What is Speech Synthesis Markup Language (SSML)?
What is SSML?
Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis applications (converting text into speech). SSML is often embedded in VoiceXML scripts, similar to Amazon Alexa Skills, to drive interactive telephony systems (yeah, those annoying robocalls we all hate). However, SSML may also be used as a standalone for things such as creating audio books, chatbot voices, video voice overs, home appliance commands, and many other text-to-speech applications.
For desktop applications, other markup languages are popular, including Apple's embedded speech commands, and Microsoft's SAPI Text to speech (TTS) markup, which is also an XML language. It is also used to produce speech when writing third-party skills for Google Assistant or Amazon Alexa. In most cases, an SSML text editor is used to create the audio books, for example. The XML tags are applied to text in the SSML editor, and the resultant output, after processing, is an audio file.