Tutorial #51
Web Speech Synthesis API Intermediate   2014-03-04


The Web Speech API Specification from the W3C describes APIs for both Speech Recognition and Speech Synthesis. Tutorial #37 Web Speech Recognition described the first of these. This tutorial shows how you can use Speech Synthesis to make a web page talk.

There are many potential applications for this feature, from mobile web applications that can give spoken directions like a SatNav system, to user feedback in web-based games.

Demo 1 screenshot for this tutorial

NOTE: As of March 2014, ONLY the Google Chrome browser (version 33 and higher) fully supports Speech Synthesis, with partial support available in Safari on iOS7.

NOTE: Some of the options such as rate, volume, pitch do not work with all the voices - for example the default voice on Mac OS X


Speech Synthesis has been available on Windows and Mac OS X as a System Feature for a while. On Mac OS X, for example, you can choose among several voices from the Dictation and Speech Preference Panel and then turn on speech by selecting some text, right-clicking and choosing the Speech item in the menu.

Speech Synthesis in the Browser makes use of this system service and, in the case of Google Chrome, also provides additional voices that appear to be coded within the browser software itself.

Image 1 for this tutorial

On Mac OS X 10.9.2 (Mavericks) with Google Chrome 33, the list of voices is shown below:

Voices 0 through 9 are provided by Google and would appear to be generated within the browser itself.

The other voices are generated by, in my case, the Mac OS X built in Speech Synthesis capability.

NOTE that not all voices support pitch, rate and volume.

Understanding the Code

speechSynthesis is an API in the window object. The demo code prefixes it with window but this does not seem to be necessary

You should check whether the user's browser supports the API by testing the presence of speechSynthesis.

A single instance of speech is called a SpeechSynthesisUtterance. You create a SpeechSynthesisUtterance object and specify its attributes before passing it to a call of window.speechSynthesis.speak().

There are five demos contained in the demo page - each of them is wrapped in an event handler for the associated button.

  1. The simplest syntax - two lines of code
  2. The alternate syntax that exposes the attributes
  3. Fetching the list of voices that are available on your system
  4. Specifying a Voice
  5. Applying options to the voice

More Information

In addition to the features shown in the demos, the API also provides several event handlers and methods that can be used while an utterance is being spoken, etc.

W3C Web Speech API Specification

Code for this Tutorial

Share this tutorial

Comment on this Tutorial

comments powered by Disqus