Chrome speech synthesis api not changing options - javascript

I am trying to change the an instance of the speech synthesis API's options (such as pitch, volume etc) but its not working. For some reason, the only way I can get it to change the voice from UK male to UK female is to call the var voices variable twice, but this is the only option I can change in this. Here is the code:
//After the document loads (using the prototype library)
document.observe("dom:loaded", function() {
//When the speakMe button is clicked
$("speakMe").observe('click', function() {
//Get the entered phrase
phrase = $('phraseBar').getValue();
//If the phrase is blank
if(phrase =="")
//Warning message
alert("Please enter a phrase before asking me to speak for you. Thank you!");
//Declare the speach object & set attributes
var speech = new SpeechSynthesisUtterance(phrase);
var voices = speechSynthesis.getVoices();
var options = new Object();
speech.default = false;
speech.localservice = true;
speech.voice = voices.filter(function(voice) { return == userVoice; })[0];
speech.lang = userLang;
speech.rate = userRate;
speech.pitch = 2;
speech.volume = userVolume;
//Speak the phrase
var voices = speechSynthesis.getVoices();
Any ideas?

There is a Chrome known issue with the rate, volume, or pitch options not having an effect for some voices.
Also, the reason why speechSynthesis.getVoices() is working the second time is that it in Chrome it should be called after the onvoiceschanged event (see this answer).


How to use COCO-SSD to search for specific objects, in order to take measurements from them

I am attempting to create an app to detect a ball and another object while taking a video, say a racket, and these objects must be in FOV of the camera all the time to activate a function that then takes the average speed of the ball. I have established a site that does request permission to access webcam and displays it on screen, while making a recording button and all. However, my issues start turning up in my Javascript.EDIT:Start Recording contains id="togglerecordingmodeEl" onclick="toggleRecording()"
Start Recording
let video = null; // video element
let detector = null; // detector object
let detections = []; // store detection result
let videoVisibility = true;
let detecting = false;
const toggleRecordingEl = document.getElementById('toggleRecordingEl');
function toggleRecording() {
if (!video || !detector) return;
if (!detecting) {
toggleDetectingEl.innerText = 'Stop Detecting';
} else {
toggleDetectingEl.innerText = 'Start Detecting';
detecting = !detecting;
function detect() {
// instruct "detector" object to start detect object from video element
// and "onDetected" function is called when object is detected
detector.detect(video, onDetected);
// callback function. it is called when object is detected
function onDetected(error, results) {
if (error) {
detections = results;
In summary, I want it to first recognise two objects in the FOV, and only if these two objects are in the FOV, then I can work on a new function. I have used & to construct this app so far.

Web Speech API reads in wrong language in Safari [duplicate]

Following HTML shows empty array in console on first click:
<!DOCTYPE html>
function test(){
In second click you will get the expected list.
If you add onload event to call this function (<body onload="test()">), then you can get correct result on first click. Note that the first call on onload still doesn't work properly. It returns empty on page load but works afterward.
Since it might be a bug in beta version, I gave up on "Why" questions.
Now, the question is if you want to access window.speechSynthesis on page load:
What is the best hack for this issue?
How can you make sure it will load speechSynthesis, on page load?
Background and tests:
I was testing the new features in Web Speech API, then I got to this problem in my code:
<script type="text/javascript">
// Browser support messages. (You might need Chrome 33.0 Beta)
if (!('speechSynthesis' in window)) {
alert("You don't have speechSynthesis");
var voices = window.speechSynthesis.getVoices();
console.log(voices) // []
$("#test").on('click', function(){
var voices = window.speechSynthesis.getVoices();
console.log(voices); // [SpeechSynthesisVoice, ...]
<a id="test" href="#">click here if 'ready()' didn't work</a>
My question was: why does window.speechSynthesis.getVoices() return empty array, after page is loaded and onready function is triggered? As you can see if you click on the link, same function returns an array of available voices of Chrome by onclick triger?
It seems Chrome loads window.speechSynthesis after the page load!
The problem is not in ready event. If I remove the line var voice=... from ready function, for first click it shows empty list in console. But the second click works fine.
It seems window.speechSynthesis needs more time to load after first call. You need to call it twice! But also, you need to wait and let it load before second call on window.speechSynthesis. For example, following code shows two empty arrays in console if you run it for first time:
// First speechSynthesis call
var voices = window.speechSynthesis.getVoices();
// Second speechSynthesis call
voices = window.speechSynthesis.getVoices();
According to Web Speech API Errata (E11 2013-10-17), the voice list is loaded async to the page. An onvoiceschanged event is fired when they are loaded.
voiceschanged: Fired when the contents of the SpeechSynthesisVoiceList, that the getVoices method will return, have changed. Examples include: server-side synthesis where the list is determined asynchronously, or when client-side voices are installed/uninstalled.
So, the trick is to set your voice from the callback for that event listener:
// wait on voices to be loaded before fetching list
window.speechSynthesis.onvoiceschanged = function() {
You can use a setInterval to wait until the voices are loaded before using them however you need and then clearing the setInterval:
var timer = setInterval(function() {
var voices = speechSynthesis.getVoices();
if (voices.length !== 0) {
var msg = new SpeechSynthesisUtterance(/*some string here*/);
msg.voice = voices[/*some number here to choose from array*/];
}, 200);
$("#test").on('click', timer);
After studying the behavior on Google Chrome and Firefox, this is what can get all voices:
Since it involves something asynchronous, it might be best done with a promise:
const allVoicesObtained = new Promise(function(resolve, reject) {
let voices = window.speechSynthesis.getVoices();
if (voices.length !== 0) {
} else {
window.speechSynthesis.addEventListener("voiceschanged", function() {
voices = window.speechSynthesis.getVoices();
allVoicesObtained.then(voices => console.log("All voices:", voices));
When the event voiceschanged fires, we need to call .getVoices() again. The original array won't be populated with content.
On Google Chrome, we don't have to call getVoices() initially. We only need to listen on the event, and it will then happen. On Firefox, listening is not enough, you have to call getVoices() and then listen on the event voiceschanged, and set the array using getVoices() once you get notified.
Using a promise makes the code more clean. Everything related to getting voices are in this promise code. If you don't use a promise but instead put this code in your speech routine, it is quite messy.
You can write a voiceObtained promise to resolve to a voice you want, and then your function to say something can just do: voiceObtained.then(voice => { }) and inside that handler, call the window.speechSynthesis.speak() to speak something. Or you can even write a promise speechReady("hello world").then(speech => { window.speechSynthesis.speak(speech) }) to say something.
heres the answer
function synthVoice(text) {
const awaitVoices = new Promise(resolve=>
window.speechSynthesis.onvoiceschanged = resolve)
.then(()=> {
const synth = window.speechSynthesis;
var voices = synth.getVoices();
const utterance = new SpeechSynthesisUtterance();
utterance.voice = voices[3];
utterance.text = text;
At first i used onvoiceschanged , but it kept firing even after the voices was loaded, so my goal was to avoid onvoiceschanged at all cost.
This is what i came up with. It seems to work so far, will update if it breaks.
function loadVoicesWhenAvailable() {
voices = synth.getVoices();
if (voices.length !== 0) {
console.log("start loading voices");
else {
setTimeout(function () { loadVoicesWhenAvailable(); }, 10)
setInterval solution by Salman Oskooi was perfect
Please see
function myFunction() {
var mytimer = setInterval(function() {
var voices = speechSynthesis.getVoices();
if (voices.length !== 0) {
var msg = new SpeechSynthesisUtterance();
msg.rate = document.getElementById("rate").value; // 0.1 to 10
msg.pitch = document.getElementById("pitch").value; //0 to 2
msg.volume = document.getElementById("volume").value; // 0 to 1
msg.text = document.getElementById("sampletext").value;
msg.lang = document.getElementById("lang").value; //'hi-IN';
for(var i=0;i<voices.length;i++){
dtltxt+=voices[i].lang+' '+voices[i].name+'\n';
if(voices[i].lang==msg.lang) {
msg.voice = voices[i]; // Note: some voices don't support altering params
msg.voiceURI = voices[i].voiceURI;
// break;
msg.onend = function(e) {
console.log('Finished in ' + event.elapsedTime + ' seconds.');
}, 1000);
This works fine on Chrome for MAC, Linux(Ubuntu), Windows and Android
Android has non-standard en_GB wile others have en-GB as language code
Also you will see that same language(lang) has multiple names
On Mac Chrome you get en-GB Daniel besides en-GB Google UK English Female and n-GB Google UK English Male
en-GB Daniel (Mac and iOS)
en-GB Google UK English Female
en-GB Google UK English Male
en_GB English United Kingdom
hi-IN Google हिन्दी
hi-IN Lekha (Mac and iOS)
hi_IN Hindi India
Another way to ensure voices are loaded before you need them is to bind their loading state to a promise, and then dispatch your speech commands from a then:
const awaitVoices = new Promise(done => speechSynthesis.onvoiceschanged = done);
function listVoices() {
awaitVoices.then(()=> {
let voices = speechSynthesis.getVoices();
When you call listVoices, it will either wait for the voices to load first, or dispatch your operation on the next tick.
I used this code to load voices successfully:
<select id="voices"></select>
function loadVoices() {
if (speechSynthesis.onvoiceschanged !== undefined) {
speechSynthesis.onvoiceschanged = populateVoiceList;
function populateVoiceList() {
var allVoices = speechSynthesis.getVoices();
allVoices.forEach(function(voice, index) {
var option = $('<option>').val(index).html("selected", voice.default);
if (allVoices.length > 0 && speechSynthesis.onvoiceschanged !== undefined) {
// unregister event listener (it is fired multiple times)
speechSynthesis.onvoiceschanged = null;
I found the 'onvoiceschanged' code from this article:
Note: requires JQuery.
Works in Firefox/Safari and Chrome (and in Google Apps Script too - but only in the HTML).
async function speak(txt) {
await initVoices();
const u = new SpeechSynthesisUtterance(txt);
u.voice = speechSynthesis.getVoices()[3];
function initVoices() {
return new Promise(function (res, rej){
if (window.speechSynthesis.onvoiceschanged) {
} else {
window.speechSynthesis.onvoiceschanged = () => res();
While the accepted answer works great but if you're using SPA and not loading full-page, on navigating between links, the voices will not be available.
This will run on a full-page load
For SPA, it wouldn't run.
You can check if it's undefined, run it, or else, get it from the window object.
An example that works:
let voices = [];
if(window.speechSynthesis.onvoiceschanged == undefined){
window.speechSynthesis.onvoiceschanged = () => {
voices = window.speechSynthesis.getVoices();
voices = window.speechSynthesis.getVoices();
// console.log("voices", voices);
I had to do my own research for this to make sure I understood it properly, so just sharing (feel free to edit).
My goal is to:
Get a list of voices available on my device
Populate a select element with those voices (after a particular page loads)
Use easy to understand code
The basic functionality is demonstrated in MDN's official live demo of:
but I wanted to understand it better.
To break the topic down...
The SpeechSynthesis interface of the Web Speech API is the controller
interface for the speech service; this can be used to retrieve
information about the synthesis voices available on the device, start
and pause speech, and other commands besides.
The onvoiceschanged property of the SpeechSynthesis interface
represents an event handler that will run when the list of
SpeechSynthesisVoice objects that would be returned by the
SpeechSynthesis.getVoices() method has changed (when the voiceschanged
event fires.)
Example A
If my application merely has:
var synth = window.speechSynthesis;
Chrome developer tools console will show:
Example B
If I change the code to:
var synth = window.speechSynthesis;
var voices = synth.getVoices();
The before and after states are the same, and voices is an empty array.
Although i'm not confident implementing Promises, the following worked for me:
Defining the function
var synth = window.speechSynthesis;
// declare so that values are accessible globally
var voices = [];
function set_up_speech() {
return new Promise(function(resolve, reject) {
// get the voices
var voices = synth.getVoices();
// get reference to select element
var $select_topic_speaking_voice = $("#select_topic_speaking_voice");
// for each voice, generate select option html and append to select
for (var i = 0; i < voices.length; i++) {
var option = $("<option></option>");
var suffix = "";
// if it is the default voice, add suffix text
if (voices[i].default) {
suffix = " -- DEFAULT";
// create the option text
var option_text = voices[i].name + " (" + voices[i].lang + suffix + ")";
// add the option text
// add option attributes
option.attr("data-lang", voices[i].lang);
option.attr("data-name", voices[i].name);
// append option to select element
// resolve the voices value
Calling the function
// in your handler, populate the select element
if (page_title === "something") {
Android Chrome - turn off data saver. It was helpfull for me.(Chrome 71.0.3578.99)
// wait until the voices load
window.speechSynthesis.onvoiceschanged = function() {
let voices = speechSynthesis.getVoices();
let gotVoices = false;
if (voices.length) {
resolve(voices, message);
} else {
speechSynthesis.onvoiceschanged = () => {
if (!gotVoices) {
voices = speechSynthesis.getVoices();
gotVoices = true;
if (voices.length) resolve(voices, message);
function resolve(voices, message) {
var synth = window.speechSynthesis;
let utter = new SpeechSynthesisUtterance();
utter.lang = 'en-US';
utter.voice = voices[65];
utter.text = message;
utter.volume = 100.0;
Works for Edge, Chrome and Safari - doesn't repeat the sentences.

Speech Synthesis won't pause in google chrome at first load after browser launch

Kill the browser completely, reopen the browser and start text-to-speech with speechSynthesis.speak(string);
speechSynthesis.pause(); won't work till you refresh the page.
Same can be seen at,
This happens on both Mac and Windows, chrome 70.
Does anybody know a workaround?
Does anybody know a workaround?
If you speak an empty text first it pauses on first load.
let btnSpeak = document.getElementById("btnSpeak");
let spoken = false;
function speak() {
btnSpeak.disabled = true;
let msg = new SpeechSynthesisUtterance();
if (!spoken) {
let mt = new SpeechSynthesisUtterance();
mt.text = " ";
spoken = true;
msg.text = "Use a long sentence to give time to hit pause";
msg.voice = voices[0];
msg.lang = voices[0].lang;
msg.onend = function(event) {
btnSpeak.disabled = false;

Continuous Speech Recognition on browser like "ok google" or "hey siri"

I am doing a POC and my requirement is that I want to implement the feature like OK google or Hey Siri on browser.
I am using the Chrome Browser's Web speech api. The things I noticed that I can't continuous the recognition as it terminates automatically after a certain period of time and I know its relevant because of security concern. I just does another hack like when the SpeechReognition terminates then on its end event I further start the SpeechRecogntion but it is not the best way to implement such a solution because suppose if I am using the 2 instances of same application on the different browser tab then It doesn't work or may be I am using another application in my browser that uses the speech recognition then both the application doesn't behave the same as expected. I am looking for a best approach to solve this problem.
Thanks in advance.
Since your problem is that you can't run the SpeechRecognition continuously for long periods of time, one way would be to start the SpeechRecognition only when you get some input in the mic.
This way only when there is some input, you will start the SR, looking for your magic_word.
If the magic_word is found, then you will be able to use the SR normally for your other tasks.
This can be detected by the WebAudioAPI, which is not tied by this time restriction SR suffers from. You can feed it by an LocalMediaStream from MediaDevices.getUserMedia.
For more info, on below script, you can see this answer.
Here is how you could attach it to a SpeechRecognition:
const magic_word = ##YOUR_MAGIC_WORD##;
// initialize our SpeechRecognition object
let recognition = new webkitSpeechRecognition();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
recognition.continuous = true;
// detect the magic word
recognition.onresult = e => {
// extract all the transcripts
var transcripts = [].concat.apply([], [...e.results]
.map(res => [...res]
.map(alt => alt.transcript)
if(transcripts.some(t => t.indexOf(magic_word) > -1)){
//do something awesome, like starting your own command listeners
// didn't understood...
// called when we detect silence
function stopSpeech(){
// called when we detect sound
function startSpeech(){
try{ // calling it twice will throw...
// request a LocalMediaStream
// add our listeners
.then(stream => detectSilence(stream, stopSpeech, startSpeech))
.catch(e => log(e.message));
function detectSilence(
onSoundEnd = _=>{},
onSoundStart = _=>{},
silence_delay = 500,
min_decibels = -80
) {
const ctx = new AudioContext();
const analyser = ctx.createAnalyser();
const streamNode = ctx.createMediaStreamSource(stream);
analyser.minDecibels = min_decibels;
const data = new Uint8Array(analyser.frequencyBinCount); // will hold our data
let silence_start =;
let triggered = false; // trigger only once per silence event
function loop(time) {
requestAnimationFrame(loop); // we'll loop every 60th of a second to check
analyser.getByteFrequencyData(data); // get current data
if (data.some(v => v)) { // if there is data above the given db limit
triggered = false;
silence_start = time; // set it to now
if (!triggered && time - silence_start > silence_delay) {
triggered = true;
As a plunker, since neither StackSnippets nor jsfiddle's iframes will allow gUM in two versions...

W3C Speech To Text: output values as you speak

I've been using the W3C Speech Synthesizer for the web in my app. I'd like the words to start appearing as I speak them. This is because I want the user to have near-instant feedback on the current word they're speaking. Currently, the result events in the spec wait to append the entire array after a second or so of not speaking.
I've looked through the standards, but I've only found that it waits a bit to construct the final results list from the result event:
5.1.3 SpeechRecognition Events
result event: Fired when the speech recognizer returns a result
5.1.8 SpeechRecognitionEvent
results attribute: The array of all current recognition results for this session.
I've also tried retrieving the results in onstart and onpause methods:
recognition = new webkitSpeechRecognition()
recognition.onstart = function (event) {
//append word
recognition.onpause = function (event) {
//append word
Anyone know a way to accomplish this "typing" effect of the words as you speak?
The other issue is, if the user stops speaking for a sec, and the results list is compiled (IE, the result event is fired), and they go to speak again, the results list is not updated.
This happens even if I set recognition.continuous = true;
Found it from Google Developers Introduction Video.
In addition to recognition.continuous = true, you also need recognition.interimResults = true;.
Then need to modify your logic slightly in the onresult handler to account for interim results:
recognition.onresult = function (event) {
var final = "";
var interim = "";
for (var i = 0; i < event.results.length; ++i) {
if (event.results[i].final) {
final += event.results[i][0].transcript;
} else {
interim += event.results[i][0].transcript;
final_span.innerHTML = final;
interim_span.innerHTML = interim;

