JS Speech Synthesis Issue on iOS - javascript

I recently implemented a basic web app which relied on Google's TTS URL to generate clear MP3 files for playback on the front end.
This has since been subject to an additional security check, meaning I have had to update the code base to use alternative methods.
One such alternative is javascript's speech synthesis API, i.e. SpeechSynthesisUtterance() and window.speechSynthesis.speak('...'). This works really well on my desktop and laptop but as soon as I use it on my iOS devices, the rate of the audio is accelerated significantly.
Can anyone suggest what I can do to resolve this?
See below for example code:
var msg = new SpeechSynthesisUtterance();
msg.text = item.title;
msg.voice = "Google UK English Male";
msg.rate = 0.7;
msg.onend = function(){
console.log('message has ended');
$('.word-img').removeClass('img-isplaying');
};
msg.onerror = function(){
console.log('ERROR WITH SPEECH API');
$('.word-img').removeClass('img-isplaying');
};
window.speechSynthesis.speak(msg);

IOS doesn't allow to use the new SpeechSynthesis-Api programmatically. The user must trigger the action explicit. I can understand this decision. But I don't understand, why the Api is not working in webapps, like playing audio files. This is not working in IOS's default safari, but its working in webapps.
Here is a little trick:
<a id="trigger_me" onclick="speech_text()"></a>
<script>
function speech_text() {
var msg = new SpeechSynthesisUtterance();
/* ... */
}
/* and now you must trigger the event for #trigger_me */
$('#trigger_me').trigger('click');
</script>
This is working only with native dom elements. If you add a new tag programmatically into the dom like...
$('body').append('<a id="trigger_me" onclick="speech_text()"></a>');
... the function will not triggered. It seems that IOS-Safari registers events for special internal functions only once after domload.

OK. I solved this problem today. The problem is that the iOS would not let the speech API run programmatically unless we have triggered one time under the user's interaction.
So we can listen to the user interaction and trigger one silent speech which can let us speak programmatically later.
Here is my code.
let hasEnabledVoice = false;
document.addEventListener('click', () => {
if (hasEnabledVoice) {
return;
}
const lecture = new SpeechSynthesisUtterance('hello');
lecture.volume = 0;
speechSynthesis.speak(lecture);
hasEnabledVoice = true;
});

Related

JavaScript/ HTML video tag in Safari. Block now playing controls [duplicate]

Safari on iOS puts a scrubber on its lock screen for simple HTMLAudioElements. For example:
const a = new Audio();
a.src = 'https://example.com/audio.m4a'
a.play();
JSFiddle: https://jsfiddle.net/0seckLfd/
The lock screen will allow me to choose a position in the currently playing audio file.
How can I disable the ability for the user to scrub the file on the lock screen? The metadata showing is fine, and being able to pause/play is also acceptable, but I'm also fine with disabling it all if I need to.
DISABLE Player on lock screen completely
if you want to completely remove the lock screen player you could do something like
const a = new Audio();
document.querySelector('button').addEventListener('click', (e) => {
a.src = 'http://sprott.physics.wisc.edu/wop/sounds/Bicycle%20Race-Full.m4a'
a.play();
});
document.addEventListener('visibilitychange', () => {
if (document.hidden) a.src = undefined
})
https://jsfiddle.net/5s8c9eL0/3/
that is stoping the player when changing tab or locking screen
(code to be cleaned improved depending on your needs)
From my understanding you can't block/hide the scrubbing commands unless you can tag the audio as a live stream. That being said, you can use js to refuse scrubbing server-side. Reference the answer here. Although that answer speaks of video, it also works with audio.
The lock screen / control center scrubber can also be avoided by using Web Audio API.
This is an example of preloading a sound and playing it, with commentary and error handling:
try {
// <audio> element is simpler for sound effects,
// but in iOS/iPad it shows up in the Control Center, as if it's music you'd want to play/pause/etc.
// Also, on subsequent plays, it only plays part of the sound.
// And Web Audio API is better for playing sound effects anyway because it can play a sound overlapping with itself, without maintaining a pool of <audio> elements.
window.audioContext = window.audioContext || new AudioContext(); // Interoperate with other things using Web Audio API, assuming they use the same global & pattern.
const audio_buffer_promise =
fetch("audio/sound.wav")
.then(response => response.arrayBuffer())
.then(array_buffer => audioContext.decodeAudioData(array_buffer))
var play_sound = async function () {
audioContext.resume(); // in case it was not allowed to start until a user interaction
// Note that this should be before waiting for the audio buffer,
// so that it works the first time (it would no longer be "within a user gesture")
// This only works if play_sound is called during a user gesture (at least once), otherwise audioContext.resume(); needs to be called externally.
const audio_buffer = await audio_buffer_promise; // Promises can be awaited any number of times. This waits for the fetch the first time, and is instant the next time.
// Note that if the fetch failed, it will not retry. One could instead rely on HTTP caching and just fetch() each time, but that would be a little less efficient as it would need to decode the audio file each time, so the best option might be custom caching with request error handling.
const source = audioContext.createBufferSource();
source.buffer = audio_buffer;
source.connect(audioContext.destination);
source.start();
};
} catch (error) {
console.log("AudioContext not supported", error);
play_sound = function() {
// no-op
// console.log("SFX disabled because AudioContext setup failed.");
};
}
I did a search, in search of a way to help you, but I did not find an effective way to disable the commands, however, I found a way to customize them, it may help you, follow the apple tutorial link
I think what's left to do now is wait, see if ios 13 will bring some option that will do what you want.

Disable iOS Safari lock screen scrubber for media

Safari on iOS puts a scrubber on its lock screen for simple HTMLAudioElements. For example:
const a = new Audio();
a.src = 'https://example.com/audio.m4a'
a.play();
JSFiddle: https://jsfiddle.net/0seckLfd/
The lock screen will allow me to choose a position in the currently playing audio file.
How can I disable the ability for the user to scrub the file on the lock screen? The metadata showing is fine, and being able to pause/play is also acceptable, but I'm also fine with disabling it all if I need to.
DISABLE Player on lock screen completely
if you want to completely remove the lock screen player you could do something like
const a = new Audio();
document.querySelector('button').addEventListener('click', (e) => {
a.src = 'http://sprott.physics.wisc.edu/wop/sounds/Bicycle%20Race-Full.m4a'
a.play();
});
document.addEventListener('visibilitychange', () => {
if (document.hidden) a.src = undefined
})
https://jsfiddle.net/5s8c9eL0/3/
that is stoping the player when changing tab or locking screen
(code to be cleaned improved depending on your needs)
From my understanding you can't block/hide the scrubbing commands unless you can tag the audio as a live stream. That being said, you can use js to refuse scrubbing server-side. Reference the answer here. Although that answer speaks of video, it also works with audio.
The lock screen / control center scrubber can also be avoided by using Web Audio API.
This is an example of preloading a sound and playing it, with commentary and error handling:
try {
// <audio> element is simpler for sound effects,
// but in iOS/iPad it shows up in the Control Center, as if it's music you'd want to play/pause/etc.
// Also, on subsequent plays, it only plays part of the sound.
// And Web Audio API is better for playing sound effects anyway because it can play a sound overlapping with itself, without maintaining a pool of <audio> elements.
window.audioContext = window.audioContext || new AudioContext(); // Interoperate with other things using Web Audio API, assuming they use the same global & pattern.
const audio_buffer_promise =
fetch("audio/sound.wav")
.then(response => response.arrayBuffer())
.then(array_buffer => audioContext.decodeAudioData(array_buffer))
var play_sound = async function () {
audioContext.resume(); // in case it was not allowed to start until a user interaction
// Note that this should be before waiting for the audio buffer,
// so that it works the first time (it would no longer be "within a user gesture")
// This only works if play_sound is called during a user gesture (at least once), otherwise audioContext.resume(); needs to be called externally.
const audio_buffer = await audio_buffer_promise; // Promises can be awaited any number of times. This waits for the fetch the first time, and is instant the next time.
// Note that if the fetch failed, it will not retry. One could instead rely on HTTP caching and just fetch() each time, but that would be a little less efficient as it would need to decode the audio file each time, so the best option might be custom caching with request error handling.
const source = audioContext.createBufferSource();
source.buffer = audio_buffer;
source.connect(audioContext.destination);
source.start();
};
} catch (error) {
console.log("AudioContext not supported", error);
play_sound = function() {
// no-op
// console.log("SFX disabled because AudioContext setup failed.");
};
}
I did a search, in search of a way to help you, but I did not find an effective way to disable the commands, however, I found a way to customize them, it may help you, follow the apple tutorial link
I think what's left to do now is wait, see if ios 13 will bring some option that will do what you want.

How to change voice in Speech Synthesis?

I am trying out a simple example with Speechsynthesis.
<script>
voices = window.speechSynthesis.getVoices()
var utterance = new SpeechSynthesisUtterance("Hello World");
utterance.voice = voices[4];
utterance.lang = voices[4].lang;
window.speechSynthesis.speak(utterance);
</script>
But this gives an error that voices is undefined. I found that getVoices() is loaded async. I saw this answer and updated my code as shown below to use callback.
<script>
window.speechSynthesis.onvoiceschanged = function() {
voices = window.speechSynthesis.getVoices()
var utterance = new SpeechSynthesisUtterance("Hello World");
utterance.voice = voices[4];
utterance.lang = voices[4].lang;
window.speechSynthesis.speak(utterance);
};
</script>
But due to some strange reason, the text is spoken three times instead of one. How can I fix this code?
I can't replicate your issue, but try adding an event listener so that your function runs after the voices are loaded.
let voices, utterance;
function speakVoice() {
voices = this.getVoices();
utterance = new SpeechSynthesisUtterance("Hello World");
utterance.voice = voices[1];
speechSynthesis.speak(utterance);
};
speechSynthesis.addEventListener('voiceschanged', speakVoice);
This can be seen on many JS Bin-type demos. For examples:
http://jsbin.com/sazuca/1/edit?html,css,js,output
https://codepen.io/matt-west/pen/wGzuJ
This behaviour is seen in Chrome, which uses the voiceschanged event, when a non-local voice is used. Another effect is that the list of voices is often triplicated.
The W3C specification says:
voiceschanged event
Fired when the contents of the
SpeechSynthesisVoiceList, that the getVoices method will return, have
changed. Examples include: server-side synthesis where the list is
determined asynchronously, or when client-side voices are
installed/uninstalled.
...so I presume that the event is fired once when Chrome gets the voices and then twice more when the first non-local voice is used.
Given that there doesn't seem to be a way to distinguish which change is triggering the event I have been using this ugly bit of code:
// Add voices to dropdown list
loadVoices();
// For browsers that use voiceschanged event
speechSynthesis.onvoiceschanged = function(e) {
// Load the voices into the dropdown
loadVoices();
// Don't add more options when voiceschanged again
speechSynthesis.onvoiceschanged = null;
}
Where loadVoices() is the function that adds the voices to a selection's options. It's not ideal, however it does work on all browsers (with speech synthesis) whether they use onvoiceschanged or not.
Faced the same problem just now & the solution is pretty easy.
Just declare the voices globally not just inside the onclick function &
do it two times
utterance.voice = window.speechSynthesis.getVoices()[Math.floor(Math.random()*6)]
setTimeout(() => {
utterance.voice = window.speechSynthesis.getVoices()[Math.floor(Math.random()*6)]
}, 1000)
The Utterance is variable containing speechSynthesisisUtterance()
The Brave browser only supports 6 types of voices as compared to 24 of chrome,
that's why I choose any random voice b/w 1-6.
You can simply add this code and use SpeechSynthesis in your project, it works for me.
var su;
su = new SpeechSynthesisUtterance();
su.text = "Hello World";
speechSynthesis.speak(su);
speechSynthesis.cancel();

HTML5 Voice Recognition, wait til user answered

I'm playing arround with the HTML5 voice recognition.
Currently I have a function like this:
doSomething() {
listen("name");
console.log("done");
}
The "listen" Function works currently like this:
recognition = new webkitSpeechRecognition();
recognition.lang = "de-DE";
recognition.continuous = false;
//recognition.interimResults = true;
recognition.onresult = function(event) {
result = event.results[event.resultIndex];
confidence = result[0].confidence;
result = result[0].transcript.trim();
};
//TODO: remove old results, work with results
recognition.start();
What is happening is that Chrome asks for the microphone access and directly does the console.log.
What I want is for the console.log to wait until the speech recognition is done. Like this:
Chrome asks for mic access
User says something
Something is done with what the user said
the console.log and everything that follows will be executed.
How can I do that?
Thank you!
Javascript programming is event-driven. The code is not a sequence of statements to execute, but just a description of events to handle and reactions on them.
If you want to perform some action on speech recognized, you need to put it into even handler, in your case:
recognition.onresult = function(event) {
result = event.results[event.resultIndex];
confidence = result[0].confidence;
result = result[0].transcript.trim();
console.log("done")
};
You can access variables inside handler function and do more complex things.
There are many explanations of event-driven programming on the web, but the most complete one is Chapter 17 Handling Events of JavaScript: The Definitive Guide, 6th Edition

help with Firefox extension in multiple windows

I'm writing a Firefox extension that creates a socket server which will output the active tab's URL when a client makes a connection to it. I have the following code in my javascript file:
var serverSocket;
function startServer()
{
var listener =
{
onSocketAccepted : function(socket, transport)
{
try {
var outputString = gBrowser.currentURI.spec + "\n";
var stream = transport.openOutputStream(0,0,0);
stream.write(outputString,outputString.length);
stream.close();
} catch(ex2){ dump("::"+ex2); }
},
onStopListening : function(socket, status){}
};
try {
serverSocket = Components.classes["#mozilla.org/network/server-socket;1"]
.createInstance(Components.interfaces.nsIServerSocket);
serverSocket.init(7055,true,-1);
serverSocket.asyncListen(listener);
} catch(ex){ dump(ex); }
document.getElementById("status").value = "Started";
}
function stopServer ()
{
if (serverSocket)
serverSocket.close();
}
window.addEventListener("load", function() { startServer(); }, false);
window.addEventListener("unload", function() { stopServer(); }, false);
As it is, it works for multiple tabs in a single window. If I open multiple windows, it ignores the additional windows. I think it is creating a server socket for each window, but since they are using the same port, the additional sockets fail to initialize. I need it to create a server socket when the browser launches and continue running when I close the windows (Mac OS X). As it is, when I close a window but Firefox remains running, the socket closes and I have to restart firefox to get it up an running. How do I go about that?
Firefox extension overlays bind to window objects. One way around this is to create an XPCOM component or find one that someone else already created to allow you to build functionality without binding it to the window objects.
Of course, section #2 below on Observer Notifications may be helpful as well.
Possible workaround: #1
Instead of calling "startServer()" each time a window is opened, you could have a flag called windowCount that you could increment each time you open a new window. If windowCount is greater than 0, don't call startServer().
As windows close, you could decrement the count. Once it hits 0, stop the server.
Here is information from the Mozilla forums on this problem:
http://forums.mozillazine.org/viewtopic.php?f=19&t=2030279
Possible workaround #2:
With that said, I've also found documentation for Observer Notifications, which may be helpful as there is a section on Application Startup and Shutdown:
https://developer.mozilla.org/en/Observer_Notifications
UPDATE:
Here are some resources on creating XPCOM components in JavaScript and in C++:
https://developer.mozilla.org/en/how_to_build_an_xpcom_component_in_javascript
http://www.codeproject.com/KB/miscctrl/XPCOM_Creation.aspx
https://developer.mozilla.org/en/creating_xpcom_components
You probably want to:
Move your code into a JavaScript component
Register your component as a profile-after-change observer
Whenever someone makes a connection to your socket, find the active window and return its URL.
Use something like
var wm = Components.classes["#mozilla.org/appshell/window-mediator;1"]
.getService(Components.interfaces.nsIWindowMediator);
var win = wm.getMostRecentWindow("navigator:browser");
var spec = win ? win.getBrowser().currentURI.spec : "";
var outputString = spec + "\n";
etc.

Categories

Resources