Automating text to speech in javascript

Automating text to speech in javascript - javascript

In JavaScript I can get the experimental text to voice working in functions that are invoked through onload or onclick.
It does not work when inside an interval timer. I guess that has something to do with setting up interrupts within an interrupt timer.
Any suggestions for how I can have a spoken message once every minute.
The test I use is
var speech = new SpeechSynthesisUtterance(“hello world”);
Window.speechSynthesis.speak(speech);
I accept this feature only works on certain browsers and devices and is experimental but is also widely used.
I am trying to have an automatic spoken status report every minute for a monitoring application

Actually, you can do this with setInterval. See the below example for one way of doing it.
var messages = ['Hello', 'world', 'It\'s me', 'Good day','How are you?','Download complete'];
var frequency = 3000;
var myInterval = setInterval(speak,frequency);
function speak() {
let rand = Math.floor(Math.random() * (messages.length-1));
console.log(rand);
let speech = new SpeechSynthesisUtterance(messages[rand]);
window.speechSynthesis.speak(speech);
}
I'm just randomly picking up words from an array, and speaking them every three seconds. You'd adapt the provided code to work as you need it to (utterance of your status every minute).

Intervals works perfectly fine with the speechSynthesis API. Here is an working example:
speech = new SpeechSynthesisUtterance("hello world");
// say "hello world" every 5 seconds
setInterval(() => window.speechSynthesis.speak(speech), 5000)
If you want to check you monitoring updates before the the "text to speach" you can build something like this:
setInterval(() => {
// Perform some calucation about the current status ...
status = getStatusUpdateAsString();
// Prepare speech
speech = new SpeechSynthesisUtterance(status);
window.speechSynthesis.speak(speech);
}, 5000)

Here is an example using the SpeechSynthesisUtterance end event, an AbortController, and setTimeout to automate a counter. You should be able to adapt it to your specific needs:
let count = 1
let controller = new AbortController()
const synth = window.speechSynthesis
const utter = new SpeechSynthesisUtterance(count)
const text = document.getElementById('text')
const onUtterEnd = () => {
setTimeout(() => {
count = count + 1
utter.text = count
text.innerHTML = `<mark>${count}</mark>`
synth.speak(utter)
}, [500])
}
const start = () => {
text.innerHTML = `<mark>${count}</mark>`
controller = new AbortController()
utter.addEventListener('end', onUtterEnd, { signal: controller.signal })
synth.speak(utter)
}
const stop = () => {
controller.abort()
synth.cancel()
}
document.getElementById('start').addEventListener('click', start)
document.getElementById('stop').addEventListener('click', stop)
<button id="start">start</button>
<button id="stop">stop</button>
<p id="text"></p>
If you're using React check out tts-react for a custom control or hook.

Related

How do I make a div click without the mouse moving?

I'm trying to make a JavaScript code that injects into cpstest.org and when you click the start button, it automatically starts auto-clicking. I tried to make a script for it, but it just didn't work.
The start button has id start but I don't know the element to click. Viewing source code or using dev tools would be helpful, but those are blocked on my computer.
let i = 1;
document.querySelector('#start').addEventListener('click', function() {
i = 0;
});
function repeat() {
if (i == 0) {
document.querySelector(unknownId).click()};
requestAnimationFrame(repeat)
}
};
repeat();

Here's a solution:
// speed - how many CPS you want
// duration - how long the test is (in seconds)
const rapidClick = (speed, duration) => {
const startButton = document.querySelector('button#start');
const clickArea = document.querySelector('div#clickarea');
// Start the test
startButton.click()
// Click the clickArea on an interval based on the "speed" provided
const interval = setInterval(() => clickArea.click(), 1e3 / speed);
// Clear the interval after the duration has passed
setTimeout(() => clearInterval(interval), duration * 1e3);
}
// Do 100 CPS for 5 seconds
rapidClick(100, 5)
I noticed that even when setting the speed parameter to something insane like 1000 CPS, the test says only 133-139 clicks per second. Seems like it caps off at 139. Hope this helps.

Web Speech API reads in wrong language in Safari [duplicate]

Following HTML shows empty array in console on first click:
<!DOCTYPE html>
<html>
<head>
<script>
function test(){
console.log(window.speechSynthesis.getVoices())
}
</script>
</head>
<body>
Test
</body>
</html>
In second click you will get the expected list.
If you add onload event to call this function (<body onload="test()">), then you can get correct result on first click. Note that the first call on onload still doesn't work properly. It returns empty on page load but works afterward.
Questions:
Since it might be a bug in beta version, I gave up on "Why" questions.
Now, the question is if you want to access window.speechSynthesis on page load:
What is the best hack for this issue?
How can you make sure it will load speechSynthesis, on page load?
Background and tests:
I was testing the new features in Web Speech API, then I got to this problem in my code:
<script type="text/javascript">
$(document).ready(function(){
// Browser support messages. (You might need Chrome 33.0 Beta)
if (!('speechSynthesis' in window)) {
alert("You don't have speechSynthesis");
}
var voices = window.speechSynthesis.getVoices();
console.log(voices) // []
$("#test").on('click', function(){
var voices = window.speechSynthesis.getVoices();
console.log(voices); // [SpeechSynthesisVoice, ...]
});
});
</script>
<a id="test" href="#">click here if 'ready()' didn't work</a>
My question was: why does window.speechSynthesis.getVoices() return empty array, after page is loaded and onready function is triggered? As you can see if you click on the link, same function returns an array of available voices of Chrome by onclick triger?
It seems Chrome loads window.speechSynthesis after the page load!
The problem is not in ready event. If I remove the line var voice=... from ready function, for first click it shows empty list in console. But the second click works fine.
It seems window.speechSynthesis needs more time to load after first call. You need to call it twice! But also, you need to wait and let it load before second call on window.speechSynthesis. For example, following code shows two empty arrays in console if you run it for first time:
// First speechSynthesis call
var voices = window.speechSynthesis.getVoices();
console.log(voices);
// Second speechSynthesis call
voices = window.speechSynthesis.getVoices();
console.log(voices);

According to Web Speech API Errata (E11 2013-10-17), the voice list is loaded async to the page. An onvoiceschanged event is fired when they are loaded.
voiceschanged: Fired when the contents of the SpeechSynthesisVoiceList, that the getVoices method will return, have changed. Examples include: server-side synthesis where the list is determined asynchronously, or when client-side voices are installed/uninstalled.
So, the trick is to set your voice from the callback for that event listener:
// wait on voices to be loaded before fetching list
window.speechSynthesis.onvoiceschanged = function() {
window.speechSynthesis.getVoices();
...
};

You can use a setInterval to wait until the voices are loaded before using them however you need and then clearing the setInterval:
var timer = setInterval(function() {
var voices = speechSynthesis.getVoices();
console.log(voices);
if (voices.length !== 0) {
var msg = new SpeechSynthesisUtterance(/*some string here*/);
msg.voice = voices[/*some number here to choose from array*/];
speechSynthesis.speak(msg);
clearInterval(timer);
}
}, 200);
$("#test").on('click', timer);

After studying the behavior on Google Chrome and Firefox, this is what can get all voices:
Since it involves something asynchronous, it might be best done with a promise:
const allVoicesObtained = new Promise(function(resolve, reject) {
let voices = window.speechSynthesis.getVoices();
if (voices.length !== 0) {
resolve(voices);
} else {
window.speechSynthesis.addEventListener("voiceschanged", function() {
voices = window.speechSynthesis.getVoices();
resolve(voices);
});
}
});
allVoicesObtained.then(voices => console.log("All voices:", voices));
Note:
When the event voiceschanged fires, we need to call .getVoices() again. The original array won't be populated with content.
On Google Chrome, we don't have to call getVoices() initially. We only need to listen on the event, and it will then happen. On Firefox, listening is not enough, you have to call getVoices() and then listen on the event voiceschanged, and set the array using getVoices() once you get notified.
Using a promise makes the code more clean. Everything related to getting voices are in this promise code. If you don't use a promise but instead put this code in your speech routine, it is quite messy.
You can write a voiceObtained promise to resolve to a voice you want, and then your function to say something can just do: voiceObtained.then(voice => { }) and inside that handler, call the window.speechSynthesis.speak() to speak something. Or you can even write a promise speechReady("hello world").then(speech => { window.speechSynthesis.speak(speech) }) to say something.

heres the answer
function synthVoice(text) {
const awaitVoices = new Promise(resolve=>
window.speechSynthesis.onvoiceschanged = resolve)
.then(()=> {
const synth = window.speechSynthesis;
var voices = synth.getVoices();
console.log(voices)
const utterance = new SpeechSynthesisUtterance();
utterance.voice = voices[3];
utterance.text = text;
synth.speak(utterance);
});
}

At first i used onvoiceschanged , but it kept firing even after the voices was loaded, so my goal was to avoid onvoiceschanged at all cost.
This is what i came up with. It seems to work so far, will update if it breaks.
loadVoicesWhenAvailable();
function loadVoicesWhenAvailable() {
voices = synth.getVoices();
if (voices.length !== 0) {
console.log("start loading voices");
LoadVoices();
}
else {
setTimeout(function () { loadVoicesWhenAvailable(); }, 10)
}
}

setInterval solution by Salman Oskooi was perfect
Please see https://jsfiddle.net/exrx8e1y/
function myFunction() {
dtlarea=document.getElementById("details");
//dtlarea.style.display="none";
dtltxt="";
var mytimer = setInterval(function() {
var voices = speechSynthesis.getVoices();
//console.log(voices);
if (voices.length !== 0) {
var msg = new SpeechSynthesisUtterance();
msg.rate = document.getElementById("rate").value; // 0.1 to 10
msg.pitch = document.getElementById("pitch").value; //0 to 2
msg.volume = document.getElementById("volume").value; // 0 to 1
msg.text = document.getElementById("sampletext").value;
msg.lang = document.getElementById("lang").value; //'hi-IN';
for(var i=0;i<voices.length;i++){
dtltxt+=voices[i].lang+' '+voices[i].name+'\n';
if(voices[i].lang==msg.lang) {
msg.voice = voices[i]; // Note: some voices don't support altering params
msg.voiceURI = voices[i].voiceURI;
// break;
}
}
msg.onend = function(e) {
console.log('Finished in ' + event.elapsedTime + ' seconds.');
dtlarea.value=dtltxt;
};
speechSynthesis.speak(msg);
clearInterval(mytimer);
}
}, 1000);
}
This works fine on Chrome for MAC, Linux(Ubuntu), Windows and Android
Android has non-standard en_GB wile others have en-GB as language code
Also you will see that same language(lang) has multiple names
On Mac Chrome you get en-GB Daniel besides en-GB Google UK English Female and n-GB Google UK English Male
en-GB Daniel (Mac and iOS)
en-GB Google UK English Female
en-GB Google UK English Male
en_GB English United Kingdom
hi-IN Google हिन्दी
hi-IN Lekha (Mac and iOS)
hi_IN Hindi India

Another way to ensure voices are loaded before you need them is to bind their loading state to a promise, and then dispatch your speech commands from a then:
const awaitVoices = new Promise(done => speechSynthesis.onvoiceschanged = done);
function listVoices() {
awaitVoices.then(()=> {
let voices = speechSynthesis.getVoices();
console.log(voices);
});
}
When you call listVoices, it will either wait for the voices to load first, or dispatch your operation on the next tick.

I used this code to load voices successfully:
<select id="voices"></select>
...
function loadVoices() {
populateVoiceList();
if (speechSynthesis.onvoiceschanged !== undefined) {
speechSynthesis.onvoiceschanged = populateVoiceList;
}
}
function populateVoiceList() {
var allVoices = speechSynthesis.getVoices();
allVoices.forEach(function(voice, index) {
var option = $('<option>').val(index).html(voice.name).prop("selected", voice.default);
$('#voices').append(option);
});
if (allVoices.length > 0 && speechSynthesis.onvoiceschanged !== undefined) {
// unregister event listener (it is fired multiple times)
speechSynthesis.onvoiceschanged = null;
}
}
I found the 'onvoiceschanged' code from this article: https://hacks.mozilla.org/2016/01/firefox-and-the-web-speech-api/
Note: requires JQuery.
Works in Firefox/Safari and Chrome (and in Google Apps Script too - but only in the HTML).

async function speak(txt) {
await initVoices();
const u = new SpeechSynthesisUtterance(txt);
u.voice = speechSynthesis.getVoices()[3];
speechSynthesis.speak(u);
}
function initVoices() {
return new Promise(function (res, rej){
speechSynthesis.getVoices();
if (window.speechSynthesis.onvoiceschanged) {
res();
} else {
window.speechSynthesis.onvoiceschanged = () => res();
}
});
}

While the accepted answer works great but if you're using SPA and not loading full-page, on navigating between links, the voices will not be available.
This will run on a full-page load
window.speechSynthesis.onvoiceschanged
For SPA, it wouldn't run.
You can check if it's undefined, run it, or else, get it from the window object.
An example that works:
let voices = [];
if(window.speechSynthesis.onvoiceschanged == undefined){
window.speechSynthesis.onvoiceschanged = () => {
voices = window.speechSynthesis.getVoices();
}
}else{
voices = window.speechSynthesis.getVoices();
}
// console.log("voices", voices);

I had to do my own research for this to make sure I understood it properly, so just sharing (feel free to edit).
My goal is to:
Get a list of voices available on my device
Populate a select element with those voices (after a particular page loads)
Use easy to understand code
The basic functionality is demonstrated in MDN's official live demo of:
https://github.com/mdn/web-speech-api/tree/master/speak-easy-synthesis
but I wanted to understand it better.
To break the topic down...
SpeechSynthesis
The SpeechSynthesis interface of the Web Speech API is the controller
interface for the speech service; this can be used to retrieve
information about the synthesis voices available on the device, start
and pause speech, and other commands besides.
Source
onvoiceschanged
The onvoiceschanged property of the SpeechSynthesis interface
represents an event handler that will run when the list of
SpeechSynthesisVoice objects that would be returned by the
SpeechSynthesis.getVoices() method has changed (when the voiceschanged
event fires.)
Source
Example A
If my application merely has:
var synth = window.speechSynthesis;
console.log(synth);
console.log(synth.onvoiceschanged);
Chrome developer tools console will show:
Example B
If I change the code to:
var synth = window.speechSynthesis;
console.log("BEFORE");
console.log(synth);
console.log(synth.onvoiceschanged);
console.log("AFTER");
var voices = synth.getVoices();
console.log(voices);
console.log(synth);
console.log(synth.onvoiceschanged);
The before and after states are the same, and voices is an empty array.
Solution
Although i'm not confident implementing Promises, the following worked for me:
Defining the function
var synth = window.speechSynthesis;
// declare so that values are accessible globally
var voices = [];
function set_up_speech() {
return new Promise(function(resolve, reject) {
// get the voices
var voices = synth.getVoices();
// get reference to select element
var $select_topic_speaking_voice = $("#select_topic_speaking_voice");
// for each voice, generate select option html and append to select
for (var i = 0; i < voices.length; i++) {
var option = $("<option></option>");
var suffix = "";
// if it is the default voice, add suffix text
if (voices[i].default) {
suffix = " -- DEFAULT";
}
// create the option text
var option_text = voices[i].name + " (" + voices[i].lang + suffix + ")";
// add the option text
option.text(option_text);
// add option attributes
option.attr("data-lang", voices[i].lang);
option.attr("data-name", voices[i].name);
// append option to select element
$select_topic_speaking_voice.append(option);
}
// resolve the voices value
resolve(voices)
});
}
Calling the function
// in your handler, populate the select element
if (page_title === "something") {
set_up_speech()
}

Android Chrome - turn off data saver. It was helpfull for me.(Chrome 71.0.3578.99)
// wait until the voices load
window.speechSynthesis.onvoiceschanged = function() {
window.speechSynthesis.getVoices();
};

let voices = speechSynthesis.getVoices();
let gotVoices = false;
if (voices.length) {
resolve(voices, message);
} else {
speechSynthesis.onvoiceschanged = () => {
if (!gotVoices) {
voices = speechSynthesis.getVoices();
gotVoices = true;
if (voices.length) resolve(voices, message);
}
};
}
function resolve(voices, message) {
var synth = window.speechSynthesis;
let utter = new SpeechSynthesisUtterance();
utter.lang = 'en-US';
utter.voice = voices[65];
utter.text = message;
utter.volume = 100.0;
synth.speak(utter);
}
Works for Edge, Chrome and Safari - doesn't repeat the sentences.

new Function() inside of AudioWorklet

I want to create an audio editor where you can connect nodes together to create custom audio components. Every time the nodes change, they get compiled into javascript and then will be run by a new Function() to get better performance. I just read up that there is the possibility to create an AudioWorklet, which runs on a separate thread. Now I am wondering if there is a possibility of combining both ideas in a way where my algorithm gets passed to the AudioWorklet as a string of javascript code, where it then gets put into a function using new Function(codeString) inside of the constructor. Then the audioworklet's process() function will call the custom function somehow.
Is this possible in some way, or am I asking for too much? I would like to get a "yes, that's possible" or a "no, sorry" before I spend hours trying to get it to work...
Thanks for your help,
dogefromage

With the help of #AKX's comment, I crafted together this solution. The code inside the string will later be replaced by a compiler.
function generateProcessor()
{
return (`
class TestProcessor extends AudioWorkletProcessor
{
process(inputs, outputs)
{
const input = inputs[0];
const output = outputs[0];
for (let channel = 0; channel < output.length; ++channel) {
for (let i = 0; i < output[channel].length; i++) {
output[channel][i] = 0.01 * Math.acos(input[channel][i]);
}
}
return true;
}
}
registerProcessor('test-processor', TestProcessor);
`);
}
const button = document.querySelector('#button');
button.addEventListener('click', async (e) =>
{
const audioContext = new AudioContext();
await audioContext.audioWorklet.addModule(
URL.createObjectURL(new Blob([
generateProcessor()
], {type: "application/javascript"})));
const oscillator = new OscillatorNode(audioContext);
const testProcessor = new AudioWorkletNode(audioContext, 'test-processor');
oscillator.connect(testProcessor).connect(audioContext.destination);
oscillator.start();
});

Is there any way to change the gain individually in real time when playing a node that is a composite of two AudioBuffers?

I was having the following problems.
When I run AudioBufferSourceNode.start() when I have multiple tracks, I sometimes get a delay
Then, per chrisguttandin's answer, I tried the method using offLineAudioContext. (Thanks to chrisguttandin).
I wanted to play two different mp3 files completely simultaneously, so I used offlineAudioContext to synthesize an audioBuffer.
And I succeeded in playing the synthesized node.
The following is a demo of it.
CodeSandBox
The code in the demo is based on the code in the following page.
OfflineAudioContext - Web APIs | MDN
However, the demo does not allow you to change the gain for each of the two types of audio.
Is there any way to change the gain of the two types of audio during playback?
What I would like to do is as follows.
I want to play two pieces of audio perfectly simultaneously.
I want to change the gain of each of the two audios in real time.
Therefore, if you can achieve what you want to do as described above, you don't need to use offlineAudioContext.
The only way I can think of to do this is to run startRendering on every input type="range", but I don't think this is practical from a performance standpoint.
Also, I looked for a solution to this problem, but could not find one.
code
let ctx = new AudioContext(),
offlineCtx,
tr1,
tr2,
renderedBuffer,
renderedTrack,
tr1gain,
tr2gain,
start = false;
const trackArray = ["track1", "track2"];
const App = () => {
const [loading, setLoading] = useState(true);
useEffect(() => {
(async () => {
const bufferArray = trackArray.map(async (track) => {
const res = await fetch("/" + track + ".mp3");
const arrayBuffer = await res.arrayBuffer();
return await ctx.decodeAudioData(arrayBuffer);
});
const audioBufferArray = await Promise.all(bufferArray);
const source = audioBufferArray[0];
offlineCtx = new OfflineAudioContext(
source.numberOfChannels,
source.length,
source.sampleRate
);
tr1 = offlineCtx.createBufferSource();
tr2 = offlineCtx.createBufferSource();
tr1gain = offlineCtx.createGain();
tr2gain = offlineCtx.createGain();
tr1.buffer = audioBufferArray[0];
tr2.buffer = audioBufferArray[1];
tr1.connect(tr1gain);
tr1gain.connect(offlineCtx.destination);
tr2.connect(tr1gain);
tr2gain.connect(offlineCtx.destination);
tr1.start();
tr2.start();
offlineCtx.startRendering().then((buffer) => {
renderedBuffer = buffer;
renderedTrack = ctx.createBufferSource();
renderedTrack.buffer = renderedBuffer;
setLoading(false);
});
})();
return () => {
ctx.close();
};
}, []);
const [playing, setPlaying] = useState(false);
const playAudio = () => {
if (!start) {
renderedTrack = ctx.createBufferSource();
renderedTrack.buffer = renderedBuffer;
renderedTrack.connect(ctx.destination);
renderedTrack.start();
setPlaying(true);
start = true;
return;
}
ctx.resume();
setPlaying(true);
};
const pauseAudio = () => {
ctx.suspend();
setPlaying(false);
};
const stopAudio = () => {
renderedTrack.disconnect();
start = false;
setPlaying(false);
};
const changeVolume = (e) => {
const target = e.target.ariaLabel;
target === "track1"
? (tr1gain.gain.value = e.target.value)
: (tr2gain.gain.value = e.target.value);
};
const Inputs = trackArray.map((track, index) => (
<div key={index}>
<span>{track}</span>
<input
type="range"
onChange={changeVolume}
step="any"
max="1"
aria-label={track}
disabled={loading ? true : false}
/>
</div>
));
return (
<>
<button
onClick={playing ? pauseAudio : playAudio}
disabled={loading ? true : false}
>
{playing ? "pause" : "play"}
</button>
<button onClick={stopAudio} disabled={loading ? true : false}>
stop
</button>
{Inputs}
</>
);
};

As a test, I'd go back to your original solution, but instead of
tr1.start();
tr2.start();
try something like
t = ctx.currentTime;
tr1.start(t+0.1);
tr2.start(t+0.1);
There will be a delay of about 100 ms before audio starts, but they should be synchronized precisely. If this works, reduce the 0.1 to something smaller, but not zero. Once this is working, you can then connect separate gain nodes to each track and control the gains of each in real-time.
Oh, one other thing, instead of resuming the context after calling start, you might want to do something like
ctx.resume()
.then(() => {
let t = ctx.currentTime;
tr1.start(t + 0.1);
tr2.start(t + 0.1);
});
The clock isn't running if the context is suspended, and resuming doesn't happen instantly. It may take some time to restart the audio HW.

Oh, another approach since I see that the buffer you created with an offline context has two channels in it.
Let s be the AudioBufferSourceNode you created in the offline context.
let splitter = new ChannelSplitterNode(ctx, {numberOfOutputs: 2});
s.connect(splitter);
let g1 = new GainNode(ctx);
let g2 = new GainNode(ctx);
splitter.connect(g1, 0, 0);
splitter.connect(g2, 1, 0);
let merger = new ChannelMergerNode(ctx, {numberOfInputs: 1});
g1.connect(merger, 0, 0);
g2.connect(merger, 0 ,1);
// Connect merger to the downstream nodes or the destination.
You can now start s and modify g1 and g2 as desired to produce the output you want.
You can remove the gain nodes created in the offline context; they're not needed unless you really want to apply some kind of gain in the offline context.
But if I were doing this, I'd prefer not to use the offline context unless absolutely necessary.

Continuous Speech Recognition on browser like "ok google" or "hey siri"

I am doing a POC and my requirement is that I want to implement the feature like OK google or Hey Siri on browser.
I am using the Chrome Browser's Web speech api. The things I noticed that I can't continuous the recognition as it terminates automatically after a certain period of time and I know its relevant because of security concern. I just does another hack like when the SpeechReognition terminates then on its end event I further start the SpeechRecogntion but it is not the best way to implement such a solution because suppose if I am using the 2 instances of same application on the different browser tab then It doesn't work or may be I am using another application in my browser that uses the speech recognition then both the application doesn't behave the same as expected. I am looking for a best approach to solve this problem.
Thanks in advance.

Since your problem is that you can't run the SpeechRecognition continuously for long periods of time, one way would be to start the SpeechRecognition only when you get some input in the mic.
This way only when there is some input, you will start the SR, looking for your magic_word.
If the magic_word is found, then you will be able to use the SR normally for your other tasks.
This can be detected by the WebAudioAPI, which is not tied by this time restriction SR suffers from. You can feed it by an LocalMediaStream from MediaDevices.getUserMedia.
For more info, on below script, you can see this answer.
Here is how you could attach it to a SpeechRecognition:
const magic_word = ##YOUR_MAGIC_WORD##;
// initialize our SpeechRecognition object
let recognition = new webkitSpeechRecognition();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
recognition.continuous = true;
// detect the magic word
recognition.onresult = e => {
// extract all the transcripts
var transcripts = [].concat.apply([], [...e.results]
.map(res => [...res]
.map(alt => alt.transcript)
)
);
if(transcripts.some(t => t.indexOf(magic_word) > -1)){
//do something awesome, like starting your own command listeners
}
else{
// didn't understood...
}
}
// called when we detect silence
function stopSpeech(){
recognition.stop();
}
// called when we detect sound
function startSpeech(){
try{ // calling it twice will throw...
recognition.start();
}
catch(e){}
}
// request a LocalMediaStream
navigator.mediaDevices.getUserMedia({audio:true})
// add our listeners
.then(stream => detectSilence(stream, stopSpeech, startSpeech))
.catch(e => log(e.message));
function detectSilence(
stream,
onSoundEnd = _=>{},
onSoundStart = _=>{},
silence_delay = 500,
min_decibels = -80
) {
const ctx = new AudioContext();
const analyser = ctx.createAnalyser();
const streamNode = ctx.createMediaStreamSource(stream);
streamNode.connect(analyser);
analyser.minDecibels = min_decibels;
const data = new Uint8Array(analyser.frequencyBinCount); // will hold our data
let silence_start = performance.now();
let triggered = false; // trigger only once per silence event
function loop(time) {
requestAnimationFrame(loop); // we'll loop every 60th of a second to check
analyser.getByteFrequencyData(data); // get current data
if (data.some(v => v)) { // if there is data above the given db limit
if(triggered){
triggered = false;
onSoundStart();
}
silence_start = time; // set it to now
}
if (!triggered && time - silence_start > silence_delay) {
onSoundEnd();
triggered = true;
}
}
loop();
}
As a plunker, since neither StackSnippets nor jsfiddle's iframes will allow gUM in two versions...

Develop Reference

JavaScript is the programming language of the Web.

Automating text to speech in javascript - javascript

Related

How do I make a div click without the mouse moving?

Web Speech API reads in wrong language in Safari [duplicate]

new Function() inside of AudioWorklet

Is there any way to change the gain individually in real time when playing a node that is a composite of two AudioBuffers?

Continuous Speech Recognition on browser like "ok google" or "hey siri"

Categories

Resources