Here in their documentations, they said:
The onspeechstart property of the SpeechRecognition interface Fired when sound that is recognised by the speech recognition service as speech has been detected.
With that in mind why even when I cough or make some noises ( that definitely has not any meaning or speech on it ) SpeechRecognition.onspeechstart fires?
How can I track if the sound that is received is a speech or a noise?
This is a Bug of Web Speech API, Hope that Google fix it in future >>>
Related
I asked a question on how to implement speech recognition in android webview, but got no response for quite a while now. I would like to know if speech recognition is practical in android webview; if so, how does it work? If it doesn't work, can someone please tell me plainly?
Greetins,
I am currently trying to implement a speech recognition functionality on my application. According to the JS documentation here, speech to text is supported since Safari 14.1. Also, I am using the following configurations:
const { webkitSpeechRecognition } = (window as any)
const recognition = new webkitSpeechRecognition();
recognition.lang = 'pt-BR';
recognition.continuous = true;
recognition.interimResults = false;
recognition.maxAlternatives = 1;
// Avoid garbage collection bugs
this.garbage.push(recognition);
recognition.start();
On Chrome it works just fine, but on Safari the recognition results are super bad. It can understand me sometimes, but often it misinterprets my words, giving me wrong results. For example, if I say: "Hello assistant, change contrast", the result might be something like: "Hello assist charge contract hello assist charge charge" or something.
One peculiarity of this problem is that the events fired by the speech recognition interface on safari are just the start and audiostart.
Is anyone facing a similar issue or found a solution to this problem? I am also accepting alternatives for implementing speech recognition on my application.
Thanks in advance!
EDIT
On my end, you can see this problem by visiting any website that relies on the Web Speech API. Some examples that you can check:
https://www.google.com/chrome/demos/speech.html
https://www.audero.it/demo/web-speech-api-demo.html
So, if anyone else stumbles at this problem, I have filled an issue at the chromium forum. You can consult the issue here.
Basically, the Chrome team is having some problems integrating this functionality in their browser on iOS devices.
In my case, what I did was use Hark.js to get events based on when the user starts and stops speaking paired with Vosk on my backend to do the offline Speech-to-Text translation.
IMO the browser speech recognition API is fine if you want your app to run on a specific browser. However, if you wish to target all browsers accross different operational systems, I would advise looking for a different solution.
If you go to a page that uses the web SpeechRecognition API, like the demo page for Annyang.js, on the Android Chrome browser, you'll notice that once the mic starts listening for speech input, it will make a notification sound just like the Google Now speech input. And since the speech recognition is activated multiple times to make it seem like it is continuously listening, it gets very annoying.
I'd like to have a kiosk tablet that has a page using Annyang.js, or some other SpeechRecognition API library, to always be listening for commands. However, this notification sound would make it infeasible to leave the volume turned up, and I'd like to be able to play some audio.
Is there a way to disable the audible notification when activating the Speech Recognition API in Android Chrome?
This doesn't have to be on the webpage. I can modify the Android device as needed.
I am making a package for github's atom.
In that I need to record voice and output it on the window.
How do I record voice and output is as text in coffeeScript?
1. Record Voice
This has been asked a lot:
Can I use javascript to record voice on a web app?
Capture Audio Input with flash or html5
HTML5 record audio to file
2. Output as text
Now comes the tricky part (not that the previous step is easy): You need speech recognition. Here are some js libraries you could use:
Pocketsphinx.js
annyang!
Alternative: Web Speech API
If you're up to the task of building your own speech recognition and want to learn new stuff, go ahead. I've done it once with tcl/tk and there was a lot of learning but also fun involved. However, it takes some time to get it right, i.e. get usable results. If you don't want to reinvent the wheel, just use the Web Speech API:
Web Speech API Specification
Web Speech API Demo
I'm experimenting with the Demo of the Web Speech API: https://www.google.com/intl/en/chrome/demos/speech.html. You'll need version 25.0 of Chrome at least to run it.
I'm trying to use the Web Speech API continuously for a long transcription (10-15 minutes). However, I'm noticing that after roughly 1-2 minutes there is a "network" error (as mentioned in the Web Speech API Spec: https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html#dfn-onerror).
Does anyone know why this "network" error is happening pretty consistently after 1-2 minutes and if there is any way to configure the Web Speech API for longer, continuous transcriptions?
Thank you!
Web speech API usually has 60 seconds of recognition time.
You'll need to call the speechrecognition.onend() method and then call the speechrecognition.start(). So the recognition continues after stop.