JavaScript Speech-to-Text for blind people - javascript

I'm developing a website, and I would like to help blind people to use it by the voice, so I will use:
Text-to-speech, to give some posibilities to the user
Speech-to-text, to allow user to use her voice to select one
I already have some text-to-speech JavaScript libraries (like speak.js), but now I need a good speech-to-text one. There are some solutions for this purpose (like speechapi), but they use Java Applets or Flash, and I want to depend only on JavaScript, to avoid plugins.
I'm trying HTML5's speech input with x-webkit-speech and Google Chrome, and it is good, but you need to click over an icon (and blind people can't use a mouse well). Is it posible to use x-webkit-speech pressing a key? Do you know any alternative API (JavaScript)?
Thank you!

Is it posible to use x-webkit-speech pressing a key?
According to the this post and this post you cannot override the start of speech by clicking the microphone.
What the x-webkit-speech is doing is using the audio capture capabilities of HTML5 and sending the audio to Google's servers for processing, returning the results in JSON. This blogger has reversed engineered it. You could develop a JavaScript library that looks for a key press to start capturing audio on HTML5 enabled browsers and send it to Google's service or to one you have created. The downside to using Google's service is that it is an unsupported API and subject to change at any time. The downside to developing your own service is that it can be expensive to develop and maintain.
Do you know any alternative API (JavaScript)?
This post and this post lists some services available for speech recognition. I did not see Nuance listed. You may be able to use the Dragon Mobile SDK for this. And you may want to check into ISpeech.

Google Translate is very good Text To Speech Engine. I used to read a text with it. For example you have a text: welcome to Stack overflow you can call like this
http://translate.google.com/translate_tts?ie=UTF-8&q=Welcome%20to%20stack%20overflow&tl=en&total=1&idx=0&textlen=23&prev=input
then use browser audio to play it
For speech input you can manual activate listening process, see here
http://code.google.com/chrome/extensions/experimental.speechInput.html

Related

HTML5 Audio Output Patching

I was wondering if there is a way to control audio output device patching in HTML5/JavaScript? Like, if the user wanted to have one sound in my web app to go out of one audio device, and another sound out of a different audio device. I know the user can set the default output device on their computer, but for the web app I'm working on, I would like them to be able to send individual sounds to individual outputs while other sounds are playing, similar to the interface below (from a program called QLab).
I feel like the obvious answer is NO, and I do not want to resort to using flash or java. I MIGHT be okay with having to write some sort of browser plugin that interfaces with javascript.
So, after receiving basically zero helpful answers - and finding no further helpful information online, I think I figured out something that we, as developers, NEED to start requesting from browser vendors and w3c. We need to be able to request hardware access from users, in a similar fashion that we can request to access a user's location, or how we can request to send a user push notifications.
Until web developers are allowed the same control as native application developers over hardware, we will be left at a huge disadvantage over what we can offer our users. I don't want to have my users install third/fourth party plugins to enable a little more control/access to their I/O. Users should not have to be inundated with keeping more software than just their web browser updated to have websites run well and securely. And I, for one, do not feel like it should be necessary to write in more languages than HTML, JavaScript, CSS, and PHP to get the same experience a user would get from a native application.
I have no idea how we approach browser vendors about this, but I feel like it would be good to start doing this.
I know this is a little old, but just this year a method was added called "setSinkId" that you can apply to a media element (video, audio) to set the device that audio will be outputted to.
$('#video-element').setSinkId('default');
https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement/setSinkId
Though currently it seems only Chrome supports it. I haven't tested on Firefox or other web browsers.
I suggest you take a look at the Web Audio API:
Specs --- Tutorial
There is the destination property in the Web audio API. However it is a readonly property ... so not settable.
Here:
The destination property always correlates to the default hardware output of sound, whether it’s through speakers, attached headphones, or a Bluetooth headset.
I'm working on a sound installation based off web audio and have run into the same problem. I want to map different channel outputs to different speakers. have you had any progress on this?
This gentleman seems to have managed to do it: http://www.brucewiggins.co.uk/?p=311
I tested this out on a apogee quartet and it worked - outputting to 8 different channels.
I also found this article useful: http://www.html5audio.org/2013/03/surround-audio-comes-to-the-web.html
if (context.destination.maxChannelCount >= 4) {
context.destination.channelCount = 4;
}
// otherwise, let's down-mix to 2.0
else {
context.destination.channelCount = 2;
}
context.destination.channelCountMode = "explicit";
context.destination.channelInterpretation = "discrete";
context.destination.numberOfOutputs = 4;
While you can certainly use the splitter and merger nodes to assign to specific channels on the output, the actual devices you output are abstracted by the browser and inaccessible by your code.
I have done some experiments with 8-channel virtual audio cables and relaying that data to other sound devices outside of the browser. Unfortunately, I can't find a browser that will actually open an 8-channel sound card with more than 2 channels.
Hopefully, browsers in the future will provide more options. This flexibility will never come directly to JavaScript... and nor should it. This is an abstraction done for you, and if the browser uses it correctly, it won't be an issue.

Adding audio/video calls in HTML5 app

I'm working on HTML5 app that lets several users to work on one document. I need to add a possibility for users (editing the same document) to talk to each other. And I just don't know how to start with that. Here are my questions
Is there an HTML5 lib allowing to transfer sound from microphone between clients?
What about streaming video from camera?
What is an easiest server-side solution for that?
Any thoughts are strongly appreciated! So don't be shy! :)
UPD: please note that I need an abbility for more then two users to talk.
For this you can use WebRTC.
However, this is a very young and unfinished technology that as already stated is currently available only in Chrome stable and Firefox beta. This means there will probably come changes to the current spec, something to be aware of in case of early implementation. But it allow you to use video and audio communication directly in the browser.
Quick-start here:
http://www.html5rocks.com/en/tutorials/webrtc/basics/
Other options are Flash based plugins such as flash-videoio. This is an open source plugin but will naturally require Adobe Flash installed. This may or may not be a problem depending on the company's security policy.
For technical details on implementation please see examples on the provided links.
For many-to-many you can use either:
"Mesh" - everybody connects to everybody. This however is costly on CPU and mobiles are often left out.
"Star" - everybody goes through the most capable device. However, with many connections this will soon run slow for the device handling all connections.
MCU. Specialized server to handle all connections. If mixes audio and video and handles drop-outs as well without affecting the other callers.
Examples of MCU's:
http://sourceforge.net/projects/mcumediaserver/ (open source)
http://www.medooze.com/products/mcu.aspx (commercial)
you are searching for navigator.getUserMedia()
that allows the various users to share video audio and data.
the support is very low... only chrome and the latest verions of opera and firefox support it.
and totally no support on mobile devices... maybe in the next android chrome... dunno
as there is much to talk about and i have no clue on how u wanna setup everything i suggest u read a little more about that on the urls...
http://caniuse.com/stream
http://www.html5rocks.com/en/tutorials/getusermedia/intro/
http://dev.w3.org/2011/webrtc/editor/getusermedia.html
https://developer.mozilla.org/en-US/docs/WebRTC/navigator.getUserMedia
http://my.opera.com/core/blog/2011/03/23/webcam-orientation-preview
http://simpl.info/getusermedia/
and SERVERSIDE solution nahh... thats not a good solution
clientside is the way to go.
Not sure if you're required to do it yourself from scratch or are able to use third party libraries/tools.
In which case I would recommend using Tokbox which has support for WebRTC and SDK for iOS.
Their API is simple and easy to use.

Interfacing a midi keyboard or other real-time midi input with javascript?

I want to create a simple visualization tool that would allow to represent my playing a midi keyboard on the screen. I play a relatively novel instrument type, called the harmonic table:
http://en.wikipedia.org/wiki/Harmonic_table_note_layout
http://www.soundonsound.com/newspix/image/axis49.jpg
And want to build tools to ease their use and to teach others how to use them.
However, I can't find a good way to get get midi into javascript environment (or, for that matter, Flash, or Java without a large helping of jiggery-pokery slightly beyond my reach, and the use of code from what look to be rather stale and abandoned open source projects. Neither language I am too enthused to work in for this project in any case).
Is there a suitable library or application that I have missed, that will enable me to do this?
While searching around for another solution (Flash based, using the functions of the Red5 Open source flash server - really ugly, but I'm desperate at this point) I found someone who had done exactly what I needed using Java to interface with the hardware. They had started with a flash solution and recently ported to Javascript. Yay!
http://www.abumarkub.net/abublog/?p=505
Don't let the caveats about 'proof of concept' discourage you: the basic functionality appears solid, at least with everything I have been able to throw at it.
So now I'm on my way, and so is anyone else who want to build javascript based midi interfaces/synths/what have you.
I can manipulate real-time midi in javascript! This is much better than flying cars and jetboots.
I have made a NPAPI browser plugin that lets you communicate in Javascript with MIDI devices.
Currently it works on OSX in Chrome and Safari, and on Windows in Chrome.
I am working on support for other browsers (Firefox, Internet Explorer) and other operating systems (Linux, Android, iOs)
See: http://abumarkub.net/abublog/?p=754
EDIT:
I recently published this module https://github.com/hems/midi2funk it's a node.js module that listens to midi and broadcast it through socket.io so if you have the luxury of running a node.js service locally together with your client side you might get some fun out of it...
~~~~~
A few others handy links, i kinda ordered in what i think would be most important for you:
midibridge.js - A Javascript API for interacting with MIDI devices
midi.js sequencing in javascript
jasmid - A Javascript MIDI file reader and synthesiser
Second web midi api working draft published - 11/12/2012
Jazz Soft - MIDI IN / OUT PLUGIN FOR BROWSER
edit: just realised the thread is old, hopefully the links will help someone ( :
The Web MIDI API is now real in Google Chrome 43+. I even wrote a library to make it easier to work with it. If you are still interested and do not care that it currently only works in Chrome, check it out: https://github.com/cotejp/webmidi
Nowadays browsers supports MIDI listening. All you need is
navigator.requestMIDIAccess().then(requestMIDIAccessSuccess, requestMIDIAccessFailure);
and listen keys
function requestMIDIAccessSuccess(midi) {
var inputs = midi.inputs.values();
for (var input = inputs.next(); input && !input.done; input = inputs.next()) {
console.log('midi input', input);
input.value.onmidimessage = midiOnMIDImessage;
}
midi.onstatechange = midiOnStateChange;
}
See working example here
Most browsers don't allow access to any hardware except the keyboard and mouse - for obvious security reasons, so it's unlikely that you could access a midi device unless it's plugged in as one of those devices.
You could try finding a driver that would translate midi output to key presses, and then deal with those in the browser, but this would be a single-computer solution only.
I am really excited by the upcoming Web MIDI API. As far as I know, its only under discussion and hasn't made it into any browsers yet.
Combined with the Web Audio API which has started to be implemented in some browsers already, it will be possible to have complete software synthesis in the browser. Exciting times.
Since Web MIDI API is still a draft, there is no way of direct access to MIDI events in the browser.
A simple workaround could be to write a small server where you register MIDI events and communicate them to your javascript using a websocket. This could be done quite easily in Python.

Which Browser/IDE for rapid add-on development/prototyping?

I'd like to develop an extension for a browser which does the following.
Prerequisite: Text has been selected, add-on has been triggered (e.g. by click in context menu)
read selected text
pass the text to a (e.g. RESTful) webservice
retrieve a list of comments from the webservice
show them in the browser
optional: show also an input field below to send another comment to the webservice
Writing a Firefox add-on became quite annoying since I haven't found a proper documentation and IDE (with a handy build process).
Which Browser/IDE combination do you recommend for rapid add-on development/prototyping?
For Google Chrome, you use web technologies to create extensions. (AFAIK Firefox is the same thing). The documentation for Google Chrome Extensions is documented pretty well: http://code.google.com/chrome/extensions/index.html
For the case you have mentioned, I have answered another user on how to capture selected text and send them to a service with a working example that you can learn from if you want:
Chrome Extension: how to capture selected text and send to a web service
Regarding the tools that you can use, it depends on what your comfortable with. Personally, I just use an editor that has syntax hilighting such as VIM, Notepad2, etc. Some people use dreamweaver, emacs, notepad, etc. At the end it all matters on your taste.
Good luck!

Best option for capturing audio and video online

We are looking for various options which will help us to record audio and video through web on various platforms including iPhone and iPad? Recorded media will be saved on the server. Any suggestions would be helpful... We are looking for a cross browser approach.
Thanks and Regards
I hate to say, but the only way to do it on desktop in a cross-browser fashion would be Adobe Flash. On iOS you need to develop Apps for that.
HTML5 will provide Device API at some point of future to achieve your goal.
You have to use Flash. Flash can access your webcam and microphone.
Of course, Flash won't fly on iDevices.
There, you'll need to write a native app. :)
Because you need to gain hardware control, you'll most likely need a native application that can access the hardware drivers and API.
My guess is that Java may be able to do the job.
Here's another StackOverflow question that may have an embedded solution.
Nevermind: iFamily doesn't support Java either

Categories

Resources