Auto lips sync according to mp3/wav - javascript

I'd like to create animated heads in my web apps. It seems that CSS3 transition, animation and background features with a little help of javascript web API is all I need. Using xface looks like an overkill to me, cartoon solutions is almost all I need. I need to make it cartoon.
I've made some progress already (beeing able to create voice controlled web app), but this time I need mp3/wav input, not direct voice from microphone using google servers through x-webkit-speech.
I am considering this approach:
record speech into mp3 or wav and write it's string contents
play the mp3 in browser and detect end of words using AnalyserNode to synchronize position in the string (I use Czech language which, unlike the English, has almost constant speech speed).
display the cartoon heads (see the link above) according to actual spoken letter
The question: Is there any lower effort (shorter development time for coder and designer) approach? Especially step 2 and English language in the future makes me worried. Maybe some karaoke tool could produce some speech sync file (which can I parse into CSS3 keyframes)? I am not aware of any such tool.

For something more involved you might try:
Step 1. Web speech API to text to voice...
http://updates.html5rocks.com/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-API
Step 2 try porting "papagayo" to js (uses dictionary to relate words to phonemes to mouth poses I believe)
http://anime.smithmicro.com/papagayo.html
The GNU source is available here:
http://anime.smithmicro.com/update_files/papagayo/papagayo_1.2_source.zip
You might also refer to:
http://www.adobe.com/devnet/flash/articles/lip-sync-smartmouth.html
for an overview of what you're trying to achieve

Maybe you could do something really quick and dirty with spectrum analysis:
http://0xfe.muthanna.com/wavebox/

Related

Tesseract in a specific information

I want to scan a Spanish DNI ang get some information and print it in the screen. A DNI has this form: 1
And i want to take the fields DNI, Nombre and Apellidos (in the image, it would be 99999999R, CARMEN, ESPAÑOLA ESPAÑOLA).
I thought that the best way is using "cut tool" and use the OCR in the cut images. What do you think? I have to make the project in HTML/JS and I don't really know how to program this.
Thanks.
This is not an easy task and to do it, you need to do the following:
Make sure you "cut" the image precisely around the borders. This method needs to be robust to lightning conditions, low contrast situations, etc. Ideally, it should use advanced computer vision and ML techniques
Then you need to define where the individual fields are. This is also not an easy task, because the sizes and positions of the fields vary between different IDs.
In the final step, you need to have a very reliable OCR tool, one which would give you a low error rate, so that you actually have a benefit of doing this automatically, compared to just retyping all these fields manually. Although OCR seems like an easy problem today, it's still very hard, especially on ID documents which can be worn out and damaged and taken in weird lighting conditions.
My company Microblink has spent years working on ID scanning, not just for Spanish DNIs, but also for many other document types (there are more than 5000 different types in the world).
If you are interested in reading how we're doing it, here are some of the materials:
Goodbye Templates
BlinkID v5
From OCR to DeepOCR
As for the "cut tool" - we do have a feature that allows you to automatically capture the image of a document and crop it around the edges of the document. We call it "Document capture" and it's a part of our BlinkID SDK.
As for the HTML/JS - it's not clear what exactly you need, but we do have a React Native and Cordova plugins which allow you to build cross-platform mobile apps in JS, and we also have a Frontend SDK and Web API which allow you to scan documents in any browser.

can you edit audio to play different amounts of sound through the 2 speakers of headphones using JavaScript or html5

Holophonic sound is when a sound is played through headphone with more or less sound in one ear or the other to mimic real sound you hear in real life. With this you can feel like you can actual put a place to where it sounds like its coming from. I was wondering if I could mimic this and edit how much sound comes out of the left and right speaker with different sounds using JavaScript or html 5. If their is another language I would have to learn please tell us what it is. If you can use JavaScript and/or html 5 tell me where I should look to learn it. Thank you for your answer.
The answer is Yes. If it can be programmed in JavaScript it will be programmed in JavaScript. Also HTML5 would probably play a role with the Audio tag. Both work together to create the interaction.
JavaScript is a very powerful language and it is the most popular open source community at the moment. What that means is there is a large variety of modules to choose from. So you need a tool.
STEP #1: Use tools like NPM or GitHub. My first step for such a question would be to search for your key words on NPM like this:
https://www.npmjs.com/search?q=binaural+holophonic+stereo
Or GitHub and Bower:
https://github.com/search?utf8=%E2%9C%93&q=binaural+holophonic+stereo
http://bower.io/search/?q=binaural%20holophonic%20stereo
STEP #2: See what is out there...
Surprisingly GitHub and Bower did not return results, but NPM did.
It looks like there are some good things there...
- web-audio-school
- audiosynth
- augiobugger
- sink
STEP #3 Does it fit you requirements?
The next step is to hit the readme files of each to see if these libraries have the holophonic capabilities that you need.
STEP #4: But fret not if is doesn't exist.
Because of Atwoods Law - everything that can be written in JavaScript, will be written in JavaScript.
At least that is the trend we see is proving itself over the years. I personally know a group of computer music hackers in my city. And I can guarantee that if it has not been done - this is the type of thing that they are looking out for to solve.
OVERVIEW
So, if it is possible (which it sounds like it is) - then rest with some sense of inevitability, that it will turn up on NPM, eventually. And if not - it is you that has the pleasure of creating it, once you've mastered JavaScript modularization methodology. I've seen a few of these technologies surface now through NPM and the like. Web Audio is a specific area of advancement, which you can be on look out for impact with ES6. , WebWorkers might help you eventually.
TECHNICALLY:
Diving in technically - if a library does not exist. It sounds like what is required is - JS interoperability with the headphone hardware interface. Which increasingly is a capability of JS. But you may be slightly ahead of the curve here. It is possible, but the key word that would be missing if it does not exist is - interoperability. The need would be for JS to hardware interface interoperability... Which, by the way, is exactly what JS does really wel. So in concept yes this is reasonable. In implementation - I can't say. It requires a search.
SEARCHING JAVASCRIPT AUDIO API VOLUME (gives great results):
This one from CreativeJS is highly recommended. Start here (study it):
http://creativejs.com/resources/web-audio-api-getting-started/
And of course MDN is always best otherwise:
https://developer.mozilla.org/en-US/docs/Web/API/AudioChannels_API/Using_the_AudioChannels_API
This looks important (though maybe related indirectly):
http://codetheory.in/controlling-the-volume-of-an-audio-file-in-javascript/
And of course StackOverflow has a great link (maybe similar, with clues):
Correct audio volume output decibels using HTML5 / javascript, turn off pre-processing
In conclusion to your question:
Yes Definitely learn JavaScript.
But also learn the command line tools - bc they get u what u want.
Git, NPM, Bower...
All in all, even if you don't finish your holophonic proof of concept - you are learning valuable professional skills. JavaScript is a ripe ecosystem for a hobby deep-dive into Web Audio capabilities. It is a great idea - and if nothing exists... it is up to you to become the creator that gives it to the open-source world.

Saving Div Content As Image On Server

I have been learning a bit of jQuery and .Net in VB. I have created a product customize tool of sorts that basically layers up divs and add's text, images etc on top of a tshirt.
I'm stuck on an important stage!
I need to be able to convert the content of the div that wraps all these divs of text and images to one flat image taking into account any CSS that has been applied to it also.
I have heard of things that I could use to screen capture the content of a browser on the server which could be possible for low res thumbs etc, but it sounds a little troublesome! and it would really be nice to create an image of high res.
I have also heard to converting the html to html5 canvas then writing that out... but looks too complicated for me to fathom and browser support is an issue.
Is this possible in .NET?
Perhaps something with javascript could be done?
Any help or guidance in the correct direction would be appreciated!
EDIT:
I'm thinking perhaps I could do with two solutions for this. Ideally I would end up with a normal res jpg/png etc for displaying on the website, But also a print ready high res file would be very desirable as well.
PostScript Printer - I have heard of it but I'm struggling to find a good resource to understand it for a beginner (especially with wiki black out). Perhaps I could create a html page from my div content and send it to print to a EPS file. Anyone know any good tutorials for this?
We did this... about 10 years ago. Interestingly, the tech available really hasn't changed too much.
update - Best Answer
Spreadshirt licenses their product: http://blog.spreadshirt.net/uk/2007/11/27/everyones-a-designer-free-designers-for-premium-partners/
Just license it. Don't do this yourself, unless you have real graphics manipulating and print production experience. I'd say in today's world you're looking at somewhere around 4,000 to 5,000 hours of dev time to duplicate what they did... And that's if you have two top tier people working on it.
Short answer: you can't do it in html.
Slightly longer answer:
It doesn't work in part because you can't screen cap the client side and get the level of resolution needed for production type printing. Modern screen resolution is usually on the order of 100 ppi. For a decent print you really need something between 3 and 6 times that density. Otherwise you'll have lots of pixelation and it will generally look like crap when it comes out.
A different Answer:
Your best bet is to leverage something like SVG (scalable vector graphics) and provide a type of drawing surface to the browser. There are several ways of doing this using Flash (Spreadshirt.com uses this) or Silverlight (not recommended). We used flash and it was pretty good.
You might be able to get away with using HTML 5. Regardless, whatever path you pick is going to be complicated.
Once the user is happy with their drawing and wants to print it out, you create the final file and run a process to convert it to Postscript or whatever format your t-shirt provider needs. The converter (aka RIP software) is going to either take a long time to develop or cost a bunch of money... pick one. (helpful hint: buy it. Back then, we spent around $20k US and it was far cheaper than trying to develop).
Of course, this ignores issues such as color matching and calibration. This was actually our primary problem. Everyone's monitor is slightly different and what looks like red on one machine is pink on another.
And for a little background, we were doing customized wrapping paper. The user added text, selected images from our library or uploaded their own, and picked a pattern. Our prints came out on large-format HP Inkjet printers (36" and 60" wide). Ultimately we spent between $200k and $300k just on dev resources to make it happen... and it did, unfortunately, the price point we had to sell at was too high for the market.
If you can use some server-side tool, check phantomjs. This is a headless webkit browser (with no gui) which can take a page's screenshot, an uses a javascript api. It should do the trick.
Send the whole div with user generated content back to server using ajax call.
Generate an HTML Document on server using 'HtmlTextWriter' class.
Then you can convert that HTML file using external tools like
(1) http://www.officeconvert.com/products_website_to_image.htm#easyhtmlsnapshot
(2) http://html-to-image.acasystems.com/faq-html-to-picture.htm
which are not free tools, but you can use them by creating new Process on server.
The best option I came across is wkhtmltopdf. It comes with a tool called wkhtmltoimage. It uses QtWebKit (A Qt port of the WebKit rendering engine) to render a web page, and converts the result to PDF or image format of your choice, all done at server side.
Because it uses WebKit, it renders everything (images, css and even javascript) just like a modern browser does. In my use case, the results have been very satisfying and are almost identical to what browsers would render.
To start, you may want to look at how to run external tools in .NET:
Execute an external EXE with C#.NET

Javascript - Get Sound Data

I'm wondering if any new HTML5 functions or existing JS library would allow me to access information about the sound that's currently playing in an Audio object. For example, I'd like to be able to access an array of ranges the a song is currently playing at (that is, low values appear for deep bass sounds and higher values appear for shriller sounds). I'm not a sound engineer, so I'm not quite sure what the correct terminology is.
A comparable library would be the C++ BASS library (http://www.un4seen.com/), although I certainly don't need the same breadth of functionality.
I did a little more digging around and found this: chromium.googlecode.com/svn/trunk/samples/audio/visualizer-gl.html
It's pretty much what I'm looking for, but I can't figure out how it works. Thoughts?
The chromium visualizer uses the Web Audio API.
Firefox offers the Audio Data API.
These are the two options available at the moment, and they're not compatible with each other. Eventually an agreement will be reached.
If you intend to do something cross-browser, you're condemned to using Flash for now, there is a pretty good library called SoundManager2 that gives you the necessary data. Check out their visualization demos.

Waveform visualization in JavaScript from audio [duplicate]

This question already has answers here:
How to write a web-based music visualizer?
(4 answers)
Closed 5 years ago.
I'm trying to use JavaScript to display the waveform for and audio file, but I don't even know how to get started. I found the Audio Data API, but am unfamiliar with most audio terms and don't really know what is provided or how to manipulate it. I found examples of waveforms in JavaScript, but they are too complicated/I can't comprehend what is going on. Then my question is: how can you use JavaScript to create a waveform of a song on canvas, and what exactly is the process behind it?
Here's some sample code from my book (HTML5 Multimedia: Develop and Design) that does exactly that; Audio Waveform. It uses the Mozilla Audio Data API.
The code simply takes snapshots of the audio data and uses it to draw on the canvas.
Here's an article from the BBC's R&D team showing how they did exactly that to build a couple of JS libraries and more besides. The results all seem to be openly available and rather good.
Rather than use the Audio Data API, which you cannot be sure is supported by all your users' browsers, it might be better if you generate your waveform data server-side (the BBC team created a C++ app to do that) and then at least you are decoupling the client-side display aspect from the playback aspect. Also, bear in mind that the entire audio file has to reach the browser before you can calculate peaks and render a waveform. I am not sure if streaming files (eg MP3) can be used to calculate peaks as the file is coming in. But overall it is surely better to calculate your peaks once, server-side, then simply send the data via JSON (or even create + cache your graphics server-side - there are numerous PHP chart libraries or you can do it natively with GD).
For playback on the browser, there are several good (non-Flash!) options. Personally I like SoundManager 2 as the code is completely decoupled from display, meaning that I am free to create whatever UI / display that I like (or that the client wants). I have found it robust and reliable although I had some initial difficulty on one project with multiple players on the same page. The examples on their site are not great (imho) but with imagination you can do some cool things. SM2 also has an optional Flash fallback option for antique browsers.
I did just that with the web audio api and I used a project called wavesurfer.
http://www.html5audio.org/2012/10/interactive-navigable-audio-visualization-using-webaudio-api-and-canvas.html
What it does is, it draws tiny rectangles and uses an audio buffer to determine the height of each rectangle. Also possible in wavesurfer is playing and pausing using space bar and clicking on the wave to start playing at that point.
Update: This POC website no longer exists.
To check out what I made go to this site:
Update: This POC website no longer exists.
This only works in a google chrome browser and maybe safari but I'm not sure about that.
Let me know if you want more info.
Not well supported yet but take a look at this Firefox tone generator.

Categories

Resources