HTML5 audio - currentTime attribute inaccurate? - javascript

I'm playing around a bit with the HTML5 <audio> tag and I noticed some strange behaviour that has to do with the currentTime attribute.
I wanted to have a local audio file played and let the timeupdate event detect when it finishes by comparing the currentTime attribute to the duration attribute.
This actually works pretty fine if I let the song play from the beginning to the end - the end of the song is determined correctly.
However, changing the currentTime manually (either directly through JavaScript or by using the browser-based audio controls) results in the API not giving back the correct value of the currentTime anymore but seems to set it some seconds ahead of the position that's actually playing.
(These "some seconds" ahead are based on Chrome, Firefox seems to completely going crazy which results in the discrepancy being way bigger.)
A little jsFiddle example about the problem: http://jsfiddle.net/yp3o8cyw/2/
Can anybody tell me why this happens - or did I just not getting right what the API should do?
P.S.: I just noticed this actually only happens with MP3-encoded files, OGG files are totally doing fine.

After hours of battling this mysterious issue, I believe I have figured out what is going on here. This is not a question of .ogg vs .mp3, this is a question of variable vs. constant bitrate encoding on mp3s (and perhaps other file types).
I cannot take the credit for discovering this, only for scouring the interwebs. Terrill Thompson, a gentlemen and scholar, wrote a detailed article about this problem back on February 1st, 2015, which includes the following excerpt:
Variable Bit Rate (VBR) uses an algorithm to efficiently compress the
media, varying between low and high bitrates depending on the
complexity of the data at a given moment. Constant Bit Rate (CBR), in
contrast, compresses the file using the same bit rate throughout. VBR
is more efficient than CBR, and can therefore deliver content of
comparable quality in a smaller file size, which sounds attractive,
yes?
Unfortunately, there’s a tradeoff if the media is being streamed
(including progressive download), especially if timed text is
involved. As I’ve learned, VBR-encoded MP3 files do not play back with
dependable timing if the user scrubs ahead or back.
I'm writing this for anyone else who runs into this syncing problem (which makes precise syncing of audio and text impossible), because if you do, it's a real nightmare to figure out what is going on.
My next step is to do some more testing, and finally to figure out an efficient way to convert all my .mp3s to constant bit rate. I'm thinking FFMPEG may be able to help, but I'll explore that in another thread. Thanks also to Loilo for originally posting about this issue and Brad for the information he shared.

First off, I'm not actually able to reproduce your problem on my machine, but I only have a short MP3 file handy at the moment, so that might be the issue. In any case, I think I can explain what's going on.
MP3 files (MPEG) are very simple streams and do not have absolute positional data within them. It isn't possible from reading the first part of the file to know at what byte offset some arbitrary frame begins. The media player seeks in the file by needle dropping. That is, it knows the size of the entire track and roughly how far into the track your time offset is. It guesses and begins decoding, picking up as soon as it synchronizes to the next frame header. This is an imprecise process.
Ogg is a more robust container and has time offsets built into its frame headers. Seeking in an Ogg file is much more straightforward.
The other issue is that most browsers that support MP3 do so only because the codec is already available on your system. Playing Ogg Vorbis and MP3 are usually completely different libraries with different APIs. While the web standards do a lot to provide a common abstraction, minor implementation details cause quirks like you are seeing.

Related

Changing quality of MediaRecorder and canvas.captureStream?

I've recently been trying to generating video in the browser, and have thus been playing with two approaches:
Using the whammy js library to combine webp frames into webm video. More details here.
Using MediaRecorder and canvas.captureStream. More details here.
The whammy approach works well, but is only supported in Chrome, since it's the only browser that currently supports webp encoding (canvas.toDataURL("image/webp")). And so I'm using the captureStream approach as a backup for Firefox (and using libwebpjs for Safari).
So now on to my question: Is there a way to control the video quality of the canvas stream? And if not, has something like this been considered by the browsers / w3c?
Here's a screenshot of one of the frames of the video generated by whammy:
And here's the same frame generated by the MediaRecorder/canvas.captureStream approach:
My first thought is to artificially increase the resolution of the canvas that I'm streaming, but I don't want the output video to be bigger.
I've tried increasing the frame rate passed to the captureStream method (thinking that there may be some strange frame interpolation stuff happening), but this doesn't help. It actually degrades quality if I make it too high. My current theory is that the browser decides on the quality of the stream based on how much computational power it has access to. This makes sense, because if it's going to keep up with the frame rate that I've specified, then something has to give.
So the next thought is that I should slow down the rate at which I'm feeding the canvas with images, and then proportionally lower the FPS value that I pass into captureStream, but the problem with that is that even though I'd likely have solved the quality problem, I'd then end up with a video that runs slower than it's meant to.
Edit: Here's a rough sketch of the code that I'm using, in case it helps anyone in a similar situation.
These are compression artifacts, and there is not much you can do for now...
Video codecs are built mainly with the idea of showing real-life colors and shapes, a bit like JPEG with a really low quality. They will also do their best to keep as less information as they can between keyframes (some using motion detection algorithm) so that they need less data to be stored.
These codecs normally have some configurable settings that will allow us to improve the constant-quality of the encoding, but MediaRecorder's specs being codec agnostic, they didn't provide (yet) an option in the API for us web-devs to set any other option than a fixed bit-rate (which won't help us more in here).
There is this proposal, which asks for such a feature though.

Perfect Synchronization with Web Audio API

I'm working on a simple audio visualization application that uses a Web Audio API analyzer to pull frequency data, as in this example. Expectedly, the more visual elements I add to my canvases, the more latency there is between the audio and the yielded visual results.
Is there a standard approach to accounting for this latency? I can imagine a lookahead technique that buffers the upcoming audio data. I could work with synchronizing the JavaScript and Web Audio clocks, but I'm convinced that there's a much more intuitive answer. Perhaps it is as straightforward as playing the audio aloud with a slight delay (although this is not nearly as comprehensive).
The dancer.js library seems to have the same problem (always has a very subtle delay), whereas other applications seem to have solved the lag issue entirely. I have insofar been unable to pinpoint the technical difference. SoundJS seems to handle this a bit better, but it would be nice to build from scratch.
Any methodologies to point me in the right direction are much appreciated.
I think you will find some answers to precise audio timing in this article:
http://www.html5rocks.com/en/tutorials/audio/scheduling/
SoundJS uses this approach to enable smooth looping, but still uses javascript timers for delayed playback. This may not help you sync the audio timing with the animation timing. When I built the music visualizer example for SoundJS I found I had to play around with the different values for fft size and tick frequency to get good performance. I also needed to cache a single shape and reuse it with scaling to have performant graphics.
Hope that helps.
I'm concerned when you say the more visual elements you add to your canvases, the more latency you get in audio. That shouldn't really happen quite like that. Are your canvases being animated using requestAnimationFrame? What's your frame rate like?
You can't, technically speaking, synchronize the JS and Web Audio clocks - the Web Audio clock is the audio hardware clock, which is literally running off a different clock crystal than the system clock (on many systems, at least). The vast majority of web audio (ScriptProcessorNodes being the major exception) shouldn't have additional latency introduced when your main UI thread becomes a bit more congested).
If the problem is the analysis just seems to lag (i.e. the visuals are consistently "behind" the audio), it could just be the inherent lag in the FFT processing. You can reduce the FFT size in the Analyser, although you'll get less definition then; to fake up fixing it, you can also run all the audio through a delay node to get it to re-sync with the visuals.
Also, you may find that the "smoothing" parameter on the Analyser makes it less time-precise - try turning that down.

MobileSafari crashing due to excessive memory consumption

I'm currently working on an application, that utilises SoundJS. I inherited the codebase after it was found not to work correctly on an iPad - the issue being it creates a manifest of around 16 MP3 files, totalling approximately 35.7mb. This obviously causes the iPad issues, and at 16mb it crashes.
The crash log shows it's due to the Per Process Limit according to the Diagnostics and Usage Logs.
I've done some digging in to the underlying structure of SoundJS and can see it's default behaviour is to utilise WebAudio, via a XHR. This is then parsed as an ArrayBuffer (or an array of ArrayBuffers).
At the moment this means that, after preloading, we have 35.7mb of data in ArrayBuffers - not good. This is after crunching the file size down! There will only ever be one audio file playing at any one time - and this is one file per section of the app; apart from during transitions, where two may be fading in to eachother.
Is there an easy way to free the resources up from the underlying structure; i.e the ArrayBuffers? As far as I'm aware, the previous developer did try using calls to the SoundJS .removeSound() method to free up some memory, but the results weren't good.
At the moment I'm looking at creating an object acting as a registry of all the filenames, and rather than loading them through a manifest - loading them individually and removing them as soon as they are used. However, I'm expecting this to cause headaches with easing one file in to another during playback. Furthermore, I expect that may actually result in a problem akin to the Images one where MobileSafari didn't release the memory allocated to image - even after deletion. (The correct fix being to reset the 'src' attribute of the image element prior to deletion)
Does anyone know of a surefire workaround for loading such large amounts of data in a web app targeting iPad?
testing SoundJS has shown some issues with iPad not releasing the memory properly. Unfortunately from a library perspective, there isn't much we can do about it.
If you only ever play sounds 1 at a time, I would suggest loading them only as you need them and removing them after use. The biggest issue you will find with that is waiting for sound to load, so you may want to implement a smart preload of the next sound you expect to use (meaning you always have the current and the next sound loaded). In theory this can keep you below the iPad 16 mb memory limit. However, if the iPad refuses to free up the memory, you may need to introduce some form of cache busting.
Another solution would be to reduce the file size through lossy compression, which it sounds like has already been attempted.
A third option might be implementing some form of streaming audio, but this is not something SoundJS can help with.
Hope that helps.

Audio Analysis in WinJS

I've been poking around the API to find what I'm looking for, as well as searching online (but examples of Windows Store apps are pretty scarce). What I'm essentially looking for is a starting point for analyzing audio in a Windows Store JavaScript app. If I were creating a simple visualizer, for example, and I need to detect the various kinds of "bumps" in the currently playing audio.
Can anybody point me in the right direction here? Is this something that's even possible in a Windows Store JavaScript app? Whether it's the audio of a selected song, or the device's currently playing song, or the audio on the microphone... either way is fine for my needs at the moment. I'm just looking for where to start in the analysis of the audio.
GGG's response sounded skeptical of the possibility of a signal processor on WindowsRT, and I have to admit I don't know much about WindowsRT either. But we know you will have javascript available. It sounds like you are interested in Digital Signal Processing in Javascript. If you take a look at these resources. They could get you pointed in the right direction.
https://github.com/corbanbrook/dsp.js
http://www.bores.com/courses/intro/index.htm
http://arc.id.au/SpectrumAnalyser.html

How can I play a sound with absolute minimal or *quantified* lag?

I'm writing a psychology app in jQuery. Part of my project requires measurement of the reaction time of a user to a sound (user presses a key). Thus, I need to play a sound file with the smallest possible lag between when I call (& timestamp) the sound file and when it actually starts playing. Most sound plugins don't detail how they handle lag. Please advise me on the best method!
Only answers rooted in solid CS (not "this plugin sounds fast") are useful to me. At the very least, I need to know what the possible range of lag is for the method I'm using (for calculating a confidence interval).
An alternative and actually preferable solution would be a method of quantifying the lag. In that case the length of lag would be unimportant, because I can easily correct for it.
I don't know what sound files I'll be using, but I think they'll be very small. Two .wav files # 100kb each is probably a safe estimate.
After the comment discussion, I'm going to recommend that you use the HTML5 audio tag to play/control the sound. [The JQuery plugin you link to uses, the "embed" tag to play the sound (this is going to call a plugin on the client computer which you won't have any control over).]
With HTML5, you'll lose some older/mobile broswers but that is always the crux with web programming.
For a tutorial on controlling the "audio" tag, see here. Make sure you preload the audio.
I think the only other option besides HTML5 is to use flash to play the sound. You can pretty easily interact flash and javascript.
First thing I would do is cache the sound file (idk what you plan on using for sound. HTML5 audio tag?) and then when you call the play function you can run something like
var counter = 0;
var timer = setInterval(function(){counter++}, 1);
in the user response call
clearInterval(timer);
and timer will be the milliseconds for response (minus the time it takes to start playing).
I run into the very same problem: I tried buffering, audio events even AnalyserNode, but there seem to be a hardware lag between the sound data is parsed and when the sound is produced: in my testing it ranges between 50-200ms.
The lag however can be quantified by capturing sound:
play the sound (whether is buffered or not) and start the timer
stop the timer when sound capturing catches something (watch the sound interference near the mic)
Read more about javascript sound capturing here.

Categories

Resources