AudioElement/WebAudio/WebSpeech and output latency

AudioElement/WebAudio/WebSpeech and output latency - javascript

I tried three methods of playing back/generating audio on a Mac and on Android devices. The three methods are
loading a file into an AudioElement (via <audio>),
creating a sound with the WebAudio API,
using the WebSpeech API to generate speech.
Methods 1 and 2 have a considerable latency (i.e. time between call to play and perceivable audio) before they can be heard on my Android devices (though one of the devices appears to have less latency than the other). No latency can be perceived on my Mac.
Method 3 doesn't seem to have any latency at all.
The latency of Method 2, WebAudio API, can be mitigated by subtracting a calculated output latency from the desired starting time. The formula is:
outputLatency = audioContext.currentTime - audioContext.getOutputTimestamp().contextTime.
It does more or less remove the latency from one of my Android devices, but not from the other.
The improvement I saw after using above formula is the main reason I suspected the problem to be output latency.
According to my research, output latency has at least in part to do with the hardware so WebSpeech being completely unaffected doesn't make a lot of sense in my opinion.
Is what I am observing here output latency?
If yes, why is WebSpeech not affected by this?
If no, where does the latency come from?

Related

What is the AudioContext currentTime for the first sample recorded?

Using the Web Audio API, I create a bufferSource, and use a new MediaRecorder to record at the same time. I'm recording the sound coming out of my speakers with the built in microphone.
If I play back the original and the new recording, there is a significant lag between the two. (It sounds like about 200ms to me.) If I console.log the value of globalAudioCtx.currentTime at the point of calling the two "start" methods, the two numbers are exactly the same. The values of Date.now() are also exactly the same.
Where is this latency introduced? The latency due to speed of sound is about 1000x as small as what I am hearing.
In short, how can I get these two samples to play back at exactly the same time?
I'm working in Chrome on Linux.

Where is this latency introduced?
Both on the playback and recording.
Your sound card has a buffer, and software has to write audio to that buffer in small chunks at a time. If the software can't keep up, choppy audio is heard. So, buffer sizes are set to be large enough to prevent that from happening.
The same is true on the recording end. If a buffer isn't large enough, recorded audio data would be lost if the software wasn't able to read from that buffer fast enough, causing choppy and lost audio.
Browsers aren't using the lowest latency mode of operation with your sound card. There are some tweaks you can apply (such as using WASAPI and exclusive mode on Windows with Chrome), but you're at the mercy of the browser developers who didn't design this with folks like you and I in mind.
No matter how low you go though in buffer size, there is still going to be lag. That's the nature of computer-based digital audio.
how can I get these two samples to play back at exactly the same time?
You'll have to delay one of the samples to bring them back in sync.

WebRTC - remove/reduce latency between devices that are sharing their videos stream?

i'm sorry for not posting any code, but i'm trying learning more about latency and webRTC , what is the best way to remove latency between two or more devices that are sharing a video stream?
Or , anyway, to reduce as much as possible latency ?
Thinking about it, i imaged to just put the device's clocks to the same time so delay the requests from server, is this the real trick?

Latency is a function of the number of steps on the path between the source (microphone, camera) and the output (speakers, screen).
Changing clocks will have zero impact on latency.
The delays you have include:
device internal delays - waiting for screen vsync, etc...; nothing much to be done here
device interface delays - a shorter cable will save you some time, but nothing measureable
software delays - your operating system and browser; you might be able to do something here, but you probably don't want to, trust that your browser maker is doing what they can
encoding delays - a faster processor helps a little here, but the biggest concern is for things like audio, the encoder has to wait for a certain amount of audio to accumulate before it can even start encoding; by default, this is 20ms, so not much; eventually, you will be able to request shorter ptimes (what the control is called) for Opus, but it's early days yet
decoding delays - again, a faster processor helps a little, but there's not much to be done
jitter buffer delay - audio in particular requires a little bit of extra delay at a receiver so that any network jitter (fluctuations in the rate at which packets are delivered) doesn't cause gaps in audio; this is usually outside of your control, but that doesn't mean that it's completely impossible
resynchronization delays - when you are playing synchronized audio and video, if one gets delayed for any reason, playback of the other might be delayed so that the two can be played out together; this should be fairly small, and is likely outside of your control
network delays - this is where you can help, probably a lot, depending on your network configuration
You can't change the physics of the situation when it comes to the distance between two peers, but there are a lot of network characteristics that can change the actual latency:
direct path - if, for any reason, you are using a relay server, then your latency will suffer, potentially a lot, because every packet doesn't travel directly, it takes a detour via the relay server
size of pipe - trying to cram high resolution video down a small pipe can work, but getting big intra-frames down a tiny pipe can add some extra delays
TCP - if UDP is disabled, falling back to TCP can have a pretty dire impact on latency, mainly due to a special exposure to packet loss, which requires retransmissions and causes subsequent packets to be delayed until the retransmission completes (this is called head of line blocking); this can also happen in certain perverse NAT scenarios, even if UDP is enabled in theory, though most likely this will result in a relay server being used
larger network issues - maybe your peers are on different ISPs and those ISPs have poor peering arrangements, so that packets take a suboptimal route between peers; traceroute might help you identify where things are going
congestion - maybe you have some congestion on the network; congestion tends to cause additional delay, particularly when it is caused by TCP, because TCP tends to fill up tail-drop queues in routers, forcing your packets to wait extra time; you might also be causing self-congestion if you are using data channels, the congestion control algorithm there is the same one that TCP uses by default, which is bad for real-time latency
I'm sure that's not a complete taxonomy, but this should give you a start.

I don't think you can do something to enhance the latency besides being on a better network with higher bandwidth and latency. If you're on the same network or wifi, there should be quite little latency.
I think the latency also is higher when your devices have little processing power, so they don't decode the video as fast, but there should be not much you can do about that it happens all in the Browser.
What you could do is try different codecs. Therefore you have to manipulate the SDP before it is sent out and reorder or remove the codecs in the m=audio or the m=video line. (But there's not much to choose from in video codecs, just VP8)
You can view the performance of the codecs and network on the tool that comes with chrome:
chrome://webrtc-internals/
just type that into the URL-bar.

Perfect Synchronization with Web Audio API

I'm working on a simple audio visualization application that uses a Web Audio API analyzer to pull frequency data, as in this example. Expectedly, the more visual elements I add to my canvases, the more latency there is between the audio and the yielded visual results.
Is there a standard approach to accounting for this latency? I can imagine a lookahead technique that buffers the upcoming audio data. I could work with synchronizing the JavaScript and Web Audio clocks, but I'm convinced that there's a much more intuitive answer. Perhaps it is as straightforward as playing the audio aloud with a slight delay (although this is not nearly as comprehensive).
The dancer.js library seems to have the same problem (always has a very subtle delay), whereas other applications seem to have solved the lag issue entirely. I have insofar been unable to pinpoint the technical difference. SoundJS seems to handle this a bit better, but it would be nice to build from scratch.
Any methodologies to point me in the right direction are much appreciated.

I think you will find some answers to precise audio timing in this article:
http://www.html5rocks.com/en/tutorials/audio/scheduling/
SoundJS uses this approach to enable smooth looping, but still uses javascript timers for delayed playback. This may not help you sync the audio timing with the animation timing. When I built the music visualizer example for SoundJS I found I had to play around with the different values for fft size and tick frequency to get good performance. I also needed to cache a single shape and reuse it with scaling to have performant graphics.
Hope that helps.

I'm concerned when you say the more visual elements you add to your canvases, the more latency you get in audio. That shouldn't really happen quite like that. Are your canvases being animated using requestAnimationFrame? What's your frame rate like?
You can't, technically speaking, synchronize the JS and Web Audio clocks - the Web Audio clock is the audio hardware clock, which is literally running off a different clock crystal than the system clock (on many systems, at least). The vast majority of web audio (ScriptProcessorNodes being the major exception) shouldn't have additional latency introduced when your main UI thread becomes a bit more congested).
If the problem is the analysis just seems to lag (i.e. the visuals are consistently "behind" the audio), it could just be the inherent lag in the FFT processing. You can reduce the FFT size in the Analyser, although you'll get less definition then; to fake up fixing it, you can also run all the audio through a delay node to get it to re-sync with the visuals.
Also, you may find that the "smoothing" parameter on the Analyser makes it less time-precise - try turning that down.

Detect Graphics card performance - JS

This is a longshot - is there anyway to detect poor vs strong graphics card performance via a JS plugin?
We have built a parallax site for a client, it stutters on lower performance machines - we could tweak the performance to make it work better across the board - but this of course reduces the experience for users with higher performance machines.
We could detect browser version also - but the same browser could run on low and high performance machines - so doesn't help our situation
Any ideas?

requestAnimationFrame (rAF) can help with this.
You could figure out your framerate using rAF. Details about that here: calculate FPS in Canvas using requestAnimationFrame. In short, you figure out the time difference between frames then divide 1 by it (e.g. 1/.0159s ~= 62fps ).
Note: with any method you choose, performance will be arbitrarily decided. Perhaps anything over 24 frames per second could be considered "high performance."

Why not let the user decide? Youtube (and many other video sharing sites) implements a selector for quality of playback, now a gear icon with a list of resolutions you can choose from. Would such a HD | SD or Hi-Fi | Lo-Fi selector work (or even make sense) in the context of your application?

This is where "old school" loading screens came in useful, you could render something complex either in the foreground (or hidden away) that didn't matter if it looked odd or jurky -- and by the time you had loaded your resources you could decide on what effects to enable or disable.
Basically you would use what jcage mentioned for this, testing of frame-rate (i.e. using a setInterval in conjuction with a timer). This isn't always 100% reliable however because if their machine decides in that instance to do a triple-helix-backward-somersault (or something more likely) you'd get a dodgy reading. It is possible, depending on the animations involved, to upgrade and downgrade the effects in realtime — but this is always more tricky to code, plus your own analysis of the situation can actually sometimes cause dropped performance.

Firefox has a build in list of graphic cards which are not supported: https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drivers the related help acrticle.
But you only can indirectly test them when accessing WebGL features...
http://get.webgl.org/troubleshooting/ leads you to the corresponding browser provider when there are problems. When checking the JS code you will see that they test via
if (window.WebGLRenderingContext) {
alert('Your browser does not support WebGL');
}
if you have an up to date graphic card.

You might consider checking to see if the browser supports window.requestAnimationFrame, which would indicate you are running in a newer browser. Or alternatively consider checking jQuery.fx.interval.
You could then implement a custom function to gauge the available processing power. You might try using a custom easing function which can then be run through a function like .slideDown() to get an idea of the computation power available.
See this answer to another question for more ideas on how to check performance in javascript.

If the browser is ultra-modern and supports requestAnimationFrame, you can calculate the animation frame rate in realtime and drop your animation settings for slower machines. That's IE 10, Firefox 4, Chrome 10, and Safari 6.
You would essentially create a 'tick' function that runs on a requestAnimationFrame() callback and tracks how many milliseconds pass between each tick. After so many ticks have been registered, you can average it out to determine your overall frame rate. But there are two caveats to this:
When the tab is in the background requestAnimationFrame callbacks will be suspended -- so a single, sudden delay between frames of several seconds to many minutes does not mean it's a slow animation. And:
Simply having a JavaScript gauge running on each animation frame will cause the overall animation to slow down a bit; there's no way to measure something like that without negatively impacting it. Remember, Heisenberg's a bastard.
For older browsers, I don't know of any way to reliably gauge the frame rate. You may be able to simulate it using setTimeout() but it won't be nearly as accurate -- and might have an even more negative impact on performance.

This might be the risk/benefit based decision. I think that you will have to make important, but tough decision here.
1)
If you decide to have two verions, you will have to spend some time to:
figure out fast, non intrusive test
spend time implementing that
roll it into production
and you will most probably end up with incorrect implementation, for example at the moment my computer is running 70 chrome tabs, VLC with 1080p anime, and IntellijIDEA
The probability the my MBP, 2012 model will be detected as "slow" computer is high, at least now.
False positives are really hard to figure out.
2)
If you go for one version you will have to choose between HD and Lo-Fi, as #Patrick mentioned, which is again mistake in my opinion.
What i will suggest is that you go to Google Analytics, figure out browser distribution (yes, i know that it can be misleading, but so can any other test) and based on that (if majority of users are Chrome + modern IE/FF go with HD version, BUT spend some time figuring out optimisation strategy.
There are always things that could be done better and faster. Get one older laptop, and optimise until you get decent FPS rate. And thats it, yo as a developer need to make that decision, that is your duty.
3)
If from the Browser distribution you figure out that you absolutely must go with Lo-Fi version, well, try to think is the "downgrade" worth it, and implement it only if that is your last resort.

How does any application (chrome, flash, etc) get a time resolution better than the system time resolution?

This article on Microsoft's tech net site supplies an exe that will calculate your windows machine's minimum time resolution - this should be the smallest "tick" available to any application on that machine:
http://technet.microsoft.com/en-us/sysinternals/bb897568.aspx
The result of running this app on my current box is 15.625 ms. I have also run tests against Internet Explorer and gotten the exact same resolution from the Date() javascript function.
What is confusing is that the SAME test I ran against IE gives back much finer resolution for Google's Chrome browser (resolution of 1ms) and for a flash movie running in IE (1ms). Can anyone explain how any application can get a clock resolution better then the machine's clock? If so, is there some way I can get a consistently better resolution in browsers other then Chrome (without using flash)?
The first answer below leads to two other questions:
How does a multi-media timer get
times between system clock ticks. I
imagine the system clock as an
analog watch with a ticking hand,
each tick being 15ms. How are times
between ticks measured?
Are multimedia timers available to
browsers, especially Internet
Explorer? Can I access one with C#
or Javascript without having to push
software to a user's
browser/machine?

You can get down to 1 ms with multimedia timers and even further with QueryPerformanceCounter.
See also GetLocalTime() API time resolution.
EDIT: Partial answer to the new subquestion ...
System time resolution was around 15 ms on the Windows 3 and 95 architecture. On NT (and successors) you can get better resolution from the hardware abstraction layer (HAL). The QueryPerformanceCounter counts elapsed time, not CPU cycles article, written by the Raymond Chen, may give you some additional insights.
As for the browsers - I have no idea what they are using for timers and how they interact with the OS.

Look at the timeBeginPeriod API. From MSDN: "This function affects a global Windows setting. Windows uses the lowest value (that is, highest resolution) requested by any process."
http://msdn.microsoft.com/en-us/library/ms713413(VS.85).aspx
(Markdown didn't like parens in the URL)
See "Inside Windows NT High Resolution Timers" referenced from the link you posted.

The APIC on the processor runs at bus speed and has a timer. They may be using that instead of the system time. (Or they might just be giving a bunch of precision that isn't there.)
This description of the Local APIC mentions using it as a timer.
(It may also be the case that there is some performance counter they are using. That actually seems more likely since a device driver would be needed to program the APIC.)

Develop Reference

JavaScript is the programming language of the Web.

AudioElement/WebAudio/WebSpeech and output latency - javascript

Related

What is the AudioContext currentTime for the first sample recorded?

WebRTC - remove/reduce latency between devices that are sharing their videos stream?

Perfect Synchronization with Web Audio API

Detect Graphics card performance - JS

How does any application (chrome, flash, etc) get a time resolution better than the system time resolution?

Categories

Resources