how to export last 3s data of a web audio stream - javascript

Question: I am using web audio API. I need to buffer a non-stop audio stream, like a radio stream. and when I get a notification, I need to get the past 3s audio data and send it to server. How can I do achieve that? nodejs has a built in buffer, but it seems not a circular buffer, if I write a non-stop stream into it, it seems to be overflowed.
Background to help u understand my question:
I am implementing an ambient audio based web authentication method. Briefly, I need to compare two pieces of audio signal (one from the client, and one from the anchor device, they are all time synced with server), if they are similar enough, the authentication request will be approved by the server. The audio recording is implemented on both the client and the anchor device using web Audio API.
I need to manage a buffer on the anchor device to stream the ambient audio. The anchor device is supposed to be running all the time, so the stream is not going to be ended.

You can capture the audio from a stream using the ScriptProcessorNode. Whilst this is deprecated no browser as of now actually implements the new AudioWorker.
var N = 1024;
var time = 3; // Desired time of capture;
var frame_holder = [];
var time_per_frame = N / context.sampleRate;
var num_frames = Math.ceil(time / time_per_frame); // Minimum number to meet time
var script = context.createScriptProcessor(N,1,1);
script.connect(context.destination);
script.onaudioprocess = function(e) {
var input = e.inputBuffer.getChannelData(0);
var output = e.outputBuffer.getChannelData(0);
var copy = new Float32Array(input.length);
for (var n=0; n<input.length; n++) {
output[n] = 0.0; // Null this as I guess you are capturing microphone
copy[n] = input[n];
}
// Now we need to see if we have more than 3s worth of frames
if (frame_holder.length > num_frames) {
frame_holder = frame_holder.slice(frame_holder.length-num_frames);
}
// Add in the current frame
var temp = frame_holder.slice(1); // Cut off first frame;
frame_holder = temp.concat([copy]); // Add the latest frame
}
Then for actual transmission, you just need to string the copied frames together. It is easier than trying to keep one long array though of course that is also possible.

Related

Web Audio audiocontext createMediaStreamSource stuttering

I want to mix different audio media streams in to one stream. I'm been doing this with Web Audio audiocontext and createMediaStreamSource.
But the final mixed audio is stuttering.
Have anyone an idea how to optimize this to avoid stuttering?
// init audio context
var audioContext = new AudioContext({ latencyHint: 0 });
var audioDestination = audioContext.createMediaStreamDestination();
// add audio streams
audioContext.createMediaStreamSource(audioStream1).connect(audioDestination);
audioContext.createMediaStreamSource(audioStream2).connect(audioDestination);
audioContext.createMediaStreamSource(audioStream3).connect(audioDestination);
audioContext.createMediaStreamSource(audioStream4).connect(audioDestination);
// get mixed audio stream tracks
var audioTrack = audioDestination.stream.getTracks()[0];
// get video track
var videoTrack = videoStream.getTracks()[0];
// combine video and audio tracks into single stream.
var finalStream = new MediaStream([videoTrack, audioTrack]);
// assign to video element
el_video.srcObject = finalStream;
You could try setting the latencyHint to 'playback' like this:
const audioContext = new AudioContext({ latencyHint: 'playback' });
This allows the browser to add a bit of latency to the audio graph which can help on underpowered devices. Setting the latencyHint to 0 on the other hand will tell the browser that it should do things as fast as possible which increases the risk of dropouts.
Having said that, the latencyHint is only a hint. The browser may very well ignore it. You can check what the browser is actually doing by inspecting the baseLatency property.
console.log(audioContext.baseLatency);

How to stream PCM audio on HTML without lag?

The PCM audio data is captured in Unity3D in real time. All those data will be streaming to HTML via WebSockets. The general setup is Socket.IO with node.js server.
My major task is adding smooth audio playback for live video+audio streaming solution on All platform. This is my working progress(video streaming): https://youtu.be/82_-a7WF3vs
The audio & video streaming part works well on non-html/non-WebGL platforms.
However, I couldn't make smooth audio playback on html with javascript. It runs real-time but I found some lagging issue like noise...
One of my concern is that Web Browsers do not support multi-threading, it added some lag when receiving streaming data and playback at the same time.
below is my core script for PCM playback. Hope someone can help me improve it.
var startTime = 0;
var audioCtx = new AudioContext();
function ProcessAudioData(_byte) {
ReadyToGetFrame_aud = false;
//read meta data
SourceSampleRate = ByteToInt32(_byte, 0);
SourceChannels = ByteToInt32(_byte, 4);
//conver byte[] to float
var BufferData = _byte.slice(8, _byte.length);
AudioFloat = new Float32Array(BufferData.buffer);
//=====================playback=====================
if(AudioFloat.length > 0) StreamAudio(SourceChannels, AudioFloat.length, SourceSampleRate, AudioFloat);
//=====================playback=====================
ReadyToGetFrame_aud = true;
}
function StreamAudio(NUM_CHANNELS, NUM_SAMPLES, SAMPLE_RATE, AUDIO_CHUNKS) {
var audioBuffer = audioCtx.createBuffer(NUM_CHANNELS, (NUM_SAMPLES / NUM_CHANNELS), SAMPLE_RATE);
for (var channel = 0; channel < NUM_CHANNELS; channel++) {
// This gives us the actual ArrayBuffer that contains the data
var nowBuffering = audioBuffer.getChannelData(channel);
for (var i = 0; i < NUM_SAMPLES; i++) {
var order = i * NUM_CHANNELS + channel;
nowBuffering[i] = AUDIO_CHUNKS[order];
}
}
var source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioCtx.destination);
source.start(startTime);
startTime += audioBuffer.duration;
}
How to stream PCM audio on HTML without lag?
There is always some lag with digital audio, no matter what you do. This has nothing to do with the web browser itself.
All those data will be streaming to HTML via WebSockets.
Why? The data is only going one direction so you can use a regular HTTP response and not have to worry about the overhead of Web Sockets.
One of my concern is that Web Browsers do not support multi-threading
This isn't really accurate.
It runs real-time but I found some lagging issue like noise...
What your code appears to do is take a PCM frame it receives and play it immediately. This isn't good, as the sound is wrecked if you don't play your received buffers contiguously. You must take the data and schedule it to play immediately after the current data is finished, and not a sample early or too late.
Traditionally this means doing your own buffering and setting up a ScriptProcessorNode to read from those buffers. However, this also requires some DIY resampling because the encoded rate may not be the same as the playback rate.
These days, I think that MediaSource Extensions supports PCM decoding, so you can just pipe your data through that and let the underlying system do all the work for you.

How to play media files sequentially without visible break?

I've worded my title, and tags in a way that should be searchable for both video and audio, as this question isn't specific to one. My specific case only concerns audio though, so my question body will be written specific to that.
First, the big picture:
I'm sending audio to multiple P2P clients who will connect and disconnect a random intervals. The audio I'm sending is a stream, but each client only needs the part of the stream from whence they connected. Here's how I solved that:
Every {timeout} (e.g. 1000ms), create a new audio blob
Blob will be a full audio file, with all metadata it needs to be playable
As soon as a blob is created, convert to array buffer (better browser support), and upload to client over WebRTC (or WebSockets if they don't support)
That works well. There is a delay, but if you keep the timeout low enough, it's fine.
Now, my question:
How can I play my "stream" without having any audible delay?
I say stream, but I didn't implement it using the Streams API, it is a queue of blobs, that gets updated every time the client gets new data.
I've tried a lot of different things like:
Creating a BufferSource, and merging two blobs (converted to audioBuffers) then playing that
Passing an actual stream from Stream API to clients instead of blobs
Playing blobs sequentially, relying on ended event
Loading next blob while current blob is playing
Each has problems, difficulties, or still results in an audible delay.
Here's my most recent attempt at this:
let firstTime = true;
const chunks = [];
Events.on('audio-received', ({ detail: audioChunk }) => {
chunks.push(audioChunk);
if (firstTime && chunks.length > 2) {
const currentAudio = document.createElement("audio");
currentAudio.controls = true;
currentAudio.preload = 'auto';
document.body.appendChild(currentAudio);
currentAudio.src = URL.createObjectURL(chunks.shift());
currentAudio.play();
const nextAudio = document.createElement("audio");
nextAudio.controls = true;
nextAudio.preload = 'auto';
document.body.appendChild(nextAudio);
nextAudio.src = URL.createObjectURL(chunks.shift());
let currentAudioStartTime, nextAudioStartTime;
currentAudio.addEventListener("ended", () => {
nextAudio.play()
nextAudioStartTime = new Date();
if (chunks.length) {
currentAudio.src = URL.createObjectURL(chunks.shift());
}
});
nextAudio.addEventListener("ended", () => {
currentAudio.play()
currentAudioStartTime = new Date();
console.log(currentAudioStartTime - nextAudioStartTime)
if (chunks.length) {
nextAudio.src = URL.createObjectURL(chunks.shift());
}
});
firstTime = false;
}
});
The audio-received event gets called every ~1000ms. This code works; it plays each "chunk" after the last one was played, but on Chrome, there is a ~300ms delay that's very audible. It plays the first chunk, then goes quiet, then plays the second, so on. On Firefox the delay is 50ms.
Can you help me?
I can try to create a reproducible example if that would help.

Concerning Web Audio nodes, what does .connect() do?

Trying to follow the example here, which is basically a c&p of this
Think I got most of the parts down, except all the node.connect()'s
From what I understand, this sequence of code is needed to provide the audio analyzer with an audio stream:
var source = audioCtx.createMediaStreamSource(stream);
source.connect(analyser);
analyser.connect(audioCtx.destination);
I can't seem to make sense of it as it looks rather ouroboros-y to me.
And unfortunately, I can't seem to find any documentation on .connect() so quite lost and would appreciate any clarification!
Oh and I'm loading an .mp3 via pure javascript new Audio('db.mp3').play(); and am trying to use that as the source without creating an <audio> element.
Can a mediaStream object be created from this to feed into .createMediaStreamSource(stream)?
connect simply defines the output for the filters.
In this case, your source loads the stream into the buffer and writes to the input of the next filter which is defined by the connect function. This is repeated for your analyser filter.
Think of it as pipes.
here is a sample code snippet that I have written a few years back using web audio api.
this.scriptProcessor = this.audioContext.createScriptProcessor(this.scriptProcessorBufferSize,
this.scriptProcessorInputChannels,
this.scriptProcessorOutputChannels);
this.scriptProcessor.connect(this.audioContext.destination);
this.scriptProcessor.onaudioprocess = updateMediaControl.bind(this);
//Set up the Gain Node with a default value of 1(max volume).
this.gainNode = this.audioContext.createGain();
this.gainNode.connect(this.audioContext.destination);
this.gainNode.gain.value = 1;
sewi.AudioResourceViewer.prototype.playAudio = function(){
if(this.audioBuffer){
this.source = this.audioContext.createBufferSource();
this.source.buffer = this.audioBuffer;
this.source.connect(this.gainNode);
this.source.connect(this.scriptProcessor);
this.beginTime = Date.now();
this.source.start(0, this.offset);
this.isPlaying = true;
this.controls.update({playing: this.isPlaying});
updateGraphPlaybackPosition.call(this, this.offset);
}
};
So as you can see that my source is connected to a gainNode, which is connected to a scriptProcessor. When the audio starts playing, the data is passed from the source->gainNode->destination and source->scriptProcessor->destination. flowing through the "pipes" that connects them, which is defined by connect(). When the audio data pass through the gainNode, volume can be adjusted by changing the amplitude of the audio wave. After that it is passed to the script processor so that events can be attached and triggered while the audio is being processed.

polyphonic audio playback with node.js on raspberry pi

I've been trying to create polyphonic WAV playback with node.js on raspberry pi 3 running latest raspbian:
shelling out to aplay/mpg123/some other program - allows me to only play single sound at once
I tried combination of https://github.com/sebpiq/node-web-audio-api and https://github.com/TooTallNate/node-speaker (sample code below) but audio quality is very low, with a lot of distortions
Is there anything I'm missing here? I know I could easily do it in another programming language (I was able to write C++ code with SDL, and Python with pygame), but the question is if it's possible with node.js :)
Here's my current web-audio-api + node-speaker code:
var AudioContext = require('web-audio-api').AudioContext;
var Speaker = require('speaker');
var fs = require('fs');
var track1 = './tracks/1.wav';
var track2 = './tracks/1.wav';
var context = new AudioContext();
context.outStream = new Speaker({
channels: context.format.numberOfChannels,
bitDepth: context.format.bitDepth,
sampleRate: context.format.sampleRate
});
function play(audioBuffer) {
if (!audioBuffer) { return; }
var bufferSource = context.createBufferSource();
bufferSource.connect(context.destination);
bufferSource.buffer = audioBuffer;
bufferSource.loop = false;
bufferSource.start(0);
}
var audioData1 = fs.readFileSync(track1);
var audioData2 = fs.readFileSync(track2);
var audioBuffer1, audioBuffer2;
context.decodeAudioData(audioData1, function(audioBuffer) {
audioBuffer1 = audioBuffer;
if (audioBuffer1 && audioBuffer2) { playBoth(); }
});
context.decodeAudioData(audioData2, function(audioBuffer) {
audioBuffer2 = audioBuffer;
if (audioBuffer1 && audioBuffer2) { playBoth(); }
});
function playBoth() {
console.log('playing...');
play(audioBuffer1);
play(audioBuffer2);
}
audio quality is very low, with a lot of distortions
According to the WebAudio spec (https://webaudio.github.io/web-audio-api/#SummingJunction):
No clipping is applied at the inputs or outputs of the AudioNode to allow a maximum of dynamic range within the audio graph.
Now if you're playing two audio streams, it's possible that summing them results in a value that's beyond the acceptable range, which sounds like - distortions.
Try lowering the volume of each audio stream by first piping them through a GainNode as so:
function play(audioBuffer) {
if (!audioBuffer) { return; }
var bufferSource = context.createBufferSource();
var gainNode = context.createGain();
gainNode.gain.value = 0.5 // for instance, find a good value
bufferSource.connect(gainNode);
gainNode.connect(context.destination);
bufferSource.buffer = audioBuffer;
bufferSource.loop = false;
bufferSource.start(0);
}
Alternatively, you could use a DynamicsCompressorNode, but manually setting the gain gives you more control over the output.
This isn't exactly answer-worthy but I can't post comments at the moment ><
I had a similar problem with an app made using js audio api and the, rather easy fix, was lowering the quality of the audio and changing the format.
In your case what I could think of is setting the bit depth&sampling frequency as low as possible without affecting the listener's experience (e.g. 44.1kHz and 16 bit depth).
You might also try changing the format, wav, in theory, should be quite good at the job of not being CPU intensive, however, there are other uncompressed formats (e.g. .aiff)
You may try using multiple cores of the pi:
https://nodejs.org/api/cluster.html
Although this may prove a bit complicated, if you are doing the audio-streaming in parallel with other unrelated processes, you could try moving the audio on a separate CPU.
An (easy) thing you could try would be running node with more RAM, although, in your case, I doubt that I possible.
The biggest problem, however, might be the code, sadly enough I am not experienced with the modules you are using and as such can give to real advice on that (hence, why I said this is not answer worthy :p)
when you create Speaker instant, set parameter like this
channels = 1 // you can try with 1 or 2 and get the best quantity
bitDepth = 16
sampleRate = 48000 // normally 44100 for speaking and higher for music playing
You can spawn from node 2 aplay processes each playing one file. Use detached: true to allow node to continue running.

Categories

Resources