polyphonic audio playback with node.js on raspberry pi - javascript

I've been trying to create polyphonic WAV playback with node.js on raspberry pi 3 running latest raspbian:
shelling out to aplay/mpg123/some other program - allows me to only play single sound at once
I tried combination of https://github.com/sebpiq/node-web-audio-api and https://github.com/TooTallNate/node-speaker (sample code below) but audio quality is very low, with a lot of distortions
Is there anything I'm missing here? I know I could easily do it in another programming language (I was able to write C++ code with SDL, and Python with pygame), but the question is if it's possible with node.js :)
Here's my current web-audio-api + node-speaker code:
var AudioContext = require('web-audio-api').AudioContext;
var Speaker = require('speaker');
var fs = require('fs');
var track1 = './tracks/1.wav';
var track2 = './tracks/1.wav';
var context = new AudioContext();
context.outStream = new Speaker({
channels: context.format.numberOfChannels,
bitDepth: context.format.bitDepth,
sampleRate: context.format.sampleRate
});
function play(audioBuffer) {
if (!audioBuffer) { return; }
var bufferSource = context.createBufferSource();
bufferSource.connect(context.destination);
bufferSource.buffer = audioBuffer;
bufferSource.loop = false;
bufferSource.start(0);
}
var audioData1 = fs.readFileSync(track1);
var audioData2 = fs.readFileSync(track2);
var audioBuffer1, audioBuffer2;
context.decodeAudioData(audioData1, function(audioBuffer) {
audioBuffer1 = audioBuffer;
if (audioBuffer1 && audioBuffer2) { playBoth(); }
});
context.decodeAudioData(audioData2, function(audioBuffer) {
audioBuffer2 = audioBuffer;
if (audioBuffer1 && audioBuffer2) { playBoth(); }
});
function playBoth() {
console.log('playing...');
play(audioBuffer1);
play(audioBuffer2);
}

audio quality is very low, with a lot of distortions
According to the WebAudio spec (https://webaudio.github.io/web-audio-api/#SummingJunction):
No clipping is applied at the inputs or outputs of the AudioNode to allow a maximum of dynamic range within the audio graph.
Now if you're playing two audio streams, it's possible that summing them results in a value that's beyond the acceptable range, which sounds like - distortions.
Try lowering the volume of each audio stream by first piping them through a GainNode as so:
function play(audioBuffer) {
if (!audioBuffer) { return; }
var bufferSource = context.createBufferSource();
var gainNode = context.createGain();
gainNode.gain.value = 0.5 // for instance, find a good value
bufferSource.connect(gainNode);
gainNode.connect(context.destination);
bufferSource.buffer = audioBuffer;
bufferSource.loop = false;
bufferSource.start(0);
}
Alternatively, you could use a DynamicsCompressorNode, but manually setting the gain gives you more control over the output.

This isn't exactly answer-worthy but I can't post comments at the moment ><
I had a similar problem with an app made using js audio api and the, rather easy fix, was lowering the quality of the audio and changing the format.
In your case what I could think of is setting the bit depth&sampling frequency as low as possible without affecting the listener's experience (e.g. 44.1kHz and 16 bit depth).
You might also try changing the format, wav, in theory, should be quite good at the job of not being CPU intensive, however, there are other uncompressed formats (e.g. .aiff)
You may try using multiple cores of the pi:
https://nodejs.org/api/cluster.html
Although this may prove a bit complicated, if you are doing the audio-streaming in parallel with other unrelated processes, you could try moving the audio on a separate CPU.
An (easy) thing you could try would be running node with more RAM, although, in your case, I doubt that I possible.
The biggest problem, however, might be the code, sadly enough I am not experienced with the modules you are using and as such can give to real advice on that (hence, why I said this is not answer worthy :p)

when you create Speaker instant, set parameter like this
channels = 1 // you can try with 1 or 2 and get the best quantity
bitDepth = 16
sampleRate = 48000 // normally 44100 for speaking and higher for music playing

You can spawn from node 2 aplay processes each playing one file. Use detached: true to allow node to continue running.

Related

How to stream PCM audio on HTML without lag?

The PCM audio data is captured in Unity3D in real time. All those data will be streaming to HTML via WebSockets. The general setup is Socket.IO with node.js server.
My major task is adding smooth audio playback for live video+audio streaming solution on All platform. This is my working progress(video streaming): https://youtu.be/82_-a7WF3vs
The audio & video streaming part works well on non-html/non-WebGL platforms.
However, I couldn't make smooth audio playback on html with javascript. It runs real-time but I found some lagging issue like noise...
One of my concern is that Web Browsers do not support multi-threading, it added some lag when receiving streaming data and playback at the same time.
below is my core script for PCM playback. Hope someone can help me improve it.
var startTime = 0;
var audioCtx = new AudioContext();
function ProcessAudioData(_byte) {
ReadyToGetFrame_aud = false;
//read meta data
SourceSampleRate = ByteToInt32(_byte, 0);
SourceChannels = ByteToInt32(_byte, 4);
//conver byte[] to float
var BufferData = _byte.slice(8, _byte.length);
AudioFloat = new Float32Array(BufferData.buffer);
//=====================playback=====================
if(AudioFloat.length > 0) StreamAudio(SourceChannels, AudioFloat.length, SourceSampleRate, AudioFloat);
//=====================playback=====================
ReadyToGetFrame_aud = true;
}
function StreamAudio(NUM_CHANNELS, NUM_SAMPLES, SAMPLE_RATE, AUDIO_CHUNKS) {
var audioBuffer = audioCtx.createBuffer(NUM_CHANNELS, (NUM_SAMPLES / NUM_CHANNELS), SAMPLE_RATE);
for (var channel = 0; channel < NUM_CHANNELS; channel++) {
// This gives us the actual ArrayBuffer that contains the data
var nowBuffering = audioBuffer.getChannelData(channel);
for (var i = 0; i < NUM_SAMPLES; i++) {
var order = i * NUM_CHANNELS + channel;
nowBuffering[i] = AUDIO_CHUNKS[order];
}
}
var source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioCtx.destination);
source.start(startTime);
startTime += audioBuffer.duration;
}
How to stream PCM audio on HTML without lag?
There is always some lag with digital audio, no matter what you do. This has nothing to do with the web browser itself.
All those data will be streaming to HTML via WebSockets.
Why? The data is only going one direction so you can use a regular HTTP response and not have to worry about the overhead of Web Sockets.
One of my concern is that Web Browsers do not support multi-threading
This isn't really accurate.
It runs real-time but I found some lagging issue like noise...
What your code appears to do is take a PCM frame it receives and play it immediately. This isn't good, as the sound is wrecked if you don't play your received buffers contiguously. You must take the data and schedule it to play immediately after the current data is finished, and not a sample early or too late.
Traditionally this means doing your own buffering and setting up a ScriptProcessorNode to read from those buffers. However, this also requires some DIY resampling because the encoded rate may not be the same as the playback rate.
These days, I think that MediaSource Extensions supports PCM decoding, so you can just pipe your data through that and let the underlying system do all the work for you.

rewriting Java code to JS - creating an audio from bytes?

I'm trying to rewrite some (very simple) android code I found written in Java into a static HTML5 app (I don't need a server to do anything, I'd like to keep it that way). I have extensive background in web development, but basic understanding of Java, and even less knowledge in Android development.
The only function of the app is to take some numbers and convert them into an audio chirp from bytes. I have absolutely no problem translating the mathematical logic into JS. Where I'm having trouble is when it gets to actually producing the sound. This is the relevant parts of the original code:
import android.media.AudioFormat;
import android.media.AudioManager;
import android.media.AudioTrack;
// later in the code:
AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize, AudioTrack.MODE_STATIC);
// some math, and then:
track.write(sound, 0, sound.length); // sound is an array of bytes
How do I do this in JS? I can use a dataURI to produce the sound from the bytes, but does that allow me to control the other information here (i.e., sample rate, etc.)? In other words: What's the simplest, most accurate way to do this in JS?
update
I have been trying to replicate what I found in this answer. This is the relevant part of my code:
window.onload = init;
var context; // Audio context
var buf; // Audio buffer
function init() {
if (!window.AudioContext) {
if (!window.webkitAudioContext) {
alert("Your browser does not support any AudioContext and cannot play back this audio.");
return;
}
window.AudioContext = window.webkitAudioContext;
}
context = new AudioContext();
}
function playByteArray( bytes ) {
var buffer = new Uint8Array( bytes.length );
buffer.set( new Uint8Array(bytes), 0 );
context.decodeAudioData(buffer.buffer, play);
}
function play( audioBuffer ) {
var source = context.createBufferSource();
source.buffer = audioBuffer;
source.connect( context.destination );
source.start(0);
}
However, when I run this I get this error:
Uncaught (in promise) DOMException: Unable to decode audio data
Which I find quite extraordinary, as it's such a general error it manages to beautifully tell me exactly squat about what is wrong. Even more surprising, when I debugged this step by step, even though the chain of the errors starts (expectedly) with the line context.decodeAudioData(buffer.buffer, play); it actually runs into a few more lines within the jQuery file (3.2.1, uncompressed), going through lines 5208, 5195, 5191, 5219, 5223 and lastly 5015 before erroring out. I have no clue why jQuery has anything to do with it, and the error gives me no idea what to try. Any ideas?
If bytes is an ArrayBuffer it is not necessary to create a Uint8Array. You can pass ArrayBuffer bytes as parameter to AudioContext.decodeAudioData() which returns a Promise, chain .then() to .decodeAudioData(), call with play function as parameter.
At javascript at stacksnippets, <input type="file"> element is used to accept upload of audio file, FileReader.prototype.readAsArrayBuffer() creates ArrayBuffer from File object, which is passed to playByteArray.
window.onload = init;
var context; // Audio context
var buf; // Audio buffer
var reader = new FileReader(); // to create `ArrayBuffer` from `File`
function init() {
if (!window.AudioContext) {
if (!window.webkitAudioContext) {
alert("Your browser does not support any AudioContext and cannot play back this audio.");
return;
}
window.AudioContext = window.webkitAudioContext;
}
context = new AudioContext();
}
function handleFile(file) {
console.log(file);
reader.onload = function() {
console.log(reader.result instanceof ArrayBuffer);
playByteArray(reader.result); // pass `ArrayBuffer` to `playByteArray`
}
reader.readAsArrayBuffer(file);
};
function playByteArray(bytes) {
context.decodeAudioData(bytes)
.then(play)
.catch(function(err) {
console.error(err);
});
}
function play(audioBuffer) {
var source = context.createBufferSource();
source.buffer = audioBuffer;
source.connect(context.destination);
source.start(0);
}
<input type="file" accepts="audio/*" onchange="handleFile(this.files[0])" />
I solved it myself. I read more into the MDN docs explaining AudioBuffer and realized two important things:
I didn't need to decodeAudioData (since I'm creating the data myself, there's nothing to decode). I actually took that bit from the answer I was replicating and it retrospect, it was entirely needless.
Since I'm working with a 16 Bit PCM stereo, that meant I needed to use the Float32Array (2 Channels, each 16 Bit).
Granted, I still had a problem with some of my calculations that resulted in a distorted sound, but as far as producing the sound itself, I ended up doing this really simple solution:
function playBytes(bytes) {
var floats = new Float32Array(bytes.length);
bytes.forEach(function( sample, i ) {
floats[i] = sample / 32767;
});
var buffer = context.createBuffer(1, floats.length, 48000),
source = context.createBufferSource();
buffer.getChannelData(0).set(floats);
source.buffer = buffer;
source.connect(context.destination);
source.start(0);
}
I can probably optimize it a bit further - the 32767 part should happen before this, in the part where I'm calculating the data, for example. Also, I'm creating a Float32Array with two channels, then outputting one of them cause I really don't need both. I couldn't figure out if there's a way to create one channel mono file with Int16Array, or if that's even necessary\better.
Anyway, that's essentially it. It's really just the most basic solution, with some minimal understanding on my part of how to handle my data correctly. Hope this helps anyone out there.

Concerning Web Audio nodes, what does .connect() do?

Trying to follow the example here, which is basically a c&p of this
Think I got most of the parts down, except all the node.connect()'s
From what I understand, this sequence of code is needed to provide the audio analyzer with an audio stream:
var source = audioCtx.createMediaStreamSource(stream);
source.connect(analyser);
analyser.connect(audioCtx.destination);
I can't seem to make sense of it as it looks rather ouroboros-y to me.
And unfortunately, I can't seem to find any documentation on .connect() so quite lost and would appreciate any clarification!
Oh and I'm loading an .mp3 via pure javascript new Audio('db.mp3').play(); and am trying to use that as the source without creating an <audio> element.
Can a mediaStream object be created from this to feed into .createMediaStreamSource(stream)?
connect simply defines the output for the filters.
In this case, your source loads the stream into the buffer and writes to the input of the next filter which is defined by the connect function. This is repeated for your analyser filter.
Think of it as pipes.
here is a sample code snippet that I have written a few years back using web audio api.
this.scriptProcessor = this.audioContext.createScriptProcessor(this.scriptProcessorBufferSize,
this.scriptProcessorInputChannels,
this.scriptProcessorOutputChannels);
this.scriptProcessor.connect(this.audioContext.destination);
this.scriptProcessor.onaudioprocess = updateMediaControl.bind(this);
//Set up the Gain Node with a default value of 1(max volume).
this.gainNode = this.audioContext.createGain();
this.gainNode.connect(this.audioContext.destination);
this.gainNode.gain.value = 1;
sewi.AudioResourceViewer.prototype.playAudio = function(){
if(this.audioBuffer){
this.source = this.audioContext.createBufferSource();
this.source.buffer = this.audioBuffer;
this.source.connect(this.gainNode);
this.source.connect(this.scriptProcessor);
this.beginTime = Date.now();
this.source.start(0, this.offset);
this.isPlaying = true;
this.controls.update({playing: this.isPlaying});
updateGraphPlaybackPosition.call(this, this.offset);
}
};
So as you can see that my source is connected to a gainNode, which is connected to a scriptProcessor. When the audio starts playing, the data is passed from the source->gainNode->destination and source->scriptProcessor->destination. flowing through the "pipes" that connects them, which is defined by connect(). When the audio data pass through the gainNode, volume can be adjusted by changing the amplitude of the audio wave. After that it is passed to the script processor so that events can be attached and triggered while the audio is being processed.

how to export last 3s data of a web audio stream

Question: I am using web audio API. I need to buffer a non-stop audio stream, like a radio stream. and when I get a notification, I need to get the past 3s audio data and send it to server. How can I do achieve that? nodejs has a built in buffer, but it seems not a circular buffer, if I write a non-stop stream into it, it seems to be overflowed.
Background to help u understand my question:
I am implementing an ambient audio based web authentication method. Briefly, I need to compare two pieces of audio signal (one from the client, and one from the anchor device, they are all time synced with server), if they are similar enough, the authentication request will be approved by the server. The audio recording is implemented on both the client and the anchor device using web Audio API.
I need to manage a buffer on the anchor device to stream the ambient audio. The anchor device is supposed to be running all the time, so the stream is not going to be ended.
You can capture the audio from a stream using the ScriptProcessorNode. Whilst this is deprecated no browser as of now actually implements the new AudioWorker.
var N = 1024;
var time = 3; // Desired time of capture;
var frame_holder = [];
var time_per_frame = N / context.sampleRate;
var num_frames = Math.ceil(time / time_per_frame); // Minimum number to meet time
var script = context.createScriptProcessor(N,1,1);
script.connect(context.destination);
script.onaudioprocess = function(e) {
var input = e.inputBuffer.getChannelData(0);
var output = e.outputBuffer.getChannelData(0);
var copy = new Float32Array(input.length);
for (var n=0; n<input.length; n++) {
output[n] = 0.0; // Null this as I guess you are capturing microphone
copy[n] = input[n];
}
// Now we need to see if we have more than 3s worth of frames
if (frame_holder.length > num_frames) {
frame_holder = frame_holder.slice(frame_holder.length-num_frames);
}
// Add in the current frame
var temp = frame_holder.slice(1); // Cut off first frame;
frame_holder = temp.concat([copy]); // Add the latest frame
}
Then for actual transmission, you just need to string the copied frames together. It is easier than trying to keep one long array though of course that is also possible.

Web Audio Offline Context and Analyser Node

Is it possible to use the Analyser node in the offlineAudioContext to do frequency analysis?
I found out that ScriptProcessor 's onaudioprocess event still fires in the offlineAudioContext and this was the only event source I could use to call getByteFrequencyData of the Analyser Node. As below:
var offline = new offlineAudioContext(1, buffer.length, 44100);
var bufferSource = offline.createBufferSource();
bufferSource.buffer = buffer;
var analyser = offline.createAnalyser();
var scp = offline.createScriptProcessor(256, 0, 1);
bufferSource.connect(analyser);
scp.connect(offline.destination); // this is necessary for the script processor to start
var freqData = new Uint8Array(analyser.frequencyBinCount);
scp.onaudioprocess = function(){
analyser.getByteFrequencyData(freqData);
console.log(freqData);
// freqData is always the same.
};
bufferSource.start(0);
offline.startRendering();
The problem here is that freqData array is always the same and never changes. Seem like as if it is only analysing one section of the buffer.
Am I doing anything wrong here? Or the Analyser is not intended to be used in the offlineContext.
And here is the fiddle with the same code.
An alternative is to use the suspend and resume methods available for an OfflineAudioContext. After suspending, you can use your analyser node to get the desired time and/or frequency domain data. Since the context is stopped, this works just fine. If you're going to do this several times, you'll need to schedule a stop for the next time before resuming.
Unfortunately, AFAIK, only Chrome has implemented suspend for an OfflineAudioContext
The analyser isn't really intended to be used in the offlineContext; in fact, it was originally named "RealtimeAnalyser". But even more importantly, right now you won't get consistent functionality from script processors in offline contexts, either.

Categories

Resources