I want create live audio streaming by websockets.
This what works for me is listening microphone, create PCM, load PCM to BufferSource and playback.
This what doesn't work is send PCM by websockets or other protocol.
I get microphone permission by:
navigator.getUserMedia({audio: true}, initializeRecorder, errorCallback);
Then record microphone and create PCM:
function initializeRecorder(MediaStream) {
var sourceNode = audioCtx.createMediaStreamSource(MediaStream);
var recorder = audioCtx.createScriptProcessor(2048, 2, 2);
recorder.onaudioprocess = recorderProcess;
sourceNode.connect(recorder);
recorder.connect(audioCtx.destination);
}
Later I push every PCM to BufferSource and playback:
function recorderProcess(e) {
var buff = e.inputBuffer;
var source = audioCtx.createBufferSource();
source.buffer = buff;
source.connect(audioCtx.destination);
source.start();
}
It works well.
But if I want send var buff to server, I get for every PCM.
This PCM is Float32Array type and I don't understand why I can't send it like it is.
I can convert this to UInt16 by script:
function convertFloat32ToInt16(buffer) {
var l = buffer.length;
var buf = new Int16Array(l);
while(l--) {
buf[l] = Math.min(1, buffer[l]) * 0x7FFF;
}
return buf.buffer;
}
But I don't know how to decode this later for Float32Array back to push to BufferSource on another client.
You can I think,
whats the problem if you try:
var mybuffer = new Float32Array(pcmDataFromSocket);
Related
I would like to play audio from a web socket that sends packages of sound data of unknown total length. The playback should start as soon as the first package arrives and it should not be interrupted by new packages.
What I have done so far:
ws.onmessage = e => {
const soundDataBase64 = JSON.parse(e.data);
const bytes = window.atob(soundDataBase64);
const arrayBuffer = new window.ArrayBuffer(bytes.length);
const bufferView = new window.Uint8Array(arrayBuffer);
for (let i = 0; i < bytes.length; i++) {
bufferView[i] = bytes.charCodeAt(i);
}
const blob = new Blob([arrayBuffer], {"type": "audio/mp3"});
const objectURL = window.URL.createObjectURL(blob);
const audio = document.createElement("audio");
audio.src = objectURL;
audio.controls = "controls";
document.body.appendChild(audio);
};
However, to my knowledge, it is not possible to extend the size of ArrayBuffer and Uint8Array. I would have to create a new blob, object URL and assign it to the audio element. But I guess, this would interrupt the audio playback.
On the MDN page of <audio>, there is a hint to MediaStream, which looks promising. However, I am not quite sure how to write data onto a media stream and how to connect the media stream to an audio element.
Is it currently possible with JS to write something like pipe where I can input data on one end, which is then streamed to a consumer? How would seamless streaming be achieved in JS (preferably without a lot of micro management code)?
As #Kaiido pointed out in the comments, I can use the MediaSource object. After connecting a MediaSource object to an <audio> element in the DOM, I can add a SourceBuffer to an opened MediaSource object and then append ArrayBuffers to the SourceBuffer.
Example:
const ws = new window.WebSocket(url);
ws.onmessage = _ => {
console.log("Media source not ready yet... discard this package");
};
const mediaSource = new window.MediaSource();
const audio = document.createElement("audio");
audio.src = window.URL.createObjectURL(mediaSource);
audio.controls = true;
document.body.appendChild(audio);
mediaSource.onsourceopen = _ => {
const sourceBuffer = mediaSource.addSourceBuffer("audio/mpeg"); // mpeg appears to not work in Firefox, unfortunately :(
ws.onmessage = e => {
const soundDataBase64 = JSON.parse(e.data);
const bytes = window.atob(soundDataBase64);
const arrayBuffer = new window.ArrayBuffer(bytes.length);
const bufferView = new window.Uint8Array(arrayBuffer);
for (let i = 0; i < bytes.length; i++) {
bufferView[i] = bytes.charCodeAt(i);
}
sourceBuffer.appendBuffer(arrayBuffer);
};
};
I tested this successfully in Google Chrome 94. Unfortunately, in Firefox 92, the MIME type audio/mpeg seems not working. There, I get the error Uncaught DOMException: MediaSource.addSourceBuffer: Type not supported in MediaSource and the warning Cannot play media. No decoders for requested formats: audio/mpeg.
I am using following javascript to record audio and send it to a websocket server:
const recordAudio = () =>
new Promise(async resolve => {
const constraints = {
audio: {
sampleSize: 16,
channelCount: 1,
sampleRate: 8000
},
video: false
};
var mediaRecorder;
const stream = await navigator.mediaDevices.getUserMedia(constraints);
var options = {
audioBitsPerSecond: 128000,
mimeType: 'audio/webm;codecs=pcm'
};
mediaRecorder = new MediaRecorder(stream, options);
var track = stream.getAudioTracks()[0];
var constraints2 = track.getConstraints();
var settings = track.getSettings();
const audioChunks = [];
mediaRecorder.addEventListener("dataavailable", event => {
audioChunks.push(event.data);
webSocket.send(event.data);
});
const start = () => mediaRecorder.start(30);
const stop = () =>
new Promise(resolve => {
mediaRecorder.addEventListener("stop", () => {
const audioBlob = new Blob(audioChunks);
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
const play = () => audio.play();
resolve({
audioBlob,
audioUrl,
play
});
});
mediaRecorder.stop();
});
resolve({
start,
stop
});
});
This is for realtime STT and the websocket server refused to send any response. I checked by debugging that the sampleRate is not changing to 8Khz.Upon researching, I found out that this is a known bug on both chrome and firefox. I found some other resources like stackoverflow1 and IBM_STT but I have no idea on how to adapt it to my code.
The above helpful resources refers to buffer but all i have is mediaStream(stream) and event.data(blob) in my code.
I am new to both javascript and Audio Api, so please pardon me if i did something wrong.
If this helps, I have an equivalent code of python to send data from mic to websocket server which works. Library used = Pyaudio. Code :
p = pyaudio.PyAudio()
stream = p.open(format="pyaudio.paInt16",
channels=1,
rate= 8000,
input=True,
frames_per_buffer=10)
print("* recording, please speak")
packet_size = int((30/1000)*8000) # normally 240 packets or 480 bytes
frames = []
#while True:
for i in range(0, 1000):
packet = stream.read(packet_size)
ws.send(packet, binary=True)
To do realtime downsampling follow these steps:
First get stream instance using this:
const stream = await navigator.mediaDevices.getUserMedia(constraints);
Create media stream source from this stream.
var input = audioContext.createMediaStreamSource(stream);
Create script Processor so that you can play with buffers. I am going to create a script processor which takes 4096 samples from the stream at a time, continuously, has 1 input channel and 1 output channel.
var scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
Connect your input with scriptNode. You can connect script Node to the destination as per your requirement.
input.connect(scriptNode);
scriptNode.connect(audioContext.destination);
Now there is a function onaudioprocess in scriptProcessor where you can do whatever you want with 4096 samples. var downsample will contain (1/sampling ratio) number of packets. floatTo16BitPCM will convert that to your required format since the original data is in 32 bit float format.
var inputBuffer = audioProcessingEvent.inputBuffer;
// The output buffer contains the samples that will be modified and played
var outputBuffer = audioProcessingEvent.outputBuffer;
// Loop through the output channels (in this case there is only one)
for (var channel = 0; channel < outputBuffer.numberOfChannels; channel++) {
var inputData = inputBuffer.getChannelData(channel);
var outputData = outputBuffer.getChannelData(channel);
var downsampled = downsample(inputData);
var sixteenBitBuffer = floatTo16BitPCM(downsampled);
}
Your sixteenBitBuffer will contain the data you require.
Functions for downsampling and floatTo16BitPCM are explained in this link of Watson API:IBM Watson Speech to Text Api
You won't need MediaRecorder instance. Watson API is opensource and you can look for a better streamline approach on how they implemented it for their use case. You should be able to salvage important functions from their code.
I am receiving a raw data wave file from server, and I need to play this array of bytes on the client side.
I tried to use decodeAudioData like in this link but i got the error :
DOMException : Unable to decode Audio data.
It is logical because my raw data is not a regular mp3 file, it is a wave that needs to be played with 8000Hz rate and 1 channel and 16bits per sample.
Is there a function to play a byte array received from server with a certain rate and a number of channels
I managed to play the bytes on browser using this method :
function playWave(byteArray) {
var audioCtx = new (window.AudioContext || window.webkitAudioContext)();
var myAudioBuffer = audioCtx.createBuffer(1, byteArray.length, 8000);
var nowBuffering = myAudioBuffer.getChannelData(0);
for (var i = 0; i < byteArray.length; i++) {
nowBuffering[i] = byteArray[i];
}
var source = audioCtx.createBufferSource();
source.buffer = myAudioBuffer;
source.connect(audioCtx.destination);
source.start();
}
I have the audio buffer of a prerecorded audio file in my application.
I'm trying to get the frequency domain data of the ENTIRE audio track, this is what I've tried:
getAudioDataFromBuffer: function(buf){
var src = g.audioContext.createBufferSource();
src.buffer = buf;
var anal = src.context.createAnalyser();
src.connect(anal);
var dataArray = new Uint8Array(buf.length);
anal.fftSize = 2048;
anal.getByteFrequencyData(dataArray);
return dataArray;
},
But this only gives me an array full of zeros.
I need this to compare two audio tracks, one is prerecorded and the other is recorded in the application. I'm thinking I could measure the correlation between their frequency domains.
I arrived to the solution seeing this answer and this discussion.
Basically you need to use an OfflineAudioContext. Here the code staring from an already loaded audio buffer:
var offline = new OfflineAudioContext(2, buffer.length ,44100);
var bufferSource = offline.createBufferSource();
bufferSource.buffer = buffer;
var analyser = offline.createAnalyser();
var scp = offline.createScriptProcessor(256, 0, 1);
bufferSource.connect(analyser);
scp.connect(offline.destination); // this is necessary for the script processor to start
var freqData = new Uint8Array(analyser.frequencyBinCount);
scp.onaudioprocess = function(){
analyser.getByteFrequencyData(freqData);
console.log(freqData);
};
bufferSource.start(0);
offline.oncomplete = function(e){
console.log('analysed');
};
offline.startRendering();
Here's a working example using the latest version of the Web Audio API:
Note: You need to start with an audioBuffer.. you can get one using the new File System Access API:
const [fileHandle] = await window.showOpenFilePicker();
const file = await fileHandle.getFile();
const arrayBuffer = await file.arrayBuffer();
const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
const audioBuffer = await audioCtx.decodeAudioData(arrayBuffer);
Once you have the audioBuffer, you can access it's contents using offlineAudioContext:
const offlineAudioContext = new OfflineAudioContext(
audioBuffer.numberOfChannels,
audioBuffer.length,
audioBuffer.sampleRate
);
const bufferSourceNode = offlineAudioContext.createBufferSource();
bufferSourceNode.start(0);
offlineAudioContext
.startRendering()
.then(renderedBuffer => {
const data = renderedBuffer.getChannelData(0);
for (let i = 0, length = data.length; i < length; i += 1) {
// careful here, as you can hang the browser by logging this data
// because 1 second of audio contains 22k ~ 96k samples!
if (!(i % 1000) && i < 250000) console.log(data[i]);
}
}
I think you need more something like
AudioBuffer.getChannelData()
Returns a Float32Array containing the PCM data associated with the channel, defined by the channel parameter (with 0 representing the first channel).
Lookup at Mozilla or W3C documentation.
Cheers
Kilian
I tried concatenating audio blobs using Web RTC experiment by Muaz Khan, but when I play the concatenated audio, the HTML audio element does not show the full length of the audio file and also if you download and play, the issue will persist. I used ffmpeg to concate these blobs, though is there a way which can be used for concatenating audio blobs using the Web RTC js experiment by Muaz Khan. A similar attempt which also did not work out : Combine two audio blob recordings
The best way is to convert the blobs into AudioBuffers (Convert blob into ArrayBuffer using FileReader and then decode those arrayBuffers into AudioBuffers). You can then merge/combine more than one AudioBuffers and get the resultant.
Following code will work in such situation:
var blob="YOUR AUDIO BLOB";
var f = new FileReader();
f.onload = function (e) {
audioContext.decodeAudioData(e.target.result, function (buffer) {
arrayBuffer.push(buffer);
if (arrayBuffer.length > 1) {
resultantbuffer = appendBuffer(arrayBuffer[0], arrayBuffer[1]);
arrayBuffer = [];
arrayBuffer.push(resultantbuffer);
}
else {
resultantbuffer = buffer;
}
}, function (e) {
console.warn(e);
});
};
f.readAsArrayBuffer(blob);
This code read the blob and convert into arrayBuffer (e.target.result) and decode those buffers into AudioBuffers (buffer). I used appendBuffer method for appending more than one audioBuffers. Here is the method:
function appendBuffer(buffer1, buffer2) {
///Using AudioBuffer
var numberOfChannels = Math.min(buffer1.numberOfChannels, buffer2.numberOfChannels);
var tmp = recordingAudioContext.createBuffer(numberOfChannels, (buffer1.length + buffer2.length), buffer1.sampleRate);
for (var i = 0; i < numberOfChannels; i++) {
var channel = tmp.getChannelData(i);
channel.set(buffer1.getChannelData(i), 0);
channel.set(buffer2.getChannelData(i), buffer1.length);
}
return tmp;
}
Do let me know if you face any problem.