I have the audio buffer of a prerecorded audio file in my application.
I'm trying to get the frequency domain data of the ENTIRE audio track, this is what I've tried:
getAudioDataFromBuffer: function(buf){
var src = g.audioContext.createBufferSource();
src.buffer = buf;
var anal = src.context.createAnalyser();
src.connect(anal);
var dataArray = new Uint8Array(buf.length);
anal.fftSize = 2048;
anal.getByteFrequencyData(dataArray);
return dataArray;
},
But this only gives me an array full of zeros.
I need this to compare two audio tracks, one is prerecorded and the other is recorded in the application. I'm thinking I could measure the correlation between their frequency domains.
I arrived to the solution seeing this answer and this discussion.
Basically you need to use an OfflineAudioContext. Here the code staring from an already loaded audio buffer:
var offline = new OfflineAudioContext(2, buffer.length ,44100);
var bufferSource = offline.createBufferSource();
bufferSource.buffer = buffer;
var analyser = offline.createAnalyser();
var scp = offline.createScriptProcessor(256, 0, 1);
bufferSource.connect(analyser);
scp.connect(offline.destination); // this is necessary for the script processor to start
var freqData = new Uint8Array(analyser.frequencyBinCount);
scp.onaudioprocess = function(){
analyser.getByteFrequencyData(freqData);
console.log(freqData);
};
bufferSource.start(0);
offline.oncomplete = function(e){
console.log('analysed');
};
offline.startRendering();
Here's a working example using the latest version of the Web Audio API:
Note: You need to start with an audioBuffer.. you can get one using the new File System Access API:
const [fileHandle] = await window.showOpenFilePicker();
const file = await fileHandle.getFile();
const arrayBuffer = await file.arrayBuffer();
const audioCtx = new (window.AudioContext || window.webkitAudioContext)();
const audioBuffer = await audioCtx.decodeAudioData(arrayBuffer);
Once you have the audioBuffer, you can access it's contents using offlineAudioContext:
const offlineAudioContext = new OfflineAudioContext(
audioBuffer.numberOfChannels,
audioBuffer.length,
audioBuffer.sampleRate
);
const bufferSourceNode = offlineAudioContext.createBufferSource();
bufferSourceNode.start(0);
offlineAudioContext
.startRendering()
.then(renderedBuffer => {
const data = renderedBuffer.getChannelData(0);
for (let i = 0, length = data.length; i < length; i += 1) {
// careful here, as you can hang the browser by logging this data
// because 1 second of audio contains 22k ~ 96k samples!
if (!(i % 1000) && i < 250000) console.log(data[i]);
}
}
I think you need more something like
AudioBuffer.getChannelData()
Returns a Float32Array containing the PCM data associated with the channel, defined by the channel parameter (with 0 representing the first channel).
Lookup at Mozilla or W3C documentation.
Cheers
Kilian
Related
I would like to play audio from a web socket that sends packages of sound data of unknown total length. The playback should start as soon as the first package arrives and it should not be interrupted by new packages.
What I have done so far:
ws.onmessage = e => {
const soundDataBase64 = JSON.parse(e.data);
const bytes = window.atob(soundDataBase64);
const arrayBuffer = new window.ArrayBuffer(bytes.length);
const bufferView = new window.Uint8Array(arrayBuffer);
for (let i = 0; i < bytes.length; i++) {
bufferView[i] = bytes.charCodeAt(i);
}
const blob = new Blob([arrayBuffer], {"type": "audio/mp3"});
const objectURL = window.URL.createObjectURL(blob);
const audio = document.createElement("audio");
audio.src = objectURL;
audio.controls = "controls";
document.body.appendChild(audio);
};
However, to my knowledge, it is not possible to extend the size of ArrayBuffer and Uint8Array. I would have to create a new blob, object URL and assign it to the audio element. But I guess, this would interrupt the audio playback.
On the MDN page of <audio>, there is a hint to MediaStream, which looks promising. However, I am not quite sure how to write data onto a media stream and how to connect the media stream to an audio element.
Is it currently possible with JS to write something like pipe where I can input data on one end, which is then streamed to a consumer? How would seamless streaming be achieved in JS (preferably without a lot of micro management code)?
As #Kaiido pointed out in the comments, I can use the MediaSource object. After connecting a MediaSource object to an <audio> element in the DOM, I can add a SourceBuffer to an opened MediaSource object and then append ArrayBuffers to the SourceBuffer.
Example:
const ws = new window.WebSocket(url);
ws.onmessage = _ => {
console.log("Media source not ready yet... discard this package");
};
const mediaSource = new window.MediaSource();
const audio = document.createElement("audio");
audio.src = window.URL.createObjectURL(mediaSource);
audio.controls = true;
document.body.appendChild(audio);
mediaSource.onsourceopen = _ => {
const sourceBuffer = mediaSource.addSourceBuffer("audio/mpeg"); // mpeg appears to not work in Firefox, unfortunately :(
ws.onmessage = e => {
const soundDataBase64 = JSON.parse(e.data);
const bytes = window.atob(soundDataBase64);
const arrayBuffer = new window.ArrayBuffer(bytes.length);
const bufferView = new window.Uint8Array(arrayBuffer);
for (let i = 0; i < bytes.length; i++) {
bufferView[i] = bytes.charCodeAt(i);
}
sourceBuffer.appendBuffer(arrayBuffer);
};
};
I tested this successfully in Google Chrome 94. Unfortunately, in Firefox 92, the MIME type audio/mpeg seems not working. There, I get the error Uncaught DOMException: MediaSource.addSourceBuffer: Type not supported in MediaSource and the warning Cannot play media. No decoders for requested formats: audio/mpeg.
I am using cordova-plugin-audioinput for recording audio in my cordova based app.
The documentation can be found here : https://www.npmjs.com/package/cordova-plugin-audioinput
I was previously using the MediaRecorder function of the browser to record audio but I switched to the plugin due to audio quality issues.
My problem is that I have a realtime visualizer of the volume during the record, my function used to work using an input stream from the media recorder
function wave(stream) {
audioContext = new AudioContext();
analyser = audioContext.createAnalyser();
microphone = audioContext.createMediaStreamSource(stream);
javascriptNode = audioContext.createScriptProcessor(2048, 1, 1);
analyser.smoothingTimeConstant = 0.8;
analyser.fftSize = 1024;
microphone.connect(analyser);
analyser.connect(javascriptNode);
javascriptNode.connect(audioContext.destination);
javascriptNode.onaudioprocess = function () {
var array = new Uint8Array(analyser.frequencyBinCount);
analyser.getByteFrequencyData(array);
var values = 0;
var length = array.length;
for (var i = 0; i < length; i++) {
values += (array[i]);
}
var average = values / length;
// use average for visualization
}
}
Now that I use the cordova-plugin-audioinput, I can't find a way to retrieve the stream from the microphone even though the documentation mention a "streamToWebAudio" parameter, I can't find a way to make it work.
Any insight on this ?
Thanks you in advance !
I believe you have to connect the analyser instead, such as
function wave(stream) {
var audioContext = new AudioContext();
var analyser = audioContext.createAnalyser();
analyser.connect(audioContext.destination);
audioinput.start({streamToWebAudio: true});
var dest = audioinput.getAudioContext().createMediaStreamDestination();
audioinput.connect(dest);
var stream = dest.stream;
var input = audioContext.createMediaStreamSource(stream);
input.connect(analyser);
analyser.onaudioprocess = function(){
...
}
}
As someone who stumbled upon this a few years later and wondered why there was an extra destination being made in the other answer, i now realise it's because Eric needed to get the input stream into the same AudioContext as the analyser.
Now, ignoring the fact that the spec for analyser has changed since the answer, and just focusing on getting the input stream into something useful. You could just pass the audiocontext into the audioinput config like so and save yourself a few steps
function wave(stream) {
var audioContext = new AudioContext();
var analyser = audioContext.createAnalyser();
analyser.connect(audioContext.destination);
audioinput.start({
streamToWebAudio: true,
audioContext: audioContext
});
audioinput.connect(analyser);
analyser.onaudioprocess = function(){
...
}
}
I am using following javascript to record audio and send it to a websocket server:
const recordAudio = () =>
new Promise(async resolve => {
const constraints = {
audio: {
sampleSize: 16,
channelCount: 1,
sampleRate: 8000
},
video: false
};
var mediaRecorder;
const stream = await navigator.mediaDevices.getUserMedia(constraints);
var options = {
audioBitsPerSecond: 128000,
mimeType: 'audio/webm;codecs=pcm'
};
mediaRecorder = new MediaRecorder(stream, options);
var track = stream.getAudioTracks()[0];
var constraints2 = track.getConstraints();
var settings = track.getSettings();
const audioChunks = [];
mediaRecorder.addEventListener("dataavailable", event => {
audioChunks.push(event.data);
webSocket.send(event.data);
});
const start = () => mediaRecorder.start(30);
const stop = () =>
new Promise(resolve => {
mediaRecorder.addEventListener("stop", () => {
const audioBlob = new Blob(audioChunks);
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
const play = () => audio.play();
resolve({
audioBlob,
audioUrl,
play
});
});
mediaRecorder.stop();
});
resolve({
start,
stop
});
});
This is for realtime STT and the websocket server refused to send any response. I checked by debugging that the sampleRate is not changing to 8Khz.Upon researching, I found out that this is a known bug on both chrome and firefox. I found some other resources like stackoverflow1 and IBM_STT but I have no idea on how to adapt it to my code.
The above helpful resources refers to buffer but all i have is mediaStream(stream) and event.data(blob) in my code.
I am new to both javascript and Audio Api, so please pardon me if i did something wrong.
If this helps, I have an equivalent code of python to send data from mic to websocket server which works. Library used = Pyaudio. Code :
p = pyaudio.PyAudio()
stream = p.open(format="pyaudio.paInt16",
channels=1,
rate= 8000,
input=True,
frames_per_buffer=10)
print("* recording, please speak")
packet_size = int((30/1000)*8000) # normally 240 packets or 480 bytes
frames = []
#while True:
for i in range(0, 1000):
packet = stream.read(packet_size)
ws.send(packet, binary=True)
To do realtime downsampling follow these steps:
First get stream instance using this:
const stream = await navigator.mediaDevices.getUserMedia(constraints);
Create media stream source from this stream.
var input = audioContext.createMediaStreamSource(stream);
Create script Processor so that you can play with buffers. I am going to create a script processor which takes 4096 samples from the stream at a time, continuously, has 1 input channel and 1 output channel.
var scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
Connect your input with scriptNode. You can connect script Node to the destination as per your requirement.
input.connect(scriptNode);
scriptNode.connect(audioContext.destination);
Now there is a function onaudioprocess in scriptProcessor where you can do whatever you want with 4096 samples. var downsample will contain (1/sampling ratio) number of packets. floatTo16BitPCM will convert that to your required format since the original data is in 32 bit float format.
var inputBuffer = audioProcessingEvent.inputBuffer;
// The output buffer contains the samples that will be modified and played
var outputBuffer = audioProcessingEvent.outputBuffer;
// Loop through the output channels (in this case there is only one)
for (var channel = 0; channel < outputBuffer.numberOfChannels; channel++) {
var inputData = inputBuffer.getChannelData(channel);
var outputData = outputBuffer.getChannelData(channel);
var downsampled = downsample(inputData);
var sixteenBitBuffer = floatTo16BitPCM(downsampled);
}
Your sixteenBitBuffer will contain the data you require.
Functions for downsampling and floatTo16BitPCM are explained in this link of Watson API:IBM Watson Speech to Text Api
You won't need MediaRecorder instance. Watson API is opensource and you can look for a better streamline approach on how they implemented it for their use case. You should be able to salvage important functions from their code.
I want create live audio streaming by websockets.
This what works for me is listening microphone, create PCM, load PCM to BufferSource and playback.
This what doesn't work is send PCM by websockets or other protocol.
I get microphone permission by:
navigator.getUserMedia({audio: true}, initializeRecorder, errorCallback);
Then record microphone and create PCM:
function initializeRecorder(MediaStream) {
var sourceNode = audioCtx.createMediaStreamSource(MediaStream);
var recorder = audioCtx.createScriptProcessor(2048, 2, 2);
recorder.onaudioprocess = recorderProcess;
sourceNode.connect(recorder);
recorder.connect(audioCtx.destination);
}
Later I push every PCM to BufferSource and playback:
function recorderProcess(e) {
var buff = e.inputBuffer;
var source = audioCtx.createBufferSource();
source.buffer = buff;
source.connect(audioCtx.destination);
source.start();
}
It works well.
But if I want send var buff to server, I get for every PCM.
This PCM is Float32Array type and I don't understand why I can't send it like it is.
I can convert this to UInt16 by script:
function convertFloat32ToInt16(buffer) {
var l = buffer.length;
var buf = new Int16Array(l);
while(l--) {
buf[l] = Math.min(1, buffer[l]) * 0x7FFF;
}
return buf.buffer;
}
But I don't know how to decode this later for Float32Array back to push to BufferSource on another client.
You can I think,
whats the problem if you try:
var mybuffer = new Float32Array(pcmDataFromSocket);
I'm attempting to use Aurora.JS to play audio received from a streaming AAC-encoded source. I'm successfully pulling chunked data, and trying to feed it into a custom emitter, but no audio is actually playing.
Maybe I'm missing something very simple. Here's a sample of what I'm trying to do:
http://jsfiddle.net/Rc6Su/4/
(You're almost certainly gonna get a CORS error when hitting "Play" because the source is cross-domain. The only way I can easily get around that is using this plugin: https://chrome.google.com/webstore/detail/allow-control-allow-origi/nlfbmbojpeacfghkpbjhddihlkkiljbi/related?hl=en)
Before you mention it, this is going into a PhoneGap app and so the cross-domain issue isn't going to be a problem.
The problem code is somewhere in here:
var aurora_source = null;
var player = null;
function make_noise(chunk) {
var uarr = (function (chunk) {
var buf = new ArrayBuffer(chunk.length * 2); // 2 bytes for each character
var bufView = new Uint8Array(buf);
for (var i=0, strLen=chunk.length; i<strLen; i++) {
bufView[i] = chunk.charCodeAt(i);
}
return buf;
})(chunk);
var abData = new AV.Buffer(uarr);
if (!aurora_source) {
var MySource = AV.EventEmitter.extend ({
start : function () {
this.emit('data', abData);
},
pause : function () {
},
reset : function () {
}
});
aurora_source = new MySource();
asset = new AV.Asset(aurora_source);
player = new AV.Player(asset);
player.play();
} else {
$("#debug").append("emit data");
$("#debug").append("\n");
aurora_source.emit('data', abData);
}
}
Could not get audio to play, but found at least that
bufView[i] = chunk.charCodeAt(i);
may have to be replaced by
bufView[i] = chunk.charCodeAt(i) & 0xff;
see What does charCodeAt(...) & 0xff accomplish?
hope it helps.