Encode AudioBuffer with Opus (or other codec) in Browser

Encode AudioBuffer with Opus (or other codec) in Browser - javascript

I am trying to stream Audio via Websocket.
I can get an AudioBuffer from the Microphone (or other Source) via Web-Audio-Api and stream the RAW-Audio-Buffer, but i think this would not be very efficient.
So i looked arround to encode the AudioBuffer somehow. - If the Opus-Codec would not be practicable,
i am open to alternatives and thankful for any hints in the right direction.
I have tried to use the MediaRecorder (from MediaStreamRecording-API) but it seems not possible to stream with that API, instead of plain recording.
Here is the Part how i get the RAW-AudioBuffer:
const handleSuccess = function(stream) {
const context = new AudioContext();
const source = context.createMediaStreamSource(stream);
const processor = context.createScriptProcessor(16384, 1, 1);
source.connect(processor);
processor.connect(context.destination);
processor.onaudioprocess = function(e) {
bufferLen = e.inputBuffer.length
const inputBuffer = new Float32Array(bufferLen);
e.inputBuffer.copyFromChannel(inputBuffer, 0);
let data_to_send = inputBuffer
//And send the Float32Array ...
}
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(handleSuccess);
So the Main Question is: How can i encode the AudioBuffer.
(and Decode it at the Receiver)
Is there an API or Library? Can i get the encoded Buffer from another API in the Browser?

The Web Audio API has a MediaStreamDestination node that will expose a .stream MediaStream that you can then pass through the WebRTC API.
But if you are only dealing with a microphone input, then pass directly that MediaStream to WebRTC, no need for the Web Audio step.
Ps: for the ones that only want to encode to opus, then MediaRecorder is currently the only native way. It will incur a delay, will generate a webm file, not only the raw data, and will process the data no faster than real-time.
Only other options now are to write your own encoders and run it in WabAssembly.
Hopefully in a near future, we'll have access to the WebCodecs API which should solve this use case among others.

Related

WebRTC MediaRecorder on remote stream cuts when the stream hangs

The Problem:
During a WebRTC unicast video conference, I can successfully stream video from a mobile device's webcam to a laptop/desktop. I would like to record the remote stream on the laptop/desktop side. (The setup is that a mobile device streams to a laptop/desktop).
However, it is usual for the video stream to hang from time to time. That's not a problem, for the "viewer" side will catch up. However, the recording of the remote stream will stop at the first hang.
Minimal and Removed Implementation (Local Recording):
I can successfully record the local stream from navigator.mediaDevices.getUserMedia() as follows:
const recordedChunks = [];
navigator.mediaDevices.getUserMedia({
video: true,
audio: false
}).then(stream => {
const localVideoElement = document.getElementById('local-video');
localVideoElement.srcObject = stream;
return stream;
}).then(stream => {
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = (event) => {
if(event.data && event.data.size > 0) {
recordedChunks.push(event.data);
}
};
mediaRecorder.start({ mimeType: 'video/webm;codecs=vp9' }, 10);
});
I can download this quite easily as follows:
const blob = new Blob(recordedChunks, { type: 'video/webm' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
document.body.appendChild(a);
a.style = 'display: none';
a.href = url;
a.download = 'test.webm';
a.click();
window.URL.revokeObjectURL(url);
Minimal and Removed Implementation (Remote Recording):
The setup I am using requires recording the remote stream, not the local stream, for IOS Safari does not support the MediaRecorder API. I included the above to show that the recording is working on the local side. The implementation of the remote stream recording is no different except I manually add a 0 Hz audio track to the video, for Chrome appears to have a bug where it won't record without an audio track.
const mediaStream = new MediaStream();
const audioContext = new AudioContext();
const destinationNode = audioContext.createMediaStreamDestination();
const oscillatorNode = audioContext.createOscillator();
oscillatorNode.frequency.setValueAtTime(0, audioContext.currentTime);
oscillatorNode.connect(destinationNode);
const audioTrack = destinationNode.stream.getAudioTracks()[0];
const videoTrack = remoteStream.getVideoTracks()[0]; // Defined somewhere else.
mediaStream.addTrack(videoTrack);
mediaStream.addTrack(audioTrack);
And then I perform the exact same operations that I do on the local stream example above to record the mediaStream variable.
As mentioned, at the first point where the remote stream hangs (due to network latency, perhaps), the remote recording ceases, such that on download, the duration of the .webm file converted to .mp4, via ffmpeg, is only as long as to where the first hang occurred.
Attempts to Mitigate:
One attempt to mitigate this issue I have tried is, rather than recording the remote stream that is attained in the callback for the ontrack event from WebRTC, I use the video stream from the remote video element instead, via remoteVideoElement.captureStream(). This does not work to fix the issue.
Any help would be much appreciated. Thank you.

Hopefully, someone is able to post an actual fix for you. In the mean time, a nasty, inefficient, totally-not-recommended workaround:
Route the incoming MediaStream to a video element.
Use requestAnimationFrame() to schedule drawing frames to a canvas. (Note that this removes any sense of genlock from the original video, and is not something you want to do. Unfortunately, we don't have a way of knowing when incoming frames occur, as far as I know.)
Use CanvasCaptureMediaStream as the video source.
Recombine the video track from CanvasCaptureMediaStream along with the audio track from the original MediaStream in a new MediaStream.
Use this new MediaStream for MediaRecorder.
I've done this with past projects where I needed to programatically manipulate the audio and video. It works!
One big caveat is that there's a bug in Chrome where even though a capture stream is attached to a canvas, the canvas won't be updated if the tab isn't active/visible. And, of course, requestAnimationFrame is severely throttled at best if the tab isn't active, so you need another frame clock source. (I used audio processors, ha!)

How do you combine many audio tracks into one for mediaRecorder API?

I want to make a recording where, I get multiple audio tracks from different mediaStream objects (some of them, remote). Use the getAudioTracks () method and add them to a mediaStream object using addTrack (). At the moment of passing this last object as a parameter for mediaRecorder I realize that it only records the audio track located in position [0]. That gives me to understand that mediaRecorder is capable of recording a track by type, is there any way to join these tracks into one to record them correctly using mediaRecorder? I would be grateful for any page that explains this if possible and if it exists

I was battling with this for a while and took me ages to realise that the MediaStream only ever recorded the first track I added. My solution was to get the Web Audio API involved. This example uses two UserMedia (e.g. a mic & Stereo Mix) and merges them. The UserMedia are identified by their deviceId as shown when you use await navigator.mediaDevices.enumerateDevices().
In summary:
Create an AudioContext()
Get your media using navigator.mediaDevices.getUserMedia()
Add these as a stream source to the AudioContext
Create an AudioContext stream destination object
Connect your sources to this single destination
And your new MediaRecorder() takes this destination as its MediaStream
Now you can record yourself singing along to your favourite song as it streams ;)
const audioContext = new AudioContext();
audioParams_01 = {
deviceId: "default",
}
audioParams_02 = {
deviceId: "7079081697e1bb3596fad96a1410ef3de71d8ccffa419f4a5f75534c73dd16b5",
}
mediaStream_01 = await navigator.mediaDevices.getUserMedia({ audio: audioParams_01 });
mediaStream_02 = await navigator.mediaDevices.getUserMedia({ audio: audioParams_02 });
audioIn_01 = audioContext.createMediaStreamSource(mediaStream_01);
audioIn_02 = audioContext.createMediaStreamSource(mediaStream_02);
dest = audioContext.createMediaStreamDestination();
audioIn_01.connect(dest);
audioIn_02.connect(dest);
const recorder = new MediaRecorder(dest.stream);
chunks = [];
recorder.onstart = async (event) => {
// your code here
}
recorder.ondataavailable = (event) => {
chunks.push(event.data);
}
recorder.onstop = async (event) => {
// your code here
}

Finish using a library built by muazKhan, which allows you to merge the streams and return them in one!
It's incredibly easy!
https://github.com/muaz-khan/MultiStreamsMixer

How do I get buffers/raw data from AudioContext?

I am trying to record and save sound clips from the user microphone using the GetUserMedia() and AudioContext APIs.
I have been able to do this with the MediaRecorder API, but unfortunately, that's not supported by Safari/iOS, so I would like to do this with just the AudioContext API and the buffer that comes from that.
I got things partially working with this tutorial from Google Web fundamentals, but I can't figure out how to do the following steps they suggest.
var handleSuccess = function(stream) {
var context = new AudioContext();
var source = context.createMediaStreamSource(stream);
var processor = context.createScriptProcessor(1024, 1, 1);
source.connect(processor);
processor.connect(context.destination);
processor.onaudioprocess = function(e) {
// ******
// TUTORIAL SUGGESTS: Do something with the data, i.e Convert this to WAV
// ******
// I ASK: How can I get this data in a buffer and then convert it to WAV etc.??
// *****
console.log(e.inputBuffer);
};
};
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(handleSuccess);
As the tutorial says:
The data that is held in the buffers is the raw data from the
microphone and you have a number of options with what you can do with
the data:
Upload it straight to the server
Store it locally
Convert to a dedicated file format, such as WAV, and then save it to your servers or locally
I could do all this, but I can't figure out how to get the audio buffer once I stop the context.
With MediaRecorder you can do something like this:
mediaRecorder.ondataavailable = function(e) {
chunks.push(e.data);
}
And then when you're done recording, you have a buffer in chunks. There must be a way to this, as suggested by the tutorial, but I can't find the data to push into the buffer in the first code example.
Once I get the audio buffer I could convert it to WAV and make it into a blob etc.
Can anyone help me with this? (I don't want to use the MediaRecorder API)

e.inputBuffer.getChannelData(0)
Where 0 is the first channel. This should return a Float32Array with the raw PCM data, which you can then convert to an ArrayBuffer e.inputBuffer.getChannelData(0).buffer and send to a worker that would convert it to the needed format.
.getChannelData() Docs: https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer/getChannelData.
About typed arrays: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays, https://javascript.info/arraybuffer-binary-arrays.

Javascript: Does <audio>.captureStream() work without play()?

In the browser, I want to capture the stream of an audio tag that has an .mp3 as source, then send it live via WebRTC to the server. I don't want to hear it via the speakers.
Is it possible to call audioElement.play() without having speaker output?

new Audio() returns an HTMLAudioElement that connects to your browser's default audio output device. You can verify this in the dev console by running:
> new Audio().sinkId
<- ""
where the empty string output specifies the user agent default sinkId.
A flexible way of connecting the output of an HTMLAudioElement instance to non-default sink (for example, if you don't want to hear it through the speakers but just want to send it to another destination like a WebRTC peer connection), is to use the global AudioContext object to create a new MediaStreamAudioDestinationNode. Then you can grab the MediaElementAudioSourceNode from the Audio object holding your mp3 file via audioContext.createMediaElementSource(mp3Audio), and connect that to your new audio destination node. Then, when you run mp3Audio.play(), it will stream only to the destination node, and not the default (speaker) audio output.
Full example:
// Set up the audio node source and destination...
const mp3FilePath = 'testAudioSample.mp3'
const mp3Audio = new Audio(mp3FilePath)
const audioContext = new AudioContext()
const mp3AudioSource = audioContext.createMediaElementSource(mp3Audio)
const mp3AudioDestination = audioContext.createMediaStreamDestination()
mp3AudioSource.connect(mp3AudioDestination)
// Connect the destination track to whatever you want,
// e.g. another audio node, or an RTCPeerConnection.
const mp3AudioTrack = mp3AudioDestination.stream.getAudioTracks()[0]
const pc = new RTCPeerConnection()
pc.addTrack(track)
// Prepare the `Audio` instance playback however you'd like.
// For example, loop it:
mp3Audio.loop = true
// Start streaming the mp3 audio to the new destination sink!
await mp3Audio.play()

It seems that one can mute the audio element and still capture the stream:
audioElement.muted = true;
var stream = audioElement.captureStream();

AudioContext createMediaElementSource from video fetching data from blob

In javascript, How can I connect an audio context to a video fetching its data from a blob (the video uses the MediaStream capabilities). No matter what I do the audio context returns an empty buffer. Is there any way to connect the two?

Probably, createMediaElementSource is not the right kind of processing node for this use-case.
Rather, you better off to use createMediaStreamSource node from WebAudio API in case you are trying to handle audio live stream, not fixed media source.
The createMediaStreamSource() method of the AudioContext Interface is used to create a new MediaStreamAudioSourceNode object, given a media stream (say, from a navigator.getUserMedia instance), the audio from which can then be played and manipulated.
The link has a more detailed example. However, the main difference for this MediaStreamAudioSourceNode is it can be created only using a MediaStream that you get from media-server or locally(through getUserMedia). In my experience, i couldn't find any way by using only the blob url from the <video> tag.

While this is an old question, I've searched for something similar and found a solution I want to share.
To connect the Blob, you may use a new Response instance. Here is an example for creating a wave form visualizer.
var audioContext = new (window.AudioContext || window.webkitAudioContext)();
var analyser = audioContext.createAnalyser();
var dataArray = new Uint8Array(analyser.frequencyBinCount);
var arrayBuffer = await new Response(yourBlob).arrayBuffer();
var audioBuffer = await audioContext.decodeAudioData(arrayBuffer);
var source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(analyser);
source.start(0);
Note: yourBlob needs to be a Blob instance.
You may find this fiddle usefull which records video and audio for 5 seconds, turns the recording into a Blob and than plays it back including audio wave visualization.

Develop Reference

JavaScript is the programming language of the Web.

Encode AudioBuffer with Opus (or other codec) in Browser - javascript

Related

WebRTC MediaRecorder on remote stream cuts when the stream hangs

How do you combine many audio tracks into one for mediaRecorder API?

How do I get buffers/raw data from AudioContext?

Javascript: Does <audio>.captureStream() work without play()?

AudioContext createMediaElementSource from video fetching data from blob

Categories

Resources