HTML5 Audio recording too large

HTML5 Audio recording too large - javascript

I managed to create a complete recorder using HTML5.
My problem is the size of the WAV file created, it's too large to be sent to my servers. I'm using the exportWAV function alot of users seem to be using.
This function creates a WAV file from the audio BLOB:
function encodeWAV(samples){
var buffer = new ArrayBuffer(44 + samples.length * 2);
var view = new DataView(buffer);
writeString(view, 0, 'RIFF');
view.setUint32(4, 32 + samples.length * 2, true);
writeString(view, 8, 'WAVE');
writeString(view, 12, 'fmt ');
view.setUint32(16, 16, true);
view.setUint16(20, 1, true);
view.setUint16(22, 1, true);
view.setUint32(24, sampleRate, true);
view.setUint32(28, sampleRate * 4, true);
view.setUint16(32, 4, true);
view.setUint16(34, 16, true);
writeString(view, 36, 'data');
view.setUint32(40, samples.length * 2, true);
floatTo16BitPCM(view, 44, samples);
return view;
}
I was browsing through the alternatives, but none of them are really sufficient or simple enough:
Zipping the file - Doesn't work well and has some security issues.
Converting to MP3 - Makes the process much slower and complicated, also has security issues, and causes the sound to lose alot of quality.
My question is - Does the HTML5 getUserMedia export only to .WAV files ?
If there was a function, like encodeWAV I used, which is encodeMP3 - That would be perfect.
What is the recommended way so solve such a problem?
I'd love to get a simple working example if possible.
Thanks.

The recommended way is probably to use the API already there in your browser instead of rewriting it yourself with the poor tools we've got.
So to record an audio stream, (https fiddle for chrome)
// get our audio stream
navigator.mediaDevices.getUserMedia({
audio: true
}).then(setup);
function startRecording(stream) {
let recorder = new MediaRecorder(stream);
let chunks = []; // here we'll store all recorded chunks
// every time a new chunk is available, store it
recorder.ondataavailable = (e) => chunks.push(e.data);
recorder.onstop = () => {
let blob = new Blob(chunks);
saveRecordedAudio(blob);
};
recorder.start();
return recorder;
}
function saveRecordedAudio(blob) {
// do whatever with this audio file e.g:
// var form = new FormData();
// form.append('file', blob, 'myaudio.ogg');
// xhr.send(form)
// for demo here, we'll just append a new audio with the recorded audio
var url = URL.createObjectURL(blob);
var a = new Audio(url);
a.controls = true;
document.body.appendChild(a);
a.onload = () => URL.revokeObjectURL(url);// better to always revoke blobURLs...
}
function setup(stream) {
let btn = document.querySelector('button');
let recording = false;
var recorder; // weird bug in FF when using let...
btn.onclick = (e) => {
if (recording = !recording) {
recorder = startRecording(stream);
} else {
recorder.stop();
}
e.target.textContent = (recording ? 'stop' : 'start') + ' recording';
};
}
<button>start recording</button>
This will record your stream as an OPUS/ogg, if you want wav, simply do the conversion server side.

Does the HTML5 getUserMedia export only to .WAV files ?
getUserMedia doesn't export anything at all. getUserMedia only returns a MediaStream for some sort of audio/video capture.
This MediaStream is used in conjunction with the Web Audio API where you can access PCM samples. WAV files typically contain raw PCM samples. (WAV is a container format. PCM is the sample format, and is the most popular way of encoding audio digitally.)
Zipping the file - Doesn't work well and has some security issues.
It works just fine when you consider the constraints and has no inherent security issues. In this case, you're getting a lossless compression of audio data. Characteristics of something like that are compression that won't reduce the size by more than 15%-30% or so.
Converting to MP3 - Makes the process much slower and complicated, also has security issues, and causes the sound to lose alot of quality.
You can encode as you record so slowness isn't a problem. Complicated... maybe at first but not really once you've used it. The issue here is that you're concerned about quality loss.
Unfortunately, you don't get to pick perfect quality and tiny size. These are tradeoffs and there is no magic bullet. Any lossy compression you use (like MP3, AAC, Opus, Vorbis) will reduce your data size considerably by removing part of the audio that we don't normally perceive. The less bandwidth there is, the more artifacts occur from this process. You have to decide between data size and quality.
If I might make a suggestion... Use the MediaRecorder API. https://developer.mozilla.org/en-US/docs/Web/API/MediaStream_Recording_API It's a very easy API to use. You create a MediaRecorder, give it a stream, tell it to record, and then deal with the data it gives you in whatever way you wish. Most browsers supporting the MediaRecorder API also support Opus for an audio codec, which provides good performance at most any bitrate. You can choose the bitrate you want and know that you're getting about the best quality audio you can get for that amount of bandwidth.

You need to encode mp3 so it would take less space...
libmp3lame.js is the tools for you...
Full article in here - how to record to mp3.

Related

How to pass a h264 encoded MediaRecorder stream to a MediaSource in Chrome?

Our screen recording chrome extension allows user to record their screen using the getDisplayMedia API, which returns a stream that is fed into the MediaRecorder API.
Normally, we'd record this stream using the webm video container with the newer vp9 codec like so:
const mediaRecorder = new MediaRecorder(mediaStream, {
mimeType: "video/webm; codecs=vp9"
});
However, Safari does not support the webm container, nor does it support decoding the vp9 codec. Since the MediaRecorder API in Chrome only supports recording in the webm container but does support the h264 encoding (which Safari can decode), we instead record with the h264 codec in a webm container:
const mediaRecorder = new MediaRecorder(mediaStream, {
mimeType: "video/webm; codecs=h264"
});
This works well for two reasons:
since our recording app is a chrome extension, we don't mind that it can only record in Chrome
since the video data is encoded as h264, we can now almost instantly move the video data to a .mp4 container, allowing Safari viewers to view these recorded videos without having to wait for an expensive transcoding process (note that you can view the videos without the chrome extension, in a regular web app)
However, because the media recorder API has no method for getting the duration of the video stream recorded so far, and measuring it manually with performance.now proved to be imprecise (with a 25ms to 150ms error), we had to change to feeding the recorder data into a MediaSource such that we can use the mediaSourceBuffer.buffered.end(sourceBuffer.buffered.length - 1) * 1000 API to get a 100% accurate read of the video stream duration recorded so far (in milliseconds).
The issue is that for some reason the MediaSource fails to instantiate when we use our "video/webm; codecs=h264" mime type.
Doing this:
mediaSourceBuffer = mediaSource.addSourceBuffer("video/webm; codecs=h264");
Results in:
Failed to execute 'addSourceBuffer' on 'MediaSource': The type provided ('video/webm; codecs=h264') is unsupported.
Why is the mime type supported by MediaRecorder but not by MediaSource? Since they are of the same API family, shouldn't they support the same mime types? How can we record with the h264 codec while passing the data to a MediaSource using addSourceBuffer?
The only solution we can think of so far is to create 2 media recorders, one recording in vp9 for us to read the accurate duration of the video recorded so far using the buffered.end API, and one recording in h264 for us to be able to immediately move the video data to a mp4 container without having to transcode the codec from vp9 to h264 for Safari users. However, this would be very inefficient as it would effectively hold twice as much data in RAM.
Reproduction cases / codesandbox examples
vp9 example (both work)
h264 example (media recorder works, media source does not)

Decoders and encoders are different beast altogether. For instance Webkit (Safari) can decode a few formats, but it can't encode anything.
Also, the MediaSource API requires that the media passed to it can be fragmented and can't thus read all the media that the browser can decode, for instance, if one browser someday supported generating standard (non-fragmented) mp4 files, then they would still be unable to pass it to the MediaSource API.
I can't tell for sure if they could support this particular codec (I guess yes), but you might not even need all that workaround at all.
If your extension is able to generate DOM elements, then you can simply use a <video> element to tell you the duration of your recorded video, using the trick described in this answer:
Set the currentTime of the video to a very large number, wait for the seeked event, and you'll get the correct duration.
const canvas_stream = getCanvasStream();
const rec = new MediaRecorder( canvas_stream.stream );
const chunks = [];
rec.ondataavailable = (evt) => chunks.push( evt.data );
rec.onstop = async (evt) => {
canvas_stream.stop();
console.log( "duration:", await measureDuration( chunks ) );
};
rec.start();
setTimeout( () => rec.stop(), 5000 );
console.log( 'Recording 5s' );
function measureDuration( chunks ) {
const blob = new Blob( chunks, { type: "video/webm" } );
const vid = document.createElement( 'video' );
return new Promise( (res, rej) => {
vid.onerror = rej;
vid.onseeked = (evt) => res( vid.duration );
vid.onloadedmetadata = (evt) => {
URL.revokeObjectURL( vid.src );
// for demo only, to show it's Infinity in Chrome
console.log( 'before seek', vid.duration );
};
vid.src = URL.createObjectURL( blob );
vid.currentTime = 1e10;
} );
}
// just so we can have a MediaStream in StackSnippet
function getCanvasStream() {
const canvas = document.createElement( 'canvas' );
const ctx = canvas.getContext( '2d' );
let stopped = false;
function draw() {
ctx.fillRect( 0,0,1,1 );
if( !stopped ) {
requestAnimationFrame( draw );
}
}
draw();
return {
stream: canvas.captureStream(),
stop: () => stopped = true
};
}

Encode AudioBuffer with Opus (or other codec) in Browser

I am trying to stream Audio via Websocket.
I can get an AudioBuffer from the Microphone (or other Source) via Web-Audio-Api and stream the RAW-Audio-Buffer, but i think this would not be very efficient.
So i looked arround to encode the AudioBuffer somehow. - If the Opus-Codec would not be practicable,
i am open to alternatives and thankful for any hints in the right direction.
I have tried to use the MediaRecorder (from MediaStreamRecording-API) but it seems not possible to stream with that API, instead of plain recording.
Here is the Part how i get the RAW-AudioBuffer:
const handleSuccess = function(stream) {
const context = new AudioContext();
const source = context.createMediaStreamSource(stream);
const processor = context.createScriptProcessor(16384, 1, 1);
source.connect(processor);
processor.connect(context.destination);
processor.onaudioprocess = function(e) {
bufferLen = e.inputBuffer.length
const inputBuffer = new Float32Array(bufferLen);
e.inputBuffer.copyFromChannel(inputBuffer, 0);
let data_to_send = inputBuffer
//And send the Float32Array ...
}
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(handleSuccess);
So the Main Question is: How can i encode the AudioBuffer.
(and Decode it at the Receiver)
Is there an API or Library? Can i get the encoded Buffer from another API in the Browser?

The Web Audio API has a MediaStreamDestination node that will expose a .stream MediaStream that you can then pass through the WebRTC API.
But if you are only dealing with a microphone input, then pass directly that MediaStream to WebRTC, no need for the Web Audio step.
Ps: for the ones that only want to encode to opus, then MediaRecorder is currently the only native way. It will incur a delay, will generate a webm file, not only the raw data, and will process the data no faster than real-time.
Only other options now are to write your own encoders and run it in WabAssembly.
Hopefully in a near future, we'll have access to the WebCodecs API which should solve this use case among others.

How to stream PCM audio on HTML without lag?

The PCM audio data is captured in Unity3D in real time. All those data will be streaming to HTML via WebSockets. The general setup is Socket.IO with node.js server.
My major task is adding smooth audio playback for live video+audio streaming solution on All platform. This is my working progress(video streaming): https://youtu.be/82_-a7WF3vs
The audio & video streaming part works well on non-html/non-WebGL platforms.
However, I couldn't make smooth audio playback on html with javascript. It runs real-time but I found some lagging issue like noise...
One of my concern is that Web Browsers do not support multi-threading, it added some lag when receiving streaming data and playback at the same time.
below is my core script for PCM playback. Hope someone can help me improve it.
var startTime = 0;
var audioCtx = new AudioContext();
function ProcessAudioData(_byte) {
ReadyToGetFrame_aud = false;
//read meta data
SourceSampleRate = ByteToInt32(_byte, 0);
SourceChannels = ByteToInt32(_byte, 4);
//conver byte[] to float
var BufferData = _byte.slice(8, _byte.length);
AudioFloat = new Float32Array(BufferData.buffer);
//=====================playback=====================
if(AudioFloat.length > 0) StreamAudio(SourceChannels, AudioFloat.length, SourceSampleRate, AudioFloat);
//=====================playback=====================
ReadyToGetFrame_aud = true;
}
function StreamAudio(NUM_CHANNELS, NUM_SAMPLES, SAMPLE_RATE, AUDIO_CHUNKS) {
var audioBuffer = audioCtx.createBuffer(NUM_CHANNELS, (NUM_SAMPLES / NUM_CHANNELS), SAMPLE_RATE);
for (var channel = 0; channel < NUM_CHANNELS; channel++) {
// This gives us the actual ArrayBuffer that contains the data
var nowBuffering = audioBuffer.getChannelData(channel);
for (var i = 0; i < NUM_SAMPLES; i++) {
var order = i * NUM_CHANNELS + channel;
nowBuffering[i] = AUDIO_CHUNKS[order];
}
}
var source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioCtx.destination);
source.start(startTime);
startTime += audioBuffer.duration;
}

How to stream PCM audio on HTML without lag?
There is always some lag with digital audio, no matter what you do. This has nothing to do with the web browser itself.
All those data will be streaming to HTML via WebSockets.
Why? The data is only going one direction so you can use a regular HTTP response and not have to worry about the overhead of Web Sockets.
One of my concern is that Web Browsers do not support multi-threading
This isn't really accurate.
It runs real-time but I found some lagging issue like noise...
What your code appears to do is take a PCM frame it receives and play it immediately. This isn't good, as the sound is wrecked if you don't play your received buffers contiguously. You must take the data and schedule it to play immediately after the current data is finished, and not a sample early or too late.
Traditionally this means doing your own buffering and setting up a ScriptProcessorNode to read from those buffers. However, this also requires some DIY resampling because the encoded rate may not be the same as the playback rate.
These days, I think that MediaSource Extensions supports PCM decoding, so you can just pipe your data through that and let the underlying system do all the work for you.

How do I get buffers/raw data from AudioContext?

I am trying to record and save sound clips from the user microphone using the GetUserMedia() and AudioContext APIs.
I have been able to do this with the MediaRecorder API, but unfortunately, that's not supported by Safari/iOS, so I would like to do this with just the AudioContext API and the buffer that comes from that.
I got things partially working with this tutorial from Google Web fundamentals, but I can't figure out how to do the following steps they suggest.
var handleSuccess = function(stream) {
var context = new AudioContext();
var source = context.createMediaStreamSource(stream);
var processor = context.createScriptProcessor(1024, 1, 1);
source.connect(processor);
processor.connect(context.destination);
processor.onaudioprocess = function(e) {
// ******
// TUTORIAL SUGGESTS: Do something with the data, i.e Convert this to WAV
// ******
// I ASK: How can I get this data in a buffer and then convert it to WAV etc.??
// *****
console.log(e.inputBuffer);
};
};
navigator.mediaDevices.getUserMedia({ audio: true, video: false })
.then(handleSuccess);
As the tutorial says:
The data that is held in the buffers is the raw data from the
microphone and you have a number of options with what you can do with
the data:
Upload it straight to the server
Store it locally
Convert to a dedicated file format, such as WAV, and then save it to your servers or locally
I could do all this, but I can't figure out how to get the audio buffer once I stop the context.
With MediaRecorder you can do something like this:
mediaRecorder.ondataavailable = function(e) {
chunks.push(e.data);
}
And then when you're done recording, you have a buffer in chunks. There must be a way to this, as suggested by the tutorial, but I can't find the data to push into the buffer in the first code example.
Once I get the audio buffer I could convert it to WAV and make it into a blob etc.
Can anyone help me with this? (I don't want to use the MediaRecorder API)

e.inputBuffer.getChannelData(0)
Where 0 is the first channel. This should return a Float32Array with the raw PCM data, which you can then convert to an ArrayBuffer e.inputBuffer.getChannelData(0).buffer and send to a worker that would convert it to the needed format.
.getChannelData() Docs: https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer/getChannelData.
About typed arrays: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays, https://javascript.info/arraybuffer-binary-arrays.

Steaming fragmented Webm over websocket to MediaSouce

I am trying to do the following:
On the server I encode h264 packets into Webm (MKV) container structure, so that each cluster gets a single frame packet.Only the first data chunk is different as it contains something called Initialization Segment.Here it is explained quite well.
Then I stream those clusters one by one in a binary stream via WebSocket to a broweser, which is Chrome.
It probably sounds weird that I use h264 codec and not VP8 or VP9, which are native codec for Webm Video Format. But it appears that html video tag has no problem to play this sort of video container. If I just write the whole stream to a file and pass it to video.src, it is played fine. But I want to stream it in real-time.That's why I am breaking the video into chunks and sending them over websocket.
On the client, I am using MediaSource API. I have little experience in Web technologies, but I found that's probably the only way to go in my case.
And it doesn't work.I am getting no errors, the streams runs ok, and the video object emits no warning or errors (checking via developer console).
The client side code looks like this:
<script>
$(document).ready(function () {
var sourceBuffer;
var player = document.getElementById("video1");
var mediaSource = new MediaSource();
player.src = URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', sourceOpen);
//array with incoming segments:
var mediaSegments = [];
var ws = new WebSocket("ws://localhost:8080/echo");
ws.binaryType = "arraybuffer";
player.addEventListener("error", function (err) {
$("#id1").append("video error "+ err.error + "\n");
}, false);
player.addEventListener("playing", function () {
$("#id1").append("playing\n");
}, false);
player.addEventListener("progress",onProgress);
ws.onopen = function () {
$("#id1").append("Socket opened\n");
};
function sourceOpen()
{
sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.64001E"');
}
function onUpdateEnd()
{
if (!mediaSegments.length)
{
return;
}
sourceBuffer.appendBuffer(mediaSegments.shift());
}
var initSegment = true;
ws.onmessage = function (evt) {
if (evt.data instanceof ArrayBuffer) {
var buffer = evt.data;
//the first segment is always 'initSegment'
//it must be appended to the buffer first
if(initSegment == true)
{
sourceBuffer.appendBuffer(buffer);
sourceBuffer.addEventListener('updateend', onUpdateEnd);
initSegment = false;
}
else
{
mediaSegments.push(buffer);
}
}
};
});
I also tried different profile codes for MIME type,even though I know that my codec is "high profile.I tried the following profiles:
avc1.42E01E baseline
avc1.58A01E extended profile
avc1.4D401E main profile
avc1.64001E high profile
In some examples I found from 2-3 years ago, I have seen developers using type= "video/x-matroska", but probably alot changed since then,because now even video.src doesn't handle this sort of MIME.
Additionally, in order to make sure the chunks I am sending through the stream are not corrupted, I opened a local streaming session in VLC player and it played it progressively with no issues.
The only thing I suspect that the MediaCodec doesn't know how to handle this sort of hybrid container.And I wonder then why video object plays such a video ok.Am I missing something in my client side code? Or MediacCodec API indeed doesn't support this type of media?
PS: For those curious why I am using MKV container and not MPEG DASH, for example. The answer is - container simplicity, data writing speed and size. EBML structures are very compact and easy to write in real time.

Develop Reference

JavaScript is the programming language of the Web.