I've worded my title, and tags in a way that should be searchable for both video and audio, as this question isn't specific to one. My specific case only concerns audio though, so my question body will be written specific to that.
First, the big picture:
I'm sending audio to multiple P2P clients who will connect and disconnect a random intervals. The audio I'm sending is a stream, but each client only needs the part of the stream from whence they connected. Here's how I solved that:
Every {timeout} (e.g. 1000ms), create a new audio blob
Blob will be a full audio file, with all metadata it needs to be playable
As soon as a blob is created, convert to array buffer (better browser support), and upload to client over WebRTC (or WebSockets if they don't support)
That works well. There is a delay, but if you keep the timeout low enough, it's fine.
Now, my question:
How can I play my "stream" without having any audible delay?
I say stream, but I didn't implement it using the Streams API, it is a queue of blobs, that gets updated every time the client gets new data.
I've tried a lot of different things like:
Creating a BufferSource, and merging two blobs (converted to audioBuffers) then playing that
Passing an actual stream from Stream API to clients instead of blobs
Playing blobs sequentially, relying on ended event
Loading next blob while current blob is playing
Each has problems, difficulties, or still results in an audible delay.
Here's my most recent attempt at this:
let firstTime = true;
const chunks = [];
Events.on('audio-received', ({ detail: audioChunk }) => {
chunks.push(audioChunk);
if (firstTime && chunks.length > 2) {
const currentAudio = document.createElement("audio");
currentAudio.controls = true;
currentAudio.preload = 'auto';
document.body.appendChild(currentAudio);
currentAudio.src = URL.createObjectURL(chunks.shift());
currentAudio.play();
const nextAudio = document.createElement("audio");
nextAudio.controls = true;
nextAudio.preload = 'auto';
document.body.appendChild(nextAudio);
nextAudio.src = URL.createObjectURL(chunks.shift());
let currentAudioStartTime, nextAudioStartTime;
currentAudio.addEventListener("ended", () => {
nextAudio.play()
nextAudioStartTime = new Date();
if (chunks.length) {
currentAudio.src = URL.createObjectURL(chunks.shift());
}
});
nextAudio.addEventListener("ended", () => {
currentAudio.play()
currentAudioStartTime = new Date();
console.log(currentAudioStartTime - nextAudioStartTime)
if (chunks.length) {
nextAudio.src = URL.createObjectURL(chunks.shift());
}
});
firstTime = false;
}
});
The audio-received event gets called every ~1000ms. This code works; it plays each "chunk" after the last one was played, but on Chrome, there is a ~300ms delay that's very audible. It plays the first chunk, then goes quiet, then plays the second, so on. On Firefox the delay is 50ms.
Can you help me?
I can try to create a reproducible example if that would help.
Related
(See https://github.com/norbjd/wavesurfer-upload-and-record for a minimal reproducible example).
I'm using wavesurfer.js to display audio uploaded by the user as a waveform, and I'm trying to add a feature for recording a part of the audio uploaded.
So I've created a "Record" button (for now recording only 5 seconds of the audio) with the following code when clicking on it. I'm using MediaRecorder API :
document
.querySelector('[data-action="record"]')
.addEventListener('click', () => {
// re-use audio context from wavesurfer instead of creating a new one
const audioCtx = wavesurfer.backend.getAudioContext();
const dest = audioCtx.createMediaStreamDestination();
const audioStream = dest.stream;
audioCtx.createMediaElementSource(audio).connect(dest);
const chunks = [];
const rec = new MediaRecorder(audioStream);
rec.ondataavailable = (e) => {
chunks.push(e.data);
}
rec.onstop = () => {
const blob = new Blob(chunks, { type: "audio/ogg" });
const a = document.createElement("a");
a.download = "export.ogg";
a.href = URL.createObjectURL(blob);
a.textContent = "export the audio";
a.click();
window.URL.revokeObjectURL(a.href);
}
wavesurfer.play();
rec.start();
setTimeout(() => {
rec.stop();
wavesurfer.stop();
}, 5 * 1000);
});
When clicking on the button for recording, the wavesurfer should play (wavesurfer.play()) but I can't hear anything from my browser (but I can see the cursor move). At the end of the recording (5 seconds, set with setTimeout), I can download the recorded audio (rec.onstop function) and the sound plays correctly in VLC or any other media player.
However, I can't play audio anymore on the webpage via the browser. I can still record audio, and recorded audio can be downloaded and played correctly.
I'm wondering why audio won't play on the browser after clicking on the "Record" button for the first time. I think that this line :
audioCtx.createMediaElementSource(audio).connect(dest);
is the issue, but without it, I can't record audio.
I've also tried to recreate a new AudioContext instead of using wavesurfer's one :
const audioCtx = new AudioContext();
but it does not work better (same issue).
I've reproduced the issue in a minimal reproducible example : https://github.com/norbjd/wavesurfer-upload-and-record, so feel free to check it. Any help will be welcomed !
You don't need a separate audiocontext, but you need a MediaStreamDestination that you create using the same audiocontext (from wavesurfer.js in your case) as for the audionode you want to record, and you need to connect the audionode to that destination.
You can see a complete example of capturing audio and screen video here:
https://github.com/petersalomonsen/javascriptmusic/blob/master/wasmaudioworklet/screenrecorder/screenrecorder.js
( connecting the audionode to record is done after the recording has started on line 52 )
and you can test it live here: https://petersalomonsen.com/webassemblymusic/livecodev2/?gist=c3ad6c376c23677caa41eb79dddb5485
(Toggle the capture checkbox to start recording and press the play button to start the music, toggle the capture checkbox again to stop the recording).
and you can see the actual recording being done on this video: https://youtu.be/FHST7rLxhLM
as you can see in that example, it is still possible to play audio after the recording is finished.
Note that this example has only been tested for Chrome and Firefox.
And specifically for your case with wavesurfer:
Instead of just backend: 'MediaElement', switch to backend: 'MediaElementWebAudio',
and instead of audioCtx.createMediaElementSource(audio).connect(dest);, you can change to wavesurfer.backend.sourceMediaElement.connect(dest); to reuse the existing source from wavesurfer (but also works without this).
For recording audio and video, I am creating webm files under the ondataavailable of MediaRecorder API. I have to play each created webm file individually.
Mediarecorder api inserts header information into first chunk (webm file) only, so rest of the chunks do not play individually without the header information.
As suggested link 1 and link 2, I have extracted the header information from first chunk,
// for the most regular webm files, the header information exists
// between 0 to 189 Uint8 array elements
const headerIinformation = arrayBufferFirstChunk.slice(0, 189);
and perpended this header information into second chunk, still the second chunk could not play, but this time the browser is showing poster (single frame) of video and duration of sum of two chunks, eg:10 seconds; duration of each chunk is 5 second.
The same header-information thing I have done with the hex editor. I opened the webm file in editor and copied the first 190 elements from first webm file and put this into second file, something like below image, even this time, the second webm file could not play and the result was same as in previous example.
Red color is showing the header information:
This time I copied the header and cluster information from first webm file placed this into second file, something like below image, but did not get success,
Questions
What I am doing wrong here ?
Is there any way that we can play the webm files/chunks individually ?
Note: I can't use the MediaSource to play those chunks.
Edit 1
As #Brad suggested, I want to insert all the content before the first cluster to a later a cluster. I have few webm files that each has duration of 5 seconds. After digging into the files, I came to know, almost every alternate file hasn't cluster point (no 0x1F43B675).
Here I am confused that I'll have to insert header information (initialization data) at the beginning of every file or beginning of every first cluster? If I choose a later option, then how's going to play the webm file that doesn't have any cluster ?
Or, first I need to make each webm file in a way that it has cluster at very beginning, so I can prepend the header information before cluster in those files?
Edit 2
After some digging and reading this , I came up with the conculsion that each webm file needs header info, cluster and actual data.
// for the most regular webm files, the header information exists
// between 0 to 189 Uint8 array elements
Without seeing the actual file data it's hard to say, but this is possibly wrong. The "header information" needs to be everything up to the first Cluster element. That is, you want to keep all data from the start of the file up to before you see 0x1F43B675 and treat it as initialization data. This can/will vary from file to file. In my test file, this occurs a little after 1 KB in.
and perpended this header information into second chunk, still the second chunk could not play, but this time the browser is showing poster (single frame) of video and duration of sum of two chunks, eg:10 seconds; duration of each chunk is 5 second.
The chunks output from the MediaRecorder aren't relevant for segmentation, and can occur at various times. You would actually want to split on the Cluster element. That means you need to parse this WebM file, at least to the point of splitting out Clusters when their identifier 0x1F43B675 comes by.
Is there any way that we can play the webm files/chunks individually ?
You're on the right path, just prepend everything before the first Cluster to a later Cluster.
Once you've got that working, the next problem you'll likely hit is that you won't be able to do this with just any cluster. The first Cluster must begin with a keyframe or the browser won't decode it. Chrome will skip over to the next cluster, to a point, but it isn't reliable. Unfortunately, there's no way to configure keyframe placement with MediaRecorder. If you're lucky enough to be able to process this video server-side, here's how to do it with FFmpeg: https://stackoverflow.com/a/45172617/362536
Okay looks like this is not as easy as you have to scan through the blob to find the magic value.
let offset = -1;
let value = 0;
const magicNumber = parseInt("0x1F43B675".match(/[a-fA-F0-9]{2}/g).reverse().join(''), 16)
while(value !== magicNumber) {
offset = offset + 1;
try {
const arr = await firstChunk.slice(offset, offset + 4).arrayBuffer().then(buffer => new Int32Array(buffer));
value = arr[0];
}
catch(error) {
return;
}
}
offset = offset + 4;
The answer is 193 199
const header = firstChunk.slice(0, offset);
const blobType = firstChunk.type;
const blob = new Blob([header, chunk], { type: blobType });
And there you have it. Now question is how did I get this number? Why is it not multiple of 42?
Brute force
Well the logic is simple, record the video, gather chunks, slice the first chunk, compute new blob and try to play it with HTMLVideoElement. If it fails increase the offset.
(async() => {
const microphoneAudioStream = await navigator.mediaDevices.getUserMedia({ video: true, audio: true });
const mediaRecorder = new MediaRecorder(microphoneAudioStream);
let chunks = [];
mediaRecorder.addEventListener('dataavailable', (event) => {
const blob = event.data;
chunks = [...chunks, blob];
});
mediaRecorder.addEventListener("stop", async () => {
const [firstChunk, ...restofChunks] = chunks;
const [secondBlob] = restofChunks;
const blobType = firstChunk.type;
let index = 0;
const video = document.createElement("video");
while(index < 1000) {
const header = firstChunk.slice(0, index);
const blob = new Blob([header, secondBlob], { type: blobType });
const url = window.URL.createObjectURL(blob);
try {
video.setAttribute("src", url);
await video.play();
console.log(index);
break;
}
catch(error) {
}
window.URL.revokeObjectURL(url);
index++;
}
})
mediaRecorder.start(200);
const stop = () => {
mediaRecorder.stop();
}
setTimeout(stop, 400)
})();
I noticed that for smaller timeslice param in MediaRecorder.start and timeout param in setTimeout the header offset becomes 1. Sadly still not 42.
The Problem:
During a WebRTC unicast video conference, I can successfully stream video from a mobile device's webcam to a laptop/desktop. I would like to record the remote stream on the laptop/desktop side. (The setup is that a mobile device streams to a laptop/desktop).
However, it is usual for the video stream to hang from time to time. That's not a problem, for the "viewer" side will catch up. However, the recording of the remote stream will stop at the first hang.
Minimal and Removed Implementation (Local Recording):
I can successfully record the local stream from navigator.mediaDevices.getUserMedia() as follows:
const recordedChunks = [];
navigator.mediaDevices.getUserMedia({
video: true,
audio: false
}).then(stream => {
const localVideoElement = document.getElementById('local-video');
localVideoElement.srcObject = stream;
return stream;
}).then(stream => {
const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.ondataavailable = (event) => {
if(event.data && event.data.size > 0) {
recordedChunks.push(event.data);
}
};
mediaRecorder.start({ mimeType: 'video/webm;codecs=vp9' }, 10);
});
I can download this quite easily as follows:
const blob = new Blob(recordedChunks, { type: 'video/webm' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
document.body.appendChild(a);
a.style = 'display: none';
a.href = url;
a.download = 'test.webm';
a.click();
window.URL.revokeObjectURL(url);
Minimal and Removed Implementation (Remote Recording):
The setup I am using requires recording the remote stream, not the local stream, for IOS Safari does not support the MediaRecorder API. I included the above to show that the recording is working on the local side. The implementation of the remote stream recording is no different except I manually add a 0 Hz audio track to the video, for Chrome appears to have a bug where it won't record without an audio track.
const mediaStream = new MediaStream();
const audioContext = new AudioContext();
const destinationNode = audioContext.createMediaStreamDestination();
const oscillatorNode = audioContext.createOscillator();
oscillatorNode.frequency.setValueAtTime(0, audioContext.currentTime);
oscillatorNode.connect(destinationNode);
const audioTrack = destinationNode.stream.getAudioTracks()[0];
const videoTrack = remoteStream.getVideoTracks()[0]; // Defined somewhere else.
mediaStream.addTrack(videoTrack);
mediaStream.addTrack(audioTrack);
And then I perform the exact same operations that I do on the local stream example above to record the mediaStream variable.
As mentioned, at the first point where the remote stream hangs (due to network latency, perhaps), the remote recording ceases, such that on download, the duration of the .webm file converted to .mp4, via ffmpeg, is only as long as to where the first hang occurred.
Attempts to Mitigate:
One attempt to mitigate this issue I have tried is, rather than recording the remote stream that is attained in the callback for the ontrack event from WebRTC, I use the video stream from the remote video element instead, via remoteVideoElement.captureStream(). This does not work to fix the issue.
Any help would be much appreciated. Thank you.
Hopefully, someone is able to post an actual fix for you. In the mean time, a nasty, inefficient, totally-not-recommended workaround:
Route the incoming MediaStream to a video element.
Use requestAnimationFrame() to schedule drawing frames to a canvas. (Note that this removes any sense of genlock from the original video, and is not something you want to do. Unfortunately, we don't have a way of knowing when incoming frames occur, as far as I know.)
Use CanvasCaptureMediaStream as the video source.
Recombine the video track from CanvasCaptureMediaStream along with the audio track from the original MediaStream in a new MediaStream.
Use this new MediaStream for MediaRecorder.
I've done this with past projects where I needed to programatically manipulate the audio and video. It works!
One big caveat is that there's a bug in Chrome where even though a capture stream is attached to a canvas, the canvas won't be updated if the tab isn't active/visible. And, of course, requestAnimationFrame is severely throttled at best if the tab isn't active, so you need another frame clock source. (I used audio processors, ha!)
I am trying to do the following:
On the server I encode h264 packets into Webm (MKV) container structure, so that each cluster gets a single frame packet.Only the first data chunk is different as it contains something called Initialization Segment.Here it is explained quite well.
Then I stream those clusters one by one in a binary stream via WebSocket to a broweser, which is Chrome.
It probably sounds weird that I use h264 codec and not VP8 or VP9, which are native codec for Webm Video Format. But it appears that html video tag has no problem to play this sort of video container. If I just write the whole stream to a file and pass it to video.src, it is played fine. But I want to stream it in real-time.That's why I am breaking the video into chunks and sending them over websocket.
On the client, I am using MediaSource API. I have little experience in Web technologies, but I found that's probably the only way to go in my case.
And it doesn't work.I am getting no errors, the streams runs ok, and the video object emits no warning or errors (checking via developer console).
The client side code looks like this:
<script>
$(document).ready(function () {
var sourceBuffer;
var player = document.getElementById("video1");
var mediaSource = new MediaSource();
player.src = URL.createObjectURL(mediaSource);
mediaSource.addEventListener('sourceopen', sourceOpen);
//array with incoming segments:
var mediaSegments = [];
var ws = new WebSocket("ws://localhost:8080/echo");
ws.binaryType = "arraybuffer";
player.addEventListener("error", function (err) {
$("#id1").append("video error "+ err.error + "\n");
}, false);
player.addEventListener("playing", function () {
$("#id1").append("playing\n");
}, false);
player.addEventListener("progress",onProgress);
ws.onopen = function () {
$("#id1").append("Socket opened\n");
};
function sourceOpen()
{
sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.64001E"');
}
function onUpdateEnd()
{
if (!mediaSegments.length)
{
return;
}
sourceBuffer.appendBuffer(mediaSegments.shift());
}
var initSegment = true;
ws.onmessage = function (evt) {
if (evt.data instanceof ArrayBuffer) {
var buffer = evt.data;
//the first segment is always 'initSegment'
//it must be appended to the buffer first
if(initSegment == true)
{
sourceBuffer.appendBuffer(buffer);
sourceBuffer.addEventListener('updateend', onUpdateEnd);
initSegment = false;
}
else
{
mediaSegments.push(buffer);
}
}
};
});
I also tried different profile codes for MIME type,even though I know that my codec is "high profile.I tried the following profiles:
avc1.42E01E baseline
avc1.58A01E extended profile
avc1.4D401E main profile
avc1.64001E high profile
In some examples I found from 2-3 years ago, I have seen developers using type= "video/x-matroska", but probably alot changed since then,because now even video.src doesn't handle this sort of MIME.
Additionally, in order to make sure the chunks I am sending through the stream are not corrupted, I opened a local streaming session in VLC player and it played it progressively with no issues.
The only thing I suspect that the MediaCodec doesn't know how to handle this sort of hybrid container.And I wonder then why video object plays such a video ok.Am I missing something in my client side code? Or MediacCodec API indeed doesn't support this type of media?
PS: For those curious why I am using MKV container and not MPEG DASH, for example. The answer is - container simplicity, data writing speed and size. EBML structures are very compact and easy to write in real time.
I already looked at this question -
Concatenate parts of two or more webm video blobs
And tried the sample code here - https://developer.mozilla.org/en-US/docs/Web/API/MediaSource -- (without modifications) in hopes of transforming the blobs into arraybuffers and appending those to a sourcebuffer for the MediaSource WebAPI, but even the sample code wasn't working on my chrome browser for which it is said to be compatible.
The crux of my problem is that I can't combine multiple blob webm clips into one without incorrect playback after the first time it plays. To go straight to the problem please scroll to the line after the first two chunks of code, for background continue reading.
I am designing a web application that allows a presenter to record scenes of him/herself explaining charts and videos.
I am using the MediaRecorder WebAPI to record video on chrome/firefox. (Side question - is there any other way (besides flash) that I can record video/audio via webcam & mic? Because MediaRecorder is not supported on not Chrome/Firefox user agents).
navigator.mediaDevices.getUserMedia(constraints)
.then(gotMedia)
.catch(e => { console.error('getUserMedia() failed: ' + e); });
function gotMedia(stream) {
recording = true;
theStream = stream;
vid.src = URL.createObjectURL(theStream);
try {
recorder = new MediaRecorder(stream);
} catch (e) {
console.error('Exception while creating MediaRecorder: ' + e);
return;
}
theRecorder = recorder;
recorder.ondataavailable =
(event) => {
tempScene.push(event.data);
};
theRecorder.start(100);
}
function finishRecording() {
recording = false;
theRecorder.stop();
theStream.getTracks().forEach(track => { track.stop(); });
while(tempScene[0].size != 1) {
tempScene.splice(0,1);
}
console.log(tempScene);
scenes.push(tempScene);
tempScene = [];
}
The function finishRecording gets called and a scene (an array of blobs of mimetype 'video/webm') gets saved to the scenes array. After it gets saved. The user can then record and save more scenes via this process. He can then view a certain scene using this following chunk of code.
function showScene(sceneNum) {
var sceneBlob = new Blob(scenes[sceneNum], {type: 'video/webm; codecs=vorbis,vp8'});
vid.src = URL.createObjectURL(sceneBlob);
vid.play();
}
In the above code what happens is the blob array for the scene gets turning into one big blob for which a url is created and pointed to by the video's src attribute, so -
[blob, blob, blob] => sceneBlob (an object, not array)
Up until this point everything works fine and dandy. Here is where the issue starts
I try to merge all the scenes into one by combining the blob arrays for each scene into one long blob array. The point of this functionality is so that the user can order the scenes however he/she deems fit and so he can choose not to include a scene. So they aren't necessarily in the same order as they were recorded in, so -
scene 1: [blob-1, blob-1] scene 2: [blob-2, blob-2]
final: [blob-2, blob-2, blob-1, blob-1]
and then I make a blob of the final blob array, so -
final: [blob, blob, blob, blob] => finalBlob
The code is below for merging the scene blob arrays
function mergeScenes() {
scenes[scenes.length] = [];
for(var i = 0; i < scenes.length - 1; i++) {
scenes[scenes.length - 1] = scenes[scenes.length - 1].concat(scenes[i]);
}
mergedScenes = scenes[scenes.length - 1];
console.log(scenes[scenes.length - 1]);
}
This final scene can be viewed by using the showScene function in the second small chunk of code because it is appended as the last scene in the scenes array. When the video is played with the showScene function it plays all the scenes all the way through. However, if I press play on the video after it plays through the first time, it only plays the last scene.
Also, if I download and play the video through my browser, the first time around it plays correctly - the subsequent times, I see the same error.
What am I doing wrong? How can I merge the files into one video containing all the scenes? Thank you very much for your time in reading this and helping me, and please let me know if I need to clarify anything.
I am using a element to display the scenes
The file's headers (metadata) should only be appended to the first chunk of data you've got.
You can't make an new video file by just pasting one after the other, they've got a structure.
So how to workaround this ?
If I understood correctly your problem, what you need is to be able to merge all the recorded videos, just like if it were only paused.
Well this can be achieved, thanks to the MediaRecorder.pause() method.
You can keep the stream open, and simply pause the MediaRecorder. At each pause event, you'll be able to generate a new video containing all the frames from the beginning of the recording, until this event.
Here is an external demo because stacksnippets don't works well with gUM...
And if ever you needed to also have shorter videos from between each resume and pause events, you could simply create new MediaRecorders for these smaller parts, while keeping the big one running.