trim or cut ArrrayBuffer from audio by timestamp node

trim or cut ArrrayBuffer from audio by timestamp node - javascript

I am fetching data from a remove url that hosts some audio. For instance: https://www.listennotes.com/e/p/98bcfa3fd1b44727913385938788bcc5/
I do this with the following code:
const buffer = await (await fetch(url)).arrayBuffer();
How do I trim/cut this ArrayBuffer from the audio by time. For example, I might want to the ArrayBuffer/Blob between the 12 seconds and 60 seconds.
All the solutions I have found are web solutions. I am hoping for a way to do this server side with node.

Related

Split websocket message in multiple frames

I am using native javascript websocket in browser and we have an application hosted on AWS where every request goes through API gateway.
In some cases, request data is going upto 60kb, and then my websocket connection is closing automatically. In AWS documentation, I found out below explanation of this issue
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-known-issues.html
API Gateway supports message payloads up to 128 KB with a maximum frame size of 32 KB. If a message exceeds 32 KB, you must split it into multiple frames, each 32 KB or smaller. If a larger message is received, the connection is closed with code 1009.
I tried to find how I can split a message in multiple frames using native javascript websocket but could not find any config related to frames in documentation or anywhere else
Although I find something related to message fragmentation but it seems like a custom solution that I need to implement at both frontend and backend
https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers#message_fragmentation

As far as I know, you cannot do this using the JS AWS SDK "postToConnection" API. Best you can do is write your own poor's man fragmentation and send the chunks as independent messages.
const splitInChunks =
(sizeInBytes: number) =>
(buffer: Buffer): Buffer[] => {
const size = Buffer.byteLength(buffer);
let start = 0;
let end = sizeInBytes;
const chunks: Buffer[] = [];
do {
chunks.push(buffer.subarray(start, end));
start += sizeInBytes;
end += sizeInBytes;
} while (start < size);
return chunks;
};
Where sizeInBytes must be smaller than 32KB. Then you iterate over the chunks:
await Promise.all(chunks.map(c => apiGatewayClient.postToConnection({ data: JSON.stringify(c), connectionId: myConnectionId })
Which may run into rate limits depending on the number of chunks, so consider sending the requests serially and not in parallel
Final remark: Buffer.prototype.subarray is very efficient because it does not reallocate memory: the new chunks point at the same memory space of the original buffer. Think pointer arithmetic in C.

How to play WEBM files individually which are created by MediaRecorder

For recording audio and video, I am creating webm files under the ondataavailable of MediaRecorder API. I have to play each created webm file individually.
Mediarecorder api inserts header information into first chunk (webm file) only, so rest of the chunks do not play individually without the header information.
As suggested link 1 and link 2, I have extracted the header information from first chunk,
// for the most regular webm files, the header information exists
// between 0 to 189 Uint8 array elements
const headerIinformation = arrayBufferFirstChunk.slice(0, 189);
and perpended this header information into second chunk, still the second chunk could not play, but this time the browser is showing poster (single frame) of video and duration of sum of two chunks, eg:10 seconds; duration of each chunk is 5 second.
The same header-information thing I have done with the hex editor. I opened the webm file in editor and copied the first 190 elements from first webm file and put this into second file, something like below image, even this time, the second webm file could not play and the result was same as in previous example.
Red color is showing the header information:
This time I copied the header and cluster information from first webm file placed this into second file, something like below image, but did not get success,
Questions
What I am doing wrong here ?
Is there any way that we can play the webm files/chunks individually ?
Note: I can't use the MediaSource to play those chunks.
Edit 1
As #Brad suggested, I want to insert all the content before the first cluster to a later a cluster. I have few webm files that each has duration of 5 seconds. After digging into the files, I came to know, almost every alternate file hasn't cluster point (no 0x1F43B675).
Here I am confused that I'll have to insert header information (initialization data) at the beginning of every file or beginning of every first cluster? If I choose a later option, then how's going to play the webm file that doesn't have any cluster ?
Or, first I need to make each webm file in a way that it has cluster at very beginning, so I can prepend the header information before cluster in those files?
Edit 2
After some digging and reading this , I came up with the conculsion that each webm file needs header info, cluster and actual data.

// for the most regular webm files, the header information exists
// between 0 to 189 Uint8 array elements
Without seeing the actual file data it's hard to say, but this is possibly wrong. The "header information" needs to be everything up to the first Cluster element. That is, you want to keep all data from the start of the file up to before you see 0x1F43B675 and treat it as initialization data. This can/will vary from file to file. In my test file, this occurs a little after 1 KB in.
and perpended this header information into second chunk, still the second chunk could not play, but this time the browser is showing poster (single frame) of video and duration of sum of two chunks, eg:10 seconds; duration of each chunk is 5 second.
The chunks output from the MediaRecorder aren't relevant for segmentation, and can occur at various times. You would actually want to split on the Cluster element. That means you need to parse this WebM file, at least to the point of splitting out Clusters when their identifier 0x1F43B675 comes by.
Is there any way that we can play the webm files/chunks individually ?
You're on the right path, just prepend everything before the first Cluster to a later Cluster.
Once you've got that working, the next problem you'll likely hit is that you won't be able to do this with just any cluster. The first Cluster must begin with a keyframe or the browser won't decode it. Chrome will skip over to the next cluster, to a point, but it isn't reliable. Unfortunately, there's no way to configure keyframe placement with MediaRecorder. If you're lucky enough to be able to process this video server-side, here's how to do it with FFmpeg: https://stackoverflow.com/a/45172617/362536

Okay looks like this is not as easy as you have to scan through the blob to find the magic value.
let offset = -1;
let value = 0;
const magicNumber = parseInt("0x1F43B675".match(/[a-fA-F0-9]{2}/g).reverse().join(''), 16)
while(value !== magicNumber) {
offset = offset + 1;
try {
const arr = await firstChunk.slice(offset, offset + 4).arrayBuffer().then(buffer => new Int32Array(buffer));
value = arr[0];
}
catch(error) {
return;
}
}
offset = offset + 4;
The answer is 193 199
const header = firstChunk.slice(0, offset);
const blobType = firstChunk.type;
const blob = new Blob([header, chunk], { type: blobType });
And there you have it. Now question is how did I get this number? Why is it not multiple of 42?
Brute force
Well the logic is simple, record the video, gather chunks, slice the first chunk, compute new blob and try to play it with HTMLVideoElement. If it fails increase the offset.
(async() => {
const microphoneAudioStream = await navigator.mediaDevices.getUserMedia({ video: true, audio: true });
const mediaRecorder = new MediaRecorder(microphoneAudioStream);
let chunks = [];
mediaRecorder.addEventListener('dataavailable', (event) => {
const blob = event.data;
chunks = [...chunks, blob];
});
mediaRecorder.addEventListener("stop", async () => {
const [firstChunk, ...restofChunks] = chunks;
const [secondBlob] = restofChunks;
const blobType = firstChunk.type;
let index = 0;
const video = document.createElement("video");
while(index < 1000) {
const header = firstChunk.slice(0, index);
const blob = new Blob([header, secondBlob], { type: blobType });
const url = window.URL.createObjectURL(blob);
try {
video.setAttribute("src", url);
await video.play();
console.log(index);
break;
}
catch(error) {
}
window.URL.revokeObjectURL(url);
index++;
}
})
mediaRecorder.start(200);
const stop = () => {
mediaRecorder.stop();
}
setTimeout(stop, 400)
})();
I noticed that for smaller timeslice param in MediaRecorder.start and timeout param in setTimeout the header offset becomes 1. Sadly still not 42.

Is there a way to download an audio file of all the audio streams playing on a particular page?

I'm building an application using PizzicatoJS + HowlerJS. Those libraries essentially allow me to play multiple audio files at the same time. Imagine a 4 audio tracks with each track containing an instrument like guitar, bass, drums, vocals, etc..
Everything plays fine when using PizzicatoJS's Group functionality or running a forEach loop on all my Howl sounds and firing .play(). However, I would like to download the final resulting sound I am hearing from my speakers. Any idea on how to approach that?
I looked into OfflineAudioContext, but I am unsure on how to use it to generate an audio file. It looks like it needs an Audio source like an <audio> tag. Is what I'm trying to do possible? Any help is appreciated.

I think the OfflineAudioContext can help with your use case.
Let's say you want to create a file with a length of 10 seconds. It should contain one sound playing from the start up to second 8. And there is also another sound which is supposed to start at second 5 and should last until the end. Both sounds are AudioBuffers (named soundBuffer and anotherSoundBuffer) already.
You could arrange and combine the sounds as follows.
const sampleRate = 44100;
const offlineAudioContext = new OfflineAudioContext({
length: sampleRate * 10,
sampleRate
});
const soundSourceNode = new AudioBufferSourceNode({
buffer: soundBuffer
});
soundSourceNode.start(0);
soundSourceNode.stop(8);
soundSourceNode.connect(offlineAudioContext.destination);
const anotherSoundSourceNode = new AudioBufferSourceNode({
buffer: anotherSoundBuffer
});
anotherSoundSourceNode.start(5);
anotherSoundSourceNode.stop(10);
anotherSoundSourceNode.connect(offlineAudioContext.destination);
offlineAudioContext
.startRendering()
.then((audioBuffer) => {
// save the resulting buffer as a file
});
Now you can use a library to turn the resulting AudioBuffer into an encoded audio file. One library which does that is for example audiobuffer-to-wav.

Streaming the Microphone output via HTTP POST using chunked transfer

We are trying to build an app to broadcast live audio to multiple subscribers. The server(written in go) accepts pcm data through chunks and a client using pyaudio is able to tap into the microphone and send this data using the below code. We have tested this and it works. The audio plays from any browser with the subscriber URL.
import pyaudio
import requests
import time
p = pyaudio.PyAudio()
# frames per buffer ?
CHUNK = 1024
# 16 bits per sample ?
FORMAT = pyaudio.paInt16
# 44.1k sampling rate ?
RATE = 44100
# number of channels
CHANNELS = 1
STREAM = p.open(
format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK
)
print "initialized stream"
def get_chunks(stream):
while True:
try:
chunk = stream.read(CHUNK,exception_on_overflow=False)
yield chunk
except IOError as ioe:
print "error %s" % ioe
url = "https://<server-host>/stream/publish/<uuid>/"
s = requests.session()
s.headers.update({'Content-Type': "audio/x-wav;codec=pcm"})
resp = s.post(url, data=get_chunks(STREAM))
But we need a browser, iOS and Android client to do the same thing as the above client does. We are able to fetch the audio from the mic using the getUserMedia API on the browser but are unable to send this audio to the server like the python code above does. Can someone throw some light in the right direction?

This is about a year old now so I am sure you've moved on but I think that the approach to use from the browser is to stream the data over a WebSocket rather than over HTTP.

FFmpeg converting from video to audio missing duration

I'm attempting to load YouTube videos via their direct video URL (retrieved using ytdl-core). I load them using the request library. I then pipe the result to a stream, which is used as the input to ffmpeg (via fluent-ffmpeg). The code looks something like this:
var getAudioStream = function(req, res) {
var requestUrl = 'http://youtube.com/watch?v=' + req.params.videoId;
var audioStream = new PassThrough();
var videoUrl;
ytdl.getInfo(requestUrl, { downloadURL: true }, function(err, info) {
res.setHeader('Content-Type', 'audio/x-wav');
res.setHeader('Accept-Ranges', 'bytes');
videoUrl = info.formats ? info.formats[0].url : '';
request(videoUrl).pipe(audioStream);
ffmpeg()
.input(audioStream)
.outputOptions('-map_metadata 0')
.format('wav')
.pipe(res);
});
};
This actually works just fine, and the frontend successfully receives just the audio in WAV format and is playable. However, the audio is missing any information about its size or duration (and all other metadata). This also makes it unseekable.
I'm assuming this is lost somewhere during the ffmpeg stage, because if I load the video directly via the URL passed to request it loads and plays fine, and has a set duration/is seekable. Any ideas?

It isn't possible to know the output size nor duration until it is finished. FFmpeg cannot know this information ahead of time in most cases. Even if it could, the way you are executing FFmpeg it prevents you from accessing the extra information.
Besides, to support seeking you need to support range requests. This isn't possible either, short of encoding the file up to the byte requested and streaming from there on.
Basically, this isn't possible by the nature of what you're doing.

Develop Reference

JavaScript is the programming language of the Web.