How do you start and stop an audio stream, so that you can optionally start it again, in Javascript?
To start the stream, I'm using:
running = false;
function handleAudioStream(stream){
let audioCtx = new AudioContext();
let source = audioCtx.createMediaStreamSource(stream);
let processor = audioCtx.createScriptProcessor(1024, 1, 1);
source.connect(processor);
processor.connect(audioCtx.destination);
processor.onaudioprocess = function(event) {
console.log('processing audio');
if (!running) {
stream.getTracks().forEach(function(track) {
if (track.readyState == 'live' && track.kind === 'audio') {
track.stop();
}
});
return;
}
var audioData = event.inputBuffer.getChannelData(0);
do_stuff(audioData);
};
processor.connect(audioCtx.destination);
}
function start_audio(){
running = true;
navigator.mediaDevices.getUserMedia({
audio: true,
video: false
}).then(handleAudioStream);
}
function stop_audio(){
running = false;
}
As recommended in other questions, to stop the stream, I'm using a global flag to trigger the calling of the stop method on each track from within the stream callback.
However, this doesn't seem to work very well. This does stop audio data from being available, but the processor.onaudioprocess callback continues to get called, consuming a massive amount of CPU.
Also, if I run start_audio() again, it doesn't re-start the audio. The browser just seems to ignore it and the audio context never re-initializes correctly.
What am I doing wrong? How do I cleanly stop an audio stream so that I can later re-start it?
Related
I have an Object with couple of base64s (Audio) inside. The base64s will start to play with a keydown event. In some situations (when the Base64 size is a little high), a delay will occur before playing. Is there any way to remove this delay, or at least reduce it?
App Witten in JavaScript And Running On Electron
//audio base64s object
var audio = {A: new Audio('base64[1]'), B: new Audio('base64[2]'), C: new Audio('base64[3]')};
//audio will start plying with key down
function keydown(ev) {
if (audio[String.fromCharCode(ev.keyCode)].classList.contains('holding') == false) {
audio[String.fromCharCode(ev.keyCode)].classList.add('holding');
if (audio[String.fromCharCode(ev.keyCode)].paused) {
playPromise = audio[String.fromCharCode(ev.keyCode)].play();
if (playPromise) {
playPromise.then(function() {
setTimeout(function() {
// Follow up operation
}, audio.duration * 1000);
}).catch(function() {
// Audio loading failure
});
} else {
audio[String.fromCharCode(ev.keyCode)].currentTime = 0;
}
}
}
I wrote up a complete example for you, and annotated below.
Some key takeaways:
If you need any sort of expediency or control over timing, you need to use the Web Audio API. Without it, you have no control over the buffering or other behavior of audio playback.
Don't use base64 for this. You don't need it. Base64 encoding is a method for encoding binary data into a text format. There is no text format here... therefore it isn't necessary. When you use base64 encoding, you add 33% overhead to the storage, you use CPU, memory, etc. There is no reason for it here.
Do use the appropriate file APIs to get what you need. To decode an audio sample, we need an array buffer. Therefore, we can use the .arrayBuffer() method on the file itself to get that. This retains the content in binary the entire time and allows the browser to memory-map if it wants to.
The code:
const audioContext = new AudioContext();
let buffer;
document.addEventListener('DOMContentLoaded', (e) => {
document.querySelector('input[type="file"]').addEventListener('change', async (e) => {
// Start the AudioContext, now that we have user ineraction
audioContext.resume();
// Ensure we actually have at least one file before continuing
if ( !(e.currentTarget.files && e.currentTarget.files[0]) ) {
return;
}
// Read the file and decode the audio
buffer = await audioContext.decodeAudioData(
await e.currentTarget.files[0].arrayBuffer()
);
});
});
document.addEventListener('keydown', (e) => {
// Ensure we've loaded audio
if (!buffer) {
return;
}
// Create the node that will play our previously decoded buffer
bufferSourceNode = audioContext.createBufferSource();
bufferSourceNode.buffer = buffer;
// Hook up the buffer source to our output node (speakers, headphones, etc.)
bufferSourceNode.connect(audioContext.destination);
// Adjust pitch based on the key we pressed, just for fun
bufferSourceNode.detune.value = (e.keyCode - 65) * 100;
// Start playing... right now
bufferSourceNode.start();
});
JSFiddle: https://jsfiddle.net/bradisbell/sc9jpxvn/1/
I want to record voice, split the recorded voice (or the audio blob) automatically into 1 second chunks, export each chunk to a wav file and send to the back end . This should happen asynchronously while the user speaks.
I currently use the following recorder.js library to do the above tasks
https://cdn.rawgit.com/mattdiamond/Recorderjs/08e7abd9/dist/recorder.js
My problem is, with time the blob/wave file becomes bigger in size. I think it is because the data gets accumulated and make the chunk size bigger. So with time I am not actually sending sequential 1 second chunks but accumulated chunks.
I can’t figure our where in my code this issue is caused. May be this happens inside the recorder.js library. If someone has used recorder js or any other JavaScript method for a similar tasks, appreciate if you could go through this code and let me know where it breaks.
This is my JS code
var gumStream; // Stream from getUserMedia()
var rec; // Recorder.js object
var input; // MediaStreamAudioSourceNode we'll be recording
var recordingNotStopped; // User pressed record button and keep talking, still not stop button pressed
const trackLengthInMS = 1000; // Length of audio chunk in miliseconds
const maxNumOfSecs = 1000; // Number of mili seconds we support per recording (1 second)
// Shim for AudioContext when it's not available.
var AudioContext = window.AudioContext || window.webkitAudioContext;
var audioContext //audio context to help us record
var recordButton = document.getElementById("recordButton");
var stopButton = document.getElementById("stopButton");
//Event handlers for above 2 buttons
recordButton.addEventListener("click", startRecording);
stopButton.addEventListener("click", stopRecording);
//Asynchronous function to stop the recoding in each second and export blob to a wav file
const sleep = time => new Promise(resolve => setTimeout(resolve, time));
const asyncFn = async() => {
for (let i = 0; i < maxNumOfSecs; i++) {
if (recordingNotStopped) {
rec.record();
await sleep(trackLengthInMS);
rec.stop();
//stop microphone access
gumStream.getAudioTracks()[0].stop();
//Create the wav blob and pass it on to createWaveBlob
rec.exportWAV(createWaveBlob);
}
}
}
function startRecording() {
console.log("recordButton clicked");
recordingNotStopped = true;
var constraints = {
audio: true,
video: false
}
recordButton.disabled = true;
stopButton.disabled = false;
//Using the standard promise based getUserMedia()
navigator.mediaDevices.getUserMedia(constraints).then(function(stream) {
//Create an audio context after getUserMedia is called
audioContext = new AudioContext();
// Assign to gumStream for later use
gumStream = stream;
//Use the stream
input = audioContext.createMediaStreamSource(stream);
//Create the Recorder object and configure to record mono sound (1 channel)
rec = new Recorder(input, {
numChannels: 1
});
//Call the asynchronous function to split and export audio
asyncFn();
console.log("Recording started");
}).catch(function(err) {
//Enable the record button if getUserMedia() fails
recordButton.disabled = false;
stopButton.disabled = true;
});
}
function stopRecording() {
console.log("stopButton clicked");
recordingNotStopped = false;
//disable the stop button and enable the record button to allow for new recordings
stopButton.disabled = true;
recordButton.disabled = false;
//Set the recorder to stop the recording
rec.stop();
//stop microphone access
gumStream.getAudioTracks()[0].stop();
}
function createWaveBlob(blob) {
var url = URL.createObjectURL(blob);
//Convert the blob to a wav file and call the sendBlob function to send the wav file to the server
var convertedfile = new File([blob], 'filename.wav');
sendBlob(convertedfile);
}
Recorder.js keeps a record buffer of the audio that it records. When exportWAV is called, the record buffer is encoded but not cleared. You'd need to call clear on the recorder before calling record again so that the previous chunk of audio is cleared from the record buffer.
This is how it was fixed in the above code.
//Extend the Recorder Class and add clear() method
Recorder.prototype.step = function () {
this.clear();
};
//After calling the exportWAV(), call the clear() method
rec.exportWAV(createWaveBlob);
rec.step();
I am doing a POC and my requirement is that I want to implement the feature like OK google or Hey Siri on browser.
I am using the Chrome Browser's Web speech api. The things I noticed that I can't continuous the recognition as it terminates automatically after a certain period of time and I know its relevant because of security concern. I just does another hack like when the SpeechReognition terminates then on its end event I further start the SpeechRecogntion but it is not the best way to implement such a solution because suppose if I am using the 2 instances of same application on the different browser tab then It doesn't work or may be I am using another application in my browser that uses the speech recognition then both the application doesn't behave the same as expected. I am looking for a best approach to solve this problem.
Thanks in advance.
Since your problem is that you can't run the SpeechRecognition continuously for long periods of time, one way would be to start the SpeechRecognition only when you get some input in the mic.
This way only when there is some input, you will start the SR, looking for your magic_word.
If the magic_word is found, then you will be able to use the SR normally for your other tasks.
This can be detected by the WebAudioAPI, which is not tied by this time restriction SR suffers from. You can feed it by an LocalMediaStream from MediaDevices.getUserMedia.
For more info, on below script, you can see this answer.
Here is how you could attach it to a SpeechRecognition:
const magic_word = ##YOUR_MAGIC_WORD##;
// initialize our SpeechRecognition object
let recognition = new webkitSpeechRecognition();
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
recognition.continuous = true;
// detect the magic word
recognition.onresult = e => {
// extract all the transcripts
var transcripts = [].concat.apply([], [...e.results]
.map(res => [...res]
.map(alt => alt.transcript)
)
);
if(transcripts.some(t => t.indexOf(magic_word) > -1)){
//do something awesome, like starting your own command listeners
}
else{
// didn't understood...
}
}
// called when we detect silence
function stopSpeech(){
recognition.stop();
}
// called when we detect sound
function startSpeech(){
try{ // calling it twice will throw...
recognition.start();
}
catch(e){}
}
// request a LocalMediaStream
navigator.mediaDevices.getUserMedia({audio:true})
// add our listeners
.then(stream => detectSilence(stream, stopSpeech, startSpeech))
.catch(e => log(e.message));
function detectSilence(
stream,
onSoundEnd = _=>{},
onSoundStart = _=>{},
silence_delay = 500,
min_decibels = -80
) {
const ctx = new AudioContext();
const analyser = ctx.createAnalyser();
const streamNode = ctx.createMediaStreamSource(stream);
streamNode.connect(analyser);
analyser.minDecibels = min_decibels;
const data = new Uint8Array(analyser.frequencyBinCount); // will hold our data
let silence_start = performance.now();
let triggered = false; // trigger only once per silence event
function loop(time) {
requestAnimationFrame(loop); // we'll loop every 60th of a second to check
analyser.getByteFrequencyData(data); // get current data
if (data.some(v => v)) { // if there is data above the given db limit
if(triggered){
triggered = false;
onSoundStart();
}
silence_start = time; // set it to now
}
if (!triggered && time - silence_start > silence_delay) {
onSoundEnd();
triggered = true;
}
}
loop();
}
As a plunker, since neither StackSnippets nor jsfiddle's iframes will allow gUM in two versions...
I'm trying to create audio stream from browser and send it to server.
Here is the code:
let recording = false;
let localStream = null;
const session = {
audio: true,
video: false
};
function start () {
recording = true;
navigator.webkitGetUserMedia(session, initializeRecorder, onError);
}
function stop () {
recording = false;
localStream.getAudioTracks()[0].stop();
}
function initializeRecorder (stream) {
localStream = stream;
const audioContext = window.AudioContext;
const context = new audioContext();
const audioInput = context.createMediaStreamSource(localStream);
const bufferSize = 2048;
// create a javascript node
const recorder = context.createScriptProcessor(bufferSize, 1, 1);
// specify the processing function
recorder.onaudioprocess = recorderProcess;
// connect stream to our recorder
audioInput.connect(recorder);
// connect our recorder to the previous destination
recorder.connect(context.destination);
}
function onError (e) {
console.log('error:', e);
}
function recorderProcess (e) {
if (!recording) return;
const left = e.inputBuffer.getChannelData(0);
// send left to server here (socket.io can do the job). We dont need stereo.
}
when function start is fired, the samples can be catched in recorderProcess
when function stop is fired, the mic icon in browser disappears, but...
unless I put if (!recording) return in the beginning of recorderProcess, it still process samples.
Unfortunately it's not a solution at all - the samples are still being received by recordingProcess and if I fire start functiono once more, it will get all samples from previous stream and from new one.
My question is:
How can I stop/start recording without such issue?
or if it's not best solution
How can I totally remove stream in stop function, to safely initialize it again anytime?
recorder.disconnect() should help.
You might want to consider the new MediaRecorder functionality in Chrome Canary shown at https://webrtc.github.io/samples/src/content/getusermedia/record/ (currently video-only I think) instead of the WebAudio API.
I have a method audioBufferSourceNode which holds the audio file that has been loaded.
on line 136 which is line 13 below the start() and stop() methods are being used on the audio node for other things, but I can't get it to ALSO do another thing. How do I call these methods to play and pause the audio. I don't know the correct way to call start() & stop() methods so that I have buttons or divs that play/ pause the audio and also how do you use those methods to have a volume slider and mute button. How would I go about doing it?
Side Note: I was told declaring the variable 'audioBufferSourceNode' globally would be better practice, but not sure how to or what they meant exactly or if that even has anything to do with my problem.
so on line 13 the start() and stop() methods are being used on the audio node.
_visualize: function(audioContext, buffer) {
var audioBufferSourceNode = audioContext.createBufferSource(),
analyser = audioContext.createAnalyser(),
that = this;
//connect the source to the analyser
audioBufferSourceNode.connect(analyser);
//connect the analyser to the destination(the speaker), or we won't hear the sound
analyser.connect(audioContext.destination);
//then assign the buffer to the buffer source node
audioBufferSourceNode.buffer = buffer;
//play the source
if (!audioBufferSourceNode.start) {
audioBufferSourceNode.start = audioBufferSourceNode.noteOn //in old browsers use noteOn method
audioBufferSourceNode.stop = audioBufferSourceNode.noteOff //in old browsers use noteOn method
};
//stop the previous sound if any
if (this.animationId !== null) {
cancelAnimationFrame(this.animationId);
}
if (this.source !== null) {
this.source.stop(0);
}
audioBufferSourceNode.start(0);
this.status = 1;
this.source = audioBufferSourceNode;
audioBufferSourceNode.onended = function() {
that._audioEnd(that);
};
this._updateInfo('Playing ' + this.fileName, false);
this.info = 'Playing ' + this.fileName;
document.getElementById('fileWrapper').style.opacity = 0.2;
this._drawSpectrum(analyser);
},
full code:
https://jsfiddle.net/4hty6kak/1/
Web Audio API doesn't work on jsfiddle, so here's a live working demo:
http://wayou.github.io/HTML5_Audio_Visualizer/
Do you have any simple solid solutions?