How could I convert sample rate of a buffer from 44100 to 48000 Hz in a browser?
I found a library https://github.com/taisel/XAudioJS/blob/master/resampler.js that should allow me to do that, but don't have an idea how to use it.
Use an offline audio context. Something like the following may work:
var c = new OfflineAudioContext(1, len, 48000);
var b = c.createBuffer(1, len, 44100);
b.copyToChannel(yourSourceBuffer, 0);
var s = c.createBufferSource();
s.buffer = b;
s.connect(context.destination);
s.start();
c.startRendering().then(function (result) {
// result contains the new buffer resampled to 48000
});
Depending the implementation, the quality of the resampled signal can vary quite a bit.
There seemed to be a bug in mobile safari which didn't decode the loaded audio correctly when the sample rate for the audio context was different than the sample rate for the audio file. Moreover, the sample rate for the audio context changed randomly from 44100 to 48000 usually but not always depending if the website was loading with the iPhone sound switched on or off.
The workaround for this issue is to read the sample rate of the audio context and then to load different audio files for each sample rate, like this:
window.AudioContext = window.AudioContext || window.webkitAudioContext;
var audio_context = new AudioContext();
var actual_sample_rate = audio_context.sampleRate;
if (actual_sample_rate != 48000) {
actual_sample_rate = 44100;
}
function finished_loading_sounds(sounds) {
buffers['piano_do'] = {
buffer: sounds.piano,
rate: 1
};
// ...do something when the sounds are loaded...
}
var buffer_loader = new BufferLoader(
audio_context,
{
piano: "audio/" + actual_sample_rate + "/piano.m4a",
},
finished_loading_sounds
);
buffer_loader.load();
The buffer loader is defined like in this tutorial.
To change the sample rate for the audio file, one can use Audacity.
UPDATE
It seems that even when you try to load a file with the correct sample rate, occasionally the sound still gets distorted on iOS devices.
To fix the issue, I found a hack for your AudioContext:
function createAudioContext(desiredSampleRate) {
var AudioCtor = window.AudioContext || window.webkitAudioContext;
desiredSampleRate = typeof desiredSampleRate === 'number'
? desiredSampleRate
: 44100;
var context = new AudioCtor();
// Check if hack is necessary. Only occurs in iOS6+ devices
// and only when you first boot the iPhone, or play a audio/video
// with a different sample rate
if (/(iPhone|iPad)/i.test(navigator.userAgent) && context.sampleRate !== desiredSampleRate) {
var buffer = context.createBuffer(1, 1, desiredSampleRate);
var dummy = context.createBufferSource();
dummy.buffer = buffer;
dummy.connect(context.destination);
dummy.start(0);
dummy.disconnect();
context.close(); // dispose old context
context = new AudioCtor();
}
return context;
}
Then to use it, create the audio context as follows:
var audio_context = createAudioContext(44100);
Related
My product has a tool that allows you to share a video via WebRTC. When we first deployed it, we tried using a code like the following:
this.videoEl = document.createElement("video");
this.videoEl.src = url;
this.videoEl.oncanplay = function() {
this.oncanplay = undefined;
this.mediaStream = this.videoEl.captureStream();
};
The issue is that when sending this mediaStream, the result is a pitch green video, but with working audio:
The solution we came up with is to create a canvas and draw to our canvas the video contents, something like this:
this.canvas = document.createElement("canvas");
this.videoEl = document.createElement("video");
this.ctx = this.canvas.getContext("2d");
this.videoEl.src = url;
this.videoEl.oncanplay = function() {
this.oncanplay = undefined;
// Some code (stripping a lot of unnecessary stuff)
// Canvas drawing loop
this.canvas.width = this.videoEl.videoWidth;
this.canvas.height = this.videoEl.videoHeight;
this.ctx.drawImage(this.videoEl, 0, 0, this.videoEl.videoWidth, this.videoEl.videoHeight);
// Loop ends and more code
// Media stream element
this.mediaStream = this.canvas.captureStream(25);
// Attached audio track to Media Stream
try {
var audioContext = new AudioContext();
this.gainNode = audioContext.createGain();
audioSource = audioContext.createMediaStreamSource(this.videoEl.captureStream(25));
audioDestination = audioContext.createMediaStreamDestination();
audioSource.connect(this.gainNode);
this.gainNode.connect(audioDestination);
this.gainNode.gain.value = 1;
this.mediaStream.addTrack(audioDestination.stream.getAudioTracks()[0]);
} catch (e) {
// No audio tracks found
this.noAudio = true;
}
};
The solution works, however it consumes a lot of CPU and it would be great to avoid having to write all of that code. We also have customers complaining that the audio gets out of sync sometimes (which is understandable since I'm using a captureStream for audio and not for video.
At first I thought it was green because it was tainting the MediaStream, but that's not the case since I can normally draw the video to a canvas and capturing a MediaStream from it. PS: We are using a URL.createObjectURL(file) call to get the video url.
Do you know why the video is green?
Thanks.
It turns out it's a Google Chrome Bug.
Thanks to Philipp Hancke.
I've implemented the Google Cloud Speech to Text API using realtime streamed audio for the past few weeks. While initially everything looked really well, I've been testing the product on some more devices lately and have found some real weird irregularities when it comes to some iDevices.
First of all, here are the relevant code pieces:
Frontend (React Component)
constructor(props) {
super(props);
this.audio = props.audio;
this.socket = new SocketClient();
this.bufferSize = 2048;
}
/**
* Initializes the users microphone and the audio stream.
*
* #return {void}
*/
startAudioStream = async () => {
const AudioContext = window.AudioContext || window.webkitAudioContext;
this.audioCtx = new AudioContext();
this.processor = this.audioCtx.createScriptProcessor(this.bufferSize, 1, 1);
this.processor.connect(this.audioCtx.destination);
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
/* Debug through instant playback:
this.audio.srcObject = stream;
this.audio.play();
return; */
this.globalStream = stream;
this.audioCtx.resume();
this.input = this.audioCtx.createMediaStreamSource(stream);
this.input.connect(this.processor);
this.processor.onaudioprocess = (e) => {
this.microphoneProcess(e);
};
this.setState({ streaming: true });
}
/**
* Processes microphone input and passes it to the server via the open socket connection.
*
* #param {AudioProcessingEvent} e
* #return {void}
*/
microphoneProcess = (e) => {
const { speaking, askingForConfirmation, askingForErrorConfirmation } = this.state;
const left = e.inputBuffer.getChannelData(0);
const left16 = Helpers.downsampleBuffer(left, 44100, 16000);
if (speaking === false) {
this.socket.emit('stream', {
audio: left16,
context: askingForConfirmation || askingForErrorConfirmation ? 'zip_code_yes_no' : 'zip_code',
speechContext: askingForConfirmation || askingForErrorConfirmation ? ['ja', 'nein', 'ne', 'nö', 'falsch', 'neu', 'korrektur', 'korrigieren', 'stopp', 'halt', 'neu'] : ['$OPERAND'],
});
}
}
Helpers (DownsampleBuffer)
/**
* Downsamples a given audio buffer from sampleRate to outSampleRate.
* #param {Array} buffer The audio buffer to downsample.
* #param {number} sampleRate The original sample rate.
* #param {number} outSampleRate The new sample rate.
* #return {Array} The downsampled audio buffer.
*/
static downsampleBuffer(buffer, sampleRate, outSampleRate) {
if (outSampleRate === sampleRate) {
return buffer;
}
if (outSampleRate > sampleRate) {
throw new Error('Downsampling rate show be smaller than original sample rate');
}
const sampleRateRatio = sampleRate / outSampleRate;
const newLength = Math.round(buffer.length / sampleRateRatio);
const result = new Int16Array(newLength);
let offsetResult = 0;
let offsetBuffer = 0;
while (offsetResult < result.length) {
const nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);
let accum = 0;
let count = 0;
for (let i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {
accum += buffer[i];
count++;
}
result[offsetResult] = Math.min(1, accum / count) * 0x7FFF;
offsetResult++;
offsetBuffer = nextOffsetBuffer;
}
return result.buffer;
}
Backend (Socket Server)
io.on('connection', (socket) => {
logger.debug('New client connected');
const speechClient = new SpeechService(socket);
socket.on('stream', (data) => {
const audioData = data.audio;
const context = data.context;
const speechContext = data.speechContext;
speechClient.transcribe(audioData, context, speechContext);
});
});
Backend (Speech Client / Transcribe Function where data is sent to GCloud)
async transcribe(data, context, speechContext, isFile = false) {
if (!this.recognizeStream) {
logger.debug('Initiating new Google Cloud Speech client...');
let waitingForMoreData = false;
// Create new stream to the Google Speech client
this.recognizeStream = this.speechClient
.streamingRecognize({
config: {
encoding: 'LINEAR16',
sampleRateHertz: 16000,
languageCode: 'de-DE',
speechContexts: speechContext ? [{ phrases: speechContext }] : undefined,
},
interimResults: false,
singleUtterance: true,
})
.on('error', (error) => {
if (error.code === 11) {
this.recognizeStream.destroy();
this.recognizeStream = null;
return;
}
this.socket.emit('error');
this.recognizeStream.destroy();
this.recognizeStream = null;
logger.error(`Received error from Google Cloud Speech client: ${error.message}`);
})
.on('data', async (gdata) => {
if ((!gdata.results || !gdata.results[0]) && gdata.speechEventType === 'END_OF_SINGLE_UTTERANCE') {
logger.debug('Received END_OF_SINGLE_UTTERANCE - waiting 300ms for more data before restarting stream');
waitingForMoreData = true;
setTimeout(() => {
if (waitingForMoreData === true) {
// User was silent for too long - restart stream
this.recognizeStream.destroy();
this.recognizeStream = null;
}
}, 300);
return;
}
waitingForMoreData = false;
const transcription = gdata.results[0].alternatives[0].transcript;
logger.debug(`Transcription: ${transcription}`);
// Emit transcription and MP3 file of answer
this.socket.emit('transcription', transcription);
const filename = await ttsClient.getAnswerFromTranscription(transcription, 'fairy', context); // TODO-Final: Dynamic character
if (filename !== null) this.socket.emit('speech', `${config.publicScheme}://${config.publicHost}:${config.publicPort}/${filename}`);
// Restart stream
if (this.recognizeStream) this.recognizeStream.destroy();
this.recognizeStream = null;
});
}
// eslint-disable-next-line security/detect-non-literal-fs-filename
if (isFile === true) fs.createReadStream(data).pipe(this.recognizeStream);
else this.recognizeStream.write(data);
}
Now, the behavior varies heavily throughout my tested devices. I've originally developed on an iMac 2017 using Google Chrome as a browser. Works like a charm. Then, tested on an iPhone 11 Pro and iPad Air 4, both on Safari and as a full-screen web app. Again, works like a charm.
Afterwards I've tried with an iPad Pro 12.9" 2017. Suddenly, Google Cloud sometimes doesn't return a transcription at all, some other times it returns stuff which, only using very much fantasy, sounds like the actually spoken text. Same behavior on an iPad 5 and an iPhone 6 Plus.
I don't really know where to go from here. What I've read up on so far at least is that with the iPhone 6s (no idea about iPads unfortunately) the hardware sample rate was changed from 44.1khz to 48khz. So I thought, this might be it, played around with the sample rates everywhere in the code, no success. Also, I've noticed that my iMac with Google Chrome also runs on 44.1khz like the "old" iPads where transcription doesn't work. Likewise, the new iPads run on 48khz - and here everything works fine. So this can't be it.
What I've noticed as well: When I connect some AirPods to the "broken" devices and use them as audio input, everything works again. So this must have something to do with processing of the internal microphone of those devices. I just don't know what exactly.
Could anyone lead me to the right direction? What has changed between these device generations in regards to audio and the microphone?
Update 1: I've now implemented a quick function which writes the streamed PCM data from the frontend to a file in the backend using node-wav. I think, I'm getting closer now - on the devices, where the speech recognition goes nuts, I sound like a chipmunk (extremely high-pitched). I've also noticed that the binary audio data is flowing in way slower than on the devices where everything is working fine. So this probably has to do with sample/bit rate, encoding or something. Unfortunately I'm not an audio expert, so not sure what to do next.
Update 2: After a lot of trial end error, I've found that if I set the sample rate to about 9500 to 10000 in the Google Cloud RecognizeConfig, everything works. When I set this as the sample rate for the node-wav file output, it sounds okay as well. If I reset the "outgoing" sample rate to GCloud to 16000 again and downsample the audio input to about 25000 instead of 16000 in the frontend from 44100 (see "Frontend (React Component)" in the "microphoneProcess" function), it works as well. So there seems to be some kind of ~0.6 factor in sample rate differences. However, I still don't know where this behavior is coming from: Both Chrome on the working iMac and Safari on the "broken" iPads have a audioContext.sampleRate of 44100. Therefore, when I downsample them to 16000 in the code, I'd suppose both should work, whereas only the iMac works. It seems like the iPad is working with a different sample rate internally?
After a ton of trial and error, I've found the problem (and the solution).
It seems like "older" iDevice models - like the 2017 iPad Pro - have some weird peculiarity of automatically adjusting the microphone sample rate to the rate of played audio. Even though the hardware sample rate of those devices is set to 44.1khz, as soon as some audio is played, the rate changes. This can be observed through something like this:
const audioCtx = new webkitAudioContext();
console.log(`Current sample rate: ${audioCtx.sampleRate}`); // 44100
const audio = new Audio();
audio.src = 'some_audio.mp3';
await audio.play();
console.log(`Current sample rate: ${audioCtx.sampleRate}`); // Sample rate of the played audio
In my case I've played some synthesized speech from Google Text-to-Speech before opening the speech transcription socket. Those sound files have a sample rate of 24khz - exactly the sample rate Google Cloud received my audio input in.
The solution therefore was - something I should have done anyways - to downsample everything to 16khz (see my helper function in the question), but not from hard-coded 44.1khz, rather from the current sample rate of the audio context. So I've changed my microphoneProcess() function like this:
const left = e.inputBuffer.getChannelData(0);
const left16 = Helpers.downsampleBuffer(left, this.audioCtx.sampleRate, 16000);
Conclusion: Do not trust Safari with the sample rate on page load. It might change.
I am trying to recognize a certain audio clip heard using the microphone on a browser. This clip could be a beep/short tone of a specific preferably high frequency that is played by another device (something similar to how Google Tone works).
I am new to the Web Audio API and currently using "getByteFrequencyData()" from an AudioContext Analyzer to check for a high decibel value in a particular frequency and using that to trigger as "recognized". But as you may have guessed, this is a useless hack and doesn't work half the time.
I have the original sound byte that is being played on the other device and it is in the 10000-12000 freq. range so that other sounds in the environment do not interfere with the recognition. Is there any way to recognize such a beep on a browser in a decent amount of time?
Sound byte and sample code attached.
let AudioContext = window.AudioContext || window.webkitAudioContext;
audioContext = new AudioContext();
navigator.mediaDevices
.getUserMedia({ audio: true })
.then((stream) => {
let audioStream = audioContext.createMediaStreamSource(stream);
let analyser = audioContext.createAnalyser();
analyser.fftSize = 2048;
audioStream.connect(analyser);
x.analyser = analyser;
x.bufferLength = analyser.frequencyBinCount;
})
.catch(
(error) =>
(userAlert.innerHTML =
'Please allow access to your microphone.<br />' + error)
);
let frequencyArray = new Uint8Array(x.bufferLength);
let getFrequency = () => {
let totalFrequency = 0;
x.analyser.getByteFrequencyData(frequencyArray);
for (let i = 0; i < 5; i++) {
totalFrequency += Math.floor(frequencyArray[i + 52]);
}
if (totalFrequency > 500) {
// recognized
} else {
// continue
}
getFrequency();
}
Tone Example (assume 0.2s sound byte)
I'm exploring the Web Audio api in an attempt to try and adapt some aspects of the api into a non-web framework I'm working on (which'll get compiled for the web via Emscripten).
Take the following code:
var audioCtx = new AudioContext();
// imagine I've called getUserMedia and have the stream from a mic.
var source = audioCtx.createMediaStreamSource(stream);
// make a filter to alter the input somehow
var biquadFilter = audioCtx.createBiquadFilter();
// imagine we've set some settings
source.connect(biquadFilter);
Say I wanted to get the raw data of the input stream after it's been altered by the BiQuadFilter (or any other filter). Is there any way to do that? As far as I can tell it looks like the AnalyserNode might be what I'm looking for but ideally it'd be great to just pull a buffer off the end of the graph if possible.
Any hints or suggestions are appreciated.
There are two ways...
ScriptProcessorNode
You can use a ScriptProcessorNode, which is normally used to process data in your own code, to simply record the raw 32-bit float PCM audio data.
Whether this node outputs anything or not is up to you. I usually copy the input data to the output out of convenience, but there is a slight overhead to that.
MediaRecorder
The MediaRecorder can be used to record MediaStreams, both audio and/or video. First you'll need a MediaStreamAudioDestinationNode. Once you have that, you can use the MediaRecorder with the resulting stream to record it.
It's important to note that typically with the MediaRecorder, you're recording compressed audio with a lossy codec. This is essentially the purpose of the MediaRecorder. However, support for PCM in WebM has recently been added by at least Chrome. Just use {type: 'audio/webm;codecs=pcm'} when instantiating your MediaRecorder.
(I haven't tested this yet, but I suspect you're going to end up with 16-bit PCM, not 32-bit float which is used internally in the Web Audio API.)
here is a web page just save it mycode.html then give its file location to your browser ... it will prompt to gain access to your microphone ... take notice of createMediaStreamSource as well as where raw audio buffer is accessed then printed to browser console log ... essentially you define callback functions which make available the raw audio upon each Web Audio API event loop iteration - enjoy
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>capture microphone then show time & frequency domain output</title>
<script type="text/javascript">
var webaudio_tooling_obj = function () {
var audioContext = new AudioContext();
console.log("audio is starting up ...");
var BUFF_SIZE_RENDERER = 16384;
var SIZE_SHOW = 3; // number of array elements to show in console output
var audioInput = null,
microphone_stream = null,
gain_node = null,
script_processor_node = null,
script_processor_analysis_node = null,
analyser_node = null;
if (!navigator.getUserMedia)
navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia || navigator.msGetUserMedia;
if (navigator.getUserMedia){
navigator.getUserMedia({audio:true},
function(stream) {
start_microphone(stream);
},
function(e) {
alert('Error capturing audio.');
}
);
} else { alert('getUserMedia not supported in this browser.'); }
// ---
function show_some_data(given_typed_array, num_row_to_display, label) {
var size_buffer = given_typed_array.length;
var index = 0;
console.log("__________ " + label);
if (label === "time") {
for (; index < num_row_to_display && index < size_buffer; index += 1) {
var curr_value_time = (given_typed_array[index] / 128) - 1.0;
console.log(curr_value_time);
}
} else if (label === "frequency") {
for (; index < num_row_to_display && index < size_buffer; index += 1) {
console.log(given_typed_array[index]);
}
} else {
throw new Error("ERROR - must pass time or frequency");
}
}
function process_microphone_buffer(event) {
var i, N, inp, microphone_output_buffer;
// not needed for basic feature set
// microphone_output_buffer = event.inputBuffer.getChannelData(0); // just mono - 1 channel for now
}
function start_microphone(stream){
gain_node = audioContext.createGain();
gain_node.connect( audioContext.destination );
microphone_stream = audioContext.createMediaStreamSource(stream);
microphone_stream.connect(gain_node);
script_processor_node = audioContext.createScriptProcessor(BUFF_SIZE_RENDERER, 1, 1);
script_processor_node.onaudioprocess = process_microphone_buffer;
microphone_stream.connect(script_processor_node);
// --- enable volume control for output speakers
document.getElementById('volume').addEventListener('change', function() {
var curr_volume = this.value;
gain_node.gain.value = curr_volume;
console.log("curr_volume ", curr_volume);
});
// --- setup FFT
script_processor_analysis_node = audioContext.createScriptProcessor(2048, 1, 1);
script_processor_analysis_node.connect(gain_node);
analyser_node = audioContext.createAnalyser();
analyser_node.smoothingTimeConstant = 0;
analyser_node.fftSize = 2048;
microphone_stream.connect(analyser_node);
analyser_node.connect(script_processor_analysis_node);
var buffer_length = analyser_node.frequencyBinCount;
var array_freq_domain = new Uint8Array(buffer_length);
var array_time_domain = new Uint8Array(buffer_length);
console.log("buffer_length " + buffer_length);
script_processor_analysis_node.onaudioprocess = function() {
// get the average for the first channel
analyser_node.getByteFrequencyData(array_freq_domain);
analyser_node.getByteTimeDomainData(array_time_domain);
// draw the spectrogram
if (microphone_stream.playbackState == microphone_stream.PLAYING_STATE) {
show_some_data(array_freq_domain, SIZE_SHOW, "frequency");
show_some_data(array_time_domain, SIZE_SHOW, "time"); // store this to record to aggregate buffer/file
}
};
}
}(); // webaudio_tooling_obj = function()
</script>
</head>
<body>
<p>Volume</p>
<input id="volume" type="range" min="0" max="1" step="0.1" value="0.0"/>
</body>
</html>
a variation of above approach will allow you to replace microphone with your own logic to synthesize the audio curve again with ability to access the audio buffer
On other alternative is to create your graph with an OfflineAudioContext. You do have to know before hand how much data you want to capture, but if you do, you'll get your result faster than real-time, usually.
You'll get the raw PCM data so you can save it or analyze it, further modify it, or whatever.
The web audio api furnish the method .stop() to stop a sound.
I want my sound to decrease in volume before stopping. To do so I used a gain node. However I'm facing weird issues with this where some sounds just don't play and I can't figure out why.
Here is a dumbed down version of what I do:
https://jsfiddle.net/01p1t09n/1/
You'll hear that if you remove the line with setTimeout() that every sound plays. When setTimeout is there not every sound plays. What really confuses me is that I use push and shift accordingly to find the correct source of the sound, however it seems like it's another that stop playing. The only way I can see this happening is if AudioContext.decodeAudioData isn't synchronous. Just try the jsfiddle to have a better understanding and put your headset on obviously.
Here is the code of the jsfiddle:
let url = "https://raw.githubusercontent.com/gleitz/midi-js-soundfonts/gh-pages/MusyngKite/acoustic_guitar_steel-mp3/A4.mp3";
let soundContainer = {};
let notesMap = {"A4": [] };
let _AudioContext_ = AudioContext || webkitAudioContext;
let audioContext = new _AudioContext_();
var oReq = new XMLHttpRequest();
oReq.open("GET", url, true);
oReq.responseType = "arraybuffer";
oReq.onload = function (oEvent) {
var arrayBuffer = oReq.response;
makeLoop(arrayBuffer);
};
oReq.send(null);
function makeLoop(arrayBuffer){
soundContainer["A4"] = arrayBuffer;
let currentTime = audioContext.currentTime;
for(let i = 0; i < 10; i++){
//playing at same intervals
play("A4", currentTime + i * 0.5);
setTimeout( () => stop("A4"), 500 + i * 500); //remove this line you will hear all the sounds.
}
}
function play(notePlayed, start) {
audioContext.decodeAudioData(soundContainer[notePlayed], (buffer) => {
let source;
let gainNode;
source = audioContext.createBufferSource();
gainNode = audioContext.createGain();
// pushing notes in note map
notesMap[notePlayed].push({ source, gainNode });
source.buffer = buffer;
source.connect(gainNode);
gainNode.connect(audioContext.destination);
gainNode.gain.value = 1;
source.start(start);
});
}
function stop(notePlayed){
let note = notesMap[notePlayed].shift();
note.source.stop();
}
This is just to explain why I do it like this, you can skip it, it's just to explain why I don't use stop()
The reason I'm doing all this is because I want to stop the sound gracefully, so if there is a possibility to do so without using setTimeout I'd gladly take it.
Basically I have a map at the top containing my sounds (notes like A1, A#1, B1,...).
soundMap = {"A": [], "lot": [], "of": [], "sounds": []};
and a play() fct where I populate the arrays once I play the sounds:
play(sound) {
// sound is just { soundName, velocity, start}
let source;
let gainNode;
// sound container is just a map from soundname to the sound data.
this.audioContext.decodeAudioData(this.soundContainer[sound.soundName], (buffer) => {
source = this.audioContext.createBufferSource();
gainNode = this.audioContext.createGain();
gainNode.gain.value = sound.velocity;
// pushing sound in sound map
this.soundMap[sound.soundName].push({ source, gainNode });
source.buffer = buffer;
source.connect(gainNode);
gainNode.connect(this.audioContext.destination);
source.start(sound.start);
});
}
And now the part that stops the sounds :
stop(sound){
//remember above, soundMap is a map from "soundName" to {gain, source}
let dasound = this.soundMap[sound.soundName].shift();
let gain = dasound.gainNode.gain.value - 0.1;
// we lower the gain via incremental values to not have the sound stop abruptly
let i = 0;
for(; gain > 0; i++, gain -= 0.1){ // watchout funky syntax
((gain, i) => {
setTimeout(() => dasound.gainNode.gain.value = gain, 50 * i );
})(gain, i)
}
// we stop the source after the gain is set at 0. stop is in sec
setTimeout(() => note.source.stop(), i * 50);
}
Aaah, yes, yes, yes! I finally found a lot of things by eventually bothering to read "everything" in the doc (diagonally). And let me tell you this api is a diamond in the rough. Anyway, they actually have what I wanted with Audio param :
The AudioParam interface represents an audio-related parameter, usually a parameter of an AudioNode (such as GainNode.gain). An
AudioParam can be set to a specific value or a change in value, and
can be scheduled to happen at a specific time and following a specific
pattern.
It has a function linearRampToValueAtTime()
And they even have an example with what I asked !
// create audio context
var AudioContext = window.AudioContext || window.webkitAudioContext;
var audioCtx = new AudioContext();
// set basic variables for example
var myAudio = document.querySelector('audio');
var pre = document.querySelector('pre');
var myScript = document.querySelector('script');
pre.innerHTML = myScript.innerHTML;
var linearRampPlus = document.querySelector('.linear-ramp-plus');
var linearRampMinus = document.querySelector('.linear-ramp-minus');
// Create a MediaElementAudioSourceNode
// Feed the HTMLMediaElement into it
var source = audioCtx.createMediaElementSource(myAudio);
// Create a gain node and set it's gain value to 0.5
var gainNode = audioCtx.createGain();
// connect the AudioBufferSourceNode to the gainNode
// and the gainNode to the destination
gainNode.gain.setValueAtTime(0, audioCtx.currentTime);
source.connect(gainNode);
gainNode.connect(audioCtx.destination);
// set buttons to do something onclick
linearRampPlus.onclick = function() {
gainNode.gain.linearRampToValueAtTime(1.0, audioCtx.currentTime + 2);
}
linearRampMinus.onclick = function() {
gainNode.gain.linearRampToValueAtTime(0, audioCtx.currentTime + 2);
}
Working example here
They also have different type of timings, like exponential instead of linear ramp which I guess would fit this scenario more.