AnalyserNode.getFloatFrequencyData() returns negative values - javascript

I'm trying to get the volume of a microphone input using the Web Audio API using AnalyserNode.getFloatFrequencyData().
The spec states that "each item in the array represents the decibel value for a specific frequency" but it returns only negative values although they do look like they are reacting to the level of sound - a whistle will return a value of around -23 and silence around -80 (the values in the dataArray are also all negative, so I don't think it's to do with how I've added them together) . The same code gives the values I'd expect (positive) with AnalyserNode.getByeFrequencyData() but the decibel values returned have been normalised between 0-255 so are more difficult to add together to determine the overall volume.
Why am I not getting the values I expect? And/or is this perhaps not a good way of getting the volume of the microphone input in the first place?
function getVolume(analyser) {
analyser.fftSize = 32;
let bufferLength = analyser.frequencyBinCount;
let dataArray = new Float32Array(bufferLength);
analyser.getFloatFrequencyData(dataArray);
let totalAntilogAmplitude = 0;
for (let i = 0; i < bufferLength; i++) {
let thisAmp = dataArray[i]; // amplitude of current bin
let thisAmpAntilog = Math.pow(10, (thisAmp / 10)) // antilog amplitude for adding
totalAntilogAmplitude = totalAntilogAmplitude + thisAmpAntilog;
}
let amplitude = 10 * Math.log10(totalAntilogAmplitude);
return amplitude;
}

Your code looks correct. But without an example, it's hard to tell if it's producing the values you expect. Also, since you're just computing (basically), the sum of all the values of the transform coefficients, you've just done a a more expensive version of summing the squares of the time domain signal.
Another alternative would square the signal, filter it a bit to smooth out variations, and get the output value at various times. Something like the following, where s is the node that has the signal you're interested in.
let g = new GainNode(context, {gain: 0});
s.connect(g);
s.connect(g.gain);
// Output of g is now the square of s
let f = new BiquadFilterNode(context, {frequency: 10});
// May want to adjust the frequency some to other value for your needs.
// I arbitrarily chose 10 Hz.
g.connect(f).connect(analyser)
// Now get the time-domain value from the analyser and just take the first
// value from the signal. This is the energy of the signal and represents
// the volume.

Related

More efficient way to copy repeating sequence into TypedArray?

I have a source Float32Array that I create a secondary Float32Array from. I have a sequence of values model that I want to copy as a repeating sequence into the secondary Float32Array. I am currently doing this operation using a reverse while loop.
sequence = [1, 0, 0, 0, 0, 1, 0, 0, 2, 0, 1, 0];
n = 3179520; //divisible by sequence length
modelBuffs = new Float32Array(n);
var v = modelBuffs.length;
while(v-=12){
modelBuffs[v-12] = sequence[0];
modelBuffs[v-11] = sequence[1];
modelBuffs[v-10] = sequence[2];
modelBuffs[v-9] = sequence[3];
// YTransform
modelBuffs[v-8] = sequence[4];
modelBuffs[v-7] = sequence[5];
modelBuffs[v-6] = sequence[6];
modelBuffs[v-5] = sequence[7];
// ZTransform
modelBuffs[v-4] = sequence[8];
modelBuffs[v-3] = sequence[9];
modelBuffs[v-2] = sequence[10];
modelBuffs[v-1] = sequence[11];
}
Unfortunately, n can be unknown. I may have to do a significant refactor if there is no alternative solution. I am hoping that I can set the sequence once and there is a copy in place/ repeating fill / bitwise operation to repeat the initial byte sequence.
Edit simplified the example input
A fast method to fill an array with a repeated sequence, is to double up length of buffer for each iteration using the copyWithin() method of the typed array. You could use set() as well by creating a different view for the same underlying ArrayBuffer, but it's simpler to use the former for this purpose.
Using for example 1234 as source, the first initial iteration fill will be 1:1, or 4 indices in this case:
1234
From there we will use destination buffer as source for the remaining fill, so second iteration fills 8 indices:
12341234
Third iteration fills 16 indices:
1234123412341234
Fourth iteration fills 32 indices:
12341234123412341234123412341234
and so forth.
If the last segment length doesn't match power of 2 you can simple do a diff between last fill and the length remaining in the buffer and use that for the last iteration.
var
srcBuffer = new Uint8Array([1,2,3,4]), // any view type will do
dstBuffer = new Uint8Array(1<<14), // 16 kb
len = dstBuffer.length, // important: use indices length, not byte-length
sLen = srcBuffer.length,
p = sLen; // set initial position = source sequence length
var startTime = performance.now();
// step 1: copy source sequence to the beginning of dest. array
// todo: dest. buffer might be smaller than source. Check for this here.
dstBuffer.set(srcBuffer);
// step 2: copy existing data doubling segment length per iteration
while(p < len) {
if (p + sLen > len) sLen = len - p; // if not power of 2, truncate last segment
dstBuffer.copyWithin(p, 0, sLen); // internal copy
p += sLen; // add current length to offset
sLen <<= 1; // double length for next segment
}
var time = performance.now() - startTime;
console.log("done", time + "ms");
console.log(dstBuffer);
If the array is very long it will obviously take some time regardless. In those cases you could consider using a Web Worker with the new SharedArrayBuffer so that you can do the copying in a different process and not have to copy or transfer the data to and from. The gain from this is merely that the main thread is not blocked with little overhead dealing with the buffer as the internals of copyWithin() is relative optimal for its purpose already. The cons are the async aspect combined with the overhead from the event system (e.g.: it depends if this is useful).
A different approach is to use WebAssembly where you write the buffer fill code in C/C++, compile and expose methods to take source and destination buffers, then call that from JavaScript. I don't have any example for this case.
In both of these latter cases you will run into compatibility issues with (not that much) older browsers.

JS - How to check if 2 images (their hash) are similar

GOAL
Finding a good way to check if 2 image are similar compairing their hash profiles. The hash is a simple array containing 0 and 1 values.
INTRO
I have 2 images. They are the same image but with some little differences: one has a different brightness, rotation and shot.
What I want to do is create a Javascript method to compare the 2 images and calculate a percentage value that tells how much they are similar.
WHAT I'VE DONE
After uploading the 2 images into a html5 canvas to get their image data, I've used the pHash algorithm (www.phash.org) to obtain their hash rapresentation.
The hash is an array containing 0 and 1 values that recreates the image in a "simplified" form.
I've also created a JS script that generates a html table with black cells where the array contains 1.The result is the following screenshot (the image is a Van Gogh picture):
Screenshot
Now, what I should do is to compare the 2 arrays for obtaining a percentage value to know "how much" they are similar.
The most part of the hash Javascript algorithms I've found googling already have a compare algorithm: the hamming distance algorithm. It's very simple and fast, but not very precise. In fact, the hamming distance algorithm says that the 2 images in my screenshot have a 67% of similarity.
THE QUESTION
Starting with 2 simple arrays, with the same length, filled with 0 and 1 values: what could be a good algorithm to determine similarity more precisely?
NOTES
- Pure Javascript development, no third party plugins or framework.
- No need of a complex algorithm to find the right similarity when the 2 images are the same but they are very different (strong rotation, totaly different colors, etc.).
Thanx
PHASH CODE
// Size is the image size (for example 128px)
var pixels = [];
for (var i=0;i<imgData.data.length;i+=4){
var j = (i==0) ? 0 : i/4;
var y = Math.floor(j/size);
var x = j-(y*size);
var pixelPos = x + (y*size);
var r = imgData.data[i];
var g = imgData.data[i+1];
var b = imgData.data[i+2];
var gs = Math.floor((r*0.299)+(g*0.587)+(b*0.114));
pixels[pixelPos] = gs;
}
var avg = Math.floor( array_sum(pixels) / pixels.length );
var hash = [];
array.forEach(pixels, function(px,i){
if(px > avg){
hash[i] = 1;
} else{
hash[i] = 0;
}
});
return hash;
HAMMING DISTANCE CODE
// hash1 and hash2 are the arrays of the "coded" images.
var similarity = hash1.length;
array.forEach(hash1, function(val,key){
if(hash1[key] != hash2[key]){
similarity--;
}
});
var percentage = (similarity/hash1.length*100).toFixed(2);
NOTE: array.forEach is not pure javascript. Consider it as a replace of: for (var i = 0; i < array.length; i++).
I'm using blockhash, it seems pretty good so far, only false positives I get are when half the pictures are of the same background color, which is to be expected =/
http://blockhash.io/
BlockHash may be slower than yours but it should be more accurate.
What you do is just calculate the greyscale of EACH pixels, and just compare it to the average to create your hash.
What BlockHash does is split the picture in small rectangles of equal size and averages the sum of the RGB values of the pixels inside them and compares them to 4 horizontal medians.
So it is normal that it takes longer, but it is still pretty efficient and accurate.
I'm doing it with pictures of a good resolution, at minimum 1000x800, and use 16bits. This gives a 64 character long hexadecimal hash. When using the hamming distance provided by the same library, I see good results when using a 10 similarity threshold.
Your idea of using greyscale isn't bad at all. But you should average out portions of the image instead of comparing each pixels. That way you can compare a thumbnail version to its original, and get pretty much the same phash!
I don't know if this can do the trick, but you can just compare the 0 and 1 similarities between arrays :
const arr1 = [1,1,1,1,1,1,1,1,1,1],
arr2 = [0,0,0,0,0,0,0,0,0,0],
arr3 = [0,1,0,1,0,1,0,1,0,1],
arr4 = [1,1,1,0,1,1,1,0,1,1]
const howSimilar = (a1,a2) => {
let similarity = 0
a1.forEach( (elem,index) => {
if(a2[index]==elem) similarity++
})
let percentage = parseInt(similarity/arr1.length*100) + "%"
console.log(percentage)
}
howSimilar(arr1,arr2) // 0%
howSimilar(arr1,arr3) // 50%
howSimilar(arr1,arr4) // 80%

Memory-efficient downsampling (charting) of a growing array

A node process of mine receives a sample point every half a second, and I want to update the history chart of all the sample points I receive.
The chart should be an array which contains the downsampled history of all points from 0 to the current point.
In other words, the maximum length of the array should be l. If I received more sample points than l, I want the chart array to be a downsampled-to-l version of the whole history.
To express it with code:
const CHART_LENGTH = 2048
createChart(CHART_LENGTH)
onReceivePoint = function(p) {
// p can be considered a number
const chart = addPointToChart(p)
// chart is an array representing all the samples received, from 0 to now
console.assert(chart.length <= CHART_LENGTH)
}
I already have a working downsampling function with number arrays:
function downsample (arr, density) {
let i, j, p, _i, _len
const downsampled = []
for (i = _i = 0, _len = arr.length; _i < _len; i = ++_i) {
p = arr[i]
j = ~~(i / arr.length * density)
if (downsampled[j] == null) downsampled[j] = 0
downsampled[j] += Math.abs(arr[i] * density / arr.length)
}
return downsampled
}
One trivial way of doing this would obviously be saving all the points I receive into an array, and apply the downsample function whenever the array grows. This would work, but, since this piece of code would run in a server, possibly for months and months in a row, it would eventually make the supporting array grow so much that the process would go out of memory.
The question is: Is there a way to construct the chart array re-using the previous contents of the chart itself, to avoid mantaining a growing data structure? In other words, is there a constant memory complexity solution to this problem?
Please note that the chart must contain the whole history since sample point #0 at any moment, so charting the last n points would not be acceptable.
The only operation that does not distort the data and that can be used several times is aggregation of an integer number of adjacent samples. You probably want 2.
More specifically: If you find that adding a new sample will exceed the array bounds, do the following: Start at the beginning of the array and average two subsequent samples. This will reduce the array size by 2 and you have space to add new samples. Doing so, you should keep track of the current cluster size c(the amount of samples that constitute one entry in the array). You start with one. Every reduction multiplies the cluster size by two.
Now the problem is that you cannot add new samples directly to the array any more because they have a completely different scale. Instead, you should average the next c samples to a new entry. It turns out that it is sufficient to store the number of samples n in the current cluster to do this. So if you add a new sample s, you would do the following.
n++
if n = 1
append s to array
else
//update the average
last array element += (s - last array element) / n
if n = c
n = 0 //start a new cluster
So the memory that you actually need is the following:
the history array with predefined length
the number of elements in the history array
the current cluster size c
the number of elements in the current cluster n
The size of the additional memory does not depend on the total number of samples, hence O(1).

How to correctly determine volume in dB from getByteFrequencyData

I know that getByteFrequencyData returns the volume in dB of each frequency band. How do I determine the total volume in dB of the signal to show in a VU meter?
Most of the time I see code that simply adds the volume of each frequency band and then devides the sum by the number of bands but this for sure is not correct. It would mean that EVERY frequency band would need to be at 6 dB for the whole signal to have 6 dB. That, of course, is not the case.
My questions:
How can I determine the total volume of the signal correctly?
If minDecibelsis set to -96 and maxDecibels to 0, I assume that a value of 0 translates to -96 dB and a value of 255 to 0 db. But: What would a value of 128 mean? -48 dB?
I think this depends on what you mean by "volume". If it's the energy of the signal, then you can just take the average of the output from getFloatFrequencyData, but the average should not average the dB values. You need to convert to linear before doing the average. This is expensive; you could just take the time domain data and compute the average sum of squares and get the same answer (almost).
Yes, the FFT data is converted to dB and then linearly mapped between the min and max values. See https://webaudio.github.io/web-audio-api/#widl-AnalyserNode-getByteFrequencyData-void-Uint8Array-array.
I've been wrestling with this issue too - the results from an average of the getByteFrequencyData output just doesn't look right.
A simplistic solution would be to return just the peak value from the frequency data.
var array = new Uint8Array( buffer_size );
listenerNode.onaudioprocess = function(){
// Get the audio data and store it in our array.
volume_analyser.getByteFrequencyData( array );
// Get the peak frequency value.
var peak_frequency = Math.max.apply( null, array );
// ...
}
This produces results that look normal on a VU meter display.
I found #tom-hazledine's approach of Math.max to generate too loud of a spectrum, effectively maxing out the volume meter very quickly.
Here's another approach that simply takes the average, and it can be modified easily to adjust if still too loud:
var array = new Uint8Array( buffer_size );
listenerNode.onaudioprocess = function(){
// Get the audio data and store it in our array.
volume_analyser.getByteFrequencyData( array );
// Get the peak frequency value.
var peak_frequency = array.reduce((a, b) => a + b, 0) / array.length;
// ...
}
To reduce the average even further, simply add to the length of the array (ie. array.length + 100)
Or for a more proper mathematical way to do this, but it also creates too loud of a signal, is:
var array = new Uint8Array( buffer_size );
listenerNode.onaudioprocess = function(){
// Get the audio data and store it in our array.
volume_analyser.getByteFrequencyData( array );
// Get the peak frequency value.
let sum = 0
for (const amplitude of array) {
sum += amplitude * amplitude
}
const volume = Math.sqrt(sum / array.length)
}
This can also be adjusted by adding to the array length as described above.

Web Audio: Karplus Strong String Synthesis

Edit: Cleaned up the code and the player (on Github) a little so it's easier to set the frequency
I'm trying to synthesize strings using the Karplus Strong string synthesis algorithm, but I can't get the string to tune properly. Does anyone have any idea?
As linked above, the code is on Github: https://github.com/achalddave/Audio-API-Frequency-Generator (the relevant bits are in strings.js).
Wiki has the following diagram:
So essentially, I generate the noise, which then gets output and sent to a delay filter simultaneously. The delay filter is connected to a low-pass filter, which is then mixed with the output. According to Wikipedia, the delay should be of N samples, where N is the sampling frequency divided by the fundamental frequency (N = f_s/f_0).
Excerpts from my code:
Generating the noise (bufferSize is 2048, but that shouldn't matter too much)
var buffer = context.createBuffer(1, bufferSize, context.sampleRate);
var bufferSource = context.createBufferSource();
bufferSource.buffer = buffer;
var bufferData = buffer.getChannelData(0);
for (var i = 0; i < delaySamples+1; i++) {
bufferData[i] = 2*(Math.random()-0.5); // random noise from -1 to 1
}
Create a delay node
var delayNode = context.createDelayNode();
We need to delay by f_s/f_0 samples. However, the delay node takes the delay in seconds, so we need to divide that by the samples per second, and we get (f_s/f_0) / f_s, which is just 1/f_0.
var delaySeconds = 1/(frequency);
delayNode.delayTime.value = delaySeconds;
Create the lowpass filter (the frequency cutoff, as far as I can tell, shouldn't affect the frequency, and is more a matter of whether the string "sounds" natural):
var lowpassFilter = context.createBiquadFilter();
lowpassFilter.type = lowpassFilter.LOWPASS; // explicitly set type
lowpassFilter.frequency.value = 20000; // make things sound better
Connect the noise to the output and the delay node (destination = context.destination and was defined earlier):
bufferSource.connect(destination);
bufferSource.connect(delayNode);
Connect the delay to the lowpass filter:
delayNode.connect(lowpassFilter);
Connect the lowpass to the output and back to the delay*:
lowpassFilter.connect(destination);
lowpassFilter.connect(delayNode);
Does anyone have any ideas? I can't figure out whether the issue is my code, my interpretation of the algorithm, my understanding of the API, or (though this is least likely) an issue with the API itself.
*Note that on Github, there's actually a Gain Node between the lowpass and the output, but this doesn't really make a big difference in the output.
Here's what I think is the problem. I don't think the DelayNode implementation is designed to handle such tight feedback loops. For a 441 Hz tone, for example, that's only 100 samples of delay, and the DelayNode implementation probably processes its input in blocks of 128 or more. (The delayTime attribute is "k-rate", meaning changes to it are only processed in blocks of 128 samples. That doesn't prove my point, but it hints at it.) So the feedback comes in too late, or only partially, or something.
EDIT/UPDATE: As I state in a comment below, the actual problem is that a DelayNode in a cycle adds 128 sample frames between output and input, so that the observed delay is 128 / sampleRate seconds longer than specified.
My advice (and what I've begun to do) is to implement the whole Karplus-Strong including your own delay line in a JavaScriptNode (now known as a ScriptProcessorNode). It's not hard and I'll post my code once I get rid of an annoying bug that can't possibly exist but somehow does.
Incidentally, the tone you (and I) get with a delayTime of 1/440 (which is supposed to be an A) seems to be a G, two semitones below where it should be. Doubling the frequency raises it to a B, four semitones higher. (I could be off by an octave or two - kind of hard to tell.) Probably one could figure out what's going on (mathematically) from a couple more data points like this, but I won't bother.
EDIT: Here's my code, certified bug-free.
var context = new webkitAudioContext();
var frequency = 440;
var impulse = 0.001 * context.sampleRate;
var node = context.createJavaScriptNode(4096, 0, 1);
var N = Math.round(context.sampleRate / frequency);
var y = new Float32Array(N);
var n = 0;
node.onaudioprocess = function (e) {
var output = e.outputBuffer.getChannelData(0);
for (var i = 0; i < e.outputBuffer.length; ++i) {
var xn = (--impulse >= 0) ? Math.random()-0.5 : 0;
output[i] = y[n] = xn + (y[n] + y[(n + 1) % N]) / 2;
if (++n >= N) n = 0;
}
}
node.connect(context.destination);

Categories

Resources