I know that getByteFrequencyData returns the volume in dB of each frequency band. How do I determine the total volume in dB of the signal to show in a VU meter?
Most of the time I see code that simply adds the volume of each frequency band and then devides the sum by the number of bands but this for sure is not correct. It would mean that EVERY frequency band would need to be at 6 dB for the whole signal to have 6 dB. That, of course, is not the case.
My questions:
How can I determine the total volume of the signal correctly?
If minDecibelsis set to -96 and maxDecibels to 0, I assume that a value of 0 translates to -96 dB and a value of 255 to 0 db. But: What would a value of 128 mean? -48 dB?
I think this depends on what you mean by "volume". If it's the energy of the signal, then you can just take the average of the output from getFloatFrequencyData, but the average should not average the dB values. You need to convert to linear before doing the average. This is expensive; you could just take the time domain data and compute the average sum of squares and get the same answer (almost).
Yes, the FFT data is converted to dB and then linearly mapped between the min and max values. See https://webaudio.github.io/web-audio-api/#widl-AnalyserNode-getByteFrequencyData-void-Uint8Array-array.
I've been wrestling with this issue too - the results from an average of the getByteFrequencyData output just doesn't look right.
A simplistic solution would be to return just the peak value from the frequency data.
var array = new Uint8Array( buffer_size );
listenerNode.onaudioprocess = function(){
// Get the audio data and store it in our array.
volume_analyser.getByteFrequencyData( array );
// Get the peak frequency value.
var peak_frequency = Math.max.apply( null, array );
// ...
}
This produces results that look normal on a VU meter display.
I found #tom-hazledine's approach of Math.max to generate too loud of a spectrum, effectively maxing out the volume meter very quickly.
Here's another approach that simply takes the average, and it can be modified easily to adjust if still too loud:
var array = new Uint8Array( buffer_size );
listenerNode.onaudioprocess = function(){
// Get the audio data and store it in our array.
volume_analyser.getByteFrequencyData( array );
// Get the peak frequency value.
var peak_frequency = array.reduce((a, b) => a + b, 0) / array.length;
// ...
}
To reduce the average even further, simply add to the length of the array (ie. array.length + 100)
Or for a more proper mathematical way to do this, but it also creates too loud of a signal, is:
var array = new Uint8Array( buffer_size );
listenerNode.onaudioprocess = function(){
// Get the audio data and store it in our array.
volume_analyser.getByteFrequencyData( array );
// Get the peak frequency value.
let sum = 0
for (const amplitude of array) {
sum += amplitude * amplitude
}
const volume = Math.sqrt(sum / array.length)
}
This can also be adjusted by adding to the array length as described above.
Related
I'm trying to get the volume of a microphone input using the Web Audio API using AnalyserNode.getFloatFrequencyData().
The spec states that "each item in the array represents the decibel value for a specific frequency" but it returns only negative values although they do look like they are reacting to the level of sound - a whistle will return a value of around -23 and silence around -80 (the values in the dataArray are also all negative, so I don't think it's to do with how I've added them together) . The same code gives the values I'd expect (positive) with AnalyserNode.getByeFrequencyData() but the decibel values returned have been normalised between 0-255 so are more difficult to add together to determine the overall volume.
Why am I not getting the values I expect? And/or is this perhaps not a good way of getting the volume of the microphone input in the first place?
function getVolume(analyser) {
analyser.fftSize = 32;
let bufferLength = analyser.frequencyBinCount;
let dataArray = new Float32Array(bufferLength);
analyser.getFloatFrequencyData(dataArray);
let totalAntilogAmplitude = 0;
for (let i = 0; i < bufferLength; i++) {
let thisAmp = dataArray[i]; // amplitude of current bin
let thisAmpAntilog = Math.pow(10, (thisAmp / 10)) // antilog amplitude for adding
totalAntilogAmplitude = totalAntilogAmplitude + thisAmpAntilog;
}
let amplitude = 10 * Math.log10(totalAntilogAmplitude);
return amplitude;
}
Your code looks correct. But without an example, it's hard to tell if it's producing the values you expect. Also, since you're just computing (basically), the sum of all the values of the transform coefficients, you've just done a a more expensive version of summing the squares of the time domain signal.
Another alternative would square the signal, filter it a bit to smooth out variations, and get the output value at various times. Something like the following, where s is the node that has the signal you're interested in.
let g = new GainNode(context, {gain: 0});
s.connect(g);
s.connect(g.gain);
// Output of g is now the square of s
let f = new BiquadFilterNode(context, {frequency: 10});
// May want to adjust the frequency some to other value for your needs.
// I arbitrarily chose 10 Hz.
g.connect(f).connect(analyser)
// Now get the time-domain value from the analyser and just take the first
// value from the signal. This is the energy of the signal and represents
// the volume.
I'm using the standard Fisher-Yates algorithm to randomly shuffle a deck of cards in an array. However, I'm unsure if this will actually produce a true distribution of all possible permutations of a real-world shuffled deck of cards.
V8's Math.random only has 128-bits of internal state. Since there are 52 cards in a deck, 52 factorial would require 226-bits of internal state to generate all possible permutations.
However, I'm unsure if this applies when using Fisher-Yates since you aren't actually generating each possible but just getting one position randomly out of 52.
function shuffle(array) {
var m = array.length, t, i;
while (m) {
i = Math.floor(Math.random() * m--);
t = array[m];
array[m] = array[i];
array[i] = t;
}
return array;
}
In general, if a pseudorandom number generator admits fewer than 52 factorial different seeds, then there are some permutations that particular PRNG can't choose when it shuffles a 52-item list, and Fisher-Yates can't change that. (The set of permutations a particular PRNG can choose can be different from the set of permutations another PRNG can choose, even if both PRNGs are initialized with the same seed.) See also this question.
Note that although the Math.random algorithm used by V8 admits any of about 2^128 seeds at the time of this writing, no particular random number algorithm is mandated by the ECMAScript specification of Math.random, which states only that that method uses an "implementation-dependent algorithm or strategy" to generate random numbers (see ECMAScript sec. 20.2.2.27).
A PRNG's period can be extended with the Bays-Durham shuffle, which effectively increases that PRNG's state length (see Severin Pappadeux's answer). However, if you merely initialize the Bays-Durham table entries with outputs of the PRNG (rather than use the seed to initialize those entries), it will still be the case that that particular PRNG (which includes the way in which it initializes those entries and selects those table entries based on the random numbers it generates) can't choose more permutations than the number of possible seeds to initialize its original state, because there would be only one way to initialize the Bays-Durham entries for a given seed — unless, of course, the PRNG actually shuffles an exorbitant amount of lists, so many that it generates more random numbers without cycling than it otherwise would without the Bays-Durham shuffle.
For example, if the PRNG is 128 bits long, there are only 2^128 possible seeds, so there are only 2^128 ways to initialize the Bays-Durham shuffle, one for each seed, unless a seed longer than 128 bits extends to the Bays-Durham table entries and not just the PRNG's original state. (This is not to imply that the set of permutations that PRNG can choose is always the same no matter how it selects table entries in the Bays-Durham shuffle.)
EDIT (Aug. 7): Clarifications.
EDIT (Jan. 7, 2020): Edited.
You are right. With 128 bits of starting state, you can only generate at most 2128 different permutations. It doesn't matter how often you use this state (call Math.random()), the PRNG is deterministic after all.
Where the number of calls to Math.random() actually matter is when
each call would draw some more entropy (e.g. from hardware random) into the system, instead of relying on the internal state that is initialised only once
the entropy of a single call result is so low that you don't use the entire internal state over the run of the algorithm
Well, you definitely need RNG with 226bits period for all permutation to be covered, #PeterO answer is correct in this regard. But you could extend period using Bays-Durham shuffle, paying by effectively extending state of RNG. There is an estimate of the period of the B-D shuffled RNG and it is
P = sqrt(Pi * N! / (2*O))
where Pi=3.1415..., N is B-D table size, O is period of the original generator. If you take log2 of the whole expression, and use Stirling formula for factorial, and assume P=2226 and O=2128, you could get estimate for N, size of the table in B-D algorithm. From back-of-the envelope calculation N=64 would be enough to get all your permutations.
UPDATE
Ok, here is an example implementation of RNG extended with B-D shuffle. First, I implemented in Javascript Xorshift128+, using BigInt, which is apparently default RNG in V8 engine as well. Compared with C++ one, they produced identical output for first couple of dozen calls. 128bits seed as two 64bits words. Windows 10 x64, NodeJS 12.7.
const WIDTH = 2n ** 64n;
const MASK = WIDTH - 1n; // to keep things as 64bit values
class XorShift128Plus { // as described in https://v8.dev/blog/math-random
_state0 = 0n;
_state1 = 0n;
constructor(seed0, seed1) { // 128bit seed as 2 64bit values
this._state0 = BigInt(seed0) & MASK;
this._state1 = BigInt(seed1) & MASK;
if (this._state0 <= 0n)
throw new Error('seed 0 non-positive');
if (this._state1 <= 0n)
throw new Error('seed 1 non-positive');
}
next() {
let s1 = this._state0;
let s0 = this._state1;
this._state0 = s0;
s1 = ((s1 << 23n) ^ s1 ) & MASK;
s1 ^= (s1 >> 17n);
s1 ^= s0;
s1 ^= (s0 >> 26n);
this._state1 = s1;
return (this._state0 + this._state1) & MASK; // modulo WIDTH
}
}
Ok, then on top of XorShift128+ I've implemented B-D shuffle, with table of size 4. For your purpose you'll need table more than 84 entries, and power of two table is much easier to deal with, so let's say 128 entries table (7bit index) shall be good enough. Anyway, even with 4 entries table and 2bit index we need to know which bits to pick to form index. In original paper B-D discussed picking them from the back of rv as well as from front of rv etc. Here is where B-D shuffle needs another seed value - telling algorithm to pick say, bits from position 2 and 6.
class B_D_XSP {
_xsprng;
_seedBD = 0n;
_pos0 = 0n;
_pos1 = 0n;
_t; // B-D table, 4 entries
_Z = 0n;
constructor(seed0, seed1, seed2) { // note third seed for the B-D shuffle
this._xsprng = new XorShift128Plus(seed0, seed1);
this._seedBD = BigInt(seed2) & MASK;
if (this._seedBD <= 0n)
throw new Error('B-D seed non-positive');
this._pos0 = findPosition(this._seedBD); // first non-zero bit position
this._pos1 = findPosition(this._seedBD & (~(1n << this._pos0))); // second non-zero bit position
// filling up table and B-D shuffler
this._t = new Array(this._xsprng.next(), this._xsprng.next(), this._xsprng.next(), this._xsprng.next());
this._Z = this._xsprng.next();
}
index(rv) { // bit at first position plus 2*bit at second position
let idx = ((rv >> this._pos0) & 1n) + (((rv >> this._pos1) & 1n) << 1n);
return idx;
}
next() {
let retval = this._Z;
let j = this.index(this._Z);
this._Z = this._t[j];
this._t[j] = this._xsprng.next();
return retval;
}
}
Use example is as follow.
let rng = new B_D_XSP(1, 2, 4+64); // bits at second and sixth position to make index
console.log(rng._pos0.toString(10));
console.log(rng._pos1.toString(10));
console.log(rng.next());
console.log(rng.next());
console.log(rng.next());
Obviously, third seed value of say 8+128 would produce different permutation from what is shown in the example, you could play with it.
Last step would be to make 226bit random value by calling several (3 of 4) times B-D shuffled rng and combine 64bit values (and potential carry over) to make 226 random bits and then convert them to the deck shuffle.
This problem I was asked yesterday. I had to write a code to split the array into two parts such that the difference between the sum of these two part would be minimum.
Here is the code I wrote with the complexity O(n)
function solution(a) {
let leftSum = 0;
let rightSum = a.reduce((acc, value) => acc + value ,0);
let min = Math.abs(rightSum - leftSum);
a.forEach((item, i) => {
leftSum += a[i];
rightSum -= a[i];
const tempMin = Math.abs(rightSum - leftSum);
if(tempMin < min) min = tempMin;
})
return min;
}
But then I was asked if the input array is of length 10 million, how would I solve this problem in a distributed environment?
I am new to distributed programming, need help in this.
If you have N node,s then split the array into N sequential subarrays; this will give you N sequential sums. Take a pass to determine which subarray contains the desired split point. The difference between the "before" and "after" sums is your bias target for the next phase ...
Now divide that "middle" array into N pieces. Again, you look for the appropriate split point, except that now you know the exact result you'd like (since you have the array sum and your missing difference).
Repeat that second paragraph until you can fit the entire subarray into one node and that's the fastest way to finish the computation for your project.
You can speed this up somewhat by keeping a cumulative sum at each value; this will allow you to find the appropriate split point somewhat faster at each stage, as you can use a binary or interpolation search for every stage after the first.
Given an array of length N, and given M available nodes, divide the array into chunks of size N/M. Each node computes the sum of its chunk, and reports back. The total is computed by adding the partial sums. Then the total and the partial sums are distributed to each of the nodes. Each node determines the best split point within its chunk (the local minimum), and reports back. The global minimum is computed from the local minimums.
For example, if the array has 10 million entries, and 200 nodes are available, the chunk size is 50000. So each node receives 50000 numbers, and reports back the sum. The total of the array is computed by adding the 200 partial sums. Then each node is given the total, along with the 200 partial sums. The information at each node now consists of
a chunk number
the 50000 array entries for that chunk
the array total
the 200 partial sums
From that information, each node can compute its local minimum. The global minimum is computed from the 200 local minimums.
In the ideal case, where network bandwidth is infinite, network latency is zero, and any number of nodes can be used, the chunk size should be sqrt(N). So each node receives sqrt(N) array elements, and then receives sqrt(N) partial sums. Under those ideal conditions, the running time is O(sqrt(N)) instead of O(N).
Of course, in the real world, it makes no sense to try to distribute a problem like this. The amount of time (per array element) to send the array elements over the network is significant. Much larger than the amount of time (per array element) needed to solve the problem on a single computer.
Assume the array is stored sequentially over several nodes N_1, ..., N_k. A simple distributed version of your original algorithm could be the following.
On each N_i, calculate the sum s_i of the subarray stored on N_i and send it to a control node M
On node M, using s_1, ..., s_k, calculate leftSum_i and rightSum_i for the left subarray boundary of each N_i and send them back to N_i
On each N_i, using leftSum_i and rightSum_i, conduct search to find the minimum min_i and send it back to M
On node M, calculate the global minimum min from min_i, ... min_k
A side note: your original algorithm can be optimized to keep only the value rightSum - leftSum rather than two separate values leftSum and rightSum. The distributed version can also be optimized correspondingly.
GOAL
Finding a good way to check if 2 image are similar compairing their hash profiles. The hash is a simple array containing 0 and 1 values.
INTRO
I have 2 images. They are the same image but with some little differences: one has a different brightness, rotation and shot.
What I want to do is create a Javascript method to compare the 2 images and calculate a percentage value that tells how much they are similar.
WHAT I'VE DONE
After uploading the 2 images into a html5 canvas to get their image data, I've used the pHash algorithm (www.phash.org) to obtain their hash rapresentation.
The hash is an array containing 0 and 1 values that recreates the image in a "simplified" form.
I've also created a JS script that generates a html table with black cells where the array contains 1.The result is the following screenshot (the image is a Van Gogh picture):
Screenshot
Now, what I should do is to compare the 2 arrays for obtaining a percentage value to know "how much" they are similar.
The most part of the hash Javascript algorithms I've found googling already have a compare algorithm: the hamming distance algorithm. It's very simple and fast, but not very precise. In fact, the hamming distance algorithm says that the 2 images in my screenshot have a 67% of similarity.
THE QUESTION
Starting with 2 simple arrays, with the same length, filled with 0 and 1 values: what could be a good algorithm to determine similarity more precisely?
NOTES
- Pure Javascript development, no third party plugins or framework.
- No need of a complex algorithm to find the right similarity when the 2 images are the same but they are very different (strong rotation, totaly different colors, etc.).
Thanx
PHASH CODE
// Size is the image size (for example 128px)
var pixels = [];
for (var i=0;i<imgData.data.length;i+=4){
var j = (i==0) ? 0 : i/4;
var y = Math.floor(j/size);
var x = j-(y*size);
var pixelPos = x + (y*size);
var r = imgData.data[i];
var g = imgData.data[i+1];
var b = imgData.data[i+2];
var gs = Math.floor((r*0.299)+(g*0.587)+(b*0.114));
pixels[pixelPos] = gs;
}
var avg = Math.floor( array_sum(pixels) / pixels.length );
var hash = [];
array.forEach(pixels, function(px,i){
if(px > avg){
hash[i] = 1;
} else{
hash[i] = 0;
}
});
return hash;
HAMMING DISTANCE CODE
// hash1 and hash2 are the arrays of the "coded" images.
var similarity = hash1.length;
array.forEach(hash1, function(val,key){
if(hash1[key] != hash2[key]){
similarity--;
}
});
var percentage = (similarity/hash1.length*100).toFixed(2);
NOTE: array.forEach is not pure javascript. Consider it as a replace of: for (var i = 0; i < array.length; i++).
I'm using blockhash, it seems pretty good so far, only false positives I get are when half the pictures are of the same background color, which is to be expected =/
http://blockhash.io/
BlockHash may be slower than yours but it should be more accurate.
What you do is just calculate the greyscale of EACH pixels, and just compare it to the average to create your hash.
What BlockHash does is split the picture in small rectangles of equal size and averages the sum of the RGB values of the pixels inside them and compares them to 4 horizontal medians.
So it is normal that it takes longer, but it is still pretty efficient and accurate.
I'm doing it with pictures of a good resolution, at minimum 1000x800, and use 16bits. This gives a 64 character long hexadecimal hash. When using the hamming distance provided by the same library, I see good results when using a 10 similarity threshold.
Your idea of using greyscale isn't bad at all. But you should average out portions of the image instead of comparing each pixels. That way you can compare a thumbnail version to its original, and get pretty much the same phash!
I don't know if this can do the trick, but you can just compare the 0 and 1 similarities between arrays :
const arr1 = [1,1,1,1,1,1,1,1,1,1],
arr2 = [0,0,0,0,0,0,0,0,0,0],
arr3 = [0,1,0,1,0,1,0,1,0,1],
arr4 = [1,1,1,0,1,1,1,0,1,1]
const howSimilar = (a1,a2) => {
let similarity = 0
a1.forEach( (elem,index) => {
if(a2[index]==elem) similarity++
})
let percentage = parseInt(similarity/arr1.length*100) + "%"
console.log(percentage)
}
howSimilar(arr1,arr2) // 0%
howSimilar(arr1,arr3) // 50%
howSimilar(arr1,arr4) // 80%
A node process of mine receives a sample point every half a second, and I want to update the history chart of all the sample points I receive.
The chart should be an array which contains the downsampled history of all points from 0 to the current point.
In other words, the maximum length of the array should be l. If I received more sample points than l, I want the chart array to be a downsampled-to-l version of the whole history.
To express it with code:
const CHART_LENGTH = 2048
createChart(CHART_LENGTH)
onReceivePoint = function(p) {
// p can be considered a number
const chart = addPointToChart(p)
// chart is an array representing all the samples received, from 0 to now
console.assert(chart.length <= CHART_LENGTH)
}
I already have a working downsampling function with number arrays:
function downsample (arr, density) {
let i, j, p, _i, _len
const downsampled = []
for (i = _i = 0, _len = arr.length; _i < _len; i = ++_i) {
p = arr[i]
j = ~~(i / arr.length * density)
if (downsampled[j] == null) downsampled[j] = 0
downsampled[j] += Math.abs(arr[i] * density / arr.length)
}
return downsampled
}
One trivial way of doing this would obviously be saving all the points I receive into an array, and apply the downsample function whenever the array grows. This would work, but, since this piece of code would run in a server, possibly for months and months in a row, it would eventually make the supporting array grow so much that the process would go out of memory.
The question is: Is there a way to construct the chart array re-using the previous contents of the chart itself, to avoid mantaining a growing data structure? In other words, is there a constant memory complexity solution to this problem?
Please note that the chart must contain the whole history since sample point #0 at any moment, so charting the last n points would not be acceptable.
The only operation that does not distort the data and that can be used several times is aggregation of an integer number of adjacent samples. You probably want 2.
More specifically: If you find that adding a new sample will exceed the array bounds, do the following: Start at the beginning of the array and average two subsequent samples. This will reduce the array size by 2 and you have space to add new samples. Doing so, you should keep track of the current cluster size c(the amount of samples that constitute one entry in the array). You start with one. Every reduction multiplies the cluster size by two.
Now the problem is that you cannot add new samples directly to the array any more because they have a completely different scale. Instead, you should average the next c samples to a new entry. It turns out that it is sufficient to store the number of samples n in the current cluster to do this. So if you add a new sample s, you would do the following.
n++
if n = 1
append s to array
else
//update the average
last array element += (s - last array element) / n
if n = c
n = 0 //start a new cluster
So the memory that you actually need is the following:
the history array with predefined length
the number of elements in the history array
the current cluster size c
the number of elements in the current cluster n
The size of the additional memory does not depend on the total number of samples, hence O(1).