Why this randomization method gives skewed results?

Why this randomization method gives skewed results? - javascript

I'm using two different randomization methods, while one gives results whose variance is what I expect, the other method gives results which are skewed and in a very consistent way too.
The methods:
function randomA() {
var raw = Number((Math.random()+'').substr(2));
return raw % NUM_OF_POSSIBLES;
}
function randomB() {
var raw = Math.round(Math.random()*10000);
return raw % NUM_OF_POSSIBLES;
}
When NUM_OF_POSSIBLES = 2 the first method (randomA()) results in a rather consistent number of zeros (64%) and 36% of 1s. While radnomB() does pretty much 50/50.
If NUM_OF_POSSIBLES = 5 the first method again is skewed in a pretty consistent way:
0: 10%, 1: 23%, 2: 22%, 3: 22%, 4: 23%, while the second one gives around 20% to each result.
You can find the full code with multiple tests here: jsfiddle
Why is the first method skewed, and also why is the skewing consistent?

I'm not entirely sure, but my guess is that it has to do with the rounding mode used when JavaScript formats a number as a string. In the first case, your result depends on the choice of last digit, which is sensitive to this rounding. If it's biased toward even numbers, that would explain your results. (In the case of NUM_OF_POSSIBLES == 5, it would be because of a deficit of 5s as the last digit.) In the second routine, the result depends on an intermediate digit of the string representation, which is pretty much isolated from that influence.
You might have better results by chopping off the last digit or two in the first routine.
EDIT I just confirmed experimentally that if the first routine is changed to chop off the last digit:
function randomA() {
var raw = String(Math.random());
raw = raw.substring(2, raw.length-1);
return raw % NUM_OF_POSSIBLES;
}
then the bias appears to be gone when NUM_OF_POSSIBLES == 2 or 5.

I have found that the reason of why randomA is working in such way is because java script uses floating poing numbers with 52+1 Digits (table under Basic formats chapter). So after random function returns too big value it is rounded, for example
Math.pow(2, 54) +1
//18014398509481984
Math.pow(2, 54)
//18014398509481984
Math.pow(2, 54) -1
//18014398509481984
all return the same value wich is divided by 2 (because of rounding).
for more understanding you can play and see how it looks in biniry format , examples :
parseInt(Math.pow(2, 54) - 2).toString(2)
//"111111111111111111111111111111111111111111111111111110"
parseInt(Math.pow(2, 54) - 3).toString(2)
//"111111111111111111111111111111111111111111111111111100"
parseInt(Math.pow(2, 54) ).toString(2)
//"1000000000000000000000000000000000000000000000000000000"
parseInt(Math.pow(2, 54) -1).toString(2)
//"1000000000000000000000000000000000000000000000000000000"

You get the bias because the results of Math.random() aren't always of the same length, so for example 0.123 and 0.1235 count towards the "ones" heap.
You could think that it'd be corrected if you even out the lengths with trailing zeroes, but that also won't be correct, because 0.123 could be a rounded 0.122999999999.
The real error of the first method is relying on the least significant digit of an imprecise fraction (both %2 and %5 are only affected by the last digit), which had suffered rounding errors when converted from binary to decimal for presentation.
The original, binary form of the fraction is probably uniformly distributed, but there's no way of reading it in Javascript.
Now, if someone would explain the distribution of trailing digits of a rounded decimal fraction...

Related

How to avoid "Infinity" and console.log a large number in Javascript?

I am trying to find the first number in the Fibonacci sequence to contain over 1000 digits.
Given a number n (e.g. 4), I found a way to find what place the first number with n-digits has in the Fibonacci sequence as well as a way to find the number given its place in the sequence.
Say, for example, you need to know the first number with 4 digits in the Fibonacci sequence as well as its place in the sequence. My code would work like this:
var phi = (1+Math.sqrt(5))/2;
var nDigits = 4;
var fEntry = Math.ceil(2 + Math.log(Math.pow(10, nDigits-
1))/Math.log(phi));
var fNumber = 2 * Math.pow(phi, fEntry);
console.log(fEntry);
console.log(fNumber);
In the console you would see fEntry (that is, the place the number has in the Fibonacci sequence) and fNumber (the number you're looking for). If you want to find the first number with 4 digits and its place in the sequence, for example, you'll get number 1597 at place 17, which is correct.
So far so good.
Problems arise when I want to find big numbers. I need to find the first number with 1000 digits in the Fibonacci sequence, but when I write nDigits = 1000 and run the code, the console displays "Infinity" for fEntry and for fNumber. I guess the reason is that my code involves calculations with numbers higher than what Javascript can deal with.
How can I find that number and avoid Infinity?

How can I find that number and avoid Infinity?
You can't, with the number type. Although it can hold massive values, it loses integer accuracy after Number.MAX_SAFE_INTEGER (9,007,199,254,740,991):
const a = Number.MAX_SAFE_INTEGER;
console.log(a); // 9007199254740991
console.log(a + 1); // 9007199254740992, so far so good
console.log(a + 2); // 9007199254740992, oh dear...
You can use the new BigInt on platforms that support it. Alternately, any of several "big int" libraries that store the numbers as strings of digits (literally).

JavaScript toString limits

So my problem is this, I'm writing a program that checks if number is even or odd without division. So I decided to take the number, turn it into a String with the
number.toString()
method. The problem I'm having is that if you put a number that is about 17 or more digits long the string is correct for about the first 17 digits then it's just 0's and sometimes 2's. For example,
function toStr (number)
{
return number.toString(10);
}
console.log(toStr(123456789123456789));
prints,
123456789123456780
any ideas?

The problem has nothing to do with strings or your function at all. Try going to your console and just entering the expression 123456789123456789 and pressing return.
You will likewise obtain 123456789123456780.
Why?
The expression 123456789123456789 within the JavaScript language is interpreted as a JavaScript number type, which can only be represented exactly to a certain number of base two significant figures. The input number happens to have more significant digits when expressed in base two than the number of base two significant figures available in JavaScript's representation of a number, and so the value is automatically rounded in base two as follows:
123456789123456789 =
110110110100110110100101110101100110100000101111100010101 (base two)
123456789123456780 =
110110110100110110100101110101100110100000101111100001100 (base two)
Note that you CAN accurately represent some numbers larger than a certain size in JavaScript, but only those numbers with no more significant figures in base two than JavaScript has room for. For instance, 2 times a very large power of 10, which would have only one significant figure in base two.
If you are designing this program to accept user input from a form or dialog box, then you will receive the input as a string. You only need to check the last digit in order to determine if the input number is odd or even (assuming it is indeed an integer to begin with). The other answer has suggested the standard way to obtain the last character of a string as well as the standard way to test if a string value is odd or even.

If you go beyond Javascript's max integer size (9007199254740992) you are asking for trouble: http://ecma262-5.com/ELS5_HTML.htm.
So to solve this problem, you must treat it as a string only. Then extract the last digit in the string and use it to determine whether the number is even or odd.
if(parseInt(("123456789123456789").slice(-1)) % 2)
//odd
else
//even

It's a 64-bit floating point number, using the IEEE 754 specification. A feature of this spec is that starting at 2^53 the smallest distance between two numbers is 2.
var x = Math.pow(2, 53);
console.log( x == x + 1 );
This difference is the value of the unit in the last place, or ULP.
This is similar in principle to trying to store fractional values in integral types in other languages; values like .5 can't be represented, so they are discarded. With integers, the ULP value is always 1; with floating point, the ULP value depends on how big or small the number you're trying to represent.

JavaScript Math.floor: how guarantee number will round down?

I want to normalize an array so that each value is
in [0-1) .. i.e. "the max will never be 1 but the min can be 0."
This is not unlike the random function returning numbers in the same range.
While looking at this, I found that .99999999999999999===1 is true!
Ditto (1-Number.MIN_VALUE) === 1 But Math.ceil(Number.MIN_VALUE) is 1, as it should be.
Some others: Math.floor(.999999999999) is 0
while Math.floor(.99999999999999999) is 1
OK so there are rounding problems in JS.
Is there any way I can normalize a set of numbers to lie in the range [0,1)?

It may help to examine the steps that JavaScript performs of each of your expressions.
In .99999999999999999===1:
The source text .99999999999999999 is converted to a Number. The closest Number is 1, so that is the result. (The next closest Number is 0.99999999999999988897769753748434595763683319091796875, which is 1–2–53.)
Then 1 is compared to 1. The result is true.
In (1-Number.MIN_VALUE) === 1:
Number.MIN_VALUE is 2–1074, about 5e–304.
1–2–1074 is extremely close to one. The exact value cannot be represented as a Number, so the nearest value is used. Again, the nearest value is 1.
Then 1 is compared to 1. The result is true.
In Math.ceil(Number.MIN_VALUE):
Number.MIN_VALUE is 2–1074, about 5e–304.
The ceiling function of that value is 1.
In Math.floor(.999999999999):
The source text .999999999999 is converted to a Number. The closest Number is 0.99999999999900002212172012150404043495655059814453125, so that is the result.
The floor function of that value is 0.
In Math.floor(.99999999999999999):
The source text .99999999999999999 is converted to a Number. The closest Number is 1, so that is the result.
The floor function of 1 is 1.
There are only two surprising things here, at most. One is that the numerals in the source text are converted to internal Number values. But this should not be surprising. Of course text has to be converted to internal representations of numbers, and the Number type cannot perfectly store all the infinitely many numbers. So it has to round. And of course numbers very near 1 round to 1.
The other possibly surprising thing is that 1-Number.MIN_VALUE is 1. But this is actually the same issue: The exact result is not representable, but it is very near 1, so 1 is used.
The Math.floor function works correctly. It never introduces any error, and you do not have to do anything to guarantee that it will round down. It always does.
However, since you want to normalize numbers, it seems likely you are going to divide numbers at some point. When you divide, there may be rounding problems, because many results of division are not exactly representable, so they must be rounded.
However, that is a separate problem, and you have not given enough information in this question to address the specific calculations you plan to do. You should open a separate question for it.

Javascript will treat any number between 0.999999999999999994 and 1 as 1, so just subtract .000000000000000006.
Of course that's not as easy as it sounds, since .000000000000000006 is evaluated as 0 in Javascript, so you could do something like:
function trueFloor(x)
{
x = x * 100;
if(x > .0000000000000006)
x = x - .0000000000000006;
x = Math.floor(x/100);
return x;
}
EDIT: Or at least you'd think you could. Apparently JS casts .99999999999999999 to 1 before passing it to a function, so you'd have to try something like:
trueFloor("0.99999999999999999")
function trueFloor(str)
{
x=str.substring(0,9) + 0;
return Math.floor(x); //=> 0
}
Not sure why you'd need that level of precision, but in theory, I guess it works. You can see a working fiddle here
As long as you cast your insanely precise float as a string, that's probably your best bet.

Please understand one thing: this...
.999999999999999999
... is just a Number literal. Just as
.999999999999999998
.999999999999999997
.999999999999999996
...
... you see the pattern.
How JavaScript treats these literals is completely another story. And yes, this treatment is limited by the number of bits that can be used to store a Number value.
The number of possible floating point literals is infinite by definition - no matter how small is the range set for them. For example, take the ones shown above: how many of numbers very close to 1 you may express? Right, it's infinite: just keep appending 9 to the line.
But the container for each Number value is quite finite: it has 64 bits. That means, it can store 2^64 different values (Infinite, -Infinite and NaN among them) - and that's all.
You want to work with such literals anyway? Use Strings to store them, not Numbers - and some BigMath JS library (take your pick) to work with those values - as Strings, again.
But from your question it looks like you're not, as you talked about array of Numbers - Number values, that is. And in no way there can be .999999999999999999 stored there, as there is no such Number value in JavaScript.

Odds to get the usually excluded upper-bound with Math.random()

This may look more like a math question but as it is exclusively linked to Javascript's pseudo-random number generator I guess it is a good fit for SO. If not, feel free to move it elsewhere.
First off, I'm aware that ES does not specify the algorithm to be used in the pseudo-random number generator - Math.random() -, but it does specify that the range should have an approximate uniform distribution:
15.8.2.14 random ( )
Returns a Number value with positive sign, greater than or equal to 0 but less than 1, chosen randomly or pseudo randomly with approximately uniform distribution over that range, using an implementation-dependent algorithm or strategy. This function takes no arguments.
So far, so good. Now I've recently stumbled upon this piece of data from MDN:
Note that as numbers in JavaScript are IEEE 754 floating point numbers with round-to-nearest-even behavior, these ranges, excluding the one for Math.random() itself, aren't exact, and depending on the bounds it's possible in extremely rare cases (on the order of 1 in 2^62) to calculate the usually-excluded upper bound.
Okay. It led me to some testing, the results are (obviously) the same on Chrome console and Firefox's Firebug:
>> 0.99999999999999995
1
>> 0.999999999999999945
1
>> 0.999999999999999944
0.9999999999999999
Let's put it in a simple practical example to make my question more clear:
Math.floor(Math.random() * 1)
Considering the code above, IEEE 754 floating point numbers with round-to-nearest-even behavior, under the assessment of Math.random() range being evenly distributed, I concluded that the odds for it to return the usually excluded upper bound (1 in my code above) would be 0.000000000000000055555..., that is approximately 1/18,000,000,000,000,000.
Looking at the MDN number now, 1/2^62 evaluates to 1/4,611,686,018,427,387,904, that is, over 200 times smaller than the result from my calc.
Am I doing the wrong math? Is Firefox's pseudo-random number generator just not evenly distributed enough as to generate this 200 times difference?
I know how to work around this and I'm aware that such small odds shouldn't even be considered for every day's uses, but I'd love to understand what is going on here and if my math is broken or Mozilla's (I hope it is former). =] Any input is appreciated.

You should not be worried about rounding the number from Math.random() up to 1.
When I was looking at the implementation (inferred from results I am getting) in the current versions of IE, Chrome, and FF, there are several observations that almost certainly mean that you should always get a number in the interval 0 to 0.11111111111111111111111111111111111111111111111111111 in binary (which is 0.999999999999999944.toString(2) and a few smaller decimal numbers too btw.).
Chrome: Here it is simple. It generates numbers by generating 32 bit number and dividing it by 1 << 32. (You can see that (1 << 30) * 4 * Math.random() always return a whole number).
FF: Here it seems that the number is always generated to be at most the 0.11... (53x 1) and it really uses just those 53 decimal places. (You can see that Math.random().toString(2).length - 2 does not return more than 53).
IE: Here it is very similar to FF, except that the number of places can be more if the first digits after a decimal dot are 0 and those will not round to 1 for sure. (You can see that Math.random().toString(2).match(/1[01]*$/)[0].length does not return more than 53).
I think (although I cannot provide any proof now) that any implementation should fall to one of those described groups and have no problem with rounding to 1.

2.9999999999999999 >> .5?

I heard that you could right-shift a number by .5 instead of using Math.floor(). I decided to check its limits to make sure that it was a suitable replacement, so I checked the following values and got the following results in Google Chrome:
2.5 >> .5 == 2;
2.9999 >> .5 == 2;
2.999999999999999 >> .5 == 2; // 15 9s
2.9999999999999999 >> .5 == 3; // 16 9s
After some fiddling, I found out that the highest possible value of two which, when right-shifted by .5, would yield 2 is 2.9999999999999997779553950749686919152736663818359374999999¯ (with the 9 repeating) in Chrome and Firefox. The number is 2.9999999999999997779¯ in IE.
My question is: what is the significance of the number .0000000000000007779553950749686919152736663818359374? It's a very strange number and it really piqued my curiosity.
I've been trying to find an answer or at least some kind of pattern, but I think my problem lies in the fact that I really don't understand the bitwise operation. I understand the idea in principle, but shifting a bit sequence by .5 doesn't make any sense at all to me. Any help is appreciated.
For the record, the weird digit sequence changes with 2^x. The highest possible values of the following numbers that still truncate properly:
for 0: 0.9999999999999999444888487687421729788184165954589843749¯
for 1: 1.9999999999999999888977697537484345957636833190917968749¯
for 2-3: x+.99999999999999977795539507496869191527366638183593749¯
for 4-7: x+.9999999999999995559107901499373838305473327636718749¯
for 8-15: x+.999999999999999111821580299874767661094665527343749¯
...and so forth

Actually, you're simply ending up doing a floor() on the first operand, without any floating point operations going on. Since the left shift and right shift bitwise operations only make sense with integer operands, the JavaScript engine is converting the two operands to integers first:
2.999999 >> 0.5
Becomes:
Math.floor(2.999999) >> Math.floor(0.5)
Which in turn is:
2 >> 0
Shifting by 0 bits means "don't do a shift" and therefore you end up with the first operand, simply truncated to an integer.
The SpiderMonkey source code has:
switch (op) {
case JSOP_LSH:
case JSOP_RSH:
if (!js_DoubleToECMAInt32(cx, d, &i)) // Same as Math.floor()
return JS_FALSE;
if (!js_DoubleToECMAInt32(cx, d2, &j)) // Same as Math.floor()
return JS_FALSE;
j &= 31;
d = (op == JSOP_LSH) ? i << j : i >> j;
break;
Your seeing a "rounding up" with certain numbers is due to the fact the JavaScript engine can't handle decimal digits beyond a certain precision and therefore your number ends up getting rounded up to the next integer. Try this in your browser:
alert(2.999999999999999);
You'll get 2.999999999999999. Now try adding one more 9:
alert(2.9999999999999999);
You'll get a 3.

This is possibly the single worst idea I have ever seen. Its only possible purpose for existing is for winning an obfusticated code contest. There's no significance to the long numbers you posted -- they're an artifact of the underlying floating-point implementation, filtered through god-knows how many intermediate layers. Bit-shifting by a fractional number of bytes is insane and I'm surprised it doesn't raise an exception -- but that's Javascript, always willing to redefine "insane".
If I were you, I'd avoid ever using this "feature". Its only value is as a possible root cause for an unusual error condition. Use Math.floor() and take pity on the next programmer who will maintain the code.
Confirming a couple suspicions I had when reading the question:
Right-shifting any fractional number x by any fractional number y will simply truncate x, giving the same result as Math.floor() while thoroughly confusing the reader.
2.999999999999999777955395074968691915... is simply the largest number that can be differentiated from "3". Try evaluating it by itself -- if you add anything to it, it will evaluate to 3. This is an artifact of the browser and local system's floating-point implementation.

If you wanna go deeper, read "What Every Computer Scientist Should Know About Floating-Point Arithmetic": https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Try this javascript out:
alert(parseFloat("2.9999999999999997779553950749686919152736663818359374999999"));
Then try this:
alert(parseFloat("2.9999999999999997779553950749686919152736663818359375"));
What you are seeing is simple floating point inaccuracy. For more information about that, see this for example: http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems.
The basic issue is that the closest that a floating point value can get to representing the second number is greater than or equal to 3, whereas the closes that the a float can get to the first number is strictly less than three.
As for why right shifting by 0.5 does anything sane at all, it seems that 0.5 is just itself getting converted to an int (0) beforehand. Then the original float (2.999...) is getting converted to an int by truncation, as usual.

I don't think your right shift is relevant. You are simply beyond the resolution of a double precision floating point constant.
In Chrome:
var x = 2.999999999999999777955395074968691915273666381835937499999;
var y = 2.9999999999999997779553950749686919152736663818359375;
document.write("x=" + x);
document.write(" y=" + y);
Prints out: x = 2.9999999999999996 y=3

The shift right operator only operates on integers (both sides). So, shifting right by .5 bits should be exactly equivalent to shifting right by 0 bits. And, the left hand side is converted to an integer before the shift operation, which does the same thing as Math.floor().

I suspect that converting 2.9999999999999997779553950749686919152736663818359374999999
to it's binary representation would be enlightening. It's probably only 1 bit different
from true 3.
Good guess, but no cigar.
As the double precision FP number has 53 bits, the last FP number before 3 is actually
(exact): 2.999999999999999555910790149937383830547332763671875
But why it is
2.9999999999999997779553950749686919152736663818359375
(and this is exact, not 49999... !)
which is higher than the last displayable unit ? Rounding. The conversion routine (String to number) simply is correctly programmed to round the input the the next floating point number.
2.999999999999999555910790149937383830547332763671875
.......(values between, increasing) -> round down
2.9999999999999997779553950749686919152736663818359375
....... (values between, increasing) -> round up to 3
3
The conversion input must use full precision. If the number is exactly the half between
those two fp numbers (which is 2.9999999999999997779553950749686919152736663818359375)
the rounding depends on the setted flags. The default rounding is round to even, meaning that the number will be rounded to the next even number.
Now
3 = 11. (binary)
2.999... = 10.11111111111...... (binary)
All bits are set, the number is always odd. That means that the exact half number will be rounded up, so you are getting the strange .....49999 period because it must be smaller than the exact half to be distinguishable from 3.

I suspect that converting 2.9999999999999997779553950749686919152736663818359374999999 to its binary representation would be enlightening. It's probably only 1 bit different from true 3.

And to add to John's answer, the odds of this being more performant than Math.floor are vanishingly small.
I don't know if JavaScript uses floating-point numbers or some kind of infinite-precision library, but either way, you're going to get rounding errors on an operation like this -- even if it's pretty well defined.

It should be noted that the number ".0000000000000007779553950749686919152736663818359374" is quite possibly the Epsilon, defined as "the smallest number E such that (1+E) > 1."

Develop Reference

JavaScript is the programming language of the Web.