How random is JavaScript's Math.random? - javascript

For 6 years I've had a random number generator page on my website. For a long time, it was the first or second result on Google for "random number generator" and has been used to decide dozens, if not hundreds of contests and drawings on discussion forums and blogs (I know because I see the referrers in my web logs and usually go take a look).
Today, someone emailed me to tell me it may not be as random as I thought. She tried generating very large random numbers (e.g., between 1 and 10000000000000000000) and found that they were almost always the same number of digits. Indeed, I wrapped the function in a loop so I could generate thousands of numbers and sure enough, for very large numbers, the variation was only about 2 orders of magnitude.
Why?
Here is the looping version, so you can try it out for yourself:
http://andrew.hedges.name/experiments/random/randomness.html
It includes both a straightforward implementation taken from the Mozilla Developer Network and some code from 1997 that I swiped off a web page that no longer exists (Paul Houle's "Central Randomizer 1.3"). View source to see how each method works.
I've read here and elsewhere about Mersenne Twister. What I'm interested in is why there wouldn't be greater variation in the results from JavaScript's built-in Math.random function. Thanks!

Given numbers between 1 and 100.
9 have 1 digit (1-9)
90 have 2 digits (10-99)
1 has 3 digits (100)
Given numbers between 1 and 1000.
9 have 1 digit
90 have 2 digits
900 have 3 digits
1 has 4 digits
and so on.
So if you select some at random, then that vast majority of selected numbers will have the same number of digits, because the vast majority of possible values have the same number of digits.

Your results are actually expected. If the random numbers are uniformly distributed in a range 1 to 10^n, then you would expect about 9/10 of the numbers to have n digits, and a further 9/100 to have n-1 digits.

There different types of randomness. Math.random gives you an uniform distribution of numbers.
If you want different orders of magnitude, I would suggest using an exponential function to create what's called a power law distribution:
function random_powerlaw(mini, maxi) {
return Math.ceil(Math.exp(Math.random()*(Math.log(maxi)-Math.log(mini)))*mini)
}
This function should give you roughly the same number of 1-digit numbers as 2-digit numbers and as 3-digit numbers.
There are also other distributions for random numbers like the normal distribution (also called Gaussian distribution).

Looks perfectly random to me!
(Hint: It's browser dependent.)
Personally, I think my implementation would be better, although I stole it off from XKCD, who should ALWAYS be acknowledged:
function random() {
return 4; // Chosen by a fair dice throw. Guaranteed to be random.
}

The following paper explains how math.random() in major Web browsers is (un)secure:
"Temporary user tracking in major browsers and Cross-domain information
leakage and attacks" by Amid Klein (2008). It's no stronger than typical Java or Windows built-in PRNG functions.
On the other hand, implementing SFMT of the period 2^19937-1 requires 2496 bytes of the internal state maintained for each PRNG sequence. Some people may consider this as unforgivable cost.

If you use a number like 10000000000000000000 you're going beyond the accuracy of the datatype Javascript is using. Note that all the numbers generated end in "00".

I tried JS pseudorandom number generator on Chaos Game.
My Sierpiński triangle says its pretty random:

Well, if you are generating numbers up to, say, 1e6, you will hopefully get all numbers with approximately equal probability. That also means that you only have a one in ten chance of getting a number with one digit less. A one in a hundred chance of getting two digits less, etc. I doubt you will see much difference when using another RNG, because you have a uniform distribution across the numbers, not their logarithm.

Non-random numbers uniformly distributed from 1 to N have the same property. Note that (in some sense) it's a matter of precision. A uniform distribution on 0-99 (as integers) does have 90% of its numbers having two digits. A uniform distribution on 0-999999 has 905 of its numbers having five digits.
Any set of numbers (under some not too restrictive conditions) has a density. When someone want to discuss "random" numbers, the density of these numbers should be specified (as noted above.) A common density is the uniform density. There are others: the exponential density, the normal density, etc. One must choose which density is relevant before proposing a random number generator. Also, numbers coming from one density can often be easily transformed to another density by carious means.

Related

What is probability of random generator repeating more than once?

Imagine we have two independent pseudo-random number generators using same algorithm but seeded differently. And we are generating numbers of same size using these generators, say 32-bit integers. Provided algorithm gives us uniform distribution, there is 1/2^32 probability (or is it?) of a collision. If a collision just happened, what is the probability the very next pair will also be a collision? It seems for me this probability might be different (higher) from that initial uniform-based collision chance. Most of currently existing pseudo-random number generators hold internal state to maintain own stability, and recently happened collision might signal those internal states are somewhat "entangled" giving modified (higher) chance of a collision to happen again.
The question is probably too broad to give any precise answer, but revealing general directions/trends could also be nice. Here are some interesting aspects:
Does size of initial collision matter? Is there a difference after a
collision of 8 consecutive bits vs 64 bits? How approximately chance
of next collision depends on size of generated sequence?
Does pattern of pair generation matter? For example, we could find
initial collision by executing first generator only once and
"searching" second generator. Or we could invoke each generator on
every iteration.
I'm particularly interested in default javascript Math.random().
32-bit integers can be generated of that like
this (for example). EDIT: As pointed in comments, conversion of
random value from [0; 1) range should be done carefully, as exponent of
such values is very likely to repeat (and it takes decent part of result
extracted this way).

How to handle floating points in a JavaScript calculator?

Why is this not a duplicate of these great SO articles?
While the two posts linked in the comments below are excellent I am specifically looking for information that helps me to address this issue in native JS. I know that JS shouldn't be the first choice for complex math, but given the limitation that this calculator is meant to run in the browser it is the tool that I have decided to work with.
Background
I'm trying to make a calculator with TypeScript without any libraries (like Big.js) and without using string concatenation in the inner logic of the calculator.
Examples
When a user wants to type the number 8.5:
The 8 key is pressed
The decimal key is pressed
The 5 key is pressed
Mathematically I create this number in the display with the following snippet:
8 + 5 * 0.1
This works but if I continue down the decimal places I encounter something unexpected:
8.5 + 5 * 0.01 // 8.55
8.55 + 5 * 0.001 // 8.555000000000001
Question
What is the best way to handle this without converting the number to a string? Is there an intelligent way to impose a limit on the precision of the calculator so that it only supports accuracy to so many decimal places?
Thanks for your help!
Use .toFixed():
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/toFixed
or .toPrecision():
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/toPrecision
depending on your needs.
Note that you don't need to convert numbers all the time. The only place to convert - is for the final output to the view. In this case we can even leave it in string format.
That is an answer to the question "how to get manage with" (like in description). About "why do we get such result" - first comment provides great answer.
The easiest way to get the Number value that is closest to what the user enters is to build a numeral in a string from the user’s keypress and then convert them with String.toNumber().
Numbers such as 8.55 or 8.555 are not exactly representable in the Number format. The closest values are 8.550000000000000710542735760100185871124267578125 and 8.55499999999999971578290569595992565155029296875. Converting the strings “8.55” and “8.555” with .toNumber() should produce these values.
Because these are the closest representable values, no calculation or algorithm can produce any closer values in the Number format.
For simple additions, subtractions, and limited multiplications, you can mimic decimal arithmetic by rounding to a desired number of decimal digits after each Number operation. However, this is generally not a feasible approach because other operations and various sequences of operations will exceed the ability to mimic decimal arithmetic reasonably.

How to implement parseFloat

Wondering how a low-level implementation of parseFloat such as how it works in JavaScript would be implemented.
All the examples I've seen of typecasting resort to using it at some point, such as this, this, or this. On the other hand, there is this file which is quite large (from here).
Wondering if it is just a very complicated function or there is a straightforward implementation. Wondering just generally how it works if it is too complicated.
Perhaps this is closer to it.
The essential mathematics of parseFloat is very simple, requiring no more than elementary-school arithmetic. If we have a decimal numeral, we can easily convert it to binary by:
Divide the integer part by two. The remainder (zero or one) becomes a bit in a binary numeral we are building. The quotient replaces the integer part, and we repeat until the integer part is zero. For example, starting with 13, we divide to get a quotient of 6 and a remainder of 1. Then we divide 6 to get a quotient of 3 and a remainder of 0. Then 1 and 1, then 0 and 1, and we are done. The bits we produced, in reverse order, were 1101, and that is the binary numeral for 13.
Multiply the sub-integer part by two. The integer part becomes another bit in the binary numeral. Repeat with the sub-integer part until it is zero or we have enough bits to determine the result. For example, with .1875, we multiply by two to get .375, which has an integer part of 0. Doubling again produces .75, which again has an integer part of 0. Next we get 1.5, which has an integer part of 1. Now when the sub-integer part, .5, is doubled, we get 1 with a sub-integer part of 0. The new bits are .0011.
To determine a floating-point number, we need as many bits as fit in the significand (starting with the leading 1 bit from the binary numeral), and, for rounding purposes, we need to know the next bit and whether any bits after that are non-zero. (The information about the extra bits tells us whether the difference between the source value and the bits that fit in the significand is zero, not zero but less than 1/2 of the lowest bit that fits, exactly 1/2 of the lowest bit, or more than 1/2 of the lowest bit. This information is enough to decide whether to round up or down in any of the usual rounding modes.)
The information above tells you when to stop multiplying in the second part of the algorithm. As soon as you have all the significand bits, plus one more, plus you have either one non-zero bit or the sub-integer part is zero, you have all the information you need and can stop.
Then you construct a floating-point value by rounding the bits according to whatever rounding rule you are using (often round-to-nearest-ties-to-even), putting the bits into the significand of a floating-point object, and setting the exponent to record the position of the leading bit of the binary numeral.
There are some embellishments for checking for overflow or underflow or handling subnormal values. However, the basic arithmetic is simply elementary-school arithmetic.
Problems arise because the above uses arbitrary-size arrays and because it does not support scientific notation where an “e” is used to introduce a decimal exponent, as in “2.79e34”. The above algorithm requires that we maintain all the space needed to multiply and divide decimal numerals of any length given to us. Usually, we do not want to do that, and we also want faster algorithms. Note that supporting scientific notation with the above algorithm would also require arbitrary-size arrays. To fill out the decimal numeral for “2.79e34”, we have to fill an array with “27900000000000000000000000000000000”.
So algorithms are developed to do the conversion in smarter ways. Instead of doing exact calculations, we may do precise calculations but carefully analyze the errors produced to ensure they are too small to prevent us from getting the right answer. Also, data may be prepared in advance, such as tables with information about powers of ten, so that we have approximate values of powers of ten already in binary without having to compute them each time a conversion is performed.
The complications of converting decimal to binary floating-point arise out of this desire for algorithms that are fast and use limited resources. Allowing some errors causes a need for mathematical proofs to ensure the computations are correct, and trying to make the routines fast and resource-efficient lead people to think of clever techniques to use, which become tricky and require proof.

Optimal compression for a large base 10 number contained in a string

I am writing compression and decompression functions for strings containing base 10 digits. I figure that, since it is a mere 10 characters being acted upon, that there exists a much smaller string that can represent large strings. The compressed result is encoded in ISO-8859-7, so I can use 256 characters in the result string
For example, I want to take a string that represents a 1000-digit number (this one, for example) and "compress it". Numbers of these lengths exceed the number type in the language that I am working in, JavaScript. As such, numeric manipulation/conversion is out of the question. The compression software I use (shoco) does not compress numbers. At all.
How might I go about doing this? Is there a certain algorithm that can be used to compress numbers? I am not looking for performing speed, but rather looking for optimal compression for a majority of numbers, not just the number given as an example.
If you work on the number in groups of three digits, you can represent each triplet in 10 bits with very little wastage. Then you "just" need to create a stream of 8-bit octets from your stream of 10-bit triples, which will require a certain amount of bit-shifting, but is not awfully complicated.
That assumes that your number consists of a multiple of 3 digits (you could pad it with leading zeros) or that you know how many digits it contains (in which case you could pad it at the end with trailing zeros). If you encoded subsequences into 50 bit units, you would have enough codespace to encode digit sequences of up to 15 digits, not just exactly 15 digits, which would avoid the need to pad. You could just barely get away with that in a language which uses 53-bit floating point as a common numeric type, but it might or might not be worth the extra complication.
rici's answer, using 10 bits for every three digits, is indeed what I would use for a practical application.
However since you asked for the optimal compression and stated that you don't care about speed, that would be generating a binary representation of the decimal number using multiple precision arithmetic. This code has already been written for you in the GMP library. That library is highly optimized and quite fast, so you would not see a huge speed impact, depending on what else you're doing with the numbers.
As an example your 1000-digit number would take 418 bytes to code using 334 sets of 10 bits. It would take 416 bytes when encoded as a single, large, binary integer. On a 2 GHz i7, I get 1.9 µs for the 1000-digit conversion using sets of 10 bits, vs. 55 µs using multiple precision arithmetic to make a big integer.
Update:
I missed the javascript tag until someone pointed it out in a comment. You can use Crunch for multiple-precision arithmetic in javascript.
Update 2:
As pointed out by rici, the comparison above assumes that the length of the input is known a priori for both encodings. However if the stream of bits needs to embedded in a larger stream and the number of digits is not known a priori, then a means must be provided to determine where the number ends.
The 10-bit encoding of three digits permits using a final 10-bit code to be that marker, since 24 of the possible values are unused. In fact, we can use 10 of those 24 to provide one more digit to the number. (We could even add a "half" digit by using 20 values for 0..19, allowing a leading 1 if present in that position. Or we could use that for sign to allow negative integers. But I digress.) This turns out to be perfect for the case of 1000 digits, which is a multiple of three, plus one. Then 1000 digits can be encoded with an end marker in 418 bytes, the same as before when not requiring an end marker. (In a stream of bits it can actually be 417.5 bytes.)
For the binary integer we can either precede it with a length in bits, or use bit stuffing to mark the end of the stream with a series of one bits. The overhead is about the same either way. We'll do the latter to make it easy to handle arbitrary-length integers. The 1000-digit integer will take 3322 bits, or 415 bytes and two bits. We can choose the maximum run of one bits in the data to be 11 long. When 11 1's appear in a row, a 0 bit is stuffed into the stream. If 12 1's are seen in a row, then you have reached the end of the stream (the 12 1's and a preceding 0 are discarded.) Using 11 will add 13 bits to the end, plus allowing up to one bit of stuffing to fill the last byte (the mean number of stuffed bits is 0.81), bringing the total bytes to 417.
So there is still gain, four bits to be precise, though less now due to the advantage of the unused 10-bit patterns.

Math.sqrt() returns infinity?

Math.sqrt(); seems to work fine with any number less than 310 characters long.
However, any number 310 chars or over will return infinity...
If you want to test it out yourself, here it is on jsfiddle http://jsfiddle.net/gqhk9/2
Anyway, I need to get the square root of numbers including some which are 310 chars and longer.
How can I do that in js?
It's not an issue with Math.sqrt - get rid of the Math.sqrt call and you'll still see infinity. Basically, Javascript can't cope with numbers that big - it runs out of the range of 64-bit floating point IEEE 754 values. You'll need to find some sort of library for handling arbitrary-sized integers.
Note that even for numbers smaller than 10309, you're still going to be losing information after the first ~15 digits. If you care about all of those digits, again you should be looking at specialist maths libraries.
A quick look around the web found BigInt.js referenced a few times, but I don't know how good it is.
Look at Number.MAX_VALUE.
The MAX_VALUE property has a value of approximately 1.79E+308.
Values larger than MAX_VALUE are represented as "Infinity".
Javascript numbers cannot be that big.
If you type
javascript:123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
in your address bar, you'll also get Infinity.
You need to use a bignum library.
The number that you are starting with, 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890, is Infinity, and Math.sqrt(Infinity) is Infinity.
What you need is a big integer library to simulate it, for example, http://www.leemon.com/crypto/BigInt.html; then with that you can take your big integer to the power of 0.5 to calculate the square root.

Categories

Resources