I would like to implement an XorShift PRNG in both Java, Python and JavaScript. The different implementations must generate the exact same sequences given the same seed. So far, I've have not been able to do this.
My implementation in Java
have the following implementation of an XorShift PRNG in Java (where x is a long field):
public long randomLong() {
x ^= (x << 21);
x ^= (x >>> 35);
x ^= (x << 4);
return x;
}
If I seed x to 1, the first four calls to randomLong() will generate:
35651601
1130297953386881
-9204155794254196429
144132848981442561
My implementation in Python
I have tried both with and without numpy. Below is the version that uses numpy.
def randomLong(self):
self.x ^= np.left_shift(self.x, 21)
self.x ^= np.right_shift(self.x, 35)
self.x ^= np.left_shift(self.x, 4)
return self.x
With the same seed, the Python function will generate:
35651601
1130297953386881
-9204155787274874573 # different
143006948545953793 # different
My JavaScript implementation
I've not attempted one yet, since JavaScript's only number type seems to be doubles based on IEEE 754, which opens up a different can of worms.
What I think the cause is
Java and Python have different number types. Java has 32 and 64-bit integers, while Python has funky big int types.
It seems that the shift operators have different semantics. For example, in Java there is both logical and arithmetic shift, while in Python there is only one type of shift (logical?).
Questions
I would be happy with an answer that lets me write a PRNG in these three languages, and one that is fast. It does not have to be very good. I have considered porting C libs implementations to the other languages, although it is not very good.
Can I fix my above implementations so they work?
Should I switch to another PRNG function that is easier to implement across prog.langs?
I have read the SO where someone suggested using the java.util.Random class for Python. I don't want this, since I'm also going to need the function in JavaScript, and I don't know that this packages exists there.
I would be happy with an answer that lets me write a PRNG in these three languages, and one that is fast. It does not have to be very good.
You could implement a 32-bit linear congruential generator in 3 languages.
Python:
seed = 0
for i in range(10):
seed = (seed * 1664525 + 1013904223) & 0xFFFFFFFF
print(seed)
Java:
int seed = 0;
for (int i = 0; i < 10; i++) {
seed = seed * 1664525 + 1013904223;
System.out.println(seed & 0xFFFFFFFFL);
}
JavaScript:
var seed = 0;
for (var i = 0; i < 10; i++) {
// The intermediate result fits in 52 bits, so no overflow
seed = (seed * 1664525 + 1013904223) | 0;
console.log(seed >>> 0);
}
Output:
1013904223
1196435762
3519870697
2868466484
1649599747
2670642822
1476291629
2748932008
2180890343
2498801434
Note that in all 3 languages, each iteration prints an unsigned 32-bit integer.
The tricky part is in the logical right shift. The easiest to do in Python if you have access to NumPy, is to store your x as a uint64 value, so that arithmetic and logical right shifting are the exact same operation, and cast the output value to an int64 before returning, e.g.:
import numpy as np
class XorShiftRng(object):
def __init__(self, x):
self.x = np.uint64(x)
def random_long(self):
self.x ^= self.x << np.uint64(21)
self.x ^= self.x >> np.uint64(35)
self.x ^= self.x << np.uint64(4)
return np.int64(self.x)
Those ugly casts of the shift values are required to prevent NumPy from issuing weird casting errors. In any case, this produces the exact same result as your Java version:
>>> rng = XorShiftRng(1)
>>> for _ in range(4):
... print(rng.random_long())
...
35651601
1130297953386881
-9204155794254196429
144132848981442561
The difference in results between Java and python is due to a difference in how the languages have implemented integers. A java long is a 64 bit signed integer, having the sign in the leftmost bit. Python is... well, different.
Presumably python encodes integers with varying bit length depending of the magnitude of the number
>>> n = 10
>>> n.bit_length()
4
>>> n = 1000
>>> n.bit_length()
10
>>> n = -4
>>> n.bit_length()
3
And negative integers are (presumably) encoded as sign and magnitude, though the sign does not seem to be set in any of the bits. The sign would normally be in the leftmost bit, but not here. I guess this has to do with pythons varying bit length for numbers.
>>> bin(-4)
'-0b100'
where -4 in 64 bit 2's complement would be:
0b1111111111111111111111111111111111111111111111111111111111111100
This makes a huge difference in the algorithm, since shifting 0b100 left or right yields quite different results than shifting 0b1111111111111111111111111111111111111111111111111111111111111100.
Luckily there's a way of tricking python, but this involves switching between the two representations yourself.
First some bit masks is needed:
word_size = 64
sign_mask = 1<<(word_size-1)
word_mask = sign_mask | (sign_mask - 1)
Now to force python into 2's complement, all one need is a logical 'and' with the word mask
>>> bin(4 & word_mask)
'0b100'
>>> bin(-4 & word_mask)
'0b1111111111111111111111111111111111111111111111111111111111111100'
which is what you need for the algorithm to work. Except you need to convert the numbers back when returning values, since
>>> -4 & word_mask
18446744073709551612L
So the number needs to be converted from 2's complement to signed magnitude:
>>> number = -4 & word_mask
>>> bin(~(number^word_mask))
'-0b100'
But this only works for negative integers:
>>> number = 4 & word_mask
>>> bin(~(number^word_mask))
'-0b1111111111111111111111111111111111111111111111111111111111111100'
Since positive integers should be returned as is, this would be better:
>>> number = -4 & word_mask
>>> bin(~(number^word_mask) if (number&sign_mask) else number)
'-0b100'
>>> number = 4 & word_mask
>>> bin(~(number^word_mask) if (number&sign_mask) else number)
'0b100'
So I've implemented the algorithm like this:
class XORShift:
def __init__(self, seed=1, word_length=64):
self.sign_mask = (1 << (word_length-1))
self.word_mask = self.sign_mask | (self.sign_mask -1)
self.next = self._to2scomplement(seed)
def _to2scomplement(self, number):
return number & self.word_mask
def _from2scomplement(self, number):
return ~(number^self.word_mask) if (number & self.sign_mask) else number
def seed(self, seed):
self.next = self._to2scomplement(seed)
def random(self):
self.next ^= (self.next << 21) & self.word_mask
self.next ^= (self.next >> 35) & self.word_mask
self.next ^= (self.next << 4) & self.word_mask
return self._from2scomplement(self.next)
And seeding it with 1, the algorithm returns as its 4 first numbers:
>>> prng = XORShift(1)
>>> for _ in range(4):
>>> print prng.random()
35651601
1130297953386881
-9204155794254196429
144132848981442561
Of course you get this for free by using numpy.int64, but this is less fun as it hides the cause difference.
I have not been able to implement the same algorithm in JavaScript. It seems that JavaScript uses 32 bit unsigned integers and the shifting 35 positions right, the number wraps around. I have not investigated it further.
Related
I have the following deterministic noise function which I've been using in a C# and C++ terrain generator for a while:
float GridNoise(int x, int z, int seed)
{
int n = (1619*x + 31337*z + 1013*seed) & 0x7fffffff;
n = (n >> 13) ^ n;
return 1 - ((n*(n*n*60493 + 19990303) + 1376312589) & 0x7fffffff)/(float)1073741824;
}
It returns a 'random' float between 1 and -1 for any integer x/z coordinates I enter (plus there's a seed so I can generate different terrains). I tried implementing the same function in Javascript, but the results aren't as expected. For small values, it seems OK but as I use larger values (of the order of ~10000) the results are less and less random and eventually all it returns is 1.
You can see it working correctly in C# here, and the incorrect JS results for the same input here.
I suspect it's something to do with JS variables not being strict integers, but can anyone shed more light? Does anyone have a similarly simple deterministic function I could use in JS if this doesn't work?
The underlying problem is, in javascript, there's no integers - so all mathematical functions are done using Number (52bit precision float)
In c#, if you're using longs, then any overflows are just discarded
In javascript, you need to handle this yourself
There's a numeric format is coming to browsers that will help, but it's not here yet - BigInt ... it's in chrome/opera and behind a flag in firefox (desktop, not android)
(no word on Edge (dead anyway) or Safari (the new IE) - and of course, IE will never get them)
The best I can come up with using BigInt is
function gridNoise(x, z, seed) {
var n = (1619 * x + 31337 * z + 1013 * seed) & 0x7fffffff;
n = BigInt((n >> 13) ^ n);
n = n * (n * n * 60493n + 19990303n) + 1376312589n;
n = parseInt(n.toString(2).slice(-31), 2);
return 1 - n / 1073741824;
}
function test() {
for (var i = 10000; i < 11000; i++) {
console.log(gridNoise(0, 0, i));
}
}
test();
Note, the 60493n is BigInt notation
There are "big integer" libraries you could use in the interim though - https://github.com/peterolson/BigInteger.js
The following doesn't work and never will ... because a 32bit x 32bit == 64bit ... so you'll lose bits already
I misread the code and though n was only 19 bits (because of the >>13)
If you limit the result of n * n * 60493 to 32bit, (actually, I made it 31bit ... so .. anyway it seems to work OK
function gridNoise(x, z, seed) {
var n = (1619 * x + 31337 * z + 1013 * seed) & 0x7fffffff;
n = (n >> 13) ^ n;
return 1 - ((n * (n * n * 60493 & 0x7fffffff + 19990303) + 1376312589) & 0x7fffffff) / 1073741824;
}
this also works
return 1 - ((n*(n*n*60493 | 0 + 19990303) + 1376312589) & 0x7fffffff)/1073741824;
That limits the interim result to 32 bit which may or may not be "accurate"
You may need to play around with it if you want to duplicate exactly what c# produces
I'm afraid your code is exceeding the maximum size limit for integers. As soon as that happens, it's returning 1 because the calculation ((n*(n*n*60493 + 19990303) + 1376312589) & 0x7fffffff)/1073741824 will always be 0 - thus 1 - 0 = 1
To understand whats going on here, one has to examine the JavaScripts number type. It is basically a 53bit integer, that gets left/right shifted using another 11bit integer, resulting in a 64bit number. Therefore if you have a calculation that would result in a 54bit integer, it just takes the upper 53bits, and shifts them left by 1. Now if you do bitwise math on numbers, it will take the lower 32bits. Therefore if an integer is bigger than 84bits, doing bitwise shifting on it will always result in 0. Numbers bigger than 32bits will therefore tend to 0 in JS when doing bitwise operations, while C# always takes the lower 32bits, and therefore the result will be accurate for those 32bits (but larger numbers cannot be represented).
(2 + 2 ** 53) & (2 + 2 ** 53) // 2
(2 + 2 ** 54) & (2 + 2 ** 54) // 0
Edit (sorry for the poor previous answer):
As others stated before is the problem related too your values which are exceeding the size of JS Number.
If you have the code working in C#, it might be advisable to offload the functionality to an ASP.NET backend which will handle the calculation an forward the result via some sort of API
I want following mechanism:
int64_t MyHash (const std::string& value);
Give any std::string (usually 100 bytes) as an input
The function outputs a 64-bit integer value
However the maximum value represented by that integer should be in the range of -2<sup>53</sup> to 2<sup>53</sup>-1
I tried using std::hash(); The problem with that is: It's different on every platform. Not only that, with every run it differs.
Currently, using Qt's QCryptographicHash I am getting an SHA256 checksum and that I am truncating to 64-bit. Even in this truncation, also the collision possibility will increase.
Anyhow, my goal is to get that value within 54 bits. One obvious solution is to divide that number by 2048.
Question: Is there any better solution to get a hash of 54-bit?
Javascript solution is also fine.
Purpose: This value is passed to Javascript. Now it's datatype number can hold a 64-bit double, which is of 54 bits.
Getting a 54 bit hash, you're likely to trade quality for speed. In SHA256, the bottom 54 bits will give as reliable a hash as it's reasonably possible to get, at the cost of not the best performance.
Other possibilities are a 64 bit CRC, which can very easily be found with a quick google search. That's likely to be faster, and still probably fine for any reasonable use case.
As for truncation to the [-253 .. 253 - 1] range, I'd just use & with a suitable bitmask, and then subtract 253.
253 is 0x20000000000000, so it would just be:
crc = crc - 0x20000000000000LL;
As for the 64 bit CRC itself, the following code is taken directly from http://andrewl.dreamhosters.com/filedump/crc64.cpp which is a downloadable .cpp file. The original is written using Windows data types, I've converted here to normal stdint.h types.
unit64_t const poly = 0xC96C5795D7870F42ULL;
uint64_t table[256];
void generate_table()
{
for(int i = 0; i < 256; ++i)
{
uint64_t crc = i;
for(int j = 0; j < 8; ++j)
{
if(crc & 1)
{
crc >>= 1;
crc ^= poly;
}
else
{
crc >>= 1;
}
}
table[i] = crc;
}
}
You'll want to call generate_table() exactly once at program startup. Either that, or run it in a small harness which just prints out the results, and directly initialize the table using those values.
To actually evaluate the crc, pass the sequence of bytes and the length to this:
uint64_t calculate_crc(uint8_t *stream, size_t n)
{
uint64_t crc = 0;
for(size_t i = 0; i < n; ++i)
{
uint8_t index = stream[i] ^ crc;
uint64_t lookup = table[index];
crc >>= 8;
crc ^= lookup;
}
return crc;
}
Depending on how curious you are, it may be worth taking a look at the linked source, it has extensive comments that explain what's going on.
A colleague of mine stumbled upon a method to floor float numbers using a bitwise or:
var a = 13.6 | 0; //a == 13
We were talking about it and wondering a few things.
How does it work? Our theory was that using such an operator casts the number to an integer, thus removing the fractional part
Does it have any advantages over doing Math.floor? Maybe it's a bit faster? (pun not intended)
Does it have any disadvantages? Maybe it doesn't work in some cases? Clarity is an obvious one, since we had to figure it out, and well, I'm writting this question.
Thanks.
How does it work? Our theory was that using such an operator casts the
number to an integer, thus removing the fractional part
All bitwise operations except unsigned right shift, >>>, work on signed 32-bit integers. So using bitwise operations will convert a float to an integer.
Does it have any advantages over doing Math.floor? Maybe it's a bit
faster? (pun not intended)
http://jsperf.com/or-vs-floor/2 seems slightly faster
Does it have any disadvantages? Maybe it doesn't work in some cases?
Clarity is an obvious one, since we had to figure it out, and well,
I'm writting this question.
Will not pass jsLint.
32-bit signed integers only
Odd Comparative behavior: Math.floor(NaN) === NaN, while (NaN | 0) === 0
This is truncation as opposed to flooring. Howard's answer is sort of correct; But I would add that Math.floor does exactly what it is supposed to with respect to negative numbers. Mathematically, that is what a floor is.
In the case you described above, the programmer was more interested in truncation or chopping the decimal completely off. Although, the syntax they used sort of obscures the fact that they are converting the float to an int.
In ECMAScript 6, the equivalent of |0 is Math.trunc, kind of I should say:
Returns the integral part of a number by removing any fractional digits. It just truncate the dot and the digits behind it, no matter whether the argument is a positive number or a negative number.
Math.trunc(13.37) // 13
Math.trunc(42.84) // 42
Math.trunc(0.123) // 0
Math.trunc(-0.123) // -0
Math.trunc("-1.123")// -1
Math.trunc(NaN) // NaN
Math.trunc("foo") // NaN
Math.trunc() // NaN
Javascript represents Number as Double Precision 64-bit Floating numbers.
Math.floor works with this in mind.
Bitwise operations work in 32bit signed integers. 32bit signed integers use first bit as negative signifier and the other 31 bits are the number. Because of this, the min and max number allowed 32bit signed numbers are -2,147,483,648 and 2147483647 (0x7FFFFFFFF), respectively.
So when you're doing | 0, you're essentially doing is & 0xFFFFFFFF. This means, any number that is represented as 0x80000000 (2147483648) or greater will return as a negative number.
For example:
// Safe
(2147483647.5918 & 0xFFFFFFFF) === 2147483647
(2147483647 & 0xFFFFFFFF) === 2147483647
(200.59082098 & 0xFFFFFFFF) === 200
(0X7FFFFFFF & 0xFFFFFFFF) === 0X7FFFFFFF
// Unsafe
(2147483648 & 0xFFFFFFFF) === -2147483648
(-2147483649 & 0xFFFFFFFF) === 2147483647
(0x80000000 & 0xFFFFFFFF) === -2147483648
(3000000000.5 & 0xFFFFFFFF) === -1294967296
Also. Bitwise operations don't "floor". They truncate, which is the same as saying, they round closest to 0. Once you go around to negative numbers, Math.floor rounds down while bitwise start rounding up.
As I said before, Math.floor is safer because it operates with 64bit floating numbers. Bitwise is faster, yes, but limited to 32bit signed scope.
To summarize:
Bitwise works the same if you work from 0 to 2147483647.
Bitwise is 1 number off if you work from -2147483647 to 0.
Bitwise is completely different for numbers less than -2147483648 and greater than 2147483647.
If you really want to tweak performance and use both:
function floor(n) {
if (n >= 0 && n < 0x80000000) {
return n & 0xFFFFFFFF;
}
if (n > -0x80000000 && n < 0) {
const bitFloored = n & 0xFFFFFFFF;
if (bitFloored === n) return n;
return bitFloored - 1;
}
return Math.floor(n);
}
Just to add Math.trunc works like bitwise operations. So you can do this:
function trunc(n) {
if (n > -0x80000000 && n < 0x80000000) {
return n & 0xFFFFFFFF;
}
return Math.trunc(n);
}
Your first point is correct. The number is cast to an integer and thus any decimal digits are removed. Please note, that Math.floor rounds to the next integer towards minus infinity and thus gives a different result when applied to negative numbers.
The specs say that it is converted to an integer:
Let lnum be ToInt32(lval).
Performance: this has been tested at jsperf before.
note: dead link to spec removed
var myNegInt = -1 * Math.pow(2, 32);
var myFloat = 0.010203040506070809;
var my64BitFloat = myNegInt - myFloat;
var trunc1 = my64BitFloat | 0;
var trunc2 = ~~my64BitFloat;
var trunc3 = my64BitFloat ^ 0;
var trunc4 = my64BitFloat - my64BitFloat % 1;
var trunc5 = parseInt(my64BitFloat);
var trunc6 = Math.floor(my64BitFloat);
console.info(my64BitFloat);
console.info(trunc1);
console.info(trunc2);
console.info(trunc3);
console.info(trunc4);
console.info(trunc5);
console.info(trunc6);
IMO: The question "How does it work?", "Does it have any advantages over doing Math.floor?", "Does it have any disadvantages?" pale in comparison to "Is it at all logical to use it for this purpose?"
I think, before you try to get clever with your code, you may want to run these. My advice; just move along, there is nothing to see here. Using bitwise to save a few operations and having that matter to you at all, usually means your code architecture needs work. As far as why it may work sometimes, well a stopped clock is accurate twice a day, that does not make it useful. These operators have their uses, but not in this context.
I'm reading a tutorial on Perlin Noise, and I came across this function:
function IntNoise(32-bit integer: x)
x = (x<<13) ^ x;
return ( 1.0 - ( (x * (x * x * 15731 + 789221) + 1376312589) & 7fffffff) / 1073741824.0);
end IntNoise function
While I do understand some parts of it, I really don't get what are (x<<13) and & 7fffffff supposed to mean (I see that it is a hex number, but what does it do?). Can someone help me translate this into JS? Also, normal integers are 32 bit in JS, on 32 bit computers, right?
It should work in JavaScript with minimal modifications:
function IntNoise(x) {
x = (x << 13) ^ x;
return (1 - ((x * (x * x * 15731 + 789221) + 1376312589) & 0x7fffffff) / 1073741824);
}
The << operator is a bitwise left-shift, so << 13 means shift the number 13 bits to the left.
The & operator is a bitwise AND. Doing & 0x7fffffff on a signed 32-bit integer masks out the sign bit, ensuring that the result is always a positive number (or zero).
The way that JavaScript deals with numbers is a bit quirky, to say the least. All numbers are usually represented as IEEE-754 doubles, but... once you start using bitwise operators on a number then JavaScript will treat the operands as signed 32-bit integers for the duration of that calculation.
Here's a good explanation of how JavaScript deals with bitwise operations:
Bitwise Operators
x<<13 means shift x 13 steps to left (bitwise).
Furthermore a<<b is equivalent to a*2^b.
& 7ffffff means bitwise AND of leftside with 7FFFFFFF.
If you take a look at the bit pattern of 7FFFFFFF you will notice that the bit 32 is 0 and the rest of the bits are 1. This means that you will mask out bit 0-30 and drop bit 31.
Suppose I have a hex number "4072508200000000" and I want the floating point number that it represents (293.03173828125000) in IEEE-754 double format to be put into a JavaScript variable.
I can think of a way that uses some masking and a call to pow(), but is there a simpler solution?
A client-side solution is needed.
This may help. It's a website that lets you enter a hex encoding of an IEEE-754 and get an analysis of mantissa and exponent.
http://babbage.cs.qc.edu/IEEE-754/64bit.html
Because people always tend to ask "why?," here's why: I'm trying to fill out an existing but incomplete implementation of Google's Procol Buffers (protobuf).
I don't know of a good way. It certainly can be done the hard way, here is a single-precision example totally within JavaScript:
js> a = 0x41973333
1100428083
js> (a & 0x7fffff | 0x800000) * 1.0 / Math.pow(2,23) * Math.pow(2, ((a>>23 & 0xff) - 127))
18.899999618530273
A production implementation should consider that most of the fields have magic values, typically implemented by specifying a special interpretation for what would have been the largest or smallest. So, detect NaNs and infinities. The above example should be checking for negatives. (a & 0x80000000)
Update: Ok, I've got it for double's, too. You can't directly extend the above technique because the internal JS representation is a double, and so by its definition it can handle at best a bit string of length 52, and it can't shift by more than 32 at all.
Ok, to do double you first chop off as a string the low 8 digits or 32 bits; process them with a separate object. Then:
js> a = 0x40725082
1081233538
js> (a & 0xfffff | 0x100000) * 1.0 / Math.pow(2, 52 - 32) * Math.pow(2, ((a >> 52 - 32 & 0x7ff) - 1023))
293.03173828125
js>
I kept the above example because it's from the OP. A harder case is when the low 32-bits have a value. Here is the conversion of 0x40725082deadbeef, a full-precision double:
js> a = 0x40725082
1081233538
js> b = 0xdeadbeef
3735928559
js> e = (a >> 52 - 32 & 0x7ff) - 1023
8
js> (a & 0xfffff | 0x100000) * 1.0 / Math.pow(2,52-32) * Math.pow(2, e) +
b * 1.0 / Math.pow(2, 52) * Math.pow(2, e)
293.0319506442019
js>
There are some obvious subexpressions you can factor out but I've left it this way so you can see how it relates to the format.
A quick addition to DigitalRoss' solution, for those finding this page via Google as I did.
Apart from the edge cases for +/- Infinity and NaN, which I'd love input on, you also need to take into account the sign of the result:
s = a >> 31 ? -1 : 1
You can then include s in the final multiplication to get the correct result.
I think for a little-endian solution you'll also need to reverse the bits in a and b and swap them.
The new Typed Arrays mechanism allows you to do this (and is probably an ideal mechanism for implementing protocol buffers):
var buffer = new ArrayBuffer(8);
var bytes = new Uint8Array(buffer);
var doubles = new Float64Array(buffer); // not supported in Chrome
bytes[7] = 0x40; // Load the hex string "40 72 50 82 00 00 00 00"
bytes[6] = 0x72;
bytes[5] = 0x50;
bytes[4] = 0x82;
bytes[3] = 0x00;
bytes[2] = 0x00;
bytes[1] = 0x00;
bytes[0] = 0x00;
my_double = doubles[0];
document.write(my_double); // 293.03173828125
This assumes a little-endian machine.
Unfortunately Chrome does not have Float64Array, although it does have Float32Array. The above example does work in Firefox 4.0.1.