Bitwise XOR in Javascript compared to C++ - javascript

I am porting a simple C++ function to Javascript, but it seems I'm running into problems with the way Javascript handles bitwise operators.
In C++:
AnsiString MyClass::Obfuscate(AnsiString source)
{
int sourcelength=source.Length();
for(int i=1;i<=sourcelength;i++)
{
source[i] = source[i] ^ 0xFFF;
}
return source;
}
Obfuscate("test") yields temporary intvalues
-117, -102, -116, -117
Obfuscate ("test") yields stringvalue
‹šŒ‹
In Javascript:
function obfuscate(str)
{
var obfuscated= "";
for (i=0; i<str.length;i++) {
var a = str.charCodeAt(i);
var b = a ^ 0xFFF;
obfuscated= obfuscated+String.fromCharCode(b);
}
return obfuscated;
}
obfuscate("test") yields temporary intvalues
3979 , 3994 , 3980 , 3979
obfuscate("test") yields stringvalue
ྋྚྌྋ
Now, I realize that there are a ton of threads where they point out that Javascript treats all numbers as floats, and bitwise operations involve a temporary cast to 32bit int.
It really wouldn't be a problem except for that I'm obfuscating in Javascript and reversing in C++, and the different results don't really match.
How do i tranform the Javascript result into the C++ result? Is there some simple shift available?

Working demo
Judging from the result that xoring 116 with 0xFFF gives -117, we have to emulate
2's complement 8-bit integers in javascript:
function obfuscate(str)
{
var bytes = [];
for (var i=0; i<str.length;i++) {
bytes.push( ( ( ( str.charCodeAt(i) ^ 0xFFF ) & 0xFF ) ^ 0x80 ) -0x80 );
}
return bytes;
}
Ok these bytes are interpreted in windows cp 1252 and if they are negative, probably just subtracted from 256.
var ascii = [
0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F
,0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F
,0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F
,0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F
,0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F
,0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F
,0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F
,0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E,0x007F
];
var cp1252 = ascii.concat([
0x20AC,0xFFFD,0x201A,0x0192,0x201E,0x2026,0x2020,0x2021,0x02C6,0x2030,0x0160,0x2039,0x0152,0xFFFD,0x017D,0xFFFD
,0xFFFD,0x2018,0x2019,0x201C,0x201D,0x2022,0x2013,0x2014,0x02DC,0x2122,0x0161,0x203A,0x0153,0xFFFD,0x017E,0x0178
,0x00A0,0x00A1,0x00A2,0x00A3,0x00A4,0x00A5,0x00A6,0x00A7,0x00A8,0x00A9,0x00AA,0x00AB,0x00AC,0x00AD,0x00AE,0x00AF
,0x00B0,0x00B1,0x00B2,0x00B3,0x00B4,0x00B5,0x00B6,0x00B7,0x00B8,0x00B9,0x00BA,0x00BB,0x00BC,0x00BD,0x00BE,0x00BF
,0x00C0,0x00C1,0x00C2,0x00C3,0x00C4,0x00C5,0x00C6,0x00C7,0x00C8,0x00C9,0x00CA,0x00CB,0x00CC,0x00CD,0x00CE,0x00CF
,0x00D0,0x00D1,0x00D2,0x00D3,0x00D4,0x00D5,0x00D6,0x00D7,0x00D8,0x00D9,0x00DA,0x00DB,0x00DC,0x00DD,0x00DE,0x00DF
,0x00E0,0x00E1,0x00E2,0x00E3,0x00E4,0x00E5,0x00E6,0x00E7,0x00E8,0x00E9,0x00EA,0x00EB,0x00EC,0x00ED,0x00EE,0x00EF
,0x00F0,0x00F1,0x00F2,0x00F3,0x00F4,0x00F5,0x00F6,0x00F7,0x00F8,0x00F9,0x00FA,0x00FB,0x00FC,0x00FD,0x00FE,0x00FF
]);
function toStringCp1252(bytes){
var byte, codePoint, codePoints = [];
for( var i = 0; i < bytes.length; ++i ) {
byte = bytes[i];
if( byte < 0 ) {
byte = 256 + byte;
}
codePoint = cp1252[byte];
codePoints.push( codePoint );
}
return String.fromCharCode.apply( String, codePoints );
}
Result
toStringCp1252(obfuscate("test"))
//"‹šŒ‹"

I'm guessing that AnsiString contains 8-bit characters (since the ANSI character set is 8 bits). When you assign the result of the XOR back to the string, it is truncated to 8 bits, and so the resulting value is in the range [-128...127].
(On some platforms, it could be [0..255], and on others the range could be wider, since it's not specified whether char is signed or unsigned, or whether it's 8 bits or larger).
Javascript strings contain unicode characters, which can hold a much wider range of values, the result is not truncated to 8 bits. The result of the XOR will have a range of at least 12 bits, [0...4095], hence the large numbers you see there.
Assuming the original string contains only 8-bit characters, then changing the operation to a ^ 0xff should give the same results in both languages.

I assume that AnsiString is in some form, an array of chars. And this is the problem. in c, char can typically only hold 8-bits. So when you XOR with 0xfff, and store the result in a char, it is the same as XORing with 0xff.
This is not the case with javascript. JavaScript using Unicode. This is demonstrated by looking at the integer values:
-117 == 0x8b and 3979 == 0xf8b
I would recommend XORing with 0xff as this will work in both languages. Or you can switch your c++ code to use Unicode.

First, convert your AnsiString to wchar_t*. Only then obfuscate its individual characters:
AnsiString MyClass::Obfuscate(AnsiString source)
{
/// allocate string
int num_wchars = source.WideCharBufSize();
wchar_t* UnicodeString = new wchar_t[num_wchars];
source.WideChar(UnicodeString, source.WideCharBufSize());
/// obfuscate individual characters
int sourcelength=source.Length();
for(int i = 0 ; i < num_wchars ; i++)
{
UnicodeString[i] = UnicodeString[i] ^ 0xFFF;
}
/// create obfuscated AnsiString
AnsiString result = AnsiString(UnicodeString);
/// delete tmp string
delete [] UnicodeString;
return result;
}
Sorry, I'm not an expert on C++ Builder, but my point is simple: in JavaScript you have WCS2 symbols (or UTF-16), so you have to convert AnsiString to wide chars first.
Try using WideString instead of AnsiString

I don't know AnsiString at all, but my guess is this relates to the width of its characters. Specifically, I suspect they're less than 32 bits wide, and of course in bitwise operations, the width of what you're operating on matters, particularly when dealing with 2's complement numbers.
In JavaScript, your "t" in "test" is character code 116, which is b00000000000000000000000001110100. 0xFFF (4095) is b00000000000000000000111111111111, and the result you're getting (3979) is b00000000000000000000111110001011. We can readily see that you're getting the right result for the XOR:
116 = 00000000000000000000000001110100
4095 = 00000000000000000000111111111111
3979 = 00000000000000000000111110001011
So I'm thinking you're getting some truncation or similar in your C++ code, not least because -117 is b10001011 in eight-bit 2's complement...which is exactly what we see as the last eight bits of 3979 above.

Related

How to convert a hexidecimal value to Int(16) Little Endian in Javascript

For example, if I have an RGB value of 66, 135, 245 which translates to #4287f5, how would I get the int(16) Little Endian value?
Javascript allows bit-shifts, but be careful with operator precedence.
The parentheses are necessary here.
function colorInteger(r,g,b){
return (r<<16) + (g<<8) + b
}
Now colorInteger(66,135,245) gives 4360181
And '#' + colorInteger(66,135,245).toString(16) gives "#4287f5"
To get an integer from the hexadecimal value (without #), you can use parseInt('4287f5',16). You can recover the r,g,b values from that using bit operations, too.
function getRGB(hexadecimal_string){
let intVal = parseInt(hexadecimal_string,16);
let r = intVal >> 16;
let g = (intVal >> 8) & 0xff
let b = intVal & 0xff
return [r,g,b]
}
I'm not sure if I understand your intention, but from the answer in another thread ( Byte order for packed images ) one single integer like ARGB is represented as [B,G,R,A] in little-endian machines.

compressing a string of 0's and 1's in js

Itroduction
I'm currently working on John Conway's Game of Life in js. I have the game working (view here) and i'm working on extra functionalities such as sharing your "grid / game" to your friends. To do this i'm extracting the value's of the grid (if the cell is alive or dead) into a long string of 0's and 1's.
This string has a variable length since the grid is not always the same size. for example:
grid 1 has a length and width of 30 => so the string's length is 900
grid 2 has a length and width of 50 => so the string's length is 2500
The problem
As you can see these string's of 0's and 1's are way too long to copy around and share.
However hard i try I don't seem to be able to come up with a code that would compress a string this long to a easy to handle one.
Any ideas on how to compress (and decompress) this?
I have considered simply writing down every possible grid option for the gird sizes 1x1 to 100x100 and giving them a key/reference to use as sharable code. Doing that by hand would be madness but maybe any of you has an idea on how to create an algorithm that can do this?
GitHub repository
In case it wasn't already obvious, the string you're trying to store looks like a binary string.
Counting systems
Binary is a number in base-2. This essentially means that there are two characters being used to keep count. Normally we are used to count with base-10 (decimal characters). In computer science the hexadecimal system (base-16) is also widely being used.
Since you're not storing the bits as bits but as bytes (use var a = 0b1100001; if you ever wish to store them like bits) the 'binary' you wish to store just takes as much space as any other random string with the same length.
Since you're using the binary system each position just has 2 possible values. When using the hexadecimal value a single position can hold up to 16 possible values. This is already a big improvement when it comes to storing the data compactly. As an example 0b11111111 and 0xff both represents the decimal number 255.
In your situation that'd shave 6 bytes of every 8 bytes you have to store. In the end you'd be stuck with a string just 1/4th of the length of the original string.
Javascript implementation
Essentially what we want to do is to interpret the string you store as binary and retrieve the hexadecimal value. Luckily JavaScript has built in functionality to achieve stuff like this:
var bin =
'1110101110100011' +
'0000101111100001' +
'1010010101011010' +
'0000110111011111' +
'1111111001010101' +
'0111000011100001' +
'1011010100110001' +
'0111111110010100' +
'0111110110100101' +
'0000111101100111' +
'1100001111011100' +
'0101011100001111' +
'0110011011001101' +
'1000110010001001' +
'1010100010000011' +
'0011110000000000';
var returnValue = '';
for (var i = 0; i < parseInt(bin.length / 8); i++) {
returnValue += parseInt(bin.substr(i*8, 8), 2).toString(16);
}
console.log(bin.length); // Will return 265
console.log(returnValue.length); // Will return 64
We're saying "parse this string and interpret it like a base-2 number and store it as a hexadecimal string".
Decoding is practically the same. Replace all occurrences of the number 8 in the example above with 2 and vice versa.
Please note
A prerequisite for this code to work correctly is that the binary length is dividable by 8. See the following example:
parseInt('00011110', 2).toString(16); // returns '1e'
parseInt('1e', 16).toString(2); // returns '11110'
// Technically both representations still have the same decimal value
When decoding you should add leading zeros until you have a full byte (8 bits).
In case the positions you have to store are not dividable by 8 you can, for example, add padding and add a number to the front of the output string to identify how much positions to strip.
Wait, there's more
To get even shorter strings you can build a lookup table with 265 characters in which you search for the character associated with the specific position. (This works because you're still storing the hexadecimal value as a string.) Sadly neither the ASCII nor the UTF-8 encodings are suited for this as there are blocks with values which have no characters defined.
It may look like:
// Go fill this array until you have 265 values within it.
var lookup = ['A', 'B', 'C', 'D'];
var smallerValue = lookup[0x00];
This way you can have 265 possible values at a single position, AND you have used your byte to the fullest.
Please note that no real compression is happening here. We're rather utilising data types to be used more efficiently for your current use case.
If we make the assumption than the grid contains much more 0's than 1's, you may want to try this simple compression scheme:
convert the binary string to an hexadecimal string
convert '00' sub-strings to 'z' symbol
convert 'zz' sub-strings to 'Z' symbol
we could go further, but let's stop here for the demo
Below is an example with a 16x16 grid:
var bin =
'0000000000000000' +
'0000001000000000' +
'0000011100000000' +
'0000001000000000' +
'0000000000000000' +
'0000000000111000' +
'0000100000111000' +
'0000000000111000' +
'0000000000000000' +
'0000000000000000' +
'0000000010000000' +
'0000000101000000' +
'0000000010000000' +
'0000000000000000' +
'0000100000000000' +
'0000000000000000';
var packed = bin
.match(/(.{4})/g)
.map(function(x) {
return parseInt(x, 2).toString(16);
})
.join('')
.replace(/00/g, 'z')
.replace(/zz/g, 'Z');
This will produce the string "Z02z07z02ZZ380838z38ZZz8z14z08Zz8Zz".
The unpacking process is doing the exact opposite:
var bin = packed
.replace(/Z/g, 'zz')
.replace(/z/g, '00')
.split('')
.map(function(x) {
return ('000' + parseInt(x, 16).toString(2)).substr(-4, 4);
})
.join('');
Note that this code will only work correctly if the length of the input string is a multiple of 4. If it's not the case, you'll have to pad the input and crop the output.
EDIT : 2nd method
If the input is completely random -- with roughly as many 0's as 1's and no specific repeating patterns -- the best you can do is probably to convert the binary string to a BASE64 string. It will be significantly shorter (this time with a fixed compression ratio of about 17%) and can still be copied/pasted by the user.
Packing:
var bin =
'1110101110100011' +
'0000101111100001' +
'1010010101011010' +
'0000110111011111' +
'1111111001010101' +
'0111000011100001' +
'1011010100110001' +
'0111111110010100' +
'0111110110100101' +
'0000111101100111' +
'1100001111011100' +
'0101011100001111' +
'0110011011001101' +
'1000110010001001' +
'1010100010000011' +
'0011110000000000';
var packed =
btoa(
bin
.match(/(.{8})/g)
.map(function(x) {
return String.fromCharCode(parseInt(x, 2));
})
.join('')
);
Will produce the string "66ML4aVaDd/+VXDhtTF/lH2lD2fD3FcPZs2MiaiDPAA=".
Unpacking:
var bin =
atob(packed)
.split('')
.map(function(x) {
return ('0000000' + x.charCodeAt(0).toString(2)).substr(-8, 8);
})
.join('');
Or if you want to go a step further, you can consider using something like base91 instead, for a reduced encoding overhead.
LZ-string
Using LZ-string I was able to compress the "code" quite a bit.
By simply compressing it to base64 like this:
var compressed = LZString.compressToBase64(string)
Decompressing is also just as simple as this:
var decompressed = LZString.decompressFromBase64(compressed)
However the length of this compressed string is still pretty long given that you have about as many 0s as 1s (not given in the example)
example
But the compression does work.
ANSWER
For any of you who are wondering how exactly I ended up doing it, here's how:
First I made sure every string passed in would be padded with leading 0s untill it was devidable by 8. (saving the amount of 0s used to pad, since they're needed while decompressing)
I used Corstian's answer and functions to compress my string (interpreted as binary) into a hexadecimal string. Although i had to make one slight alteration.
Not every binary substring with a lenght of 8 will return exactly 2 hex characters. so for those cases i ended up just adding a 0 in front of the substring. The hex substring will have the same value but it's length will now be 2.
Next up i used a functionality from Arnaulds answer. Taking every double character and replacing it with a single character (one not used in the hexadecimal alphabet to avoid conflict). I did this twice for every hexadecimal character.
For example:
the hex string 11 will become h and hh will become H
01101111 will become 0h0H
Since most grids are gonna have more dead cells then alive ones, I made sure the 0s would be able to compress even further, using Arnaulds method again but going a step further.
00 -> g | gg -> G | GG -> w | ww -> W | WW -> x | xx -> X | XX-> y | yy -> Y | YY -> z | zz -> Z
This resulted in Z representing 4096 (binary) 0s
The last step of the compression was adding the amount of leading 0s in front of the compressed string, so we can shave those off at the end of decompressing.
This is how the returned string looks in the end.
amount of leading 0s-compressed string so a 64*64 empty grid, will result in 0-Z
Decompressing is practically doing everything the other way around.
Firstly splitting the number that represents how many leading 0s we've used as padding from the compressed string.
Then using Arnaulds functionality, turning the further "compressed" characters back into hexadecimal code.
Taking this hex string and turning it back into binary code. Making sure, as Corstian pointed out, that every binary substring will have a length of 8. (ifnot we pad the substrings with leading 0s untill the do, exactly, have a length of 8)
And then the last step is to shave off the leading 0s we've used as padding to make the begin string devidable by 8.
The functions
Function I use to compress:
/**
* Compresses the a binary string into a compressed string.
* Returns the compressed string.
*/
Codes.compress = function(bin) {
bin = bin.toString(); // To make sure the binary is a string;
var returnValue = ''; // Empty string to add our data to later on.
// If the lenght of the binary string is not devidable by 8 the compression
// won't work correctly. So we add leading 0s to the string and store the amount
// of leading 0s in a variable.
// Determining the amount of 'padding' needed.
var padding = ((Math.ceil(bin.length/8))*8)-bin.length;
// Adding the leading 0s to the binary string.
for (var i = 0; i < padding; i++) {
bin = '0'+bin;
}
for (var i = 0; i < parseInt(bin.length / 8); i++) {
// Determining the substring.
var substring = bin.substr(i*8, 8)
// Determining the hexValue of this binary substring.
var hexValue = parseInt(substring, 2).toString(16);
// Not all binary values produce two hex numbers. For example:
// '00000011' gives just a '3' while what we wand would be '03'. So we add a 0 in front.
if(hexValue.length == 1) hexValue = '0'+hexValue;
// Adding this hexValue to the end string which we will return.
returnValue += hexValue;
}
// Compressing the hex string even further.
// If there's any double hex chars in the string it will take those and compress those into 1 char.
// Then if we have multiple of those chars these are compressed into 1 char again.
// For example: the hex string "ff will result in a "v" and "ffff" will result in a "V".
// Also: "11" will result in a "h" and "1111" will result in a "H"
// For the 0s this process is repeated a few times.
// (string with 4096 0s) (this would represent a 64*64 EMPTY grid)
// will result in a "Z".
var returnValue = returnValue.replace(/00/g, 'g')
.replace(/gg/g, 'G')
// Since 0s are probably more likely to exist in our binary and hex, we go a step further compressing them like this:
.replace(/GG/g, 'w')
.replace(/ww/g, 'W')
.replace(/WW/g, 'x')
.replace(/xx/g, 'X')
.replace(/XX/g, 'y')
.replace(/yy/g, 'Y')
.replace(/YY/g, 'z')
.replace(/zz/g, 'Z')
//Rest of the chars...
.replace(/11/g, 'h')
.replace(/hh/g, 'H')
.replace(/22/g, 'i')
.replace(/ii/g, 'I')
.replace(/33/g, 'j')
.replace(/jj/g, 'J')
.replace(/44/g, 'k')
.replace(/kk/g, 'K')
.replace(/55/g, 'l')
.replace(/ll/g, 'L')
.replace(/66/g, 'm')
.replace(/mm/g, 'M')
.replace(/77/g, 'n')
.replace(/nn/g, 'N')
.replace(/88/g, 'o')
.replace(/oo/g, 'O')
.replace(/99/g, 'p')
.replace(/pp/g, 'P')
.replace(/aa/g, 'q')
.replace(/qq/g, 'Q')
.replace(/bb/g, 'r')
.replace(/rr/g, 'R')
.replace(/cc/g, 's')
.replace(/ss/g, 'S')
.replace(/dd/g, 't')
.replace(/tt/g, 'T')
.replace(/ee/g, 'u')
.replace(/uu/g, 'U')
.replace(/ff/g, 'v')
.replace(/vv/g, 'V');
// Adding the number of leading 0s that need to be ignored when decompressing to the string.
returnValue = padding+'-'+returnValue;
// Returning the compressed string.
return returnValue;
}
The function I use to decompress:
/**
* Decompresses the compressed string back into a binary string.
* Returns the decompressed string.
*/
Codes.decompress = function(compressed) {
var returnValue = ''; // Empty string to add our data to later on.
// Splitting the input on '-' to seperate the number of paddin 0s and the actual hex code.
var compressedArr = compressed.split('-');
var paddingAmount = compressedArr[0]; // Setting a variable equal to the amount of leading 0s used while compressing.
compressed = compressedArr[1]; // Setting the compressed variable to the actual hex code.
// Decompressing further compressed characters.
compressed = compressed// Decompressing the further compressed 0s. (even further then the rest of the chars.)
.replace(/Z/g, 'zz')
.replace(/z/g, 'YY')
.replace(/Y/g, 'yy')
.replace(/y/g, 'XX')
.replace(/X/g, 'xx')
.replace(/x/g, 'WW')
.replace(/W/g, 'ww')
.replace(/w/g, 'GG')
.replace(/G/g, 'gg')
.replace(/g/g, '00')
// Rest of chars...
.replace(/H/g, 'hh')
.replace(/h/g, '11')
.replace(/I/g, 'ii')
.replace(/i/g, '22')
.replace(/J/g, 'jj')
.replace(/j/g, '33')
.replace(/K/g, 'kk')
.replace(/k/g, '44')
.replace(/L/g, 'll')
.replace(/l/g, '55')
.replace(/M/g, 'mm')
.replace(/m/g, '66')
.replace(/N/g, 'nn')
.replace(/n/g, '77')
.replace(/O/g, 'oo')
.replace(/o/g, '88')
.replace(/P/g, 'pp')
.replace(/p/g, '99')
.replace(/Q/g, 'qq')
.replace(/q/g, 'aa')
.replace(/R/g, 'rr')
.replace(/r/g, 'bb')
.replace(/S/g, 'ss')
.replace(/s/g, 'cc')
.replace(/T/g, 'tt')
.replace(/t/g, 'dd')
.replace(/U/g, 'uu')
.replace(/u/g, 'ee')
.replace(/V/g, 'vv')
.replace(/v/g, 'ff');
for (var i = 0; i < parseInt(compressed.length / 2); i++) {
// Determining the substring.
var substring = compressed.substr(i*2, 2);
// Determining the binValue of this hex substring.
var binValue = parseInt(substring, 16).toString(2);
// If the length of the binary value is not equal to 8 we add leading 0s (js deletes the leading 0s)
// For instance the binary number 00011110 is equal to the hex number 1e,
// but simply running the code above will return 11110. So we have to add the leading 0s back.
if (binValue.length != 8) {
// Determining how many 0s to add:
var diffrence = 8 - binValue.length;
// Adding the 0s:
for (var j = 0; j < diffrence; j++) {
binValue = '0'+binValue;
}
}
// Adding the binValue to the end string which we will return.
returnValue += binValue
}
var decompressedArr = returnValue.split('');
returnValue = ''; // Emptying the return variable.
// Deleting the not needed leading 0s used as padding.
for (var i = paddingAmount; i < decompressedArr.length; i++) {
returnValue += decompressedArr[i];
}
// Returning the decompressed string.
return returnValue;
}
URL shortener
I still found the "compressed" strings a little long for sharing / pasting around. So i used a simple URL shortener (view here) to make this process a little easier for the user.
Now you might ask, then why did you need to compress this string anyway?
Here's why:
First of all, my project is hosted on github pages (gh-pages). The info page of gh-pages tells us that the url can't be any longer than 2000 chars. This would mean that the max grid size would be the square root of 2000 - length of the base url, which isn't that big. By using this "compression" we are able to share much larger grids.
Now the second reason why is that, it's a challange. I find dealing with problems like these fun and also helpfull since you learn a lot.
Live
You can view the live version of my project here. and/or find the github repository here.
Thankyou
I want to thank everyone who helped me with this problem. Especially Corstian and Arnauld, since i ended up using their answers to reach my final functions.
Sooooo.... thanks guys! apriciate it!
In the Game of Life there is a board of ones and zeros. I want to back up to previous generation - size 4800 - save each 16 cells as hexadecimal = 1/4 the size. http://innerbeing.epizy.com/cwebgl/gameoflife.html [g = Go] [b = Backup]
function drawGen(n) {
stop(); var i = clamp(n,0,brw*brh-1), hex = gensave[i].toString();
echo(":",i, n,nGEN); nGEN = i; var str = '';
for (var i = 0; i < parseInt(hex.length / 4); i++)
str = str + pad(parseInt(hex.substr(i*4,4), 16).toString(2),16,'0');
for (var j=0;j<Board.length;j++) Board[j] = intr(str.substr(j,1));
drawBoard();
}
function Bin2Hex(n) {
var i = n.indexOf("1"); /// leading Zeros = NAN
if (i == -1) return "0000";
i = right(n,i*-1);
return pad(parseInt(i,2).toString(16),4,'0');
}
function saveGen(n) {
var b = Board.join(''), str = ''; /// concat array to string 10101
for (var i = 0; i < parseInt(b.length / 16); i++)
str = str + Bin2Hex(b.substr(i*16,16));
gensave[n] = str;
}
function right(st,n) {
var s = st.toString();
if (!n) return s;
if (n < 0) return s.substr(n * -1,s.length + n);
return s.substr(s.length - n,n);
}
function pad(str, l, padwith) {
var s = str;
while (s.length < l) s = padwith + s;
return s;
}

Compressing a Hex String in JavaScript/NodeJS

My app generates links, which contain hex string like: 37c1fbcabbc31f2f8d2ad31ceb91cd8d0d189ca5963dc6d353188d3d5e75b8b3e401d4e74e9b3e02efbff0792cda5c4620cb3b1f84aeb47b8d2225cd40e761a5. I would really like to make them shorter, like the solution mentioned for Ruby in Compressing a hex string in Ruby/Rails.
Is there a way to do this in JavaScript/NodeJS?
node int-encoder does this, using the strategy already mentioned.
it also supports large numbers
npm install int-encoder
var en = require('int-encoder');
//simple integer conversion
en.encode(12345678); // "ZXP0"
en.decode('ZXP0'); // 12345678
//convert big hex number using optional base argument
en.encode('e6c6b53d3c8160b22dad35a0f705ec09', 16); // 'hbDcW9aE89tzLYjDgyzajJ'
en.decode('hbDcW9aE89tzLYjDgyzajJ', 16); // 'e6c6b53d3c8160b22dad35a0f705ec09'
You could use toString and parseInt method, that basically are doing the same thing of the methods you mentioned in the link:
var hexString = "4b3fc1400";
var b36 = parseInt(hexString, 16).toString(36); // "9a29mgw"
And to convert it back, you just need to do the opposite:
hexString = parseInt(b36, 36).toString(16); // "4b3fc1400"
The only problem with your string, is that is too big to be threat as number in JavaScript. You should split them in chunk. JavaScript's numbers are accurate up to 2^53 (plus sign), so the max positive number you can handle is 0x20000000000000 (in hexadecimal, that is 9007199254740992 in decimal); you can use the accuracy to handle the chunk:
var hexString = "37c1fbcabbc31f2f8d2ad31ceb91cd8d0d189ca5963dc6d353188d3d5e75b8b3e401d4e74e9b3e02efbff0792cda5c4620cb3b1f84aeb47b8d2225cd40e761a5"
var b36 = "", b16 = "";
var chunk, intChunk;
// 14 is the length of 0x20000000000000 (2^53 in base 16)
for (var i = 0, max = 14; i < hexString.length; i += max) {
chunk = hexString.substr(i, max);
intChunk = parseInt(chunk, 16);
if (intChunk.toString(16) !== chunk) {
intChunk = parseInt(hexString.substr(i, max - 1), 16);
i -= 1;
}
b36 += intChunk.toString(36)
}
// 11 is the length of 2gosa7pa2gv (2^53 in base 36)
for (var i = 0, max = 11; i < b36.length; i += max ) {
chunk = b36.substr(i, max);
intChunk = parseInt(chunk, 36);
if (intChunk.toString(36) !== chunk) {
intChunk = parseInt(b36.substr(i, max - 1), 36);
i -= 1;
}
b16 += intChunk.toString(16)
}
console.log(hexString);
console.log(b36);
console.log(b16);
Update: You could also use a base 62 instead of 36 to compress more, but notice that JS supports up to base 36, so you need to implement that personal notation manually (I believe there are already some implementation around).
The simplest and fastest thing to do is define a set of 64 safe characters for use in the URL, such as A-Z, a-z, 0-9, _, and $. Then encode every three hex digits (4 bits each) into two safe characters (6 bits each). This requires no multiplication and division, and it can be used on arbitrarily long strings.
You will need to pick a 65th character to use at the end of the string to indicate if the last four-bit piece is used or not. Otherwise you will have an ambiguity for character strings with an even number of characters. Let's call it 2n. Then there are either 3n-1 or 3n hex digits encoded within, but there is no way to tell which. You can follow the sequence with a special character to indicate one of those cases. E.g. a '.' (period).
Note: The last few characters picked here for the set differ from Base64 encoding, since URLs have their own definition of safe punctuation characters. See RFC 1738.

Opposite of Number.toExponential in JS

I need to get the value of an extremely large number in JavaScript in non-exponential form. Number.toFixed simply returns it in exponential form as a string, which is worse than what I had.
This is what Number.toFixed returns:
>>> x = 1e+31
1e+31
>>> x.toFixed()
"1e+31"
Number.toPrecision also does not work:
>>> x = 1e+31
1e+31
>>> x.toPrecision( 21 )
"9.99999999999999963590e+30"
What I would like is:
>>> x = 1e+31
1e+31
>>> x.toNotExponential()
"10000000000000000000000000000000"
I could write my own parser but I would rather use a native JS method if one exists.
You can use toPrecision with a parameter specifying how many digits you want to display:
x.toPrecision(31)
However, among the browsers I tested, the above code only works on Firefox. According to the ECMAScript specification, the valid range for toPrecision is 1 to 21, and both IE and Chrome throw a RangeError accordingly. This is due to the fact that the floating-point representation used in JavaScript is incapable of actually representing numbers to 31 digits of precision.
Use Number(string)
Example :
var a = Number("1.1e+2");
Return :
a = 110
The answer is there's no such built-in function. I've searched high and low.
Here's the RegExp I use to split the number into sign, coefficient (digits before decimal point), fractional part (digits after decimal point) and exponent:
/^([+-])?(\d+)\.?(\d*)[eE]([+-]?\d+)$/
"Roll your own" is the answer, which you already did.
It's possible to expand JavaScript's exponential output using string functions. Admittedly, what I came up is somewhat cryptic, but it works if the exponent after the e is positive:
var originalNumber = 1e+31;
var splitNumber = originalNumber.toString().split('e');
var result;
if(splitNumber[1]) {
var regexMatch = splitNumber[0].match(/^([^.]+)\.?(.*)$/);
result =
/* integer part */ regexMatch[1] +
/* fractional part */ regexMatch[2] +
/* trailing zeros */ Array(splitNumber[1] - regexMatch[2].length + 1).join('0');
} else result = splitNumber[0];
"10000000000000000000000000000000"?
Hard to believe that anybody would rather look at that than 1.0e+31,
or in html: 1031.
But here's one way, much of it is for negative exponents(fractions):
function longnumberstring(n){
var str, str2= '', data= n.toExponential().replace('.','').split(/e/i);
str= data[0], mag= Number(data[1]);
if(mag>=0 && str.length> mag){
mag+=1;
return str.substring(0, mag)+'.'+str.substring(mag);
}
if(mag<0){
while(++mag) str2+= '0';
return '0.'+str2+str;
}
mag= (mag-str.length)+1;
while(mag> str2.length){
str2+= '0';
}
return str+str2;
}
input: 1e+30
longnumberstring: 1000000000000000000000000000000
to Number: 1e+30
input: 1.456789123456e-30
longnumberstring: 0.000000000000000000000000000001456789123456
to Number: 1.456789123456e-30
input: 1.456789123456e+30
longnumberstring: 1456789123456000000000000000000
to Number: 1.456789123456e+30
input: 1e+80 longnumberstring: 100000000000000000000000000000000000000000000000000000000000000000000000000000000
to Number: 1e+80

bitwise AND in Javascript with a 64 bit integer

I am looking for a way of performing a bitwise AND on a 64 bit integer in JavaScript.
JavaScript will cast all of its double values into signed 32-bit integers to do the bitwise operations (details here).
Javascript represents all numbers as 64-bit double precision IEEE 754 floating point numbers (see the ECMAscript spec, section 8.5.) All positive integers up to 2^53 can be encoded precisely. Larger integers get their least significant bits clipped. This leaves the question of how can you even represent a 64-bit integer in Javascript -- the native number data type clearly can't precisely represent a 64-bit int.
The following illustrates this. Although javascript appears to be able to parse hexadecimal numbers representing 64-bit numbers, the underlying numeric representation does not hold 64 bits. Try the following in your browser:
<html>
<head>
<script language="javascript">
function showPrecisionLimits() {
document.getElementById("r50").innerHTML = 0x0004000000000001 - 0x0004000000000000;
document.getElementById("r51").innerHTML = 0x0008000000000001 - 0x0008000000000000;
document.getElementById("r52").innerHTML = 0x0010000000000001 - 0x0010000000000000;
document.getElementById("r53").innerHTML = 0x0020000000000001 - 0x0020000000000000;
document.getElementById("r54").innerHTML = 0x0040000000000001 - 0x0040000000000000;
}
</script>
</head>
<body onload="showPrecisionLimits()">
<p>(2^50+1) - (2^50) = <span id="r50"></span></p>
<p>(2^51+1) - (2^51) = <span id="r51"></span></p>
<p>(2^52+1) - (2^52) = <span id="r52"></span></p>
<p>(2^53+1) - (2^53) = <span id="r53"></span></p>
<p>(2^54+1) - (2^54) = <span id="r54"></span></p>
</body>
</html>
In Firefox, Chrome and IE I'm getting the following. If numbers were stored in their full 64-bit glory, the result should have been 1 for all the substractions. Instead, you can see how the difference between 2^53+1 and 2^53 is lost.
(2^50+1) - (2^50) = 1
(2^51+1) - (2^51) = 1
(2^52+1) - (2^52) = 1
(2^53+1) - (2^53) = 0
(2^54+1) - (2^54) = 0
So what can you do?
If you choose to represent a 64-bit integer as two 32-bit numbers, then applying a bitwise AND is as simple as applying 2 bitwise AND's, to the low and high 32-bit 'words'.
For example:
var a = [ 0x0000ffff, 0xffff0000 ];
var b = [ 0x00ffff00, 0x00ffff00 ];
var c = [ a[0] & b[0], a[1] & b[1] ];
document.body.innerHTML = c[0].toString(16) + ":" + c[1].toString(16);
gets you:
ff00:ff0000
Here is code for AND int64 numbers, you can replace AND with other bitwise operation
function and(v1, v2) {
var hi = 0x80000000;
var low = 0x7fffffff;
var hi1 = ~~(v1 / hi);
var hi2 = ~~(v2 / hi);
var low1 = v1 & low;
var low2 = v2 & low;
var h = hi1 & hi2;
var l = low1 & low2;
return h*hi + l;
}
This can now be done with the new BigInt built-in numeric type. BigInt is currently (July 2019) only available in certain browsers, see the following link for details:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt
I have tested bitwise operations using BigInts in Chrome 67 and can confirm that they work as expected with up to 64 bit values.
Javascript doesn't support 64 bit integers out of the box. This is what I ended up doing:
Found long.js, a self contained Long implementation on github.
Convert the string value representing the 64 bit number to a Long.
Extract the high and low 32 bit values
Do a 32 bit bitwise and between the high and low bits, separately
Initialise a new 64 bit Long from the low and high bit
If the number is > 0 then there is correlation between the two numbers
Note: for the code example below to work you need to load
long.js.
// Handy to output leading zeros to make it easier to compare the bits when outputting to the console
function zeroPad(num, places){
var zero = places - num.length + 1;
return Array(+(zero > 0 && zero)).join('0') + num;
}
// 2^3 = 8
var val1 = Long.fromString('8', 10);
var val1High = val1.getHighBitsUnsigned();
var val1Low = val1.getLowBitsUnsigned();
// 2^61 = 2305843009213693960
var val2 = Long.fromString('2305843009213693960', 10);
var val2High = val2.getHighBitsUnsigned();
var val2Low = val2.getLowBitsUnsigned();
console.log('2^3 & (2^3 + 2^63)')
console.log(zeroPad(val1.toString(2), 64));
console.log(zeroPad(val2.toString(2), 64));
var bitwiseAndResult = Long.fromBits(val1Low & val2Low, val1High & val2High, true);
console.log(bitwiseAndResult);
console.log(zeroPad(bitwiseAndResult.toString(2), 64));
console.log('Correlation betwen val1 and val2 ?');
console.log(bitwiseAndResult > 0);
Console output:
2^3
0000000000000000000000000000000000000000000000000000000000001000
2^3 + 2^63
0010000000000000000000000000000000000000000000000000000000001000
2^3 & (2^3 + 2^63)
0000000000000000000000000000000000000000000000000000000000001000
Correlation between val1 and val2?
true
The Closure library has goog.math.Long with a bitwise add() method.
Unfortunately, the accepted answer (and others) appears not to have been adequately tested. Confronted by this problem recently, I initially tried to split my 64-bit numbers into two 32-bit numbers as suggested, but there's another little wrinkle.
Open your JavaScript console and enter:
0x80000001
When you press Enter, you'll obtain 2147483649, the decimal equivalent. Next try:
0x80000001 & 0x80000003
This gives you -2147483647, not quite what you expected. It's clear that in performing the bitwise AND, the numbers are treated as signed 32-bit integers. And the result is wrong. Even if you negate it.
My solution was to apply ~~ to the 32-bit numbers after they were split off, check for a negative sign, and then deal with this appropriately.
This is clumsy. There may be a more elegant 'fix', but I can't see it on quick examination. There's a certain irony that something that can be accomplished by a couple of lines of assembly should require so much more labour in JavaScript.

Categories

Resources