Compressing a Hex String in JavaScript/NodeJS

Compressing a Hex String in JavaScript/NodeJS - javascript

My app generates links, which contain hex string like: 37c1fbcabbc31f2f8d2ad31ceb91cd8d0d189ca5963dc6d353188d3d5e75b8b3e401d4e74e9b3e02efbff0792cda5c4620cb3b1f84aeb47b8d2225cd40e761a5. I would really like to make them shorter, like the solution mentioned for Ruby in Compressing a hex string in Ruby/Rails.
Is there a way to do this in JavaScript/NodeJS?

node int-encoder does this, using the strategy already mentioned.
it also supports large numbers
npm install int-encoder
var en = require('int-encoder');
//simple integer conversion
en.encode(12345678); // "ZXP0"
en.decode('ZXP0'); // 12345678
//convert big hex number using optional base argument
en.encode('e6c6b53d3c8160b22dad35a0f705ec09', 16); // 'hbDcW9aE89tzLYjDgyzajJ'
en.decode('hbDcW9aE89tzLYjDgyzajJ', 16); // 'e6c6b53d3c8160b22dad35a0f705ec09'

You could use toString and parseInt method, that basically are doing the same thing of the methods you mentioned in the link:
var hexString = "4b3fc1400";
var b36 = parseInt(hexString, 16).toString(36); // "9a29mgw"
And to convert it back, you just need to do the opposite:
hexString = parseInt(b36, 36).toString(16); // "4b3fc1400"
The only problem with your string, is that is too big to be threat as number in JavaScript. You should split them in chunk. JavaScript's numbers are accurate up to 2^53 (plus sign), so the max positive number you can handle is 0x20000000000000 (in hexadecimal, that is 9007199254740992 in decimal); you can use the accuracy to handle the chunk:
var hexString = "37c1fbcabbc31f2f8d2ad31ceb91cd8d0d189ca5963dc6d353188d3d5e75b8b3e401d4e74e9b3e02efbff0792cda5c4620cb3b1f84aeb47b8d2225cd40e761a5"
var b36 = "", b16 = "";
var chunk, intChunk;
// 14 is the length of 0x20000000000000 (2^53 in base 16)
for (var i = 0, max = 14; i < hexString.length; i += max) {
chunk = hexString.substr(i, max);
intChunk = parseInt(chunk, 16);
if (intChunk.toString(16) !== chunk) {
intChunk = parseInt(hexString.substr(i, max - 1), 16);
i -= 1;
}
b36 += intChunk.toString(36)
}
// 11 is the length of 2gosa7pa2gv (2^53 in base 36)
for (var i = 0, max = 11; i < b36.length; i += max ) {
chunk = b36.substr(i, max);
intChunk = parseInt(chunk, 36);
if (intChunk.toString(36) !== chunk) {
intChunk = parseInt(b36.substr(i, max - 1), 36);
i -= 1;
}
b16 += intChunk.toString(16)
}
console.log(hexString);
console.log(b36);
console.log(b16);
Update: You could also use a base 62 instead of 36 to compress more, but notice that JS supports up to base 36, so you need to implement that personal notation manually (I believe there are already some implementation around).

The simplest and fastest thing to do is define a set of 64 safe characters for use in the URL, such as A-Z, a-z, 0-9, _, and $. Then encode every three hex digits (4 bits each) into two safe characters (6 bits each). This requires no multiplication and division, and it can be used on arbitrarily long strings.
You will need to pick a 65th character to use at the end of the string to indicate if the last four-bit piece is used or not. Otherwise you will have an ambiguity for character strings with an even number of characters. Let's call it 2n. Then there are either 3n-1 or 3n hex digits encoded within, but there is no way to tell which. You can follow the sequence with a special character to indicate one of those cases. E.g. a '.' (period).
Note: The last few characters picked here for the set differ from Base64 encoding, since URLs have their own definition of safe punctuation characters. See RFC 1738.

Related

How to generate a fixed-length code from a set of integers of a specific bit count in JavaScript

Generate string from integer with arbitrary base in JavaScript received the following answer:
function parseInt(value, code) {
return [...value].reduce((r, a) => r * code.length + code.indexOf(a), 0);
}
function toString(value, code) {
var digit,
radix= code.length,
result = '';
do {
digit = value % radix;
result = code[digit] + result;
value = Math.floor(value / radix);
} while (value)
return result;
}
console.log(parseInt('dj', 'abcdefghijklmnopqrstuvwxyz0123456789+-'));
console.log(toString(123, 'abcdefghijklmnopqrstuvwxyz0123456789+-'));
console.log(parseInt('a', 'abcdefghijklmnopqrstuvwxyz0123456789+-'));
console.log(toString(0, 'abcdefghijklmnopqrstuvwxyz0123456789+-'));
I am interested something slightly different. Whereas this will generate the shortest code for the number, I would like to now generate a constant-length code based on the number of bits. I am not sure if this is also a complex radix solution as well.
Say I want to generate 8-bit codes using a 16-character alphabet. That means I should be able to take the first 4 bits to select 1 character, and the next 4 bits to select the second character. So I might end up with MV if my 16 character set was ABDHNMOPQRSTUVYZ. Likewise if I had a 16-bit range, I would have 4 character code, and 32-bit range would be an 8-character code. So calling code32(1, 'ABDHNMOPQRSTUVYZ') would give an 8 letter code, while code8(1, 'ABDHNMOPQRSTUVYZ') would give a 2 digit code.
How could that be implemented in JavaScript? Something along these lines?
code8(i, alpha) // 0 to 255 it accepts
code16(i, alpha) // 0 to 65535 it accepts
code32(i, alpha) // 0 to 2^32-1 it accepts
Likewise, how would you get the string code back into the original number (or bit sequence)?

This really comes down to changing toString so that:
It only accepts a code that has a length of a power of 2
It pads the result to a given number of "digits" (characters)
The actual number of digits you would use for a 16 bit number depends on the size of the code. If the code has 16 characters, then it can cover for 4 bits, and so an output of 4 characters would be needed. If however the code has 4 characters, then the output would need 8 characters. You can have cases where the match is not exact, like when you would have a code with 8 characters. Then the output would need 6 characters.
Here I have highlighted the changes to the toString method. My personal preference is to also put the value as last parameter to toString.
function toString(digitCount, code, value) { // <-- add argument digitCount
// Perform a sanity check: code must have a length that is power of 2
if (Math.log2(code.length) % 1) throw "code size not power of 2: " + code.length;
var digit,
radix = code.length,
result = '';
do {
digit = value % radix;
result = code[digit] + result;
value = Math.floor(value / radix);
} while (value)
return result.padStart(digitCount, code[0]); // Pad to the desired output size
}
console.log(toString(4, 'abcdefghijklmnop', 123));
console.log(toString(4, 'abcdefghijklmnop', 0));
console.log(toString(4, 'abcdefghijklmnop', 0xFFFF));
// You could define some more specific functions
const code8 = (code, value) => toString(Math.ceil(8 / Math.log2(code.length)), code, value);
const code16 = (code, value) => toString(Math.ceil(16 / Math.log2(code.length)), code, value);
console.log(code16('abcdefghijklmnop', 123));
console.log(code16('abcdefghijklmnop', 0));
console.log(code16('abcdefghijklmnop', 0xFFFF));
console.log(code8('abcdefghijklmnop', 123));
console.log(code8('abcdefghijklmnop', 0));
console.log(code8('abcdefghijklmnop', 0xFF));

EDIT: I just noticed that you required a decoder as well. It is easy to implement a non-optimal version too, while an optimal one can be implemented via go through each letter and accumulate their value times their weighs.
Is this what you want? I tested this code for bit=16 and bit=8, but when bit=32 the count of codewords becomes too large and hangs the devtools of the browser. It's only a demonstrative code and may need optimization if need to be applied in practical use...
function genCode(len, alpha){
let tmp = [...alpha];
for(let i = 1; i != len; ++i){
const ttmp = [];
tmp.forEach(te => {
[...alpha].forEach(e => {
ttmp.push(te + e);
});
});
tmp = ttmp;
}
return tmp;
}
function code(bits, i, alpha){
const len = Math.ceil(bits / Math.floor(Math.log2(alpha.length)));
return genCode(len, alpha)[i];
}
function decode(bits, c, alpha){
const len = Math.ceil(bits / Math.floor(Math.log2(alpha.length)));
const codes = genCode(len, alpha);
return codes.indexOf(c);
}
console.log(code(16, 2, "ABDHNMOPQRSTUVYZ"));
console.log(decode(16, "AAAD", "ABDHNMOPQRSTUVYZ"));
console.log(code(8, 255, "ABDHNMOPQRSTUVYZ"));
console.log(decode(8, "ZZ", "ABDHNMOPQRSTUVYZ"));

compressing a string of 0's and 1's in js

Itroduction
I'm currently working on John Conway's Game of Life in js. I have the game working (view here) and i'm working on extra functionalities such as sharing your "grid / game" to your friends. To do this i'm extracting the value's of the grid (if the cell is alive or dead) into a long string of 0's and 1's.
This string has a variable length since the grid is not always the same size. for example:
grid 1 has a length and width of 30 => so the string's length is 900
grid 2 has a length and width of 50 => so the string's length is 2500
The problem
As you can see these string's of 0's and 1's are way too long to copy around and share.
However hard i try I don't seem to be able to come up with a code that would compress a string this long to a easy to handle one.
Any ideas on how to compress (and decompress) this?
I have considered simply writing down every possible grid option for the gird sizes 1x1 to 100x100 and giving them a key/reference to use as sharable code. Doing that by hand would be madness but maybe any of you has an idea on how to create an algorithm that can do this?
GitHub repository

In case it wasn't already obvious, the string you're trying to store looks like a binary string.
Counting systems
Binary is a number in base-2. This essentially means that there are two characters being used to keep count. Normally we are used to count with base-10 (decimal characters). In computer science the hexadecimal system (base-16) is also widely being used.
Since you're not storing the bits as bits but as bytes (use var a = 0b1100001; if you ever wish to store them like bits) the 'binary' you wish to store just takes as much space as any other random string with the same length.
Since you're using the binary system each position just has 2 possible values. When using the hexadecimal value a single position can hold up to 16 possible values. This is already a big improvement when it comes to storing the data compactly. As an example 0b11111111 and 0xff both represents the decimal number 255.
In your situation that'd shave 6 bytes of every 8 bytes you have to store. In the end you'd be stuck with a string just 1/4th of the length of the original string.
Javascript implementation
Essentially what we want to do is to interpret the string you store as binary and retrieve the hexadecimal value. Luckily JavaScript has built in functionality to achieve stuff like this:
var bin =
'1110101110100011' +
'0000101111100001' +
'1010010101011010' +
'0000110111011111' +
'1111111001010101' +
'0111000011100001' +
'1011010100110001' +
'0111111110010100' +
'0111110110100101' +
'0000111101100111' +
'1100001111011100' +
'0101011100001111' +
'0110011011001101' +
'1000110010001001' +
'1010100010000011' +
'0011110000000000';
var returnValue = '';
for (var i = 0; i < parseInt(bin.length / 8); i++) {
returnValue += parseInt(bin.substr(i*8, 8), 2).toString(16);
}
console.log(bin.length); // Will return 265
console.log(returnValue.length); // Will return 64
We're saying "parse this string and interpret it like a base-2 number and store it as a hexadecimal string".
Decoding is practically the same. Replace all occurrences of the number 8 in the example above with 2 and vice versa.
Please note
A prerequisite for this code to work correctly is that the binary length is dividable by 8. See the following example:
parseInt('00011110', 2).toString(16); // returns '1e'
parseInt('1e', 16).toString(2); // returns '11110'
// Technically both representations still have the same decimal value
When decoding you should add leading zeros until you have a full byte (8 bits).
In case the positions you have to store are not dividable by 8 you can, for example, add padding and add a number to the front of the output string to identify how much positions to strip.
Wait, there's more
To get even shorter strings you can build a lookup table with 265 characters in which you search for the character associated with the specific position. (This works because you're still storing the hexadecimal value as a string.) Sadly neither the ASCII nor the UTF-8 encodings are suited for this as there are blocks with values which have no characters defined.
It may look like:
// Go fill this array until you have 265 values within it.
var lookup = ['A', 'B', 'C', 'D'];
var smallerValue = lookup[0x00];
This way you can have 265 possible values at a single position, AND you have used your byte to the fullest.
Please note that no real compression is happening here. We're rather utilising data types to be used more efficiently for your current use case.

If we make the assumption than the grid contains much more 0's than 1's, you may want to try this simple compression scheme:
convert the binary string to an hexadecimal string
convert '00' sub-strings to 'z' symbol
convert 'zz' sub-strings to 'Z' symbol
we could go further, but let's stop here for the demo
Below is an example with a 16x16 grid:
var bin =
'0000000000000000' +
'0000001000000000' +
'0000011100000000' +
'0000001000000000' +
'0000000000000000' +
'0000000000111000' +
'0000100000111000' +
'0000000000111000' +
'0000000000000000' +
'0000000000000000' +
'0000000010000000' +
'0000000101000000' +
'0000000010000000' +
'0000000000000000' +
'0000100000000000' +
'0000000000000000';
var packed = bin
.match(/(.{4})/g)
.map(function(x) {
return parseInt(x, 2).toString(16);
})
.join('')
.replace(/00/g, 'z')
.replace(/zz/g, 'Z');
This will produce the string "Z02z07z02ZZ380838z38ZZz8z14z08Zz8Zz".
The unpacking process is doing the exact opposite:
var bin = packed
.replace(/Z/g, 'zz')
.replace(/z/g, '00')
.split('')
.map(function(x) {
return ('000' + parseInt(x, 16).toString(2)).substr(-4, 4);
})
.join('');
Note that this code will only work correctly if the length of the input string is a multiple of 4. If it's not the case, you'll have to pad the input and crop the output.
EDIT : 2nd method
If the input is completely random -- with roughly as many 0's as 1's and no specific repeating patterns -- the best you can do is probably to convert the binary string to a BASE64 string. It will be significantly shorter (this time with a fixed compression ratio of about 17%) and can still be copied/pasted by the user.
Packing:
var bin =
'1110101110100011' +
'0000101111100001' +
'1010010101011010' +
'0000110111011111' +
'1111111001010101' +
'0111000011100001' +
'1011010100110001' +
'0111111110010100' +
'0111110110100101' +
'0000111101100111' +
'1100001111011100' +
'0101011100001111' +
'0110011011001101' +
'1000110010001001' +
'1010100010000011' +
'0011110000000000';
var packed =
btoa(
bin
.match(/(.{8})/g)
.map(function(x) {
return String.fromCharCode(parseInt(x, 2));
})
.join('')
);
Will produce the string "66ML4aVaDd/+VXDhtTF/lH2lD2fD3FcPZs2MiaiDPAA=".
Unpacking:
var bin =
atob(packed)
.split('')
.map(function(x) {
return ('0000000' + x.charCodeAt(0).toString(2)).substr(-8, 8);
})
.join('');
Or if you want to go a step further, you can consider using something like base91 instead, for a reduced encoding overhead.

LZ-string
Using LZ-string I was able to compress the "code" quite a bit.
By simply compressing it to base64 like this:
var compressed = LZString.compressToBase64(string)
Decompressing is also just as simple as this:
var decompressed = LZString.decompressFromBase64(compressed)
However the length of this compressed string is still pretty long given that you have about as many 0s as 1s (not given in the example)
example
But the compression does work.

ANSWER
For any of you who are wondering how exactly I ended up doing it, here's how:
First I made sure every string passed in would be padded with leading 0s untill it was devidable by 8. (saving the amount of 0s used to pad, since they're needed while decompressing)
I used Corstian's answer and functions to compress my string (interpreted as binary) into a hexadecimal string. Although i had to make one slight alteration.
Not every binary substring with a lenght of 8 will return exactly 2 hex characters. so for those cases i ended up just adding a 0 in front of the substring. The hex substring will have the same value but it's length will now be 2.
Next up i used a functionality from Arnaulds answer. Taking every double character and replacing it with a single character (one not used in the hexadecimal alphabet to avoid conflict). I did this twice for every hexadecimal character.
For example:
the hex string 11 will become h and hh will become H
01101111 will become 0h0H
Since most grids are gonna have more dead cells then alive ones, I made sure the 0s would be able to compress even further, using Arnaulds method again but going a step further.
00 -> g | gg -> G | GG -> w | ww -> W | WW -> x | xx -> X | XX-> y | yy -> Y | YY -> z | zz -> Z
This resulted in Z representing 4096 (binary) 0s
The last step of the compression was adding the amount of leading 0s in front of the compressed string, so we can shave those off at the end of decompressing.
This is how the returned string looks in the end.
amount of leading 0s-compressed string so a 64*64 empty grid, will result in 0-Z
Decompressing is practically doing everything the other way around.
Firstly splitting the number that represents how many leading 0s we've used as padding from the compressed string.
Then using Arnaulds functionality, turning the further "compressed" characters back into hexadecimal code.
Taking this hex string and turning it back into binary code. Making sure, as Corstian pointed out, that every binary substring will have a length of 8. (ifnot we pad the substrings with leading 0s untill the do, exactly, have a length of 8)
And then the last step is to shave off the leading 0s we've used as padding to make the begin string devidable by 8.
The functions
Function I use to compress:
/**
* Compresses the a binary string into a compressed string.
* Returns the compressed string.
*/
Codes.compress = function(bin) {
bin = bin.toString(); // To make sure the binary is a string;
var returnValue = ''; // Empty string to add our data to later on.
// If the lenght of the binary string is not devidable by 8 the compression
// won't work correctly. So we add leading 0s to the string and store the amount
// of leading 0s in a variable.
// Determining the amount of 'padding' needed.
var padding = ((Math.ceil(bin.length/8))*8)-bin.length;
// Adding the leading 0s to the binary string.
for (var i = 0; i < padding; i++) {
bin = '0'+bin;
}
for (var i = 0; i < parseInt(bin.length / 8); i++) {
// Determining the substring.
var substring = bin.substr(i*8, 8)
// Determining the hexValue of this binary substring.
var hexValue = parseInt(substring, 2).toString(16);
// Not all binary values produce two hex numbers. For example:
// '00000011' gives just a '3' while what we wand would be '03'. So we add a 0 in front.
if(hexValue.length == 1) hexValue = '0'+hexValue;
// Adding this hexValue to the end string which we will return.
returnValue += hexValue;
}
// Compressing the hex string even further.
// If there's any double hex chars in the string it will take those and compress those into 1 char.
// Then if we have multiple of those chars these are compressed into 1 char again.
// For example: the hex string "ff will result in a "v" and "ffff" will result in a "V".
// Also: "11" will result in a "h" and "1111" will result in a "H"
// For the 0s this process is repeated a few times.
// (string with 4096 0s) (this would represent a 64*64 EMPTY grid)
// will result in a "Z".
var returnValue = returnValue.replace(/00/g, 'g')
.replace(/gg/g, 'G')
// Since 0s are probably more likely to exist in our binary and hex, we go a step further compressing them like this:
.replace(/GG/g, 'w')
.replace(/ww/g, 'W')
.replace(/WW/g, 'x')
.replace(/xx/g, 'X')
.replace(/XX/g, 'y')
.replace(/yy/g, 'Y')
.replace(/YY/g, 'z')
.replace(/zz/g, 'Z')
//Rest of the chars...
.replace(/11/g, 'h')
.replace(/hh/g, 'H')
.replace(/22/g, 'i')
.replace(/ii/g, 'I')
.replace(/33/g, 'j')
.replace(/jj/g, 'J')
.replace(/44/g, 'k')
.replace(/kk/g, 'K')
.replace(/55/g, 'l')
.replace(/ll/g, 'L')
.replace(/66/g, 'm')
.replace(/mm/g, 'M')
.replace(/77/g, 'n')
.replace(/nn/g, 'N')
.replace(/88/g, 'o')
.replace(/oo/g, 'O')
.replace(/99/g, 'p')
.replace(/pp/g, 'P')
.replace(/aa/g, 'q')
.replace(/qq/g, 'Q')
.replace(/bb/g, 'r')
.replace(/rr/g, 'R')
.replace(/cc/g, 's')
.replace(/ss/g, 'S')
.replace(/dd/g, 't')
.replace(/tt/g, 'T')
.replace(/ee/g, 'u')
.replace(/uu/g, 'U')
.replace(/ff/g, 'v')
.replace(/vv/g, 'V');
// Adding the number of leading 0s that need to be ignored when decompressing to the string.
returnValue = padding+'-'+returnValue;
// Returning the compressed string.
return returnValue;
}
The function I use to decompress:
/**
* Decompresses the compressed string back into a binary string.
* Returns the decompressed string.
*/
Codes.decompress = function(compressed) {
var returnValue = ''; // Empty string to add our data to later on.
// Splitting the input on '-' to seperate the number of paddin 0s and the actual hex code.
var compressedArr = compressed.split('-');
var paddingAmount = compressedArr[0]; // Setting a variable equal to the amount of leading 0s used while compressing.
compressed = compressedArr[1]; // Setting the compressed variable to the actual hex code.
// Decompressing further compressed characters.
compressed = compressed// Decompressing the further compressed 0s. (even further then the rest of the chars.)
.replace(/Z/g, 'zz')
.replace(/z/g, 'YY')
.replace(/Y/g, 'yy')
.replace(/y/g, 'XX')
.replace(/X/g, 'xx')
.replace(/x/g, 'WW')
.replace(/W/g, 'ww')
.replace(/w/g, 'GG')
.replace(/G/g, 'gg')
.replace(/g/g, '00')
// Rest of chars...
.replace(/H/g, 'hh')
.replace(/h/g, '11')
.replace(/I/g, 'ii')
.replace(/i/g, '22')
.replace(/J/g, 'jj')
.replace(/j/g, '33')
.replace(/K/g, 'kk')
.replace(/k/g, '44')
.replace(/L/g, 'll')
.replace(/l/g, '55')
.replace(/M/g, 'mm')
.replace(/m/g, '66')
.replace(/N/g, 'nn')
.replace(/n/g, '77')
.replace(/O/g, 'oo')
.replace(/o/g, '88')
.replace(/P/g, 'pp')
.replace(/p/g, '99')
.replace(/Q/g, 'qq')
.replace(/q/g, 'aa')
.replace(/R/g, 'rr')
.replace(/r/g, 'bb')
.replace(/S/g, 'ss')
.replace(/s/g, 'cc')
.replace(/T/g, 'tt')
.replace(/t/g, 'dd')
.replace(/U/g, 'uu')
.replace(/u/g, 'ee')
.replace(/V/g, 'vv')
.replace(/v/g, 'ff');
for (var i = 0; i < parseInt(compressed.length / 2); i++) {
// Determining the substring.
var substring = compressed.substr(i*2, 2);
// Determining the binValue of this hex substring.
var binValue = parseInt(substring, 16).toString(2);
// If the length of the binary value is not equal to 8 we add leading 0s (js deletes the leading 0s)
// For instance the binary number 00011110 is equal to the hex number 1e,
// but simply running the code above will return 11110. So we have to add the leading 0s back.
if (binValue.length != 8) {
// Determining how many 0s to add:
var diffrence = 8 - binValue.length;
// Adding the 0s:
for (var j = 0; j < diffrence; j++) {
binValue = '0'+binValue;
}
}
// Adding the binValue to the end string which we will return.
returnValue += binValue
}
var decompressedArr = returnValue.split('');
returnValue = ''; // Emptying the return variable.
// Deleting the not needed leading 0s used as padding.
for (var i = paddingAmount; i < decompressedArr.length; i++) {
returnValue += decompressedArr[i];
}
// Returning the decompressed string.
return returnValue;
}
URL shortener
I still found the "compressed" strings a little long for sharing / pasting around. So i used a simple URL shortener (view here) to make this process a little easier for the user.
Now you might ask, then why did you need to compress this string anyway?
Here's why:
First of all, my project is hosted on github pages (gh-pages). The info page of gh-pages tells us that the url can't be any longer than 2000 chars. This would mean that the max grid size would be the square root of 2000 - length of the base url, which isn't that big. By using this "compression" we are able to share much larger grids.
Now the second reason why is that, it's a challange. I find dealing with problems like these fun and also helpfull since you learn a lot.
Live
You can view the live version of my project here. and/or find the github repository here.
Thankyou
I want to thank everyone who helped me with this problem. Especially Corstian and Arnauld, since i ended up using their answers to reach my final functions.
Sooooo.... thanks guys! apriciate it!

In the Game of Life there is a board of ones and zeros. I want to back up to previous generation - size 4800 - save each 16 cells as hexadecimal = 1/4 the size. http://innerbeing.epizy.com/cwebgl/gameoflife.html [g = Go] [b = Backup]
function drawGen(n) {
stop(); var i = clamp(n,0,brw*brh-1), hex = gensave[i].toString();
echo(":",i, n,nGEN); nGEN = i; var str = '';
for (var i = 0; i < parseInt(hex.length / 4); i++)
str = str + pad(parseInt(hex.substr(i*4,4), 16).toString(2),16,'0');
for (var j=0;j<Board.length;j++) Board[j] = intr(str.substr(j,1));
drawBoard();
}
function Bin2Hex(n) {
var i = n.indexOf("1"); /// leading Zeros = NAN
if (i == -1) return "0000";
i = right(n,i*-1);
return pad(parseInt(i,2).toString(16),4,'0');
}
function saveGen(n) {
var b = Board.join(''), str = ''; /// concat array to string 10101
for (var i = 0; i < parseInt(b.length / 16); i++)
str = str + Bin2Hex(b.substr(i*16,16));
gensave[n] = str;
}
function right(st,n) {
var s = st.toString();
if (!n) return s;
if (n < 0) return s.substr(n * -1,s.length + n);
return s.substr(s.length - n,n);
}
function pad(str, l, padwith) {
var s = str;
while (s.length < l) s = padwith + s;
return s;
}

Compresing / decompresing a binary string into/from hex in javascript not working

Introduction
I'm currently working on John Conway's Game of Life in js. I have the game working (view here) and i'm working on extra functionalities such as sharing your "grid / game" to your friends. To do this i'm extracting the value's of the grid (if the cell is alive or dead) into a long string of 0's and 1's.
This long string can be seen as binary code and im trying to "compress" it into a hexadecimal string by chopping the binary up into substrings with a lenght of 8 and then determining its hexadecimal value. decompressing works the other way around. Deviding the hex string into bits of two and determining its binary value.
parseInt('00011110', 2).toString(16); // returns '1e'
parseInt('1e', 16).toString(2); // returns '11110'
// Technically both representations still have the same decimal value
As shown above js will cut off the leading 0s since they're 'not needed'.
I've fixed this problem by looking if the lenght of the binary string returned by the function is 8, ifnot it adds enough 0s in front untill its length is exactly 8.
It could be that this function is not working correctly but i'm not sure.
It seems to work with small binary values.
please note you can only put in strings with a length devidable by 8
The problem
Longer binary strings don't seem to work (shown below) and this is probably not caused by overflow (that would probably result in a long row of 0s at the end).
EDIT:
var a = "1000011101110101100011000000001011111100111011010011110000000100101000000111111010111111110101100001100101110001100110110101000111110001001010110111001010100011010010111001110010111001101100000100001001101000001010101110001001001110101001110001001111010110011000010100001111000111000011000101010110010011101100000100011101101110110000100101000110011101101011011111010111001001000101000001001111010010010010100000110101101101110101110101010101111101100110101110100100110000010000000110000100000001110001011001011011000101111110101000100011010100011001000101111001000010001011001011100100110001101100001111110110000000111010100101110110101110110111001100000001001100111110000111001010111110110100010111001011101110011011100100111010001100010111100111011010111110111101010000111101010100011000000111000010101011101101011110010011001110000111100000111011111011000000100000010100001111110101001110001100011001"
a.length
904
var c = compress(a)
c
"87758c2fced3c4a07ebfd619719b51f12b72a34b9cb9b042682ae24ea713d66143c7c5593b0476ec2519dadf5c91413d24ad6dd7557d9ae93040611c596c5fa88d4645e422cb931b0fd80ea5daedcc04cf872bed172ee6e4e8c5e76bef5f546070abb5e4ce1eefb25fd4e319"
var d = decompress(c)
d
"100001110111010110001100001011111100111011010011110001001010000001111110101111111101011000011001011100011001101101010001111100010010101101110010101000110100101110011100101110011011000001000010011010000010101011100010010011101010011100010011110101100110000101000011110001111100010101011001001110110000010001110110111011000010010100011001110110101101111101011100100100010100000100111101001001001010110101101101110101110101010101111101100110101110100100110000010000000110000100011100010110010110110001011111101010001000110101000110010001011110010000100010110010111001001100011011000011111101100000001110101001011101101011101101110011000000010011001111100001110010101111101101000101110010111011100110111001001110100011000101111001110110101111101111010111110101010001100000011100001010101110110101111001001100111000011110111011111011001001011111110101001110001100011001"
d == a
false
end of edit
My code
The function I use to compress:
function compress(bin) {
bin = bin.toString(); // To make sure the binary is a string;
var returnValue = ''; // Empty string to add our data to later on.
for (var i = 0; i < parseInt(bin.length / 8); i++) {
// Determining the substring.
var substring = bin.substr(i*8, 8)
// Determining the hexValue of this binary substring.
var hexValue = parseInt(substring, 2).toString(16);
// Adding this hexValue to the end string which we will return.
returnValue += hexValue;
}
// Returning the to hex compressed string.
return returnValue;
}
The function I use to decompress:
function decompress(compressed) {
var returnValue = ''; // Empty string to add our data to later on.
for (var i = 0; i < parseInt(compressed.length / 2); i++) {
// Determining the substring.
var substring = compressed.substr(i*2, 2);
// Determining the binValue of this hex substring.
var binValue = parseInt(substring, 16).toString(2);
// If the length of the binary value is not equal to 8 we add leading 0s (js deletes the leading 0s)
// For instance the binary number 00011110 is equal to the hex number 1e,
// but simply running the code above will return 11110. So we have to add the leading 0s back.
if (binValue.length != 8) {
// Determining how many 0s to add:
var diffrence = 8 - binValue.length;
// Adding the 0s:
for (var j = 0; j < diffrence; j++) {
binValue = '0'+binValue;
}
}
// Adding the binValue to the end string which we will return.
returnValue += binValue
}
// Returning the decompressed string.
return returnValue;
}
Does anyone know what's going wrong? Or how to do this properly?

Problem is you are expecting your compress function to always add pairs of 2 hexa letters, but that is not always the case. For example '00000011' gives just a '3', but you actually want '03'. So you need to cover those cases in your compress function:
var hexValue = parseInt(substring, 2).toString(16);
if(hexValue.length == 1) hexValue = '0'+hexValue

Bitwise XOR in Javascript compared to C++

I am porting a simple C++ function to Javascript, but it seems I'm running into problems with the way Javascript handles bitwise operators.
In C++:
AnsiString MyClass::Obfuscate(AnsiString source)
{
int sourcelength=source.Length();
for(int i=1;i<=sourcelength;i++)
{
source[i] = source[i] ^ 0xFFF;
}
return source;
}
Obfuscate("test") yields temporary intvalues
-117, -102, -116, -117
Obfuscate ("test") yields stringvalue
‹šŒ‹
In Javascript:
function obfuscate(str)
{
var obfuscated= "";
for (i=0; i<str.length;i++) {
var a = str.charCodeAt(i);
var b = a ^ 0xFFF;
obfuscated= obfuscated+String.fromCharCode(b);
}
return obfuscated;
}
obfuscate("test") yields temporary intvalues
3979 , 3994 , 3980 , 3979
obfuscate("test") yields stringvalue
ྋྚྌྋ
Now, I realize that there are a ton of threads where they point out that Javascript treats all numbers as floats, and bitwise operations involve a temporary cast to 32bit int.
It really wouldn't be a problem except for that I'm obfuscating in Javascript and reversing in C++, and the different results don't really match.
How do i tranform the Javascript result into the C++ result? Is there some simple shift available?

Working demo
Judging from the result that xoring 116 with 0xFFF gives -117, we have to emulate
2's complement 8-bit integers in javascript:
function obfuscate(str)
{
var bytes = [];
for (var i=0; i<str.length;i++) {
bytes.push( ( ( ( str.charCodeAt(i) ^ 0xFFF ) & 0xFF ) ^ 0x80 ) -0x80 );
}
return bytes;
}
Ok these bytes are interpreted in windows cp 1252 and if they are negative, probably just subtracted from 256.
var ascii = [
0x0000,0x0001,0x0002,0x0003,0x0004,0x0005,0x0006,0x0007,0x0008,0x0009,0x000A,0x000B,0x000C,0x000D,0x000E,0x000F
,0x0010,0x0011,0x0012,0x0013,0x0014,0x0015,0x0016,0x0017,0x0018,0x0019,0x001A,0x001B,0x001C,0x001D,0x001E,0x001F
,0x0020,0x0021,0x0022,0x0023,0x0024,0x0025,0x0026,0x0027,0x0028,0x0029,0x002A,0x002B,0x002C,0x002D,0x002E,0x002F
,0x0030,0x0031,0x0032,0x0033,0x0034,0x0035,0x0036,0x0037,0x0038,0x0039,0x003A,0x003B,0x003C,0x003D,0x003E,0x003F
,0x0040,0x0041,0x0042,0x0043,0x0044,0x0045,0x0046,0x0047,0x0048,0x0049,0x004A,0x004B,0x004C,0x004D,0x004E,0x004F
,0x0050,0x0051,0x0052,0x0053,0x0054,0x0055,0x0056,0x0057,0x0058,0x0059,0x005A,0x005B,0x005C,0x005D,0x005E,0x005F
,0x0060,0x0061,0x0062,0x0063,0x0064,0x0065,0x0066,0x0067,0x0068,0x0069,0x006A,0x006B,0x006C,0x006D,0x006E,0x006F
,0x0070,0x0071,0x0072,0x0073,0x0074,0x0075,0x0076,0x0077,0x0078,0x0079,0x007A,0x007B,0x007C,0x007D,0x007E,0x007F
];
var cp1252 = ascii.concat([
0x20AC,0xFFFD,0x201A,0x0192,0x201E,0x2026,0x2020,0x2021,0x02C6,0x2030,0x0160,0x2039,0x0152,0xFFFD,0x017D,0xFFFD
,0xFFFD,0x2018,0x2019,0x201C,0x201D,0x2022,0x2013,0x2014,0x02DC,0x2122,0x0161,0x203A,0x0153,0xFFFD,0x017E,0x0178
,0x00A0,0x00A1,0x00A2,0x00A3,0x00A4,0x00A5,0x00A6,0x00A7,0x00A8,0x00A9,0x00AA,0x00AB,0x00AC,0x00AD,0x00AE,0x00AF
,0x00B0,0x00B1,0x00B2,0x00B3,0x00B4,0x00B5,0x00B6,0x00B7,0x00B8,0x00B9,0x00BA,0x00BB,0x00BC,0x00BD,0x00BE,0x00BF
,0x00C0,0x00C1,0x00C2,0x00C3,0x00C4,0x00C5,0x00C6,0x00C7,0x00C8,0x00C9,0x00CA,0x00CB,0x00CC,0x00CD,0x00CE,0x00CF
,0x00D0,0x00D1,0x00D2,0x00D3,0x00D4,0x00D5,0x00D6,0x00D7,0x00D8,0x00D9,0x00DA,0x00DB,0x00DC,0x00DD,0x00DE,0x00DF
,0x00E0,0x00E1,0x00E2,0x00E3,0x00E4,0x00E5,0x00E6,0x00E7,0x00E8,0x00E9,0x00EA,0x00EB,0x00EC,0x00ED,0x00EE,0x00EF
,0x00F0,0x00F1,0x00F2,0x00F3,0x00F4,0x00F5,0x00F6,0x00F7,0x00F8,0x00F9,0x00FA,0x00FB,0x00FC,0x00FD,0x00FE,0x00FF
]);
function toStringCp1252(bytes){
var byte, codePoint, codePoints = [];
for( var i = 0; i < bytes.length; ++i ) {
byte = bytes[i];
if( byte < 0 ) {
byte = 256 + byte;
}
codePoint = cp1252[byte];
codePoints.push( codePoint );
}
return String.fromCharCode.apply( String, codePoints );
}
Result
toStringCp1252(obfuscate("test"))
//"‹šŒ‹"

I'm guessing that AnsiString contains 8-bit characters (since the ANSI character set is 8 bits). When you assign the result of the XOR back to the string, it is truncated to 8 bits, and so the resulting value is in the range [-128...127].
(On some platforms, it could be [0..255], and on others the range could be wider, since it's not specified whether char is signed or unsigned, or whether it's 8 bits or larger).
Javascript strings contain unicode characters, which can hold a much wider range of values, the result is not truncated to 8 bits. The result of the XOR will have a range of at least 12 bits, [0...4095], hence the large numbers you see there.
Assuming the original string contains only 8-bit characters, then changing the operation to a ^ 0xff should give the same results in both languages.

I assume that AnsiString is in some form, an array of chars. And this is the problem. in c, char can typically only hold 8-bits. So when you XOR with 0xfff, and store the result in a char, it is the same as XORing with 0xff.
This is not the case with javascript. JavaScript using Unicode. This is demonstrated by looking at the integer values:
-117 == 0x8b and 3979 == 0xf8b
I would recommend XORing with 0xff as this will work in both languages. Or you can switch your c++ code to use Unicode.

First, convert your AnsiString to wchar_t*. Only then obfuscate its individual characters:
AnsiString MyClass::Obfuscate(AnsiString source)
{
/// allocate string
int num_wchars = source.WideCharBufSize();
wchar_t* UnicodeString = new wchar_t[num_wchars];
source.WideChar(UnicodeString, source.WideCharBufSize());
/// obfuscate individual characters
int sourcelength=source.Length();
for(int i = 0 ; i < num_wchars ; i++)
{
UnicodeString[i] = UnicodeString[i] ^ 0xFFF;
}
/// create obfuscated AnsiString
AnsiString result = AnsiString(UnicodeString);
/// delete tmp string
delete [] UnicodeString;
return result;
}
Sorry, I'm not an expert on C++ Builder, but my point is simple: in JavaScript you have WCS2 symbols (or UTF-16), so you have to convert AnsiString to wide chars first.
Try using WideString instead of AnsiString

I don't know AnsiString at all, but my guess is this relates to the width of its characters. Specifically, I suspect they're less than 32 bits wide, and of course in bitwise operations, the width of what you're operating on matters, particularly when dealing with 2's complement numbers.
In JavaScript, your "t" in "test" is character code 116, which is b00000000000000000000000001110100. 0xFFF (4095) is b00000000000000000000111111111111, and the result you're getting (3979) is b00000000000000000000111110001011. We can readily see that you're getting the right result for the XOR:
116 = 00000000000000000000000001110100
4095 = 00000000000000000000111111111111
3979 = 00000000000000000000111110001011
So I'm thinking you're getting some truncation or similar in your C++ code, not least because -117 is b10001011 in eight-bit 2's complement...which is exactly what we see as the last eight bits of 3979 above.

Is there "0b" or something similar to represent a binary number in Javascript

I know that 0x is a prefix for hexadecimal numbers in Javascript. For example, 0xFF stands for the number 255.
Is there something similar for binary numbers ? I would expect 0b1111 to represent the number 15, but this doesn't work for me.

Update:
Newer versions of JavaScript -- specifically ECMAScript 6 -- have added support for binary (prefix 0b), octal (prefix 0o) and hexadecimal (prefix: 0x) numeric literals:
var bin = 0b1111; // bin will be set to 15
var oct = 0o17; // oct will be set to 15
var oxx = 017; // oxx will be set to 15
var hex = 0xF; // hex will be set to 15
// note: bB oO xX are all valid
This feature is already available in Firefox and Chrome. It's not currently supported in IE, but apparently will be when Spartan arrives.
(Thanks to Semicolon's comment and urish's answer for pointing this out.)
Original Answer:
No, there isn't an equivalent for binary numbers. JavaScript only supports numeric literals in decimal (no prefix), hexadecimal (prefix 0x) and octal (prefix 0) formats.
One possible alternative is to pass a binary string to the parseInt method along with the radix:
var foo = parseInt('1111', 2); // foo will be set to 15

In ECMASCript 6 this will be supported as a part of the language, i.e. 0b1111 === 15 is true. You can also use an uppercase B (e.g. 0B1111).
Look for NumericLiterals in the ES6 Spec.

I know that people says that extending the prototypes is not a good idea, but been your script...
I do it this way:
Object.defineProperty(
Number.prototype, 'b', {
set:function(){
return false;
},
get:function(){
return parseInt(this, 2);
}
}
);
100..b // returns 4
11111111..b // returns 511
10..b+1 // returns 3
// and so on

If your primary concern is display rather than coding, there's a built-in conversion system you can use:
var num = 255;
document.writeln(num.toString(16)); // Outputs: "ff"
document.writeln(num.toString(8)); // Outputs: "377"
document.writeln(num.toString(2)); // Outputs: "11111111"
Ref: MDN on Number.prototype.toString

As far as I know it is not possible to use a binary denoter in Javascript. I have three solutions for you, all of which have their issues. I think alternative 3 is the most "good looking" for readability, and it is possibly much faster than the rest - except for it's initial run time cost. The problem is it only supports values up to 255.
Alternative 1: "00001111".b()
String.prototype.b = function() { return parseInt(this,2); }
Alternative 2: b("00001111")
function b(i) { if(typeof i=='string') return parseInt(i,2); throw "Expects string"; }
Alternative 3: b00001111
This version allows you to type either 8 digit binary b00000000, 4 digit b0000 and variable digits b0. That is b01 is illegal, you have to use b0001 or b1.
String.prototype.lpad = function(padString, length) {
var str = this;
while (str.length < length)
str = padString + str;
return str;
}
for(var i = 0; i < 256; i++)
window['b' + i.toString(2)] = window['b' + i.toString(2).lpad('0', 8)] = window['b' + i.toString(2).lpad('0', 4)] = i;

May be this will usefull:
var bin = 1111;
var dec = parseInt(bin, 2);
// 15

No, but you can use parseInt and optionally omit the quotes.
parseInt(110, 2); // this is 6
parseInt("110", 2); // this is also 6
The only disadvantage of omitting the quotes is that, for very large numbers, you will overflow faster:
parseInt(10000000000000000000000, 2); // this gives 1
parseInt("10000000000000000000000", 2); // this gives 4194304

I know this does not actually answer the asked Q (which was already answered several times) as is, however I suggest that you (or others interested in this subject) consider the fact that the most readable & backwards/future/cross browser-compatible way would be to just use the hex representation.
From the phrasing of the Q it would seem that you are only talking about using binary literals in your code and not processing of binary representations of numeric values (for which parstInt is the way to go).
I doubt that there are many programmers that need to handle binary numbers that are not familiar with the mapping of 0-F to 0000-1111.
so basically make groups of four and use hex notation.
so instead of writing 101000000010 you would use 0xA02 which has exactly the same meaning and is far more readable and less less likely to have errors.
Just consider readability, Try comparing which of those is bigger:
10001000000010010 or 1001000000010010
and what if I write them like this:
0x11012 or 0x9012

Convert binary strings to numbers and visa-versa.
var b = function(n) {
if(typeof n === 'string')
return parseInt(n, 2);
else if (typeof n === 'number')
return n.toString(2);
throw "unknown input";
};

Using Number() function works...
// using Number()
var bin = Number('0b1111'); // bin will be set to 15
var oct = Number('0o17'); // oct will be set to 15
var oxx = Number('0xF'); // hex will be set to 15
// making function convTo
const convTo = (prefix,n) => {
return Number(`${prefix}${n}`) //Here put prefix 0b, 0x and num
}
console.log(bin)
console.log(oct)
console.log(oxx)
// Using convTo function
console.log(convTo('0b',1111))

Develop Reference

JavaScript is the programming language of the Web.

Compressing a Hex String in JavaScript/NodeJS - javascript

Related

How to generate a fixed-length code from a set of integers of a specific bit count in JavaScript

compressing a string of 0's and 1's in js

Compresing / decompresing a binary string into/from hex in javascript not working

Bitwise XOR in Javascript compared to C++

Is there "0b" or something similar to represent a binary number in Javascript

Categories

Resources