From this answer I adapted the code below:
function _makeCRCTable() {
const CRCTable = new Uint32Array(256);
for (let i = 256; i--;) {
let char = i;
for (let j = 8; j--;) {
char = char & 1 ? 3988292384 ^ char >>> 1 : char >>> 1;
}
CRCTable[i] = char;
}
return CRCTable;
}
This code generates table as here, but for Ogg I need another table - as here.
From Ogg documentation:
32 bit CRC value (direct algorithm, initial val and final XOR = 0,
generator polynomial=0x04c11db7)
parseInt('04c11db7', 16)
return 79764919 - I tried this polynomial but resulting table is not correct.
I am new to the CRC field, as I found there are a few variations of CRC32 algorithm.
I'm not sure of javascript precedence, but the xor needs to occur after the shift:
char = char & 1 ? 3988292384 ^ (char >>> 1) : char >>> 1;
However the first table you show seems correct, as table[128] = table[0x80] = 3988292384 = 0xEDB88320 which is 0x104c11db7 bit reversed, then shifted right one bit.
The second table you have is for a left shifting CRC, where table[1] = x04c11db7. In this case the inner loop would include something like this:
let char = i << 24;
for (let j = 8; j--;) {
char = char & 0x80000000 ? 0x04c11db7 ^ char << 1 : char << 1;
}
Example C code for comparison, generates crc for the patterns {0x01}, {0x01,0x00}, {0x01,0x00,0x00}, {0x01,0x00,0x00,0x00}.
#include <stdio.h>
typedef unsigned char uint8_t;
typedef unsigned int uint32_t;
uint32_t crctbl[256];
void gentbl(void)
{
uint32_t crc;
uint32_t b;
uint32_t c;
uint32_t i;
for(c = 0; c < 0x100; c++){
crc = c<<24;
for(i = 0; i < 8; i++){
b = crc>>31;
crc <<= 1;
crc ^= (0 - b) & 0x04c11db7;
}
crctbl[c] = crc;
}
}
uint32_t crc32(uint8_t * bfr, size_t size)
{
uint32_t crc = 0;
while(size--)
crc = (crc << 8) ^ crctbl[(crc >> 24)^*bfr++];
return(crc);
}
int main(int argc, char** argv)
{
uint32_t crc;
uint8_t bfr[4] = {0x01,0x00,0x00,0x00};
gentbl();
crc = crc32(bfr, 1); /* 0x04c11db7 */
printf("%08x\n", crc);
crc = crc32(bfr, 2); /* 0xd219c1dc */
printf("%08x\n", crc);
crc = crc32(bfr, 3); /* 0x01d8ac87 */
printf("%08x\n", crc);
crc = crc32(bfr, 4); /* 0xdc6d9ab7 */
printf("%08x\n", crc);
return(0);
}
For JS:
function _makeCRC32Table() {
const polynomial = 79764919;
const mask = 2147483648;
const CRCTable = new Uint32Array(256);
for (let i = 256; i--;) {
let char = i << 24;
for (let j = 8; j--;) {
char = char & mask ? polynomial ^ char << 1 : char << 1;
}
CRCTable[i] = char;
}
return CRCTable;
}
How to use this table:
[1, 0].reduce((crc, byte) => crc << 8 >>> 0 ^ CRCTable[crc >>> 24 ^ byte], 0) >>> 0
Here we added >>> 0 that takes the module of the number - because there is no unsigned int in JS - JavaScript doesn't have integers. It only has double precision floating-point numbers.
Note that for Ogg you must set generated CRC in the reverse order.
Related
I'm looking to decode an encoded google polyline:
`~oia#
However to reverse one of the steps requires reversing the bitwise OR operation, which is destructive.
I see it's done here: How to decode Google's Polyline Algorithm? but I can't see how to do that in Javascript.
Here is what I have so far:
const partialDecodedPolyline = "`~oia#".split('').map(char => (char.codePointAt()-63).toString(2))
console.log(partialDecodedPolyline)
The next step is to reverse the bitwise OR... how is that possible?
There is a libriary for that https://github.com/mapbox/polyline/blob/master/src/polyline.js
/*
https://github.com/mapbox/polyline/blob/master/src/polyline.js
*/
const decode = function(str, precision) {
var index = 0,
lat = 0,
lng = 0,
coordinates = [],
shift = 0,
result = 0,
byte = null,
latitude_change,
longitude_change,
factor = Math.pow(10, Number.isInteger(precision) ? precision : 5);
// Coordinates have variable length when encoded, so just keep
// track of whether we've hit the end of the string. In each
// loop iteration, a single coordinate is decoded.
while (index < str.length) {
// Reset shift, result, and byte
byte = null;
shift = 0;
result = 0;
do {
byte = str.charCodeAt(index++) - 63;
result |= (byte & 0x1f) << shift;
shift += 5;
} while (byte >= 0x20);
latitude_change = ((result & 1) ? ~(result >> 1) : (result >> 1));
shift = result = 0;
do {
byte = str.charCodeAt(index++) - 63;
result |= (byte & 0x1f) << shift;
shift += 5;
} while (byte >= 0x20);
longitude_change = ((result & 1) ? ~(result >> 1) : (result >> 1));
lat += latitude_change;
lng += longitude_change;
coordinates.push([lat / factor, lng / factor]);
}
return coordinates;
};
console.log(decode("`~oia#"));
This question already has answers here:
Bitshift in javascript
(4 answers)
Closed 3 years ago.
I am writing a Javascript version of this Microsoft string decoding algorithm and its failing on large numbers. This seems to be because of sizing (int / long) issues. If I step through the code in C# I see that the JS implementation fails on this line
n |= (b & 31) << k;
This happens when the values are (and the C# result is 240518168576)
(39 & 31) << 35
If I play around with these values in C# I can replicate the JS issue if b is an int. And If I set b to be long it works correctly.
So then I checked the max size of a JS number, and compared it to the C# long result
240518168576 < Number.MAX_SAFE_INTEGER = true
So.. I can see that there is some kind of number size issue happening but do not know how to force JS to treat this number as a long.
Full JS code:
private getPointsFromEncodedString(encodedLine: string): number[][] {
const EncodingString = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-";
var points: number[][] = [];
if (!encodedLine) {
return points;
}
var index = 0;
var xsum = 0;
var ysum = 0;
while (index < encodedLine.length) {
var n = 0;
var k = 0;
debugger;
while (true) {
if (index >= encodedLine.length) {
return points;
}
var b = EncodingString.indexOf(encodedLine[index++]);
if (b == -1) {
return points;
}
n |= (b & 31) << k;
k += 5;
if (b < 32) {
break;
}
}
var diagonal = ((Math.sqrt(8 * n + 5) - 1) / 2);
n -= diagonal * (diagonal + 1) / 2;
var ny = n;
var nx = diagonal - ny;
nx = (nx >> 1) ^ -(nx & 1);
ny = (ny >> 1) ^ -(ny & 1);
xsum += nx;
ysum += ny;
points.push([ysum * 0.000001, xsum * 0.000001]);
}
console.log(points);
return points;
}
Expected input output:
Encoded string
qkoo7v4q-lmB0471BiuuNmo30B
Decoded points:
35.89431, -110.72522
35.89393, -110.72578
35.89374, -110.72606
35.89337, -110.72662
Bitwise operators treat their operands as a sequence of 32 bits
(zeroes and ones), rather than as decimal, hexadecimal, or octal
numbers. For example, the decimal number nine has a binary
representation of 1001. Bitwise operators perform their operations on
such binary representations, but they return standard JavaScript
numerical values.
(39 & 31) << 35 tries to shift 35 bits when there only 32
Bitwise Operators
To solve this problem you could use BigInt to perform those operations and then downcast it back to Number
Number((39n & 31n) << 35n)
You can try this:
function getPointsFromEncodedString(encodedLine) {
const EncodingString = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-";
var points = [];
if (!encodedLine) {
return points;
}
var index = 0;
var xsum = 0;
var ysum = 0;
while (index < encodedLine.length) {
var n = 0n;
var k = 0n;
while (true) {
if (index >= encodedLine.length) {
return points;
}
var b = EncodingString.indexOf(encodedLine[index++]);
if (b === -1) {
return points;
}
n |= (b & 31n) << k;
k += 5n;
if (b < 32n) {
break;
}
}
var diagonal = ((Math.sqrt(8 * Number(n) + 5) - 1) / 2);
n -= diagonal * (diagonal + 1) / 2;
var ny = n;
var nx = diagonal - ny;
nx = (nx >> 1) ^ -(nx & 1);
ny = (ny >> 1) ^ -(ny & 1);
xsum += Number(nx);
ysum += Number(ny);
points.push([ysum * 0.000001, xsum * 0.000001]);
}
console.log(points);
return points;
}
I try to re-write CCNET driver for CashCode, from node to Python.
But, i realy can`t run CRC generator.
You can find "working" code on Github repo
Here is the JS function:
function getCRC16(bufData) {
var POLYNOMIAL = 0x08408;
var sizeData = bufData.length;
var CRC, i, j;
CRC = 0;
for (i = 0; i < sizeData; i++) {
CRC ^= bufData[i];
for (j = 0; j < 8; j++) {
if (CRC & 0x0001) {
CRC >>= 1;
CRC ^= POLYNOMIAL;
} else CRC >>= 1;
}
}
var buf = new Buffer(2);
buf.writeUInt16BE(CRC, 0);
CRC = buf;
return Array.prototype.reverse.call(CRC);
}
I try crcmod , BUT it`s not predefined function, and when i try set polynominal, get error
Here is my sometime working code:
#staticmethod
def getCRC16(data):
CRC = 0
for i in range(0, len(data), 2):
CRC ^= int(str(data[i:(i+2)]), 16)
for j in range(8):
if (CRC & 0x0001):
CRC >>= 1
CRC ^= 0x8408
else:
CRC >>= 1
CRC = format(CRC, '02x')
return CRC[2:4] + CRC[0:2]
And i get
CRC ^= int(str(data[i:(i+2)]), 16)
ValueError: invalid literal for int() with base 16: '\x02\x03'
Help me with that function. (input binary/integers or HEX-strings)
UPD: : It works with bytearray.fromhex(data) . Thanks)
#staticmethod
def getCRC16(data):
data = bytearray.fromhex(data)
CRC = 0
for bit in data:
CRC ^= bit
for j in range(0, 8):
if (CRC & 0x0001):
CRC >>= 1
CRC ^= 0x8408
else:
CRC >>= 1
CRC = format(CRC, '02x')
return CRC[2:4] + CRC[0:2]
you need
my_int = struct.unpack("h",data[i:i+2])
# my_int = 770 ?
you will also need to truncate the result by anding it with 0xFFFF (I think) (since python ints will just keep going up forever)
Sinse Python 2.6: bytearray.fromhex(data).
E.G.
for byte in bytearray.fromhex(data):
CRC ^= byte
...
I'm trying to convert this function from the Mozilla Firefox code base, it's called HashString. It calls a bunch of functions which are all in this file: https://dxr.mozilla.org/mozilla-central/source/mfbt/HashFunctions.h#294
So these are the C functions it calls:
static const uint32_t kGoldenRatioU32 = 0x9E3779B9U;
MOZ_WARN_UNUSED_RESULT inline uint32_t
HashString(const wchar_t* aStr)
{
return detail::HashUntilZero(aStr);
}
template<typename T>
uint32_t
HashUntilZero(const T* aStr)
{
uint32_t hash = 0;
for (T c; (c = *aStr); aStr++) {
hash = AddToHash(hash, c);
}
return hash;
}
MOZ_WARN_UNUSED_RESULT inline uint32_t
AddToHash(uint32_t aHash, A* aA)
{
/*
* You might think this function should just take a void*. But then we'd only
* catch data pointers and couldn't handle function pointers.
*/
static_assert(sizeof(aA) == sizeof(uintptr_t), "Strange pointer!");
return detail::AddUintptrToHash<sizeof(uintptr_t)>(aHash, uintptr_t(aA));
}
inline uint32_t
AddUintptrToHash<8>(uint32_t aHash, uintptr_t aValue)
{
/*
* The static cast to uint64_t below is necessary because this function
* sometimes gets compiled on 32-bit platforms (yes, even though it's a
* template and we never call this particular override in a 32-bit build). If
* we do aValue >> 32 on a 32-bit machine, we're shifting a 32-bit uintptr_t
* right 32 bits, and the compiler throws an error.
*/
uint32_t v1 = static_cast<uint32_t>(aValue);
uint32_t v2 = static_cast<uint32_t>(static_cast<uint64_t>(aValue) >> 32);
return AddU32ToHash(AddU32ToHash(aHash, v1), v2);
}
inline uint32_t
AddU32ToHash(uint32_t aHash, uint32_t aValue)
{
return kGoldenRatioU32 * (RotateBitsLeft32(aHash, 5) ^ aValue);
}
inline uint32_t
RotateBitsLeft32(uint32_t aValue, uint8_t aBits)
{
MOZ_ASSERT(aBits < 32);
return (aValue << aBits) | (aValue >> (32 - aBits));
}
And here is my js code:
function HashString(aStr, aLength) {
// moz win32 hash function
if (aLength) {
console.error('NS_ERROR_NOT_IMPLEMENTED');
throw Components.results.NS_ERROR_NOT_IMPLEMENTED;
} else {
return HashUntilZero(aStr);
}
}
function HashUntilZero(aStr) {
var hash = 0;
//for (T c; (c = *aStr); aStr++) {
for (var c=0; c<aStr.length; c++) {
hash = AddToHash(hash, aStr.charCodeAt(c));
}
return hash;
}
function AddToHash(aHash, aA) {
//return detail::AddU32ToHash(aHash, aA);
//return AddU32ToHash(aHash, aA);
//return detail::AddUintptrToHash<sizeof(uintptr_t)>(aHash, aA);
return AddUintptrToHash(aHash, aA);
}
function AddUintptrToHash(aHash, aValue) {
//return AddU32ToHash(aHash, static_cast<uint32_t>(aValue));
return AddU32ToHash(aHash, aValue);
}
function AddU32ToHash(aHash, aValue) {
var kGoldenRatioU32 = 0x9E3779B9;
return (kGoldenRatioU32 * (RotateBitsLeft32(aHash, 5) ^ aValue));
}
function RotateBitsLeft32(aValue, aBits) {
// MOZ_ASSERT(aBits < 32);
return (aValue << aBits) | (aValue >> (32 - aBits));
}
console.log(HashString('C:\Users\Vayeate\AppData\Roaming\Mozilla\Firefox\Profiles\aksozfjt.Unnamed Profile 10')); // should return 3181739213
This isn't working right, doing HashString('C:\Users\Vayeate\AppData\Roaming\Mozilla\Firefox\Profiles\aksozfjt.Unnamed Profile 10') should return to me 3181739213 however it's not. It keeps returning to me: -159266146140
Let's implement a more minimal C++ version first, which also dumps intermediate values which we can later compare.
#include <iostream>
#include <iomanip>
#include <stdint.h>
using namespace std;
static const uint32_t gr = 0x9E3779B9U;
template<typename T>
static uint32_t add(uint32_t hash, T val) {
const uint32_t rv = gr * (((hash << 5) | (hash >> 27)) ^ val);
cerr << dec << setw(7) << (uint32_t)val << " " << setw(14) << rv << " " << hex << rv << endl;
return rv;
}
int main() {
const auto text = string("C:\\Users\\Vayeate\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\aksozfjt.Unnamed Profile 10");
uint32_t rv = 0;
for (auto c: text) {
rv = add(rv, c);
}
cout << "Result: " << dec << setw(14) << rv << " " << hex << rv << endl;
}
Result: 3181739213 bda57ccd, so we're on the right track.
Now, for some Javascript:
GetNativePath returns an nsAutoCString aka. 8-bit string, by converting the internal 16-bit string to UTF-8.
Javascript does not actually know about 32-bit unsigned integers, just 32-bit signed integers, but there are some dirty tricks (mainly the >>> 0 "unsigned cast").
32-bit unsigned multiplication does not work, but we can actually implement that operation ourselves.
Properly escaping the backslashes \ in your test string also helps ;)
Putting these things together, I arrived at the following function, which seems to produce correct results.
/**
* Javascript implementation of
* https://hg.mozilla.org/mozilla-central/file/0cefb584fd1a/mfbt/HashFunctions.h
* aka. the mfbt hash function.
*/
let HashString = (function() {
// Note: >>>0 is basically a cast-to-unsigned for our purposes.
const encoder = new TextEncoder("utf-8");
const kGoldenRatio = 0x9E3779B9;
// Multiply two uint32_t like C++ would ;)
const mul32 = (a, b) => {
// Split into 16-bit integers (hi and lo words)
let ahi = (a >> 16) & 0xffff;
let alo = a & 0xffff;
let bhi = (b >> 16) & 0xffff
let blo = b & 0xffff;
// Compute new hi and lo seperately and recombine.
return (
(((((ahi * blo) + (alo * bhi)) & 0xffff) << 16) >>> 0) +
(alo * blo)
) >>> 0;
};
// kGoldenRatioU32 * (RotateBitsLeft32(aHash, 5) ^ aValue);
const add = (hash, val) => {
// Note, cannot >> 27 here, but / (1<<27) works as well.
let rotl5 = (
((hash << 5) >>> 0) |
(hash / (1<<27)) >>> 0
) >>> 0;
return mul32(kGoldenRatio, (rotl5 ^ val) >>> 0);
}
return function(text) {
// Convert to utf-8.
// Also decomposes the string into uint8_t values already.
let data = encoder.encode(text);
// Compute the actual hash
let rv = 0;
for (let c of data) {
rv = add(rv, c | 0);
}
return rv;
};
})();
let res = HashString('C:\\Users\\Vayeate\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\aksozfjt.Unnamed Profile 10');
console.log(res, res === 3181739213);
Might not be the most efficient implementation, but well, it works at least ;)
There is a simpler way
var file = new FileUtils.File('C:\\Users\\Vayeate\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\aksozfjt.Unnamed Profile 10');
file.QueryInterface(Ci.nsIHashable);
console.log(file.hashCode === 3181739213);
before I start, a disclaimer: while I can go around C/C++ code, I am no wizard, nor did I ever did enough programming to call myself a capable programmer with it.
I'm trying to use CRC32C to validate data that is coming into our servers from browser. Currently both implementations use the same code (nodeJS on server), but we would like to switch to hardware implementation(blog post, github repo) (when available) and for that I need a correctly functioning version in the browser.
I tried to go with this implementation (and another, internally develop, but also not-working) but using correct polynom (0x82F63B78 instead of 0xEDB88320, and also 0x1EDC6F41 & 0x8F6E37A0) but no polynom that I used produces the correct output.
Continuing in my research I find a post from Mark Adler which includes a software implementation and decide to try to convert it to Javascript (to the best of my understanding of C).
The result:
function crc32c_table_intel() {
var POLY = 0x82f63b78;
var n, crc, k;
var crc32c_table = gen2darr(8, 256, 0);
for (n = 0; n < 256; n++) {
crc = n;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
crc32c_table[0][n] = crc;
}
for (n = 0; n < 256; n++) {
crc = crc32c_table[0][n];
for (k = 1; k < 8; k++) {
crc = crc32c_table[0][crc & 0xff] ^ (crc >> 8);
crc32c_table[k][n] = crc;
_crc_tmptable.push(crc32c_table[k][n]);
}
}
return crc32c_table;
}
function crc32c_sw(crci, str) {
var len = str.length;
var crc;
var crc32c_table = crc32c_table_intel();
crc = crci ^ 0xffffffff;
for(var next = 0; next < 7; next++) { // was: while (len && ((uintptr_t)next & 7) != 0) {
crc = crc32c_table[0][(crc ^ str.charCodeAt(next++)) & 0xff] ^ (crc >> 8);
len--;
}
while (len >= 8) {
// was: crc ^= *(uint64_t *)next;
crc ^= str.charCodeAt(next);
crc = crc32c_table[7][crc & 0xff] ^
crc32c_table[6][(crc >> 8) & 0xff] ^
crc32c_table[5][(crc >> 16) & 0xff] ^
crc32c_table[4][(crc >> 24) & 0xff] ^
crc32c_table[3][(crc >> 32) & 0xff] ^
crc32c_table[2][(crc >> 40) & 0xff] ^
crc32c_table[1][(crc >> 48) & 0xff] ^
crc32c_table[0][crc >> 56];
next += 1;
len -= 1;
}
while (len) {
// was: crc = crc32c_table[0][(crc ^ *next++) & 0xff] ^ (crc >> 8);
crc = crc32c_table[0][(crc ^ str.charCodeAt(next++)) & 0xff] ^ (crc >> 8);
len--;
}
return crc ^ 0xffffffff;
}
// a helper function
function gen2darr( rows, cols, defaultValue){
var arr = [];
for(var i=0; i < rows; i++){
arr.push([]);
arr[i].push( new Array(cols));
for(var j=0; j < cols; j++){
arr[i][j] = defaultValue;
}
}
return arr;
}
Still, no luck. No matter what function, what table or what polynom do I use, the results do not match:
SSE4.2: 606105071
JS (example): 1249991249
Then I go around thinking that it must be something with conversion from Javascript strings to C/C++ data and I see that nodeJS implementation uses UTF8 (https://github.com/Voxer/sse4_crc32/blob/master/src/sse4_crc32.cpp#L56) while Javascript uses UCS-2 encoding.
Now, the questions that I have are these:
Is any of these functions valid? The first seems so, for the one I posted I am not sure if I translated all the bitwise operations correctly
How to go around encoding issues? Is that even an encoding issue as I suspect? Does anyone has any other ideas how to ensure that nodeJS HW implementation and client side implemntation return the same output?
Thanks for any ideas!
See this answer for a compatible software implementation and a fast implementation using the hardware instruction.
You have a few problems there. One is that in Javascript you need to use a logical right shift >>> instead of an arithmetic right shift >>. Second is that you are using charCodeAt, which returns the Unicode value of a character, which may be more than one byte. The CRC algorithm operates on a sequence of bytes, not a sequence of Unicode characters. Third is that you're computing the same table every time -- the table should only be computed once. Last is you're jumping straight to a complicated implementation.
As a simple example, this will compute a CRC-32C in Javascript on an array of values expected to be integers in the range 0..255, i.e. bytes:
function crc32c(crc, bytes) {
var POLY = 0x82f63b78;
var n;
crc ^= 0xffffffff;
for (n = 0; n < bytes.length; n++) {
crc ^= bytes[n];
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
}
return crc ^ 0xffffffff;
}