In javascript I am trying to make unicode into byte based hex escape sequences that are compatible with C:
ie. 😄
becomes: \xF0\x9F\x98\x84 (correct)
NOT javascript surrogates, not \uD83D\uDE04 (wrong)
I cannot figure out the math relationship between the four bytes C wants vs the two surrogates javascript uses. I suspect the algorithm is far more complex than my feeble attempts.
Thanks for any tips.
encodeURIComponent does this work:
var input = "\uD83D\uDE04";
var result = encodeURIComponent(input).replace(/%/g, "\\x"); // \xF0\x9F\x98\x84
Upd: Actually, C strings can contain digits and letters without escaping, but if you really need to escape them:
function escape(s, escapeEverything) {
if (escapeEverything) {
s = s.replace(/[\x10-\x7f]/g, function (s) {
return "-x" + s.charCodeAt(0).toString(16).toUpperCase();
});
}
s = encodeURIComponent(s).replace(/%/g, "\\x");
if (escapeEverything) {
s = s.replace(/\-/g, "\\");
}
return s;
}
Found a solution here: http://jonisalonen.com/2012/from-utf-16-to-utf-8-in-javascript/
I would have never figured out THAT math, wow.
somewhat minified
function UTF8seq(s) {
var i,c,u=[];
for (i=0; i < s.length; i++) {
c = s.charCodeAt(i);
if (c < 0x80) { u.push(c); }
else if (c < 0x800) { u.push(0xc0 | (c >> 6), 0x80 | (c & 0x3f)); }
else if (c < 0xd800 || c >= 0xe000) { u.push(0xe0 | (c >> 12), 0x80 | ((c>>6) & 0x3f), 0x80 | (c & 0x3f)); }
else { i++; c = 0x10000 + (((c & 0x3ff)<<10) | (s.charCodeAt(i) & 0x3ff));
u.push(0xf0 | (c >>18), 0x80 | ((c>>12) & 0x3f), 0x80 | ((c>>6) & 0x3f), 0x80 | (c & 0x3f)); }
}
for (i=0; i < u.length; i++) { u[i]=u[i].toString(16); }
return '\\x'+u.join('\\x');
}
Your C code expects an UTF-8 string (the symbol is represented as 4 bytes). The JS representation you see is UTF-16 however (the symbol is represented as 2 uint16s, a surrogate pair).
You will first need to get the (Unicode) code point for your symbol (from the UTF-16 JS string), then build the UTF-8 representation for it from that.
Since ES6 you can use the codePointAt method for the first part, which I would recommend using as a shim even if not supported. I guess you don't want to decode surrogate pairs yourself :-)
For the rest, I don't think there's a library method, but you can write it yourself according to the spec:
function hex(x) {
x = x.toString(16);
return (x.length > 2 ? "\\u0000" : "\\x00").slice(0,-x.length)+x.toUpperCase();
}
var c = "😄";
console.log(c.length, hex(c.charCodeAt(0))+hex(c.charCodeAt(1))); // 2, "\uD83D\uDE04"
var cp = c.codePointAt(0);
var bytes = new Uint8Array(4);
bytes[3] = 0x80 | cp & 0x3F;
bytes[2] = 0x80 | (cp >>>= 6) & 0x3F;
bytes[1] = 0x80 | (cp >>>= 6) & 0x3F;
bytes[0] = 0xF0 | (cp >>>= 6) & 0x3F;
console.log(Array.prototype.map.call(bytes, hex).join("")) // "\xf0\x9f\x98\x84"
(tested in Chrome)
I need to create a SHA-256 digest from a file (~6MB) inside the browser. The only way that I've managed to do it so far was like this:
var reader = new FileReader();
reader.onload = function() {
// this gets rid of the mime-type data header
var actual_contents = reader.result.slice(reader.result.indexOf(',') + 1);
var what_i_need = new jsSHA(actual_contents, "B64").getHash("SHA-256", "HEX");
}
reader.readAsDataURL(some_file);
While this works correctly, the problem is that it's very slow. It took ~2-3 seconds for a 6MB file. How can I improve this?
You may want to take a look at the Stanford JS Crypto Library
GitHub
Website with Examples
From the website:
SJCL is secure. It uses the industry-standard AES algorithm at 128, 192 or 256 bits; the SHA256 hash function; the HMAC authentication code; the PBKDF2 password strengthener; and the CCM and OCB authenticated-encryption modes.
SJCL has a test page that shows how long it will take.
184 milliseconds for a SHA256 iterative. And 50 milliseconds for a SHA-256 from catameringue.
Test page
Sample code:
Encrypt data:
sjcl.encrypt("password", "data")
Decrypt data: sjcl.decrypt("password", "encrypted-data")
This is an old question but I thought it's worth noting that asmCrypto is significantly faster than jsSHA, and faster than CryptoJS and SJCL
https://github.com/vibornoff/asmcrypto.js/
There is also a lite version (a fork of the above) maintained by OpenPGP.js
https://github.com/openpgpjs/asmcrypto-lite
Which only includes SHA256, and a couple of AES features.
To use asmCrypto You can simply do the following:
var sha256HexValue = asmCrypto.SHA256.hex(myArraybuffer);
I'm able to hash a 150MB+ file in < 2 seconds consistently in Chrome.
Here is what your looking for. I derived this from a C version of the SHA256 algorithm. It also includes SHA256D. I don't think your going to get much faster than this with javascript. I tried expanding the loops and it ran slower due to optimizations run by the javascript interpreter.
// From: https://github.com/Hartland/GPL-CPU-Miner/blob/master/sha2.c
if ("undefined" == typeof vnet) {
vnet = new Array();
}
if ("undefined" == typeof vnet.crypt) {
vnet.crypt = new Array();
}
vnet.crypt.sha2 = function() {
var sha256_h = [
0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19
];
var sha256_k = [
0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3,
0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc,
0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7,
0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13,
0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3,
0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5,
0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208,
0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
];
var sha256_init = function(s) {
s.state = [
sha256_h[0],
sha256_h[1],
sha256_h[2],
sha256_h[3],
sha256_h[4],
sha256_h[5],
sha256_h[6],
sha256_h[7],
];
}; this.sha256_init = sha256_init;
/*
* SHA256 block compression function. The 256-bit state is transformed via
* the 512-bit input block to produce a new state.
*/
var sha256_transform = function(s, b, swap) {
var block = b.block;
var state = s.state;
var W;
var S;
var t0;
var t1;
var i;
/* 1. Prepare message schedule W. */
if (swap) {
W = [
((((block[0] ) << 24) & 0xff000000) | (((block[0] ) << 8) & 0x00ff0000) | (((block[0] ) >> 8) & 0x0000ff00) | (((block[0] ) >> 24) & 0x000000ff)),
((((block[1] ) << 24) & 0xff000000) | (((block[1] ) << 8) & 0x00ff0000) | (((block[1] ) >> 8) & 0x0000ff00) | (((block[1] ) >> 24) & 0x000000ff)),
((((block[2] ) << 24) & 0xff000000) | (((block[2] ) << 8) & 0x00ff0000) | (((block[2] ) >> 8) & 0x0000ff00) | (((block[2] ) >> 24) & 0x000000ff)),
((((block[3] ) << 24) & 0xff000000) | (((block[3] ) << 8) & 0x00ff0000) | (((block[3] ) >> 8) & 0x0000ff00) | (((block[3] ) >> 24) & 0x000000ff)),
((((block[4] ) << 24) & 0xff000000) | (((block[4] ) << 8) & 0x00ff0000) | (((block[4] ) >> 8) & 0x0000ff00) | (((block[4] ) >> 24) & 0x000000ff)),
((((block[5] ) << 24) & 0xff000000) | (((block[5] ) << 8) & 0x00ff0000) | (((block[5] ) >> 8) & 0x0000ff00) | (((block[5] ) >> 24) & 0x000000ff)),
((((block[6] ) << 24) & 0xff000000) | (((block[6] ) << 8) & 0x00ff0000) | (((block[6] ) >> 8) & 0x0000ff00) | (((block[6] ) >> 24) & 0x000000ff)),
((((block[7] ) << 24) & 0xff000000) | (((block[7] ) << 8) & 0x00ff0000) | (((block[7] ) >> 8) & 0x0000ff00) | (((block[7] ) >> 24) & 0x000000ff)),
((((block[8] ) << 24) & 0xff000000) | (((block[8] ) << 8) & 0x00ff0000) | (((block[8] ) >> 8) & 0x0000ff00) | (((block[8] ) >> 24) & 0x000000ff)),
((((block[9] ) << 24) & 0xff000000) | (((block[9] ) << 8) & 0x00ff0000) | (((block[9] ) >> 8) & 0x0000ff00) | (((block[9] ) >> 24) & 0x000000ff)),
((((block[10]) << 24) & 0xff000000) | (((block[10]) << 8) & 0x00ff0000) | (((block[10]) >> 8) & 0x0000ff00) | (((block[10]) >> 24) & 0x000000ff)),
((((block[11]) << 24) & 0xff000000) | (((block[11]) << 8) & 0x00ff0000) | (((block[11]) >> 8) & 0x0000ff00) | (((block[11]) >> 24) & 0x000000ff)),
((((block[12]) << 24) & 0xff000000) | (((block[12]) << 8) & 0x00ff0000) | (((block[12]) >> 8) & 0x0000ff00) | (((block[12]) >> 24) & 0x000000ff)),
((((block[13]) << 24) & 0xff000000) | (((block[13]) << 8) & 0x00ff0000) | (((block[13]) >> 8) & 0x0000ff00) | (((block[13]) >> 24) & 0x000000ff)),
((((block[14]) << 24) & 0xff000000) | (((block[14]) << 8) & 0x00ff0000) | (((block[14]) >> 8) & 0x0000ff00) | (((block[14]) >> 24) & 0x000000ff)),
((((block[15]) << 24) & 0xff000000) | (((block[15]) << 8) & 0x00ff0000) | (((block[15]) >> 8) & 0x0000ff00) | (((block[15]) >> 24) & 0x000000ff))
];
} else {
W = [
block[0],
block[1],
block[2],
block[3],
block[4],
block[5],
block[6],
block[7],
block[8],
block[9],
block[10],
block[11],
block[12],
block[13],
block[14],
block[15]
];
}
for (i = 16; i < 64; i += 2) {
W[i] = ((
((((W[i-2] >>> 17) | (W[i-2] << 15)) ^ ((W[i-2] >>> 19) | ((W[i-2] << 13)>>>0) ) ^ (W[i - 2] >>> 10)) >>> 0) + //s1 (W[i - 2]) +
W[i - 7] +
((((W[i - 15] >>> 7) | (W[i - 15] << 25)) ^ ((W[i - 15] >>> 18) | ((W[i - 15] << 14) >>> 0)) ^ (W[i - 15] >>> 3)) >>> 0) + //s0 (W[i - 15]) +
W[i - 16]
) & 0xffffffff) >>> 0;
W[i+1] = ((
((((W[i-1] >>> 17) | (W[i-1] << 15)) ^ ((W[i-1] >>> 19) | (W[i-1] << 13)) ^ (W[i - 1] >>> 10)) >>> 0)+ //s1 (W[i - 1]) +
W[i - 6] +
((((W[i - 14] >>> 7) | (W[i - 14] << 25)) ^ ((W[i - 14] >>> 18) | (W[i - 14] << 14)) ^ (W[i - 14] >>> 3)) >>> 0) + //s0 (W[i - 14]) +
W[i - 15]
) & 0xffffffff) >>> 0;
}
/* 2. Initialize working variables. */
S = [
state[0],
state[1],
state[2],
state[3],
state[4],
state[5],
state[6],
state[7],
];
/* 3. Mix. */
i=0;
for(;i<64;++i) {
//RNDr(S,W,i)
t0 = S[(71 - i) % 8] +
((((S[(68 - i) % 8] >>> 6) | (S[(68 - i) % 8] << 26)) ^ ((S[(68 - i) % 8] >>> 11) | (S[(68 - i) % 8] << 21)) ^ ((S[(68 - i) % 8] >>> 25) | (S[(68 - i) % 8] << 7)))) + //S1 (S[(68 - i) % 8]) +
(((S[(68 - i) % 8] & (S[(69 - i) % 8] ^ S[(70 - i) % 8])) ^ S[(70 - i) % 8]) ) + // Ch
W[i] +
sha256_k[i];
t1 = ((((S[(64 - i) % 8] >>> 2) | ((S[(64 - i) % 8] & 3) << 30)) ^ ((S[(64 - i) % 8] >>> 13) | (S[(64 - i) % 8] << 19)) ^ ((S[(64 - i) % 8] >>> 22) | (S[(64 - i) % 8] << 10)))) + //S0 (S[(64 - i) % 8]) +
(((S[(64 - i) % 8] & (S[(65 - i) % 8] | S[(66 - i) % 8])) | (S[(65 - i) % 8] & S[(66 - i) % 8]))); // Maj
S[(67 - i) % 8] = ((S[(67 - i) % 8] + t0) & 0xFFFFFFFF) >>> 0;
S[(71 - i) % 8] = ((t0 + t1) & 0xFFFFFFFF) >>> 0;
}
/* 4. Mix local working variables into global state */
i=0;
for(;i<8;++i) {
s.state[i] = (0xFFFFFFFF & (state[i] + S[i])) >>> 0;
}
}; this.sha256_transform = sha256_transform;
var sha256d_hash1 = [
0x00000000, 0x00000000, 0x00000000, 0x00000000,
0x00000000, 0x00000000, 0x00000000, 0x00000000,
0x80000000, 0x00000000, 0x00000000, 0x00000000,
0x00000000, 0x00000000, 0x00000000, 0x00000100
];
var sha256d_80_swap = function(hash, data)
{
var S = new Array();
var i;
var b1 = new Array();
var b2 = new Array();
var b3 = new Array();
b1.block = [
data[0],
data[1],
data[2],
data[3],
data[4],
data[5],
data[6],
data[7],
data[8],
data[9],
data[10],
data[11],
data[12],
data[13],
data[14],
data[15]
];
b2.block = [
data[16],
data[17],
data[18],
data[19],
data[20],
data[21],
data[22],
data[23],
data[24],
data[25],
data[26],
data[27],
data[28],
data[29],
data[30],
data[31]
];
sha256_init(S);
sha256_transform(S, b1, 0);
sha256_transform(S, b2, 0);
b3.block = [
S.state[0],
S.state[1],
S.state[2],
S.state[3],
S.state[4],
S.state[5],
S.state[6],
S.state[7],
sha256d_hash1[8],
sha256d_hash1[9],
sha256d_hash1[10],
sha256d_hash1[11],
sha256d_hash1[12],
sha256d_hash1[13],
sha256d_hash1[14],
sha256d_hash1[15]
];
sha256_init(hash);
sha256_transform(hash, b3, 0);
for (i = 0; i < 8; i++) {
hash.state[i] = ((((hash.state[i] ) << 24) & 0xff000000) | (((hash.state[i] ) << 8) & 0x00ff0000) | (((hash.state[i] ) >> 8) & 0x0000ff00) | (((hash.state[i] ) >> 24) & 0x000000ff)); //swab32(hash[i]);
}
}; this.sha256d_80_swap = sha256d_80_swap;
var sha256d = function(hash, data) {
var S;
var T;
var block_in;
S = new Array();
T = new Array();
T.block = [];
var i, r;
//hash.hash = new Array(32).join('0').split('').map(parseFloat);
sha256_init(S);
for (r = data.length; r > -9; r -= 64) {
if (r < 64) {
if (r > 0) {
block_in = data.slice(data.length - r,data.length);
block_in.push.apply(block_in, new Array(64-r).join('0').split('').map(parseFloat));
} else {
block_in = new Array(64).join('0').split('').map(parseFloat);
}
} else {
block_in = data.slice(data.length - r,data.length - r + 64);
}
//memcpy(T, data + len - r, r > 64 ? 64 : (r < 0 ? 0 : r));
if (r >= 0 && r < 64) {
block_in[r] = 0x80;
}
for (i = 0; i < 16; i++) {
T.block[i] = (((0xff & block_in[(i*4)]) << 24) | ((0xff & block_in[(i*4)+1]) << 16) | ((0xff & block_in[(i*4)+2]) << 8) | (0xff & block_in[(i*4)+3])) >>> 0;
}
if (r < 56) {
T.block[15] = 8 * data.length;
}
sha256_transform(S, T, 0);
}
//memcpy(S + 8, sha256d_hash1 + 8, 32);
S.block = S.state;
for(i=8;i<16;i++) {
S.block[i] = sha256d_hash1[i];
}
sha256_init(T);
sha256_transform(T, S, 0);
hash.hash = [
(T.state[0] >> 24) & 0xff,
(T.state[0] >> 16) & 0xff,
(T.state[0] >> 8) & 0xff,
T.state[0] & 0xff,
(T.state[1] >> 24) & 0xff,
(T.state[1] >> 16) & 0xff,
(T.state[1] >> 8) & 0xff,
T.state[1] & 0xff,
(T.state[2] >> 24) & 0xff,
(T.state[2] >> 16) & 0xff,
(T.state[2] >> 8) & 0xff,
T.state[2] & 0xff,
(T.state[3] >> 24) & 0xff,
(T.state[3] >> 16) & 0xff,
(T.state[3] >> 8) & 0xff,
T.state[3] & 0xff,
(T.state[4] >> 24) & 0xff,
(T.state[4] >> 16) & 0xff,
(T.state[4] >> 8) & 0xff,
T.state[4] & 0xff,
(T.state[5] >> 24) & 0xff,
(T.state[5] >> 16) & 0xff,
(T.state[5] >> 8) & 0xff,
T.state[5] & 0xff,
(T.state[6] >> 24) & 0xff,
(T.state[6] >> 16) & 0xff,
(T.state[6] >> 8) & 0xff,
T.state[6] & 0xff,
(T.state[7] >> 24) & 0xff,
(T.state[7] >> 16) & 0xff,
(T.state[7] >> 8) & 0xff,
T.state[7] & 0xff
];
}; this.sha256d = sha256d;
var sha256 = function(hash, data) {
var S;
var T;
var block_in;
S = new Array();
T = new Array();
T.block = [];
var i, r;
hash.hash = new Array(32).join('0').split('').map(parseFloat);
sha256_init(S);
for (r = data.length; r > -9; r -= 64) {
if (r < 64) {
if (r > 0) {
block_in = data.slice(data.length - r,data.length);
block_in.push.apply(block_in, new Array(64-r).join('0').split('').map(parseFloat));
} else {
block_in = new Array(64).join('0').split('').map(parseFloat);
}
} else {
block_in = data.slice(data.length - r,data.length - r + 64);
}
//memcpy(T, data + len - r, r > 64 ? 64 : (r < 0 ? 0 : r));
if (r >= 0 && r < 64) {
block_in[r] = 0x80;
}
for (i = 0; i < 16; i++) {
T.block[i] = (((0xff & block_in[(i*4)]) << 24) | ((0xff & block_in[(i*4)+1]) << 16) | ((0xff & block_in[(i*4)+2]) << 8) | (0xff & block_in[(i*4)+3])) >>> 0;
}
if (r < 56) {
T.block[15] = 8 * data.length;
}
sha256_transform(S, T, 0);
}
for (i = 0; i < 8; i++) {
//be32enc((uint32_t *)hash + i, T[i]);
hash.hash[(i * 4)] = (S.state[i] >> 24) & 0xff;
hash.hash[(i * 4)+1] = (S.state[i] >> 16) & 0xff
hash.hash[(i * 4)+2] = (S.state[i] >> 8) & 0xff
hash.hash[(i * 4)+3] = S.state[i] & 0xff;
}
}; this.sha256 = sha256;
};
It might be faster to use an emscripten compiled version of the crypto libraries,
Q. How fast will the compiled code be?
A. Emscripten's default code generation mode is in asm.js format,
which is a subset of JavaScript designed to make it possible for
JavaScript engines to execute very quickly. See here for up-to-date
benchmark results. In many cases, asm.js can get quite close to native
speed.
You can find an Emscripten-compiled NaCl cryptographic library here.
I use SubtleCrypto.digest()
test file about ~85MB, It doesn't take a second to finish.
<input type="file" multiple/>
<input placeholder="Press `Enter` when done."/>
<script>
/**
* #param {"SHA-1"|"SHA-256"|"SHA-384"|"SHA-512"} algorithm https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/digest
* #param {string|Blob} data
*/
async function getHash(algorithm, data) {
const main = async (msgUint8) => { // https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/digest#converting_a_digest_to_a_hex_string
const hashBuffer = await crypto.subtle.digest(algorithm, msgUint8)
const hashArray = Array.from(new Uint8Array(hashBuffer))
return hashArray.map(b => b.toString(16).padStart(2, '0')).join(''); // convert bytes to hex string
}
if (data instanceof Blob) {
const arrayBuffer = await data.arrayBuffer()
const msgUint8 = new Uint8Array(arrayBuffer)
return await main(msgUint8)
}
const encoder = new TextEncoder()
const msgUint8 = encoder.encode(data)
return await main(msgUint8)
}
const inputFile = document.querySelector(`input[type="file"]`)
const inputText = document.querySelector(`input[placeholder^="Press"]`)
inputFile.onchange = async (event) => {
for (const file of event.target.files) {
console.log(file.name, file.type, file.size + "bytes")
const hashHex = await getHash("SHA-256", new Blob([file]))
console.log(hashHex)
}
}
inputText.onkeyup = async (keyboardEvent) => {
if (keyboardEvent.key === "Enter") {
const hashHex = await getHash("SHA-256", keyboardEvent.target.value)
console.log(hashHex)
}
}
</script>
As some have answered, it can be done in vanillajs :
async function getChecksumSha256(blob: Blob): Promise<string> {
const uint8Array = new Uint8Array(await blob.arrayBuffer());
const hashBuffer = await crypto.subtle.digest('SHA-256', uint8Array);
const hashArray = Array.from(new Uint8Array(hashBuffer));
return hashArray.map((h) => h.toString(16).padStart(2, '0')).join('');
}
Source : https://gist.github.com/bilelz/c96fb0b1f62983d061910e8d310a5162
You can do that without external libraries using Crypto.subtle API. More details here.
Example:
function b2h(buffer) {
return Array.prototype.map.call(new Uint8Array(buffer), x => ('00' + x.toString(16)).slice(-2)).join('');
}
const FILEREADER = new FileReader();
FILEREADER.readAsArrayBuffer(file);
FILEREADER.onloadend = async function(entry) {
const FILE_HASH = b2h(await crypto.subtle.digest('SHA-256', entry.target.result)); // output: the sha256 digest hex encoded of the file
}