decompress gzip compressed data in javascript

decompress gzip compressed data in javascript - javascript

i am compressing data in gzip format in java like this
Object obj = parser.parse(new FileReader("/home/netstorm/FinalFunnelData.json"));
JSONObject jsonObject = (JSONObject) obj;
String data = jsonObject.toJSONString();
ByteOutputStream bos = new ByteOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(bos);
byte B_ARRAY[] = new byte[1024];
gzip.write(data.trim().getBytes());
gzip.close();
// response.setHeader("Content-Type", "application/json");
//response.setHeader("Content-Encoding", "gzip");
System.out.println(bos.toString());
out.println(bos.toString());
but i am unable to decompress it on javascript side?

The only thing that stands out in your code is :
out.println(bos.toString());
gzip'ed data is binary, so you cannot treat it as text (toString() and println).
I'm no Java expert, but you may want to try
GZIPOutputStream gzip = new GZIPOutputStream(out);
By giving it out directly, instead of first putting it in a ByteOutputStream, it might just do the right thing.

Related

base64 encode in python, decode in javascript

A Python backend reads a binary file, base64 encodes it, inserts it into a JSON doc and sends it to a JavaScript frontend:
#Python
with open('some_binary_file', 'rb') as in_file:
return base64.b64encode(in_file.read()).decode('utf-8')
The JavaScript frontend fetches the base64 encoded string from the JSON document and turns it into a binary blob:
#JavaScript
b64_string = response['b64_string'];
decoded_file = atob(b64_string);
blob = new Blob([decoded_file], {type: 'application/octet-stream'});
Unfortunately when downloading the blob the encoding seems to be wrong but I'm not sure where the problem is. E.g. it is an Excel file that I can't open anymore. In the Python part I've tried different decoders ('ascii', 'latin1') but that doesn't make a difference. Is there a problem with my code?

I found the answer here. The problem was at the JavaScript side. It seems only applying atob to the base64 encoded string doesn't work for binary data. You'll have to convert that into a typed byte array. What I ended up doing (LiveScript):
byte_chars = atob base64_str
byte_numbers = [byte_chars.charCodeAt(index) for bc, index in byte_chars]
byte_array = new Uint8Array byte_numbers
blob = new Blob [byte_array], {type: 'application/octet-stream'}

nodejs binary websocket mimetype handling

i'm not 100% sure but from what i read when i send a blob (binary data) over websocket, the blob does not contain any file information. (Also the official specification states that wesockets only send the raw binary)
the filesize
the mimetype
user info (explain later)
i'm using https://github.com/websockets/ws
Testing:
Sending directly the blob from an input file.
ws.send(this.files[0]) //this should already contain the info
Creating a new blob with the native javascript api from file setting the proper mimetype.
ws.send(new Blob([this.files[0]],{type:this.files[0].type})); //also this
on both sides you can get only the effective blob without any other information.
Is it possible to append let's say a 4kb predefined json data converted also to binary that contains important information like the mimetype and the filesize,
and then just split off the 4kb when needed?
{"mime":"txt/plain","size":345}____________4KB_REST_OF_THE_BINARY
OR
ws.send({"mime":"txt\/plain","size":345})
ws.send(this.files[0])
Even if the first one is the worst solution ever it would allow me to send everything in one time.
The second one has a big problem:
it's a chat that allows to send also files like documents,images,music videos.
i could write some sort of handshaking system when sending the file/user info before i send the binary data.
BUT
if another person sends also a file, as it's async, the handshaking system has no chance to determine wich file is the right one for the correct user and mimetype.
So how do you properly send a binary file in a multiuser async envoirement?
i know i can convert to base64 but thats 30% bigger.
btw. Totally disappointed with Apple... while chrome shows every binary data properly, my ios devices are not able to handle blob's, only images will show in blob or base64 format, not even a simple txt file. Basically only a <img> tag can read dynamic files.
How everything works (now):
user sends a file
nodejs gets the binary data, also user info... but not mimetype,filename,size.
nodejs broadcasts the raw binary file to all the users.(can't specify user & file info)
clients create a bloburl (who send that? XD).
EDIT
what i have now:
client 1 (sends a file)CHROME
fileInput.addEventListener('change',function(e){
var file=this.files[0];
ws.send(new Blob([file],{
type:file.type //<- SET MIMETYPE
}));
//file.size
},false);
note: file is already a blob ... but this is how you would normally create a new blob specifying the mimetype.
server (broadcasts the binary data to the other clients)NODEJS
aaaaaand the mimetype is gone...
ws.addListener('message',function(binary){
var b=0,c=wss.clients.length;
while(b<c){
wss.clients[b++].send(binary)
}
});
client 2 (recieves the binary)CHROME
ws.addEventListener('message',function(msg){
var blob=new Blob([msg.data],{
type:'application/octet-stream' //<- LOST
});
var file=window.URL.createObjectURL(blob);
},false);
note: m.data is already a blob ... but this is how you would normally create a new blob specifying the mimetype witch is lost.
In client 2 i need the mimetype and naturally i also need the info about the user, wich can be retrieved from client 1 or the server (not a good choice)...

You're a bit out of luck with this because Node doesn't support the Blob interface and so any data you send or receive in Binary with Node is just Binary. You would have to have something that knew how to interpret a Blob object.
Here's an idea, and let me know if this works. Reading through the documentation for websockets\ws it says it supports sending and receiving ArrayBuffers. Which means you can use TypedArrays.
Here's where it gets nasty. You set a certain fixed n number of bytes at the beginning of every TypedArray to signal the mime type encoded in utf8 or what have you, and the rest of your TypedArray contains your file's bytes.
I would recommend using UInt8Array because utf8 characters are 8 bits long and your text will probably be readable when encoded that way. As for the file bits you'll probably just end up writing those down somewhere and appending an ending to it.
Also note, this method of interpretation works both ways whether from Node or in the Browser.
This solution is really just a form of type casting and you might get some unexpected results. The fixed length of your mime type field is crucial.
Here it is illustrated. Copy, paste, set the image file to whatever you want and then run that. You'll see the mime type I set pop out.
var fs = require('fs');
//https://stackoverflow.com/questions/8609289/convert-a-binary-nodejs-buffer-to-javascript-arraybuffer
function toUint8Array(buffer) {
var ab = new ArrayBuffer(buffer.length);
var array = new Uint8Array(ab);
for (var i = 0; i < buffer.length; ++i) {
array[i] = buffer[i];
}
return array;
}
//data is a raw Buffer object
fs.readFile('./ducklings.png', function (err, data) {
var mime = new Buffer('image/png');
var allBuffed = Buffer.concat([mime, data]);
var array = toUint8Array(allBuffed);
var mimeBytes = array.subarray(0,9); //number of characters in mime Buffer
console.log(String.fromCharCode.apply(null, mimeBytes));
});
Here's how you do it on the client side:
SOLUTION A: GET A PACKAGE
Get buffer, an implementation of Node's Buffer API for browsers. The solution to concatenate Byte buffers will work exactly as before. You can append fields like To: and what not as well. The way you format your headers in order to best serve your clients will be an evolving process I'm sure.
SOLUTION B: OLD SCHOOL
STEP 1: Convert your Blob to an ArrayBuffer
Notes: How to convert a String to an ArrayBuffer
var fr = new FileReader();
fr.addEventListener('loadend', function () {
//Asynchronous action in part 2.
var message = concatenateBuffers(headerStringAsBuffer, fr.result);
ws.send(message);
});
fr.readAsArrayBuffer(blob);
STEP 2: Concatenate ArrayBuffers
function concatenateBuffers(buffA, buffB) {
var byteLength = buffA.byteLength + buffB.byteLength;
var resultBuffer = new ArrayBuffer(byteLength);
//wrap ArrayBuffer in a typedArray/view
var resultView = new Uint8Array(resultBuffer);
var viewA = new Uint8Array(resultBuffer);
var viewB = new Uint8Array(resultBuffer);
//Copy 8 bit integers AKA Bytes
resultView.set(viewA);
resultView.set(viewB, viewA.byteLength);
return resultView.buffer
}
STEP 3: Receive and Reblob
I'm not going to repeat how to convert the concatenated String bytes back into a string because I've done it in the server example, but for turning the file bytes into a blob of your mime type is fairly simple.
new Blob(buffer.slice(offset, buffer.byteLength), {type: mimetype});
This Gist by robnyman goes into further details on how you would use an image transmitted via XHR, put it into localstorage, and use it in an image tag on your page.

I liked #Breedly's idea of prepending a fixed length byte array to indicate mime type of the ArrayBuffer so I created this npm package that I use when dealing with websockets but maybe others' might find it useful.
Example usage
const {
arrayBufferWithMime,
arrayBufferMimeDecouple
} = require('arraybuffer-mime')
// some image array buffer
const uint8 = new Uint8Array(1)
uint8[0] = 1
const ab = uint8.buffer
const mime = 'image/png'
const abWithMime = arrayBufferWithMime(ab, mime)
const {mime, arrayBuffer} = arrayBufferMimeDecouple(abWithMime)
console.log(mime) // "image/png"
console.log(arrayBuffer) // ArrayBuffer

JavaScript Zlib Decompress

I'm trying to decompress zlib'ed XML such as the following:
https://drive.google.com/file/d/0B52P0MZLTdw8ZzQwQzVpZGZVZWc
Uploading to online decompress services works, such as: http://i-tools.org/gzip
In PHP, I'm using this code and works just fine, I get the XML string:
$raw = file_get_contents("file_here");
$uncompressed = zlib_decode($raw);
However, I want to do this in JavaScript.
The app is a client-side Chrome extension which uses chrome.devtools.network that reads from the network logs
Reads binary responses. Example at Google Drive link at the top
JS needs to decompress that response to its original XML and parsed afterwards into object
The only problem I have is the zlib decompress part.
As of Latest Update, the Decompression Libraries work but unpacking doesn't. Please skip to the Update Sept 16 at the bottom.
I have already tried several JavaScript libraries and still cannot make it work:
Pako: https://github.com/nodeca/pako
unpack() code: https://codereview.stackexchange.com/questions/3569/pack-and-unpack-bytes-to-strings
function unpack(str) {
var bytes = [];
for(var i = 0, n = str.length; i < n; i++) {
var char = str.charCodeAt(i);
bytes.push(char >>> 8, char & 0xFF);
}
return bytes;
}
$.get("file_here", function(response){
var charData = unpack(response);
var binData = new Uint8Array(charData);
var data = pako.inflate(binData);
var strData = String.fromCharCode.apply(null, new Uint16Array(data));
console.log(strData);
});
Error: Uncaught incorrect header check
It's the same even placing the response elsewhere:
new Uint8Array(response);
pako.inflate(response);
Imaya's zlib: https://github.com/imaya/zlib.js
$.get("file_here", function(response){
var inflate = new Zlib.Inflate(response);
var output = inflate.decompress();
console.log(output);
});
Error: Uncaught Error: unsupported compression method inflate.js:60
Still using Imaya's zlib, combining with this Stack Overflow question:
Decompress gzip and zlib string in javascript
$.get("file_here", function(response){
var response = response.split('').map(function(e) {
return e.charCodeAt(0);
});
var inflate = new Zlib.Inflate(response);
var output = inflate.decompress();
console.log(output);
});
Error: Uncaught Error: invalid fcheck flag:29 inflate.js:65
dankogai's js-deflate: https://github.com/dankogai/js-deflate
console.log(RawDeflate.inflate(response));
Output: empty
augustl's js-inflate: https://github.com/augustl/js-inflate
console.log(JSInflate.inflate(response));
Output: empty
zlib-browserify: https://github.com/brianloveswords/zlib-browserify
Error: ReferenceError: exports is not defined
This is just a wrapper for Imaya's zlib. I think this is requireJS? I'm not even sure how to use it. Can it even be used without installing anything and just jQuery/JS? The app as mentioned is downloadable Chrome extension with just HTML importing JS files.
UPDATE Sept 16, 2014
It seems the problem is with the JavaScript unpack( ) function. When I use the ByteArray generated by PHP: http://pastebin.com/uDWvK94B, the JavaScript decompression functions work.
PHP unpacking that works:
$unpacked = unpack("C*", $raw);
For the JavaScript unpack( ) code that I use, which doesn't work, see top of the post under Pako section.
So the new question is, why does JavaScript generate a different ByteArray values than the one generated by PHP.
Is it really a problem with the unpack( ) function?
or is it something when the JS fetches the file, the encoding or whatsoever changes thus bytes get messed up?
and lastly, what is your suggested fix?
UPDATE Sept 20, 2014
With more research and some of the answers here giving leads
Sebastian S opening the idea that the problem was in the manner of retrieving data and it had something to do with text encodings
user3995789 providing an example that it will work even without the unpack( ) function, though outside the context of Chrome extensions
Isaac providing examples in the context of Chrome extensions, but still does not work
With that I researched further combining all leads which lead me to a theory that the reason behind all this is that Chrome is unable to get "raw" data through its request.getContent function. See here for the Chrome documentation for the said function.
As of now, I have taken the issue to Chrome, see here.
UPDATE March 24, 2015
Although the problem was not fully resolved, the answer which I think was the most useful to me was from #Sebastian S, who proposed that "the way" I was taking or receiving the data was at fault and a bad conversion was the cause, which is as near as the problem was.

Jquery reads in utf8 format, you have to read the raw file, this function will work.
function readTextFile(file)
{
var rawFile = new XMLHttpRequest();
rawFile.open('GET', file, true);
rawFile.responseType = 'arraybuffer';
rawFile.onload = function (response)
{
var words = new Uint8Array(rawFile.response);
console.log(words[1]);
console.log(pako.ungzip(words));
};
rawFile.send();
}
For more information see this answer.

I understood that you wanna use the zlib decompression inside a chrome extension while reading reponses bodies from the network log.
You need first to retrieve the base64 who will be decompressed. You can achieve this while using the getContent method.
function zlibDecompress(base64Content){
// var base64Content = base64Content.split(',')[1]; // Not sure if need to keep it
// Decode base64 (convert ascii to binary)
var strData = atob(base64Content);
// Convert binary string to character-number array
var charData = strData.split('').map(function(x){return x.charCodeAt(0);});
// Turn number array into byte-array
var binData = new Uint8Array(charData);
// Pako inflate
var data = pako.inflate(binData, { to: 'string' });
return data;
}
chrome.devtools.network.onRequestFinished.addListener(
function(request) {
request.getContent(
function(content, encoding){
if(encoding == 'base64'){
var output = zlibDecompress(content);
}
}
);
}
);
https://developer.chrome.com/extensions/devtools_network#type-Request
Using XMLHttpRequest :
<script type="text/javascript" src="pako.js"></script>
<script type="text/javascript">
function zlibDecompress(url){
var xhr = new XMLHttpRequest();
xhr.open('GET', url, true);
xhr.responseType = 'blob';
xhr.onload = function(oEvent) {
// Base64 encode
var reader = new window.FileReader();
reader.readAsDataURL(xhr.response);
reader.onloadend = function() {
base64data = reader.result;
var base64 = base64data.split(',')[1];
// Decode base64 (convert ascii to binary)
var strData = atob(base64);
// Convert binary string to character-number array
var charData = strData.split('').map(function(x){return x.charCodeAt(0);});
// Turn number array into byte-array
var binData = new Uint8Array(charData);
// Pako inflate
var data = pako.inflate(binData, { to: 'string' });
console.log(data);
}
};
xhr.send();
}
zlibDecompress('fileurl');
</script>
If you wanna use XMLHttpRequest with chrome extension
{
"name": "My extension",
...
"permissions": [
"http://www.domain.com/", // The domain that hold the file
"http://*/" // Or every domain
],
...
}
https://developer.chrome.com/extensions/xhr
Feel free to ask if you have any questions ;)

In my opinion the question you should really be asking is: How do you retrieve the compressed data? As soon as it becomes an UTF-16 string, the trouble begins. I'm not even sure, if the conversion from raw byte data to javascript strings is lossless.
As you wrote something about php, I assume you're communicating to some sort of backend. If this is true, there are options to handle binary data with native means. Maybe this can help you: https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Sending_and_Receiving_Binary_Data

Download file using Javascript from a bytearray

I have a button in my asp net app that when pressed I am calling a rest service that returns a bytearray.
This bytearray is actually a file that I need to start downloading in the browser when the button is pressed. How can I achieve this?
I am thinking along the lines of writing the bytearray to the response and setting the headers.
Is this thinking correct and does someone have some code samples?
---------Update on 3/25----------------
Thanks Justin but not yet what I need. Please look at this link it will return a file for download. What I need to do is have an event client side that will get this file for download without redirecting to this page. It has to be downloaded from my page and not this link.
http://ops.epo.org/3.0/rest-services/published-data/images/US/5000001/PA/firstpage.pdf?Range=1
If you check it out with Fiddler, you will see how the pdf is received as binary.

You can set the responseType as arraybuffer and access it that way. There is even a way to stream the data using onprogress events. JavaScript has come a long way.
var oReq = new XMLHttpRequest();
oReq.open("GET", "/myfile.png", true);
oReq.responseType = "arraybuffer";
oReq.onload = function (oEvent) {
var arrayBuffer = oReq.response; // Note: not oReq.responseText
if (arrayBuffer) {
var byteArray = new Uint8Array(arrayBuffer);
for (var i = 0; i < byteArray.byteLength; i++) {
// do something with each byte in the array
}
}
};
oReq.send(null);
https://developer.mozilla.org/en-US/docs/DOM/XMLHttpRequest/Sending_and_Receiving_Binary_Data
If you mean that the webservice isn't returning binary data, and instead is JSON data like: [0,1,3,4] then that is a different issue.

As far as i know, the closest thing you can do with javascript is an ajax request that has the server parse the data as text

ZLIB Decompression - Client Side

I am receiving data as an "ZLIB" compressed inputstream.
Using Javascript/Ajax/JQuery, I need to uncompress it on the client side.
Is there a way to do so?
I already have this working in JAVA as below, but need to do this on Client Side.
url = new URL(getCodeBase(), dataSrcfile);
URLConnection urlConn = url.openConnection();
urlConn.setUseCaches(false);
InputStream in = urlConn.getInputStream();
InflaterInputStream inflate = new InflaterInputStream(in);
InputStreamReader inputStreamReader = new InputStreamReader(inflate);
InputStreamReader inputStreamReader = new InputStreamReader(in);
BufferedReader bufReader = new BufferedReader(inputStreamReader);
// Read until no more '#'
int i = 0;
int nHidden = 0;
String line1;
do //------------------------Parsing Starts Here
{
line1 = bufReader.readLine();
.............
...... so on

Pako is a full and modern Zlib port.
Here is a very simple example and you can work from there.
Get pako.js and you can decompress byteArray like so:
<html>
<head>
<title>Gunzipping binary gzipped string</title>
<script type="text/javascript" src="pako.js"></script>
<script type="text/javascript">
// Get datastream as Array, for example:
var charData = [31,139,8,0,0,0,0,0,0,3,5,193,219,13,0,16,16,4,192,86,214,151,102,52,33,110,35,66,108,226,60,218,55,147,164,238,24,173,19,143,241,18,85,27,58,203,57,46,29,25,198,34,163,193,247,106,179,134,15,50,167,173,148,48,0,0,0];
// Turn number array into byte-array
var binData = new Uint8Array(charData);
// Pako magic
var data = pako.inflate(binData);
// Convert gunzipped byteArray back to ascii string:
var strData = String.fromCharCode.apply(null, new Uint16Array(data));
// Output to console
console.log(strData);
</script>
</head>
<body>
Open up the developer console.
</body>
</html>
Running example: http://jsfiddle.net/9yH7M/
Alternatively you can base64 encode the array before you send it over as the Array takes up a lot of overhead when sending as JSON or XML. Decode likewise:
// Get some base64 encoded binary data from the server. Imagine we got this:
var b64Data = 'H4sIAAAAAAAAAwXB2w0AEBAEwFbWl2Y0IW4jQmziPNo3k6TuGK0Tj/ESVRs6yzkuHRnGIqPB92qzhg8yp62UMAAAAA==';
// Decode base64 (convert ascii to binary)
var strData = atob(b64Data);
// Convert binary string to character-number array
var charData = strData.split('').map(function(x){return x.charCodeAt(0);});
// Turn number array into byte-array
var binData = new Uint8Array(charData);
// Pako magic
var data = pako.inflate(binData);
// Convert gunzipped byteArray back to ascii string:
var strData = String.fromCharCode.apply(null, new Uint16Array(data));
// Output to console
console.log(strData);
Running example: http://jsfiddle.net/9yH7M/1/
To go more advanced, here is the pako API documentation.

A more recent offering is https://github.com/imaya/zlib.js
I think it's much better than the alternatives.

Our library JSXGraph contains the deflate, unzip and gunzip algorithm. Please, have a look at jsxcompressor (a spin-off from JSXGraph, see http://jsxgraph.uni-bayreuth.de/wp/download/) or at Utils.js in the source code.

Try pako https://github.com/nodeca/pako , it's not just inflate/deflate, but exact zlib port to javascript, with almost all features and options supported. Also, it's the fastest implementation in modern browsers.

Just as the first comments to your question suggest, I would suspect that you actually want the browser to handle the decompression. If I am mistaken, you might want to check out the JSXGraph library, it is supposed to contain pure JS implementations for deflate and unzip.

The js-deflate project by dankogai may be what you are looking for. I haven't actually tried it, but the rawinflate.js code seems fairly minimal, and should be able to decompress DEFLATE/zlib:ed data.

Using Pako you can decode the compressed(gzib) response, short code for above answer
JSON.parse(Pako.inflate(Buffer.from(data, 'base64'), { to: 'string' }))
Buffer.from, Pako

Browserify-zlib works perfectly for me, it uses pako and has the exact same api as zlib. After I struggled for days with compressing/decompressing zlib encoded payloads in client side with pako, I can say that browserify-zlib is really convenient.

you should see zlib rfc.
the javascript inflate code I tested: inflate in Javascript
the java code I wrote:
static public byte[] compress(byte[] input) {
Deflater deflater = new Deflater();
deflater.setInput(input, 0, input.length);
deflater.finish();
byte[] buff = new byte[input.length + 50];
deflater.deflate(buff);
int compressedSize = deflater.getTotalOut();
if (deflater.getTotalIn() != input.length)
return null;
byte[] output = new byte[compressedSize - 6];
System.arraycopy(buff, 2, output, 0, compressedSize - 6);// del head and
// foot byte
return output;
}
The very Important thing is in deflate in Java you must cut the head 2 byte,foot 4 byte,to get the raw deflate.

Develop Reference

JavaScript is the programming language of the Web.

decompress gzip compressed data in javascript - javascript

Related

base64 encode in python, decode in javascript

nodejs binary websocket mimetype handling

JavaScript Zlib Decompress

Download file using Javascript from a bytearray

ZLIB Decompression - Client Side

Categories

Resources