I'm getting gzipped data from a business api i'm working with, but I can't manage to decompress it into something readable in JS, though I managed with C#.
My question is - how do I unzip the received gzipped input to a string or json?
The following code works well for me in C#:
using (HttpWebResponse response = (HttpWebResponse)WebRequest.Create(url).GetResponse())
{
using (GZipStream decompress = new GZipStream(response.GetResponseStream(), CompressionMode.Decompress))
{
using (StreamReader reader = new StreamReader(decompress))
{
responseFromServer = reader.ReadToEnd();
}
}
}
I've read various answers and tried some libraries but still can't manage to decompress it in JS (using same URL).
This is where the code should be in JS:
var requestData = {
url: url,
headers: {
"Allow-Encoding": "gzip"
}
}
request.get(requestData, function(error, response, body) {
// compressed data is in body
});
I've tried pako, zlib but I am probably not using them correctly.
[edit]
Some of my tries:
// Decode base64 (convert ascii to binary)
var strData = new Buffer(body).toString('base64');
// Convert binary string to character-number array
var charData = strData.split('').map(function (x) { return x.charCodeAt(0); });
// Turn number array into byte-array
var binData = new Uint8Array(charData);
// Pako magic
var data = pako.inflate(binData);
// Convert gunzipped byteArray back to ascii string:
var strData2 = String.fromCharCode.apply(null, new Uint8Array(data));
This code is running in a NodeJS application, and i'm using request package
Thanks
Related
I have a PDF file which I want to read into memory using NodeJS. Ideally I'd like to encode it using base64 for transferring it. But somehow the read function does not seem to read the full PDF file, which makes no sense to me. The original PDF was generated using pdfKit, and is ok and viewable using a PDF reader program.
The original file test.pdf has 90kB on disk. But if I read and write it back to disk there are just 82kB and the new PDF test-out.pdf is not ok. The pdf viewer says:
Unable to open document. The pdf document is damaged.
The base64 encoding therefore also does not work correctly. I tested it using this webservice. Does someone know why and what is happening here? And how to resolve it.
I found this post already.
fs = require('fs');
let buf = fs.readFileSync('test.pdf'); // returns raw buffer binary data
// buf = fs.readFileSync('test.pdf', {encoding:'base64'}); // for the base64 encoded data
// ...transfer the base64 data...
fs.writeFileSync('test-out.pdf', buf); // should be pdf again
EDIT MCVE:
const fs = require('fs');
const PDFDocument = require('pdfkit');
let filepath = 'output.pdf';
class PDF {
constructor() {
this.doc = new PDFDocument();
this.setupdocument();
this.doc.pipe(fs.createWriteStream(filepath));
}
setupdocument() {
var pageNumber = 1;
this.doc.on('pageAdded', () => {
this.doc.text(++pageNumber, 0.5 * (this.doc.page.width - 100), 40, {width: 100, align: 'center'});
}
);
this.doc.moveDown();
// draw some headline text
this.doc.fontSize(25).text('Some Headline');
this.doc.fontSize(15).text('Generated: ' + new Date().toUTCString());
this.doc.moveDown();
this.doc.font('Times-Roman', 11);
}
report(object) {
this.doc.moveDown();
this.doc
.text(object.location+' '+object.table+' '+Date.now())
.font('Times-Roman', 11)
.moveDown()
.text(object.name)
.font('Times-Roman', 11);
this.doc.end();
let report = fs.readFileSync(filepath);
return report;
}
}
let pdf = new PDF();
let buf = pdf.report({location: 'athome', table:'wood', name:'Bob'});
fs.writeFileSync('outfile1.pdf', buf);
The encoding option for fs.readFileSync() is for you to tell the readFile function what encoding the file already is so the code reading the file knows how to interpret the data it reads. It does not convert it into that encoding.
In this case, your PDF is binary - it's not base64 so you are telling it to try to convert it from base64 into binary which causes it to mess up the data.
You should not be passing the encoding option at all and you will then get the RAW binary buffer (which is what a PDF file is - raw binary). If you then want to convert that to base64 for some reason, you can then do buf.toString('base64') on it. But, that is not its native format and if you write that converted data back out to disk, it won't be a legal PDF file.
To just read and write the same file out to a different filename, leave off the encoding option entirely:
const fs = require('fs');
let buf = fs.readFileSync('test.pdf'); // get raw buffer binary data
fs.writeFileSync('test-out.pdf', buf); // write out raw buffer binary data
After a lot of searching I found this Github issue. The problem in my question seems to be the call of doc.end() which for some reason doesn't wait for the stream to finish (finish event of write stream). Therefore as suggested in the Github issue, the following approaches work:
callback based:
doc = new PDFDocument();
writeStream = fs.createWriteStream('filename.pdf');
doc.pipe(writeStream);
doc.end()
writeStream.on('finish', function () {
// do stuff with the PDF file
});
or promise based:
const stream = fs.createWriteStream(localFilePath);
doc.pipe(stream);
.....
doc.end();
await new Promise<void>(resolve => {
stream.on("finish", function() {
resolve();
});
});
or even nicer, instead of calling doc.end() direcly, call the function savePdfToFile below:
function savePdfToFile(pdf : PDFKit.PDFDocument, fileName : string) : Promise<void> {
return new Promise<void>((resolve, reject) => {
// To determine when the PDF has finished being written sucessfully
// we need to confirm the following 2 conditions:
//
// 1. The write stream has been closed
// 2. PDFDocument.end() was called syncronously without an error being thrown
let pendingStepCount = 2;
const stepFinished = () => {
if (--pendingStepCount == 0) {
resolve();
}
};
const writeStream = fs.createWriteStream(fileName);
writeStream.on('close', stepFinished);
pdf.pipe(writeStream);
pdf.end();
stepFinished();
});
}
This function should correctly handle the following situations:
PDF generated successfully
Error is thrown inside pdf.end() before write stream is closed
Error is thrown inside pdf.end() after write stream has been closed
I want to compress large JSON object in javascript and decompress it in java.What is best compression Algorithm that will support this?
Continuing on same,i tried to use gzip,facing issue with that..
"I have javascript as client side and Java Jboss Resteasy on my server side.I tried your approach it is not working.I used zlib library in javascript to compress using Gzip and used Content Encoding as gzip ,also Gzip annotation at Jboss server side to automatically decompress it.It was not working.Also,I tried to use InputStreamReader to decompress in java ,it was throwing "Data not in Gzip Format" error.Can you please help me here ,if possible can you paste an example code for the same"
Code in Javascript
zlib.gzip(JSON.stringify($scope.jsonCompressCheck),function(err, buffer) {
if (!err) {
console.log("USing gzip: ");
console.log("Byte Length: "+Buffer.byteLength(buffer));
console.log(sizeof(buffer));
$scope.compressed = buffer;
var buf2 = Buffer.from(JSON.stringify($scope.jsonCompressCheck));
$http.post(ATS_URL + ATS_INSTANCE + '/rest/private/decompress/' + clientName + '/gzipdecompress', ($scope.compressed), {
contentType: 'application/json',
contentEncoding: 'gzip'
}).success(function (data, status, headers) {
console.log("Output Response :- "+data+" Headers: "+headers+" status: "+status);
}).error(function (reason) {
console.log(" Error reason "+reason);
});
Java Code here : Jboss RestEasy
Endpoint
#POST #NoCache
#ApiOperation(value = "Decompress Given Compressed Json object Using Gzip",
response = ElasticSearchResults.class, position = 0)
#Path("/{client}/gzipdecompress")
public String gzipJsonDecompress(
#ApiParam(value = "This required field should be the client name as defined in the datasources.", required = true)
#PathParam("client") String client,
#GZIP byte[] compressedObject) throws ATSException {
return decompressService.gzipJsonDecompress(client,compressedObject);
}
Implementation Code
public String gzipJsonDecompress(String client,byte[] compressedObject)throws ATSException{
validateDomain(client);
try
{ InputStream inputStream = new
ByteArrayInputStream(compressedObject);
GZIPInputStream gzipInput = new GZIPInputStream(inputStream); //Not working here
....
The most appropriate compression may be GZip. You can upload the content compressed using GZip and set up the server to handle Content-Encoding header so that it is automatically uncompressed on the server end. Look at following link.
enter link description here
**Compress a normal JSON object as a LZW string:**
var lzwString = JSONC.pack( json );
**Decompress using java:**
String input = BinaryStdIn.readString();
TST<Integer> st = new TST<Integer>();
for (int i = 0; i < R; i++)
st.put("" + (char) i, i);
int code = R+1; // R is codeword for EOF
while (input.length() > 0) {
String s = st.longestPrefixOf(input); // Find max prefix match s.
BinaryStdOut.write(st.get(s), W); // Print s's encoding.
int t = s.length();
if (t < input.length() && code < L) // Add s to symbol table.
st.put(input.substring(0, t + 1), code++);
input = input.substring(t); // Scan past s in input.
}
BinaryStdOut.write(R, W);
BinaryStdOut.close();
I have a pdf file which is generated into my local server with my server side code. I want to send a request to the another server requesting POST. The post method take parameter as FormData where formdata types
one is string and another is file type.
content-type
form-data
Body
PDF file (file type)
string value
Is it possible to make the POST request without browsing the file location?
Doing some R&D I have overcome this problem with following some steps, as there is no way to get the file object from the physical location automatically in client side (basically in js) except browsing for security reason.
In my local server I have created a REST service. which response base64 string of the desired file.
Than I call the REST api from my javaScript and as a response I receive the base64 string. And than I convert it into bytes array and than Blob object and than File object.
base64 string==>bytes array==>Blob object==>File object
var base64 = this.getpdfFromLocal() //get the base64 string
var byteArray= this.base64ToByte(base64 );
var file = this.getFileFromByteArray(byteArray);
//return the byte array form the base64 string
MyApi.prototype.base64ToByte= function(base64) {
var binaryString = window.atob(base64);
var binaryLen = binaryString.length;
var bytes = new Uint8Array(binaryLen);
for (var i = 0; i < binaryLen; i++) {
var ascii = binaryString.charCodeAt(i);
bytes[i] = ascii;
}
return bytes;
};
MyApi.prototype.getFileFromByteArray=function(byteArray) {
var blob = new Blob([byteArray]);
var file = new File([blob], "resource.pdf");
return file;
};
Lastly I make from data using file object and send request the another server REST web services.
var formdata = new FormData();
formdata.append("some_value", "Some String");
formdata.append("file", file);
var url = "http://yoururl.com";
var result =$.ajax({
url : url ,
type : 'POST',
data : formdata,
contentType : false,
cache : false,
processData : false,
scriptCharset : 'utf-8',
async : false
}).responseText;
I'm trying to do some experiment with HTML5, WebSocket and File API.
I'm using the Tomcat7 WebSocket implementation.
I'm able to send and received text messages from the servlet. What I want to do now is to send from the servlet to the client JSON objects, but I want to avoid text message in order to skip the JSON.parse (or similar) on the client, so I'm trying to send binary messages.
The servlet part is really simple:
String s = "{arr : [1,2]}";
CharBuffer cbuf = CharBuffer.wrap(s);
CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
getWsOutbound().writeBinaryMessage(encoder.encode(cbuf));
getWsOutbound().flush();
After this message, on the client I see that I received a binary frame, that is converted to a Blob object (http://www.w3.org/TR/FileAPI/#dfn-Blob).
The question is: is it possible to get the JSON object from the Blob?
I took a look at the FileReader interface (http://www.w3.org/TR/FileAPI/#FileReader-interface), and I used code like this to inspect what the FileReader can do (the first line creates a brand new Blob, so you can test on the fly if you want):
var b = new Blob([{"test": "toast"}], {type : "application/json"});
var fr = new FileReader();
fr.onload = function(evt) {
var res = evt.target.result;
console.log("onload",arguments, res, typeof res);
};
fr.readAsArrayBuffer(b);
using all the "readAs..." methods that I saw on the File Reader implementation (I'm using Chrome 22). Anyway I didn't find something useful.
Did you have any suggestion? Thanks.
You should have tried readAsText() instead of readAsArrayBuffer() (JSON is text in the end).
You've also missed to stringify the object (convert to JSON text)
var b = new Blob([JSON.stringify({"test": "toast"})], {type : "application/json"}),
fr = new FileReader();
fr.onload = function() {
console.log(JSON.parse(this.result))
};
fr.readAsText(b);
To convert Blob/File that contains JSON data to a JavaScript object use it:
JSON.parse(await blob.text());
The example:
Select a JSON file, then you can use it in the browser's console (json object).
const input = document.createElement("input");
input.type = "file";
input.accept = "application/json";
document.body.prepend(input);
input.addEventListener("change", async event => {
const json = JSON.parse(await input.files[0].text());
console.log("json", json);
globalThis.json = json;
});
What you're doing is conceptually wrong. JSON is a string representation of an object, not an object itself. So, when you send a binary representation of JSON over the wire, you're sending a binary representation of the string. There's no way to get around parsing JSON on the client side to convert a JSON string to a JavaScript Object.
You absolutely should always send JSON as text to the client, and you should always call JSON.parse. Nothing else is going to be easy for you.
let reader = new FileReader()
reader.onload = e => {
if (e.target.readyState === 2) {
let res = {}
if (window.TextDecoder) {
const enc = new TextDecoder('utf-8')
res = JSON.parse(enc.decode(new Uint8Array(e.target.result))) //转化成json对象
} else {
res = JSON.parse(String.fromCharCode.apply(null, new Uint8Array(e.target.result)))
}
console.info('import-back:: ', res)
}
}
reader.readAsArrayBuffer(response)
I am using websockets for file transfer, while i am downloading a file i am recieving the data as it is, but when I open an image file it was corrupted. Data files are downloading fine, the code goes as follows.
try {
fileEntry = fs.root.getFile(filename, { create : creat_file });
var byteArray = new Uint8Array(data.data.length);
for (var i = 0; i < data.data.length; i++) {
byteArray[i] = data.data.charCodeAt(i) & 0xff;
}
BlobBuilderObj = new WebKitBlobBuilder();
BlobBuilderObj.append(byteArray.buffer);
if (!writer) {
writer = fileEntry.createWriter();
pos = 0;
}
//self.postMessage(writer.position);
writer.seek(pos);
writer.write(BlobBuilderObj.getBlob());
pos += 4096;
}
catch (e) {
errorHandler(e);
}
It looks like you are reading data from a WebSocket as a string, converting it to a Blob, and then writing this to a file.
If you have control of the WebSocket server then the best thing would be to send the data as binary frames instead of UTF-8 text data. If you can get the server to send the data as binary frames then you can just tell the WebSocket to deliver the data as Blobs:
ws.binaryType = "blob";
ws.onmessage = function (event) {
if (event.data instanceof Blob) {
// event.data is a Blob
} else {
// event.data is a string
}
}
If that is not an option and you can only send text frames from the server, then you will need to encode the binary data to text before sending it from the server and then decode the text on the other end. If you try and send binary data directly as text frames over WebSockets then doing charCodeAt(x) && 0xff will result in corrupt data.
For example you could base64 encode the data at the server and then base64 decode the data in the client:
ws.onmessage = function (event) {
raw = window.atob(event.data);
}
Update:
There is a very well performing pure Javascript base64 decode/encode contained in websockify. It decodes to an an array of numbers from 0-255 but could be easily modified to return a string instead if that is what you require (Disclaimer: I made websockify).