AJAX response gives a corrupted compressed (.tgz) file

AJAX response gives a corrupted compressed (.tgz) file - javascript

We are implementing a client-side web application that communicates with the server exclusively via XMLHttpRequests (and AJAX engine).
The XHR responses usually are plain text with some XML on it but in this case, the server is sending compressed data in .tgz file type. We know for sure that the data that the server is sending is correct because if we use an HTTP command-line client such as curl, the file sent as response is valid and contains the expected data.
However, when making an AJAX call and "blobing" the response in a downloadable file, the file we obtain is different in size (higher) than the correct one and it is not recognized by the decompresser. It Gives the following error:
gzip: stdin: not in gzip format
/bin/gtar: Child returned status 1
/bin/gtar: Error is not recoverable: exiting now
The code I'm using is the following:
*$.AJAX*.done(function(data){
window.URL = window.webkitURL || window.URL;
var contentType = 'application/x-compressed-tar';
var file = new Blob([data], {type: contentType});
var a = document.createElement('a'),
ev = document.createEvent("MouseEvents");
a.download = "browser_download2.tgz";
a.href = window.URL.createObjectURL(file);
ev.initMouseEvent("click", true, false, self, 0, 0, 0, 0, 0,
false, false, false, false, 0, null);
a.dispatchEvent(ev);
});
I avoided the parameters used to make the AJAX call, but let's assume that this is not the problem as I correctly receive an answer. I used this contentType because is the same one displayed by the obtained by curl but I tried different ones. The code may look a little bit weird so I'll desglosse it for you: I'm basically creating a link and I'm attaching to it the download link and the name of the file (it's a dirty way to be able to name the file). Finally I'm virtually clicking the link.
I compared the correct tgz file and the one obtained via browser with a hex viewer and I observed the repetition of patterns in the corrupted one (EF, BF and BD, all along the file) that is not present in the correct one.
Therefore I think about some possible causes:
(a) The browser is adding extra characters or maybe the response
header is still in the downloaded file.
(b) The file has been partially decompressed because when I inspect
the request Header I can state "Accept-Encoding: gzip, deflate";
although I don't know if the browser (Firefox in my case)
automatically decompresses data.
(c) The code that I'm using to blob the data is not correct; although
it acomplished well the aim with a plain/text file in another
occasion.
Edit
I also provide you the links to the hex inspection:
(a) Corrupted file: http://en.webhex.net/view/278aac05820c34dfbdd2217c03970dd9/0
(b) (Presumably) correct file: http://en.webhex.net/view/4a01894b814c17d2ec71ba49ac48e683

I don't know if this thread will be helpful for somebody, but just in case I figured out the cause and a possible solution for my problem.
The cause
Default Javascript variables store information in Unicode/ASCII format; they are not prepared for storing binary data correctly and this is why one can easily see wrong characters interpreted (this also explains why repetitions of EF, BF, etc. were observed in the Hex Viewer, which stand for wrong characters of ASCII/Unicode).
The solution
The last browser versions implement the so called typed arrays. They are javascript arrays that can store data in different formats (also binary). Then, if one specifies that the XMLHttpRequest response is in binary format, data will be correctly stored and, when blobed into a file, the file will not be corrupted. Check out the code I used:
var xhr = new XMLHttpRequest();
xhr.open('POST', url, true);
xhr.responseType = 'arraybuffer';
Notice that the key point is to define the responseType as "arraybuffer". It may be also interesting noticing that I decided not to use Jquery for the AJAX anymore. It poorly implements this feature and all attempts I did to parse Jquery were in vain (overrideMimeType described somewhere else didn't work in my case). Instead, old plain XMLHttRquest worked pretty nicely.

Related

How to process the response of an XMLHttpRequest where the responseText is a compressed (gzip) format

I very much appreciate anyone looking at this question and any helpful responses.
I am exploring the possibility of trying to load a text file from an external site to my site. The text file has been compressed or deflated with gzip, so the path looks like https://host/filename.txt.gz
I am trying to load the contents with a XMLHttpRequest, and then I am trying to decompress/inflate the contents using this https://github.com/augustl/js-inflate library. The response content-type is Application/Octet-stream.
So, my problem is that however the responseText is decoded, a lot of the characters produced are the "replacement character" (code 65533, or �). It is my understanding that this is produced when the decoder can't process the byte sequence.
The text files I am trying to decode/decompress are certainly valid, because if I download them they can be decompressed and viewed just fine.
var request = new XMLHttpRequest();
request.open('GET', 'https://host.something/filename.txt.gz');
request.onload = function() {
// the request text is all, there, looking like ��`�P GSE43615_non_normalized.txt t�I�,A�$���}���yXFf����D��...
var infwated = JSInflate.inflate(request.responseText); // (note: tried to base64decode the response first in case that's it. it doesn't seem to be)
// the 'inflated' result comes back as an empty string.
// As I debug the JSInflate library, it appears the the library is looking for bytes to signal how the text should be processed.
// The code breaks out of the processing in the first conditional because the byte is not recognized
console.log(infwated || 'failed'); // it's 'failed'
}
request.send();
I hope I explained this so it makes sense. So, my questions are:
Is what I am trying to do possible? (emphasis on possible, as opposed to reasonable)
If so, the vague question is, how can I read the response so it can be processed and decompressed? More specifically, how can text be 'read' in from an XMLHttpWebRequest in a way that an inflating algorithm can work with it?
Thanks a lot for any help!!!

Make sure you are using the correct responseType. "Text" is default, perhaps it should be "blob"?

Wrong encoding on JavaScript Blob when fetching file from server

Using a FileStreamResult from C# in a SPA website (.NET Core 2, SPA React template), I request a file from my endpoint, which triggers this response in C#:
var file = await _docService.GetFileAsync(token.UserName, instCode.Trim()
.ToUpper(), fileSeqNo);
string contentType = MimeUtility.GetMimeMapping(file.FileName);
var result = new FileStreamResult(file.File, contentType);
var contentDisposition = new ContentDispositionHeaderValue("attachment");
Response.Headers[HeaderNames.ContentDisposition] =
contentDisposition.ToString();
return result;
The returned response is handled using msSaveBlob (spesificly for MS, but this is a problem even though I use createObjectURL and different browser (Yes, I have tried multiple solutions to this, but none of them seems to work). This is the code I use to send the request, and receive the PDF FileStreamResult from the server.
if (window.navigator.msSaveBlob) {
axios.get(url).then(response => {
window.navigator.msSaveOrOpenBlob(
new Blob([response.data], {type: "application/pdf"}),
filename);
});
The problem is that the returned PDF file that I get has a wrong encoding on it somehow. So the PDF will not open.
I have tried adding encoding to the end of type: {type: "application/pdf; encoding=UTF-8"} which was suggested in different posts, however, it makes no difference.
Comparing a PDF file that I have fetched in a different way, I can clearly see that the encoding is wrong. Most of the special characters are not correct. Indicated by the response header, the PDF file should be in UTF-8, but I have no idea how to actually find out and check.

Without knowing axios it seems though from its readme page that it uses JSON as default responseType. This may potentially alter the content as it is now treated as text (axios will probably bail out when it cannot convert to an actual JSON object and keep the string/text source for response data).
A PDF should be loaded as binary data even though it can be both, either 8-bit binary content or 7-bit ASCII - both should in any case be treated as a byte stream, from Adobe PDF reference sec. 2.2.1:
PDF files are represented as sequences of 8-bit binary bytes.
A PDF file is designed to be portable across all platforms and
operating systems. The binary rep resentation is intended to be
generated, transported, and consumed directly, without translation
between native character sets, end-of-line representations, or other
conventions used on various platforms. [...].
Any PDF file can also be represented in a form that uses only 7-bit
ASCII [...] character codes. This is useful for the purpose of
exposition, as in this book. However, this representation is not
recommended for actual use, since it is less efficient than the normal
binary representation. Regardless of which representation is
used, PDF files must be transported and stored as binary files,
not as text files. [...]
So to solve the conversion that happens I would suggest trying specifying the configuration entry responseType when doing the request:
axios.get(url, {responseType: "arraybuffer"}) ...
or in this form:
axios({
method: 'get',
url: url,
responseType:'arraybuffer'
})
.then( ... )
You can also go directly to response-type blob if you are sure the mime-type is preserved in the process.

Fetching zipped text file and unzipping in client browsers, feasible in Javascript?

I am developing a web page containing Javascript. This js uses static string data (about 1-2 MB) which is stored in a flat file. I could compress it with gzip or any other algorithm to reduce the transfer load.
Would it be possible to fetch this binary file with Ajax and decompress it into a string (which I could split later) in the client browser. If yes, how can I achieve this? Does anyone have a code example?

And another library or site is this one, although it has few examples it has some thorough test cases that can be seen.
https://github.com/imaya/zlib.js
Here are some of the complex test cases
https://github.com/imaya/zlib.js/blob/master/test/browser-test.js
https://github.com/imaya/zlib.js/blob/master/test/browser-plain-test.js
The code example seems very compact. Just these two lines of code...
// compressed = Array.<number> or Uint8Array
var gunzip = new Zlib.Gunzip(compressed);
var plain = gunzip.decompress();
If you look here https://github.com/imaya/zlib.js/blob/master/bin/gunzip.min.js you see they have the packed js file you will need to include. You might need to include one or two of the others in https://github.com/imaya/zlib.js/blob/master/bin.
In any event get those files into your page and then feed the GUnzip objects your pre-gzipped data from the server and then it will be as expected.
You will need to download the data and get it into an array yourself using other functions. I do not think they include that support.
So try these examples of download from https://developer.mozilla.org/en-US/docs/DOM/XMLHttpRequest/Sending_and_Receiving_Binary_Data
function load_binary_resource(url) {
var req = new XMLHttpRequest();
req.open('GET', url, false);
req.overrideMimeType('text\/plain; charset=x-user-defined');
req.send(null);
if (req.status != 200) return '';
return req.responseText;
}
// Each byte is now encoded in a 2 byte string character. Just AND it with 0xFF to get the actual byte and then feed that to GUnzip...
var filestream = load_binary_resource(url);
var abyte = filestream.charCodeAt(x) & 0xff; // throw away high-order byte (f7)
=====================================
Also there is Node.js
Question is similar to
Simplest way to download and unzip files in Node.js cross-platform?
There is this example code at nodejs documentation. I do not know how much more specific it gets than that...
http://nodejs.org/api/zlib.html

Just enable the Gzip compression on your Apache and everything will be automatically done.
Probably you will have to store the string in a .js file as a json and enable gzip for js mime type.

I remember that I used js-deflate for off-linne JS app with large databases (needed due to limitations of local storage) and worked perfectly. It depends on js-base64.

How can binary files be requested from GreaseMonkey userscripts?

Backstory
I wrote a specialized image inliner script that is intended to be used with both GreaseMonkey and Google Chrome. It is supposed to download PNG files and store them in data: urls in image src attributes. This may sound ridiculous, but a certain website sets Content-Disposition to attachment for images, and I don't want the «Save As» dialog to pop up every time.
The actual question
The script fetches data with an XMLHttpRequest, encodes it into base64 and stores it in a proper place. So far, good. But it only works when I run it through both Firebug and Chrome dev consoles, and does not when I use it as a proper userscript. As far as I understand, this is because Greasemonkey scripts cannot use XMLHttpRequest objects directly and should rely on calls to GM_xmlhttpRequest instead. However, I cannot set responseType to "blob" or "arraybuffer" that way, and the binary parameter seems to only work for sending data through POST requests. I only get Unicode strings.
Just in case, the images are served from the same domain as the page that links to them. I believe it satisfies the «same origin» thingy.
http://wiki.greasespot.net/GM_xmlhttpRequest here are the GM_xmlhttpRequest docs.
Is there a way to fetch an arraybuffer from within a Greasemonkey userscript?

If it is same-domain, then you can use XMLHttpRequest, with no problems. The only reason to use GM_xmlhttpRequest (which currently has a crippled subset of functionality) is if the images/files are cross domain.
For same-domain, you can use XHR2 as shown in this answer.
For cross-domain, you must: use GM_xmlhttpRequest, override the mime-type, and use a custom encoder algorithm. Again, this is all shown in that same answer.
However, it sounds like you are just trying to make it easier to download images? If that is so, then you might be better off just using the excellent DownThemAll extension.

overrideMimeType String (Compatibility: 0.6.8+) Optional. A MIME type
to specify with the request (E.G. "text/html; charset=ISO-8859-1").
You can set this to plain/text; charset=x-user-defined (the type doesn't matter but the charset does), bitwise AND through the response string and add the values to a typed array and get the buffer:
var text = xhr.responseText,
len = text.length,
arr = new Uint8Array(len),
i = 0;
for( i = 0; i < len; ++i ) {
arr[i] = text.charCodeAt(i) & 0xFF;
}
arr.buffer //The arraybuffer
Note: this is for raw binary responses, not base64.

Upload a binary file using pure JavaScript

I'm working on a Chrome app that uses the HTML5 Filesystem API, and allows users to import and sync files. One issue I'm having is that if the user tries to sync image files, the files get corrupted during the upload process to the server. I'm assuming it's because they're binary.
For uploading, I opted just to make an Ajax POST request (using MooTools) and then put the file contents as the body of the request. I told MooTools to turn off urlEncoding and set the charset to "x-user-defined" (not sure if that's necessary, I just saw it on some websites).
Given that Chrome doesn't have support for xhr.sendAsBinary, does anyone have any sample code that would allow me to send binary files via Ajax?

FF's xhr.sendAsBinary() is not standard. XHR2 supports sending files (xhr.send(file)) and blobs (xhr.send(blob)):
function upload(blobOrFile) {
var xhr = new XMLHttpRequest();
xhr.open('POST', '/server', true);
xhr.onload = function(e) { ... };
// Listen to the upload progress.
xhr.upload.onprogress = function(e) { ... };
xhr.send(blobOrFile);
}
You can also send an ArrayBuffer.

IF you're writing the server, then you can just transform the bytes that you read into pure text, send it to the server and then decode it back.
Here's the simplest way (not very efficient, but that's just to show the technique) -
translate each byte you read from the file into a string of two hexadecimal characters. If you read the byte 53 (in decimal) then translate it into "45" (the hexadecimal representation of 53). concatenate all these strings together, and send the resulting string to the server.
On the server side, break the string on even positions, translate each pair of digits into a byte.

Develop Reference

JavaScript is the programming language of the Web.