How can binary files be requested from GreaseMonkey userscripts?

How can binary files be requested from GreaseMonkey userscripts? - javascript

Backstory
I wrote a specialized image inliner script that is intended to be used with both GreaseMonkey and Google Chrome. It is supposed to download PNG files and store them in data: urls in image src attributes. This may sound ridiculous, but a certain website sets Content-Disposition to attachment for images, and I don't want the «Save As» dialog to pop up every time.
The actual question
The script fetches data with an XMLHttpRequest, encodes it into base64 and stores it in a proper place. So far, good. But it only works when I run it through both Firebug and Chrome dev consoles, and does not when I use it as a proper userscript. As far as I understand, this is because Greasemonkey scripts cannot use XMLHttpRequest objects directly and should rely on calls to GM_xmlhttpRequest instead. However, I cannot set responseType to "blob" or "arraybuffer" that way, and the binary parameter seems to only work for sending data through POST requests. I only get Unicode strings.
Just in case, the images are served from the same domain as the page that links to them. I believe it satisfies the «same origin» thingy.
http://wiki.greasespot.net/GM_xmlhttpRequest here are the GM_xmlhttpRequest docs.
Is there a way to fetch an arraybuffer from within a Greasemonkey userscript?

If it is same-domain, then you can use XMLHttpRequest, with no problems. The only reason to use GM_xmlhttpRequest (which currently has a crippled subset of functionality) is if the images/files are cross domain.
For same-domain, you can use XHR2 as shown in this answer.
For cross-domain, you must: use GM_xmlhttpRequest, override the mime-type, and use a custom encoder algorithm. Again, this is all shown in that same answer.
However, it sounds like you are just trying to make it easier to download images? If that is so, then you might be better off just using the excellent DownThemAll extension.

overrideMimeType String (Compatibility: 0.6.8+) Optional. A MIME type
to specify with the request (E.G. "text/html; charset=ISO-8859-1").
You can set this to plain/text; charset=x-user-defined (the type doesn't matter but the charset does), bitwise AND through the response string and add the values to a typed array and get the buffer:
var text = xhr.responseText,
len = text.length,
arr = new Uint8Array(len),
i = 0;
for( i = 0; i < len; ++i ) {
arr[i] = text.charCodeAt(i) & 0xFF;
}
arr.buffer //The arraybuffer
Note: this is for raw binary responses, not base64.

Related

How Can I make my browser show files (images) that a server is sending me with download prompt?

The only solution I've found it to grab the link with getElementsByClassName then inject it into an html snippet on the page, but it looks so fake, and is also unnecessary (I don't want all the links)
I want to right click the link (one at a time) and show it to the next tab. If I right click the link the server sends me a download prompt. How can I evade this?

I think the browser decides to download a file or display it based on its MIME type.
If the server is under your control, you should make sure you supply the correct Content-Type HTTP header (e.g. you have to call a library function in PHP, and there should be a similar way to do that in other languages).
Otherwise, for a purely client-side solution in JavaScript, you can fetch the file with an XMLHttpRequest (most JavaScript toolkits have wrappers around it). Then, you can convert it to base 64, prefix the result data:image/png;base64,, and use it as the src attribute of an img element (thanks https://stackoverflow.com/a/21508186/324969).
Note that for security aspects, grabbing arbitrary files and stuffing them in a data: URL might not be safe. I don't know if any cross-site scripting or CORS attacks could be built upon this. You'll have to ask a separate question to know if the client-side solution is unsafe. For the server-side, be careful not to set the wrong content-type for user-uploaded data, or for endpoints of your service (e.g. letting the client-side send you in the request the Content-Type that it would like, as tempting as it looks, is a big no-no).
To open the image in a new tab, you can use window.open as usual, but download the image beforehand (using XMLHttpRequest) and put the data:image/png;base64,… as the URL of the new tab.
Since you can already see the images by placing their URL in an img tag, you can paint that img on a , extract a PNG from the canvas, craft a data:image/png;base64,… URL from that, and then either automatically open many tabs with these URLS, or write in your page a series of links to data: URLs.
You could also have a link to a tiny web page with just the img tag that you currently use: link text.

AJAX response gives a corrupted compressed (.tgz) file

We are implementing a client-side web application that communicates with the server exclusively via XMLHttpRequests (and AJAX engine).
The XHR responses usually are plain text with some XML on it but in this case, the server is sending compressed data in .tgz file type. We know for sure that the data that the server is sending is correct because if we use an HTTP command-line client such as curl, the file sent as response is valid and contains the expected data.
However, when making an AJAX call and "blobing" the response in a downloadable file, the file we obtain is different in size (higher) than the correct one and it is not recognized by the decompresser. It Gives the following error:
gzip: stdin: not in gzip format
/bin/gtar: Child returned status 1
/bin/gtar: Error is not recoverable: exiting now
The code I'm using is the following:
*$.AJAX*.done(function(data){
window.URL = window.webkitURL || window.URL;
var contentType = 'application/x-compressed-tar';
var file = new Blob([data], {type: contentType});
var a = document.createElement('a'),
ev = document.createEvent("MouseEvents");
a.download = "browser_download2.tgz";
a.href = window.URL.createObjectURL(file);
ev.initMouseEvent("click", true, false, self, 0, 0, 0, 0, 0,
false, false, false, false, 0, null);
a.dispatchEvent(ev);
});
I avoided the parameters used to make the AJAX call, but let's assume that this is not the problem as I correctly receive an answer. I used this contentType because is the same one displayed by the obtained by curl but I tried different ones. The code may look a little bit weird so I'll desglosse it for you: I'm basically creating a link and I'm attaching to it the download link and the name of the file (it's a dirty way to be able to name the file). Finally I'm virtually clicking the link.
I compared the correct tgz file and the one obtained via browser with a hex viewer and I observed the repetition of patterns in the corrupted one (EF, BF and BD, all along the file) that is not present in the correct one.
Therefore I think about some possible causes:
(a) The browser is adding extra characters or maybe the response
header is still in the downloaded file.
(b) The file has been partially decompressed because when I inspect
the request Header I can state "Accept-Encoding: gzip, deflate";
although I don't know if the browser (Firefox in my case)
automatically decompresses data.
(c) The code that I'm using to blob the data is not correct; although
it acomplished well the aim with a plain/text file in another
occasion.
Edit
I also provide you the links to the hex inspection:
(a) Corrupted file: http://en.webhex.net/view/278aac05820c34dfbdd2217c03970dd9/0
(b) (Presumably) correct file: http://en.webhex.net/view/4a01894b814c17d2ec71ba49ac48e683

I don't know if this thread will be helpful for somebody, but just in case I figured out the cause and a possible solution for my problem.
The cause
Default Javascript variables store information in Unicode/ASCII format; they are not prepared for storing binary data correctly and this is why one can easily see wrong characters interpreted (this also explains why repetitions of EF, BF, etc. were observed in the Hex Viewer, which stand for wrong characters of ASCII/Unicode).
The solution
The last browser versions implement the so called typed arrays. They are javascript arrays that can store data in different formats (also binary). Then, if one specifies that the XMLHttpRequest response is in binary format, data will be correctly stored and, when blobed into a file, the file will not be corrupted. Check out the code I used:
var xhr = new XMLHttpRequest();
xhr.open('POST', url, true);
xhr.responseType = 'arraybuffer';
Notice that the key point is to define the responseType as "arraybuffer". It may be also interesting noticing that I decided not to use Jquery for the AJAX anymore. It poorly implements this feature and all attempts I did to parse Jquery were in vain (overrideMimeType described somewhere else didn't work in my case). Instead, old plain XMLHttRquest worked pretty nicely.

Convert an image to base64 without using HTML5 Canvas

First, a little background:
I apologize ahead of time for the long-winded nature of this preface; however, it may assist in providing an alternate solution that is not specific to the nature of the question.
I have an ASP.NET MVC application that uses embedded WinForm UserControls. These control provide "ink-over" support to TabletPCs through the Microsoft.Ink library. They are an unfortunate necessity due to an IE8 corporate standard; otherwise, HTML5 Canvas would be the solution.
Anyway, an image URL is passed to the InkPicture control through a <PARAM>.
<object VIEWASEXT="true" classid="MyInkControl.dll#MyInkControl.MyInkControl"
id="myImage" name="myImage" runat="server">
<PARAM name="ImageUrl" value="http://some-website/Content/images/myImage.png" />
</object>
The respective property in the UserControl takes that URL, calls a method that performs an HttpWebRequest, and the returned image is placed in the InkPicture.
public Image DownloadImage(string url)
{
Image _tmpImage = null;
try
{
// Open a connection
HttpWebRequest _HttpWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
_HttpWebRequest.AllowWriteStreamBuffering = true;
// use the default credentials
_HttpWebRequest.Credentials = CredentialCache.DefaultCredentials;
// Request response:
System.Net.WebResponse _WebResponse = _HttpWebRequest.GetResponse();
// Open data stream:
System.IO.Stream _WebStream = _WebResponse.GetResponseStream();
// convert webstream to image
_tmpImage = Image.FromStream(_WebStream);
// Cleanup
_WebResponse.Close();
}
catch (Exception ex)
{
// Error
throw ex;
}
return _tmpImage;
}
Problem
This works, but there's a lot of overhead in this process that significantly delays my webpage from loading (15 images taking 15 seconds...not ideal). Doing Image img = new Bitmap(url); in the UserControl does not work in this situation because of FileIO Permission issues (Full trust or not, I have been unsuccessful in eliminating that issue).
Initial Solution
Even though using canvas is not a current option, I decided to test a solution using it. I would load each image in javascript, then use canvas and toDataUrl() to get the base64 data. Then, instead of passing the URL to the UserControl and have it do all the leg work, I pass the base64 data as a <PARAM> instead. Then it quickly converts that data back to the image.
That 15 seconds for 15 images is now less than 3 seconds. Thus began my search for a image->base64 solution that worked in IE7/8.
Here are some additional requirements/restrictions:
The solution cannot have external dependencies (i.e. $.getImageData).
It needs to be 100% encapsulated so it can be portable.
The source and quantity of images are variable, and they must be in URL format (base64 data up front is not an option).
I hope I've provided sufficient information and I appreciate any direction you're able to give.
Thanks.

You can use any of the FlashCanvas, fxCanvas or excanvas libraries, which simulate canvas using Flash or VML in old internet explorer versions. I believe all of these provide the toDataURL method from the Canvas API, allowing you to get an encoded representation of your image.
After extensive digging around (I'm in the same fix) I believe this is the only solution, short of writing a PHP script that you can send the image to. The problem with that of course is that there isn't a way to send images to the PHP script unless any of these three conditions is true:
The browser supports typed arrays (Uint8Array)
The browser supports sendAsBinary
The image is being uploaded by someone via a form (in which case it can be sent to a PHP script that responds with the base 64 encoding)

Including base64 gzipped stylesheets/images in javascript?

I know you can include css and images, among other file types, which have been stored in base64 form within a javascript file. However, those are decently huge... and gzipped, they shrink down a LOT, even with the ~33% overhead from base64 encoding.
Non-gzipped, images are data:image/gif;base64, data:image/jpeg, data:image/png, and css is data:text/css;base64. What mime type can/should I be using, then, to include css or image data URIs which are gzipped? (Or if gzip+base64 can't work, is there any other compression I can do to bring down the string's size, while still keeping the data stored within the javascript?)
..edit..
I think the question is being misunderstood. I am not asking if I should include gzipped base64 strings within javascript. Yes, I know it's best, in most cases, to gzip the javascript and other files on the server end. But that is not applicable for a userscript; a userscript has no server, and consists of only a single file. Firefox allows a #require directive, but Opera and Chrome do not, and local file security issues come into play with loading any local files. Thus anything needed by the script has to be either: 1) on the web (slow) or 2) embedded in the userscript (big).
Now this question assumes that big is preferable to slow, but that big does not have to mean we totally ignore just how big; if it can be smaller, that's an improvement.
So assuming that a base64 string is embedded in javascript, the question is how to make it into something meaningful.
Either:
1) atob() can convert raw base64-encoded gzip to raw gzip within javascript. (atob does not need to know the mediatype). The question then would be how to decompress that raw gzipped css or image file so that the resulting output can be fed into the document.
or 2) given the proper mediatype, browsers at least theoretically (per the datauri RFC) should be able to load any file directly from a datauri. "" is sufficient to load a non-gzipped css stylesheet. The question here would be what link type attribute and datauri mediatype combination should work (and which browsers would it work for)? Preferably, for a userscript, this would be a combination that works in Opera, FF, and Chrome.

In HTTP, compression is most often only applied for transmission to reduce the payload that is to be transmitted. This is done by the Content-Encoding header field.
But the data URL scheme is very limited and you can only specify the media type:
dataurl := "data:" [ mediatype ] [ ";base64" ] "," data
Although you could use a multipart message, most user agents don’t support them in data URLs. It would also be questionable whether the additional data to describe such a multipart message wouldn’t be more than the data you safe by compressing the actual payload.
So compressing the data in a data URL is possible in theory but impracticable. It is better to simply compress the whole document the data URL is embedded in.

Why pass parameters to CSS and JavaScript link files like src="../cnt.js?ver=4.0"?

When I saw many sites' source code, parameters were passed to the linking file (CSS/JavaScript).
In the Stack Overflow source, I got
<script type="text/javascript" src="http://sstatic.net/js/master.js?v=55c7eccb8e19"></script>
Why is master.js?v=55c7eccb8e19 used?
I am sure that JavaScript/CSS files can't get the parameters.
What is the reason?

It is usually done to prevent caching.
Let's say you deploy version 2 of your new application and you want to cause the clients to refresh their CSS, you could add this extra parameter to indicate that it should re-request it from the server. Of course there are other approaches as well, but this is pretty simple.

As the others have said, it's probably an attempt to control caching, although I think it's best to do so by changing the actual resource name (foo.v2.js, not foo.js?v=2) rather than a version in the query string. (That doesn't mean you have to rename files, there are better ways of mapping that URL to the underlying file.) This article, though four years old and therefore ancient in the web world, is still a quite useful discussion. In it, the author claims that you don't want to use query strings for versions because:
...According the letter of the HTTP caching specification, user agents should never cache URLs with query strings. While Internet Explorer and Firefox ignore this, Opera and Safari don’t...
That statement may not be quite correct, because what the spec actually says is
...since some applications have traditionally used GETs and HEADs with query URLs (those containing a "?" in the rel_path part) to perform operations with significant side effects, caches MUST NOT treat responses to such URIs as fresh unless the server provides an explicit expiration time...
(That emphasis at the end is mine.) So using a version in the query string may be fine as long as you're also including explicit caching headers. Provided browsers implement the above correctly. And proxies do. You see why I think you're better off with versions in the actual resource locator, rather than query parameters (which [again] doesn't mean you have to constantly rename files; see the article linked above for more). You know browsers, proxies, etc. along the way are going to fetch the updated resource if you change its name, which means you can give the previous "name" a never-ending cache time to maximize the benefit of intermediate caches.
Regarding:
I am sure that Js/CSS files can't get the parameters.
Just because the result coming back is a JavaScript or CSS resource, it doesn't mean that it's a literal file on the server's file system. The server could well be doing processing based on the query string parameters and generating a customized JavaScript or CSS response. There's no reason I can't configure my server to route all .js files to (say) a PHP handler that looks at the query string and returns something customized to match the fields given. Thus, foo.js?v=2 may well be different from foo.js?v=1 if I've set up my server to do so.

That's to avoid the browser from caching the file. The appending version name has no effect on the JavaScript file, but to the browser's caching engine it looks like a unique file now.
For example, if you had scripts.js and the browser visits the page, they download and cache (store) that file to make the next page visit faster. However, if you make a change the browser may not recognize it until the cache has expired. However, scripts.js?v2 now makes the browser force a re-fetch because the "name's changed" (even though it hasn't, just the contents have).

A server-side script generating the CSS or JavaScript code could make use of them, but it is probably just being used to change the URI when the the content of the file changes so that old, cached versions won't cause problems.

<script type="text/javascript">
// front end cache bust
var cacheBust = ['js/StrUtil.js', 'js/protos.common.js', 'js/conf.js', 'bootstrap_ECP/js/init.js'];
for (i=0;i<cacheBust.length;i++){
var el = document.createElement('script');
el.src = cacheBust[i]+"?v=" + Math.random();
document.getElementsByTagName('head')[0].appendChild(el);
}
</script>

This is to force the browser to re-cache the .js file if there has been any update.
You see, when you update your JS on a site, some browsers may have cached the old version (to improve performace). Sicne you want them to use your new one, you can append something in the query-field of the name, and voíla! The browser re-fetches the file!
This applies to all files sent from the server btw.

Since javascript and css files are cached by the client browser, so we append some numeric values against their names in order to provide the non-cached version of the file

"I am sure that JavaScript /CSS files can't get the parameters"
function getQueryParams(qs) {
qs = qs.split("+").join(" ");
var params = {},
tokens, re = /[?&]?([^=]+)=([^&]*)/g;
while (tokens = re.exec(qs)) {
params[decodeURIComponent(tokens[1])] = decodeURIComponent(tokens[2]);
}
return params;
}

This is referred to as Cache Busting.
The browser will cache the file, including the querystring. Next time the querystring is updated the browser will be forced to download the new version of the file.
There are various types of cache-busting, for example:
Static
Date/Time
Software Version
Hashed-Content
I've wrote an article on cache busting previously which you may find useful:
http://curtistimson.co.uk/front-end-dev/what-is-cache-busting/

Develop Reference

JavaScript is the programming language of the Web.