Error handling when generating pdf or other non-html file

Error handling when generating pdf or other non-html file - javascript

In our web portal we generate PDFs for certain kinds of data. The user downloads the PDF by clicking an tag that references something that we return with content-type: application/pdf;charset=utf-8
This works well when it works; the browser realizes that it is getting a PDF file and opens a internal or external PDF reader, or asks the user to save the file, depending on browser and user configuration.
We have some cases where we may fail to generate the PDF though. First we didn't handle the error, a NullPointerException fell through and we got an ugly new page full of JSON formatted garbage. Then we tried returning an empty result, which the browser thinks is fine and just saves or sends an empty file. Then I tried returning a redirect, which confused Chrome which showed an alert telling the user that something strange was happening.
The href in the tag is on the format "/module/showmypdf.cmd?pdfid=67482". This, as I said, works fine when a valid pdf is returned.
So, is there any kind of best practice for error handling when it comes to sending non-HTML files to browsers? Is there something else I could try to make the browser interpret my response as a redirect?

Ok I figured out why the redirect didn't work. I was doing this in my Java Spring controller:
response.sendRedirect("redirect:mypage.html?pdfError=true");
The "redirect:" prefix is something you can use when returning the view name from a controller. In the sendDirect() call it only adds confusion. Removing "redirect:" fixed it.

Related

How to change the url being called by javascript without access to the javascript

I have an jQuery('#Frame').animate360 script on page and it calls a file called Profile.xml to get its settings etc.
Problem is Profile.xml is on Azure Blob and is in uppercase (PROFILE.XML). This means a 404 file not found error.
I can't change filename(Profile.xml) on azure.
The piece of JS that calls the Profile.xml seems to be encrypted in a library (HTML5Loader.js) i.e. The text 'Profile.xml' does not appear in any file and can only be found with chrome debugger in an unnamed file.
My instinct was to use something like Application_BeginRequest etc to catch a 'call' to https://storageblabla.blob.core.windows.net/uploads/ALPHA/3DIMAGES/1.010659/Profile.xml
and change it to ...../PROFILE.XML
but it's to late at that stage. It already knows that its a 404.
There must be some access, with code, to a point where a remote url is being called that can be intercepted.
Rekon its a one line fix but I just can't find the right term to search on.

Preventing 'content-sniffing' type vulnerabilities when handling user-uploaded images?

The problem:
I work on an internal tool that allows users to upload images - and then displays those images back to them and others.
It's a Java/Spring application. I have the benefit of only needing to worry about IE11 exactly and Firefox v38+ (Chrome v43+ would be a nice to have)
After first developing the feature, it seems that users can just create a text file like:
<script>alert("malicious code here!")</script>
and save it as "maliciousImage.jpg" and upload it.
Later, when that image is displayed inside image tags like:
<img src="blah?imgName=foobar" id="someImageID">
actualImage.jpg displays normally, and maliciousImage.jpg displays as a broken link - and most importantly no malicious content is interpreted!
However If the user right-clicks on this broken link, and clicks 'view image'... bad things happen.
the browser does 'content-sniffing' a concept which is new to me, detects that 'maliciousImage.jpg' is actually a text file, and very kindly renders it as HTML without hesitation. Any script tags are passed to the JavaScript interpreter and, as you can imagine, we don't want this.
What I've tried so far
In short, every possible combination of response headers I can think of to prevent the browser from content-sniffing. All the answers I've found here on stackoverflow, and other docs, imply that setting the content-type header should prevent most browsers from content-sniffing, and setting X-content options should prevent some versions of IE.
I'm setting the x-content-type-options to no sniff, and I'm setting the response content type. The docs I've read lead me to believe this should stop content-sniffing.
response.setHeader("X-Content-Type-Options", "nosniff");
response.setContentType("image/jpg");
I'm intercepting the response and these headers are present, but seem to have no effect on how the malicious content is processed...
I've also tried detecting which images are and are not malicious at the point of upload, but I'm quickly realizing this is very much non-trivial...
End goal:
Naturally - any output at all for images that aren't really images (garbled nonsense, an unhandled exception, etc) would be better than executing the text-file as HTML/javascript in the clear, but displaying any malicious HTML as escaped/CDATA'd plain-text would be ideal... though maybe a bit impractical.

So I ended up fixing this problem but forgot to answer my own question:
Step 1: blocking invalid images
To get a quick fix out, I simply added some fairly blunt code that checked if an image was actually an image - during upload and before serving it, using the imageio lib:
import javax.imageio.ImageIO;
//......
Image img = attBO.getImage(imgId);
InputStream x = new ByteArrayInputStream(img.getData());
BufferedImage s;
try {
s = ImageIO.read(x);
s.getWidth();
} catch (Exception e) {
throw new myCustomException("Invalid image");
}
Now, initially i'd hoped that would fix my problem - but in reality it wasn't that simple and just made generating a payload more difficult.
While this would block:
<script>alert("malicious code here!")</script>
It's very possible to generate a valid image that's also an XSS payload - just a little more effort....
Step 2: framework silliness
It turned out there was an entire post-processing workflow that I'd never touched, that did things such as append tokens to response bodies and use additional frameworks to decorate responses with CSS, headers, footers etc.
This meant that, although the controller was explicitly returning image/png, it was being grabbed and placed (as bytes) post processing was taking that bytestream, and wrapping it in a header and footer, to form a fully qualified 'view' - this view would always have the 'content-type' text/html and thus was never displayed correctly.
The crux of this problem was that my controller was directly returning an image, in a RESTful fashion, when the rest of the framework was built to handle controllers returning full fledged views.
So I had to step through this workflow and create exceptions for the controllers in my code that returned something other than worked in a restful fashion.
for example with with site-mesh it was just an exclude(as always, simple fix once I understood the problem...):
<decorators defaultdir="/WEB-INF/decorators">
<excludes>
<pattern>*blah.ctl*</pattern>
</excludes>
<decorator name="foo" page="myDecorator.jsp">
<pattern>*</pattern>
</decorator>
and then some other other bespoke post-invocation interceptors.
Step 3: Content negotiation
Now, I finally got the stage where only image bytecode was being served and no review was being specified or explicitly generated.
A Spring feature called 'content negotiation' kicked in. It tries to reconcile the 'accepts' header of the request, with the 'messageconverters' it has on hand to produce such responses.
Because spring by default doesn't have a messageconverter to produce image/png responses, it was falling back to text/html - and I was still seeing problems.
Now, were I using spring 4, I could've simply added the annotation:
#Produces("image/png")
to my controller - simple fix...
Step 4: Legacy dependencies
but because I only had spring 3.0.5 (and couldn't upgrade it) I had to try other things.
I tried registering new messageconverters but that was a headache or adding a new post-method interceptor to simply change the content-type back to 'image/png' - but that was a hacky headache.
In the end I just exposed the request/reponse in the controller, and wrote my image directly to the response body - circumventing Spring's content-negotiation altogether
....and finally my image was served as an image and displayed as an image - and no injected code was executed!

That sounds odd, because it works perfectly elsewhere. Are you sure the X-Content-Type-Options header is present in the responses?
Here is a demo I built a while back, where I have a file that's a valid html, gif and javascript. As you can see it first loads as an HTML, but then loads itself as an image and as a script (which executes):
http://research.insecurelabs.org/content-sniffing/gifjs.html
However if you load it using the "X-Content-Type-Options: nosniff" header, the script no longer executes:
http://research.insecurelabs.org/content-sniffing/nosniff/gifjs.html
Btw, the image renders properly in FF/IE, but not in Chrome.
Here is a demo, where I attempted what you described:
http://research.insecurelabs.org/content-sniffing/stackexchange.html
First image is without nosniff, and second is with, and it seems to work as intended. Second one does not run the script when opened with "view image".
Edit:
Firefox doesn't seem to support X-Content-Type-Options: nosniff
So, you should also add "Content-disposition: attachment;filename=image.gif" or similar to the images. The image will load normally if loaded through an image tag, but if you open the URL directly, you will force a download instead of showing the image directly in the browser.
Example: http://research.insecurelabs.org/content-sniffing/attachment/

adeneo is pretty much spot-on. You should use whatever image library you want to check if the uploaded file is a valid file for the type it claims to be. Anything the client sends can be manipulated.

Workaround for Error 414 when opening file from PHP script

I have a PHP script that's outputting a CSV file and up until now I've been just using a link and passing parameters that are used to determine the output in the GET data. However recently the size of the data increased and now that code gets Error 414 - Request URI too Large. I tried using a hidden form to do it with POST but it just reloaded the page and didn't supply a prompt to download the file and all of the suggestions I've been able to find online about doing it with AJAX suggest using a link with GET data instead. Does anyone know a workaround that will have the browser still let the user easily download the data?
Presently I'm just setting the href attribute of a <a> tag.
$("#exportCSV").attr('href', "myscript.php/?data=" + exportData);
exportData has become too long for GET data but I want to maintain the behavior where if you click on a link that has say a CSV file being outputted the browser provides a download dialog for the user.

get a mutating url with javascript

i have the following question:
i'm currently working with a software(MicroStrategy, BI) wich has a functionality that exports reports to pdf, it works something like this:
each report has an unique ID, so you select the report to export, and with jsp i send this report's id to the exporting tool, and it generates a complete URL with some parameters that the MicroStrategy server will read to generate the PDF.
What i'm trying is to capture this pdf url and send it to a Java method that will save this pdf in the hard drive without prompting anything to the user.
My problem is that this URL doesn't generate instantly, it takes a while, AND, some redirections are made in the process.
So, after all that chitchat, how can i capture that damn URL?
What i'm doing is making the pdf load into an iframe, and then extracting the url with a js code i found searching, assigning it to a JSP variable, and then, once i have the pdf url, call the Java Method. But it is not working.
The JavaScript function is this:
<script language="text/javascript">
function getSrc()
{
var CurrentUrl = document.getElementById('miframe').contentWindow.location.href;
if(currentUrl.substr(length-5)==".pdf")
{
return currentUrl;
}
else
{
setTimeout(getSrc(),5000);
}
}
</script>
and this is the call i make to it:
<% jsp code
String currentUrl="<script>document.writeln(getSrc();)</script>";
more jsp code %>
The rest of the code is actually fine, tried it with a normal pdf URL and it saved the pdf into the disk.
Hope it is understadable, and thanks in advance!

Your main problem is that you are calling getSrc, not passing it to setTimeout (you are actually passing null to setTimeout, unless the second call to getSrc happens to work, in which case you are passing a string, which setTimeout can't process due to "syntax errors".
Instead, use setTimeout(getSrc,5000); - no parentheses after getSrc. This passes the function, rather than its result.
Also, currentUrl.substr(length-5) is wrong, partly because length is undefined (you need currentUrl.length in there), and partly because you need -4 to get the last four characters.

I don't know what kind of access you have to MicroStrategy, but there is a MicroStrategy java api that will allow you to execute the document and get pdf without capturing the url.
Check out their Knowledge Base for examples.

Why don't you just save the report/document with PDF format as default, in this way when you open the report it will automatically generated in PDF.
If you don't like the idea to save a report in PDF (for example because you need it also as regular report and you don't want to maintain two version of the same object), you can use URLAPI to generate the PDF using &executionMode=3 and &currentViewMedia=32.
Not sure about these parameters, the best way for you to figure out which they are (beside some MicroStrategy TN) is to export the report in PDF and check the url.

Refused to execute a JavaScript script. Source code of script found within request

In WebKit I get the following error on my JavaScript:
Refused to execute a JavaScript script. The source code of script found within request.
The code is for a JavaScript spinner, see ASCII Art.
The code used to work OK and is still working correctly in Camino and Firefox. The error only seems to be thrown when the page is saved via a POST and then retrieved via a GET. It happens in both Chrome/Mac and Safari/Mac.
Anyone know what this means, and how to fix this?

This "feature" can be disabled by sending the non-standard HTTP header X-XSS-Protection on the affected page.
X-XSS-Protection: 0

It's a security measure to prevent XSS (cross-site scripting) attacks.
This happens when some JavaScript code is sent to the server via an HTTP POST request, and the same code comes back via the HTTP response. If Chrome detects this situation, the script is refused to run, and you get the error message Refused to execute a JavaScript script. Source code of script found within request.
Also see this blogpost about Security in Depth: New Security Features.

Short answer: refresh the page after making your initial submission of the javascript, or hit the URL that will display the page you're editing.
Long answer: because the text you filled into the form includes javascript, and the browser doesn't necessarily know that you are the source of the javascript, it is safer for the browser to assume that you are not the source of this JS, and not run it.
An example: Suppose I gave you a link your email or facebook with some javascript in it. And imagine that the javascript would message all your friends my cool link. So, the game of getting that link to be invoked becomes simply, find a place to send the javascript such that it will be included in the page.
Chrome and other WebKit browsers try to mitigate this risk by not executing any javascript that is in the response, if it was present in the request. My nefarious attack would be thwarted because your browser would never run that JS.
In your case, you're submitting it into a form field. The Post of the form field will cause a render of the page that will display the Javascript, causing the browser to worry. If your javascript is truly saved, however, hitting that same page without submitting the form will allow it to execute.

As others have said, this happens when an HTTP response contains a JavaScript and/or HTML string that was also in the request. This is usually caused by entering JS or HTML into a form field, but can also be triggered in other ways such as manually tweaking the URL's parameters.
The problem with this is that someone with bad intentions could put whatever JS they want as the value, link to that URL with the malicious JS value, and cause your users trouble.
In almost every case, this can be fixed by HTML encoding the response, though there are exceptions. For example, this will not be safe for content inside a <script> tag. Other specific cases can be handled differently - for example, injecting input into a URL is better served by URL encoding.
As Kendall Hopkins mentioned, there may be a few cases when you actually want JavaScript from form inputs to be executed, such as creating an application like JSFiddle. In those cases, I'd recommend that you you at least scrub through the input in your backend code before blindly writing it back. After that, you can use the method he mentioned to prevent the XSS blockage (at least in Chrome), but be aware that it is opening you to attackers.

I used this hacky PHP trick just after I commit to database, but before the script is rendered from my _GET request.:
if(!empty($_POST['contains_script'])) {
echo "<script>document.location='template.php';</script>";
}
This was the cheapest solution for me.

Develop Reference

JavaScript is the programming language of the Web.