I have a simple web application to manipulate image files in the browser.
It is entirely client side. I had some questions about the safety of the operation here : Is this client side application secure?
I want to validate the files to make sure only allowed formats can be 'uploaded'. I put uploaded in quotations because, I'd like to repeat, everything happens on the client side in javascript.
PNG files specifically
I am learning about the structure of png files and I am thinking of using the fileReader object and the method readAsArrayBuffer() to read the bytes of the png file so that I can evaluate the first 8 bytes of the header (137 80 78 71 13 10 26 10) along with the chunk types (like IHDR, IDAT, IEND etc.) and the CRCs. In fact I have already done so but it isn't a part of my web app. Basically, when a user tries to 'upload' a file, my app would spot check some key bytes of the file and determine roughly 'OK, this is a png file. It's ok to work with this file'.
Would this be a good enough validation?
My reasoning for this precaution, even to this whole thing is client-side, is to protect an unsuspecting end user who might 'upload' a file that looks like a png, but which actually contains some harmful script with it.
If this isn't sufficient, I'm hoping someone can point me in such a direction so that I will know what does constitute a proper validation.
Related
I developed a kind of job application website and I only now realized that by allowing the upload of PDF files I'm at risk of receiving PDF documents containing encrypted data, active content (e.g. JavaScript, PostScript), and external references.
What could I use to sanitize or re-build the content of every PDF files uploaded by users?
I want that the companies that will later review the uploaded resumes are able to open the resumes from their browsers without putting them at risk..
The simplest method to flatten or sanitise a PDF that can be done using using GhostScript in safer mode requires just one pass:-
For a Windows user it will be as "simple" as using new 9.55 command
"c:\path to gs9.55\bin\GSwin64c.exe" -sDEVICE=pdfwrite -dNEWPDF -o "Output.pdf" "Input.pdf"
for others replace gs9.55\bin\GSwin64c with version 9.55 GS command
It is not a fast method e.g. around 40ppm is not uncommon, thus 4 pages is about 6 seconds to be reprinted, however, a 400 page document could take 10 minutes.
Advantages the file size is often smaller once any redundant content is removed. Images and font reconstruction may save storage, e.g. a 100 MB file may be reduced to 30 MB but that is a general bonus, not an aim.
JavaScript actions are usually discarded, However links such as bookmarks are usually retained, so be cautious as the result can still have rogue hyperlinks.
The next best suggestion is two passes via PostScript as discussed here https://security.stackexchange.com/questions/103323/effectiveness-of-flattening-a-pdf-to-remove-malware
GS[win64c] -sDEVICE=ps2write -o "%temp%\temp.ps" "Input.pdf"
GS[win64c] -sDEVICE=pdfwrite -o "Output.pdf" "%temp%\temp.ps"
But there is no proof that its any different or more effective than the one line approach.
Finally the strictest method of all, is burst the pdf into image only pages then stitch the images back into a single pdf and concurrently run OCR to reconstruct a searchable PDF (drops bookmarks). That can also be done using Ghostscript enabled with Tesseract.
Note:- visible external hyperlinks may then still be reactivated due to the pdf readers native ability.
I have a file which is split up into multiple parts on the server side. The complete file is huge, it can be 10 or more gigabytes in size (that is the reason for splitting it in the first place).
The file is in a specific format, and it has to be processed on the client side before being downloaded.
Now, I know that I can download into a Blob to the client side, do the processing, then download Blobs from there with this approach: JavaScript blob filename without link
The problem here is that I would need to construct a single huge blob from all the file parts on the client side, which I do not want, because it will probably exceed RAM limitations rather quickly.
I would like to download each part of the file individually and then process it and download the "partial" blob. That means that I would need to start a download and then piece by piece add blobs to it until the download is complete.
Is there any possibility of doing this? How? I know that mega.co.nz does something similar with file downloads where they process the file on the client side first (for decryption). Are they using such techniques?
You can save the downloaded parts to localStorage. You will have to serialize each part into a string first; then you can call localStorage.setItem. Your code might look like this:
localStorage.setItem('download-part-' + chunkIndex, chunkDataAsString);
chunkDataAsString = ''; // let the garbage collector collect the large string
I'm creating an ASP.Net form with a fileupload control which will then email the details of the form and the file to another admin. I want to ensure this secure (for the server and the recipient). The attachment should be a CV so I will restrict it to typical text documents.
From what I can tell the best bet is to check that the file extension or MIME Type is of that kind and check it against the "magic numbers" to verify that the extension hasn't been changed. I'm not too concerned about how to go about doing that but want to know if that really is enough.
I'd also be happy to use a third party product that takes care of this and I've looked at a couple:
blueimp jQuery file upload
http://blueimp.github.io/jQuery-File-Upload/
and cutesoft ajaxuploader
http://ajaxuploader.com/Demo/
But blueimp one still seems to require custom server validation (i guess just being jQuery it just handles client-side validation) and the .net one checks the MIME-type matches the extension but I thought the MIME type followed the extension anyway.
So,
Do I need to worry about server security when the file is added as an attachment but not saved?
Is there a plugin or control that takes care of this well?
If I need to implement something for server validation myself is matching the MIME-type to the "magic numbers" good enough?
I'm sure nothing is 100% bulletproof but file upload is pretty common stuff and I assume most implementations are "safe enough" - but how!?
If it's relevant, here is my basic code so far
<p>Please attach your CV here</p>
<asp:FileUpload ID="fileUploader" runat="server" />
and on submit
MailMessage message = new MailMessage();
if (fileUploader.HasFile)
{
try
{
if (fileUploader.PostedFile.ContentType == "text")
{
// check magic numbers indicate same content type... if(){}
if (fileUploader.PostedFile.ContentLength < 102400)
{
string fileName = System.IO.Path.GetFileName(fileUploader.PostedFile.FileName);
message.Attachments.Add(new Attachment(fileUploader.PostedFile.InputStream, fileName));
}
else
{
// show a message saying the file is too large
}
}
else
{
// show a message saying the file is not a text based document
}
}
catch (Exception ex)
{
// display ex.Message;
}
}
A server can never be 100% secure, but we should do our best to minimize the risk on an incident. I should say at this point that I am not an expert, I am just a computer science student. So, here is an approach that I would follow in such a case. Please, comment any additional tip you can give.
Generally speaking, to have a secure form, all client inputs must be checked and validated. Any information that does not origin from our system is not trusted.
Inputs from the client in our case:
file's name
name
extension
file's content
Extension
We don't really care about the minetype, this is info for a web server. We care about the file extension, because this is the indicator for the OS on how to run/read/open a file. We have to support only specific file extensions (what ever your admin's pc can handle) there is no point supporting unknown file types.
Name (without the extension)
The name of the file is not always a valuable info. When I deal with file uploading I usually rename it (set it) to an id (a username, a time-stamp, hashes etc). If the name is important, always check/trim it, if you only expect letters or numbers delete all other chars (I avoid to leave "/", "\", "." because they can be used to inject paths).
So now we suppose that the generated file name is safe.
Content
When you support no structured files, you just can not validate the file's content. Thus, let an expert program do this for you... scan them with an antivirus. Call the antivirus from the console (carefully, use mechanics that avoid injections). Many antivirus can scan zips contents too (a malicious file, in a folder on your server is not a good idea). Always keep the scan program updated.
On the comments I suggested zipping the file, in order to avoid any automatic execution on the admin's machine and on the sever. The admin's machine's antivirus can then handle it before unzip.
Some more tips, don't give more information's to the client than he needs... don't let the client know where the files are saved, don't let the web-server access them for distribution if there no need to. Keep a log with weird actions (slashes in filenames, too big files, too long names, warning extensions like "sh" "exe" "bat") and report the admins with an email if anything weird happen (it is good to know if your protections work).
All these creates server work load (more system holes), so you may should count the number of files that are scanned/checked at the moment before accepting a new file upload request (that is where I would launch a DDoS attack).
With a quick google search Avast! For Linux - Command Line Guide, I do not promote Avast, I am just showing it as an existing example.
Lastly but not least, you are not paranoid, I manage a custom translation system that I coded... spams and hack attacks have occurred more than once.
Some more thoughts, JavaScript running on a web-page is only secure for the client's computer (thanks to the browser's security). We can use it to prevent invalid posts to the server but this does not ensures that such requests will not be done as JavaScript can be bypassed/edited.
So, all JavaScript solutions are only for a first validation (usually just to help the user correct mistakes) and to correctly set the form data.
I had to check filetype in file uploader to determine if file was image (jpg, png) and I decided to do it by reading file's magic number (4 first bytes) with FileReader but I have some doubts about this method:
Is this method safe? Is there a way to upload non jpg file as jpg with this method?
I've seen filetypes with different magic numbers size like 2, 4, 6 bytes... So If I had to make a generic method to determine not just image filetype but the others as well, I would have to read the maximum amount of bytes (to determine largest magic number) from the file, right?
It's not safe. Problem is not only in magic numbers but already in that you try to validate it on client side.
Form can be uploaded directly from a script bypassing your client side validation.
Correct way to do it is to validate everything on server side using proven techniques.
Right. Different file formats have different magic numbers on different offsets. But still if you care about security - dont trust anything.
I make multiple file uploading project, (Server language is PHP)
Especially, I need that before uploading, on client side, get files size in bytes.
what is today best cross browser solution/plugin for this?
I find SWFUpload, may be exists better solution? or use this SWFUpload?
In HTML5 a file drag & drop event creates a File object which has a .size property.
Look in ev.dataTransfer for the list of File objects.