Content-length header doesnt match request body on chunked upload - javascript

Hello,
I have a problem with chunked uploading to google-cloud-storage (gcs) using dropzone.js.
What Im Trying To Do:
I want to upload a (bigger) file to google-cloud-storage via dropzone. Since its a quite large file I'm using dropzone's internal function to chunk (and upload) it. I'm trying to upload via signed url that I'm creating, initalizing and afterwards passing to dropzone so that it knows where to send the file. Also I'm adding a Content-Range header to the request so that gcs keeps track of the current file-upload status.
Current Status:
When the download starts, it seems to work. But after the first chunk finishes dropzone tries to reupload the first chunk, because the xhr-response status is 400.
The Problem:
When investigated the xhr request and response it showed that the Content Length and Content-Range header are NOT the same size. And that is also what the response text told me the error is.
My code in .on("sending") by dropzone:
let procChunks = file.upload.chunks; // Get all chunks processed until now
latestChunk = procChunks[procChunks.length-1]; // Select latest chunk
const chunkFirstByte = dz.options.chunkSize * latestChunk.index; // Calc first byte
const chunkLastByte = chunkFirstByte + (latestChunk.dataBlock.data.size-1); // Calc last byte
let header = "bytes "+chunkFirstByte+"-"+chunkLastByte +"/"+ (file.size-1); // Create header value
xhr.setRequestHeader("Content-Range", header); // Set header to xhr
Request Header:
PUT /upload/storage/v1/b/(myBucket)/o?uploadType=resumable&name=test2.fna&upload_id=(myUpLoadID) HTTP/2
Host: storage.googleapis.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:83.0) Gecko/20100101 Firefox/83.0
Accept: application/json
Accept-Language: de,en-US;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate, br
Cache-Control: no-cache
X-Requested-With: XMLHttpRequest
Content-Range: bytes 0-8388607/13088352
Content-Type: multipart/form-data; boundary=---------------------------428770732237854445893641225227
Content-Length: 8388843
Origin: http://127.0.0.1:5000
Connection: keep-alive
Referer: http://127.0.0.1:5000/upload
TE: Trailers
The request body:
-----------------------------428770732237854445893641225227
Content-Disposition: form-data; name="file"; filename="test2.fna"
Content-Type: application/octet-stream
>NC_000913.3 Escherichia coli str. K-12 substr. MG1655, complete genome
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTG
GTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGAC
AGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGT
AACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGG
TAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCG
ATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTG
GCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTT
GACGGGACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAA
AACATGTCCTGCATGGCATTAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAA
ATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTCACAACGTTACTGTTATCGATCCGGTCGAAAAACTGCT
GGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCTGAGTCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTG
ATCACATGGTGCTGATGGCAGGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGAC
TACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGACGTTGACGGGGTCTATACCTG
CGACCCGCGTCAGGTGCCCGATGCGAGGTTGTTGAAGTCGATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCG
CTAAAGTTCTTCACCCCCGCACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCT
CAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGCATTTCCAATCTGAATAACAT
GGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATGGTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCACGCG
CCCGTATTTCCGTGGTGCTGATTACGCAATCATCT
Response Text:
Invalid request. There were 8388843 byte(s) (or more) in the request body. There should have been 8388608 byte(s) (starting at offset 0 and ending at offset 8388607) according to the Content-Range header.
As you can maybe see the first 4 lines are NOT from the file.
Additional Information:
The file is pure text data (utf-8 encoded i guess).
Im chunking in 8mb large chunks (recommended by google).
Somewhere on google's api guide I read that when uploading via PUT there should/must be no other data except the file data.
Content-Length header gets added automatically by dropzone.
My Questions:
Where does those 4 lines come from?
Can I (re)set the xhr request body?
Is my problem maybe caused by some formating problem of file-data? (Like bytes, strings)?
In case that's not the problem - do u have any other idea what could be the problem?
THANK YOU VERY MUCH!
Any help appricated! If u need some more information please ask!

Related

Cannot recognize image upload at Google Drive in js

i'm trying to upload image file at google drive, using oauth token & fetch url.
https://developers.google.com/drive/api/v3/manage-uploads
Perform a multipart upload, HTTP.
when i try to upload, fetch url response returns status 200, and in google drive, file is in there. But can't see(recognized no support img).
it's my header
method: post
Authorization: `Bearer ${token}`
Content-Type: `multipart/related; boundary=${boundaryString}`
Content-Length: ${body.Length}
and it's my body
--`${boundaryString}`
Content-Type: application/json; charset=UTF-8
{"name":"myimage.png","description":"Upload image","mimeType":"image/png"}
--`${boundaryString}`
Content-Type: image/png; Content-Transfer-Encoding: base64
data:image/png;base64,iVBO......TkSuQmCC
--`${boundaryString}`--
response :
status: 200 url: "https://www.googleapis.com/upload/drive/v3/files?uploadType=multipart"
body: {
id: "~~~~"
kind: "drive#file"
mimeType: "image/png"
name: "myimage.png"
when i go to drive, it exist. it's details correct(name, description, mimeType),
but can't recognize like another images.(file format is not supported.)
when i check <img src ="data:image/png;base64,iVBO......TkSuQmC" /> it works.
could tell me what's the problem?
How about this modification?
Modification points:
Please remove Content-Type: image/png; from the data part.
Please remove the header of base64 data.
When above points are reflected to your request body, it becomes as follows.
Modified request body:
--`${boundaryString}`
Content-Type: application/json; charset=UTF-8
{"name":"myimage.png","description":"Upload image","mimeType":"image/png"}
--`${boundaryString}`
Content-Transfer-Encoding: base64
iVBO......TkSuQmCC
--`${boundaryString}`--
Note:
In this case, the line breaks are important. Please be careful this.
In this modification, it supposes that your access token can be used for uploading the file to Google Drive.
Although I'm not sure about your actual script, if you use Javascript, how about the following modified script?
var data = `--${boundaryString}
Content-Type: application/json; charset=UTF-8
{"name":"myimage.png","description":"Upload image","mimeType":"image/png"}
--${boundaryString}
Content-Transfer-Encoding: base64
iVBO......TkSuQmCC
--${boundaryString}--`;

API Data Returning Unicode Characters in Console

I am facing a rather confusing problem since the last two days. I am working on a document management system, that uses an API that pulls in data from SOLR. The data is in tune of around ~15Mbs, and pulls records of more than 4000+ documents. The API has response in this format -
{
"documents": [
{
id: 123,
some_field: "abcd",
some_other_field: "abcdef"
},
{
id: 124,
some_field: "abcd1",
some_other_field: "abcdef1"
}
]
}
Everything works fine in browser. If I hit the endpoint in Chrome or Firefox browser, it gives me the correct output and I am able to see the JSON output.
However, if I try hitting the same API endpoint with a Java or JS code - the response code is 200, but the output in console (Terminal or Eclipse) shows unicode characters like \u0089 \u0078 U+0080 - all the output comes in this way, and since there are around 4000+ records being fetched by the API, the console kinda fills with all of these unicode characters.
The only difference that I see between the requests made from browser and the code is that in browser I can see Content-Encoding : gzip, while I cannot find this header from the code that I written . For eg - in JS code, through Chakram framework, I can check
expect(response).to.be.encoded.with.gzip
mentioned here. However, this returns a failure stating expected undefined to match gzip
What am I missing here? Is this something related to encoding/decoding or something entirely different?
Edit 1 : The Response Headers as seen in Network tab of Chrome :
cache-control: max-age=0, private, must-revalidate, max-age=315360000
content-encoding: gzip
content-type: application/json; charset=utf-8
date: Tue, 22 May 2018 06:07:26 GMT
etag: "a07eb7c1eef4ab97699afc8d61fb9c5d"
expires: Fri, 19 May 2028 06:07:26 GMT
p3p: CP="NON CUR OTPi OUR NOR UNI"
server: Apache
Set-Cookie : some_cookie
status: 200 OK
strict-transport-security:
transfer-encoding: chunked
vary: Accept-Encoding
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-request-id: abceefr4-1234-acds-100b-d2bef2413r47
x-runtime: 3.213943
x-ua-compatible: chrome=1
x-xss-protection: 1; mode=block
The Request Headers as seen in Network tab of Chrome
Accept: application/json, text/plain, */*
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Connection: keep-alive
Cookie: some_cookie
Host: abcd.bcd.com
IV_USER: demouser123
IV_USER_L: demouser123
MAIL: demouser#f.com
PERSON_ID: 123
Referer: http://abcd.bcd.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
X-CSRF-TOKEN: some_csrf_token
Edit 2 : The tests that I am using
describe('Hits required API',()=>{
before(()=>{
return chakram.wait(api_response = chakram.get(url,options));
});
it('displayes response',()=>{
return api_response.then((t_resp)=>{
console.log(JSON.stringify(t_resp));
expect(t_resp).to.have.header('Content-Encoding','gzip');
});
});
This has nothing to do with encoding. The web server in general compresses to gzip to save the bandwidth since its redundant to transfer the whole 15MB file as is refer this article for more about gZip and the its working ( https://betterexplained.com/articles/how-to-optimize-your-site-with-gzip-compression/ ). So where does it went wrong and how it worked in chrome is pretty simple chrome has an inbuilt unicode parser(even an HTML parser) in its devTools which can show you the parsed content rather showing you the wiered text (same can be seen in response tab next to preview tab). why you see wierd text is that you are stingfying the response which will escape special character if any console.log(JSON.stringify(t_resp));. You cannot use something like console.log("response", t_resp); without stringifying in terminal since the terminal doesn't have a JSON or an unicode parser it just prints in text. try removing that console since stringifying a 15mb file is a costly process.
Edit 1:-
if you still want to output in the console here whats to be done.
Since NODE cannot decode gzip by default directly (not with chakram, its just a APItesting platform) you can use zlib to do this. Please find the example snippet
const zlib = require('zlib');
describe('Hits required API',()=>{
before(()=>{
return chakram.wait(api_response = chakram.get(url,options));
});
it('displayes response',()=>{
return api_response.then((t_resp)=>{
zlib.gunzip(t_resp, function(err, dezipped) {
console.log(dezipped);
});
});
});
Try with console.dir to display your values
describe('Hits required API',()=>{
before(()=>{
return chakram.wait(api_response = chakram.get(url,options));
});
it('displayes response',()=>{
return api_response.then((t_resp)=>{
console.dir(t_resp, { depth: null });
});
});
Console.dir

How is an image sent when submitting a form? [duplicate]

When I submit a simple form like this with a file attached:
<form enctype="multipart/form-data" action="http://localhost:3000/upload?upload_progress_id=12344" method="POST">
<input type="hidden" name="MAX_FILE_SIZE" value="100000" />
Choose a file to upload: <input name="uploadedfile" type="file" /><br />
<input type="submit" value="Upload File" />
</form>
How does it send the file internally? Is the file sent as part of the HTTP body as data? In the headers of this request, I don't see anything related to the name of the file.
I just would like the know the internal workings of the HTTP when sending a file.
Let's take a look at what happens when you select a file and submit your form (I've truncated the headers for brevity):
POST /upload?upload_progress_id=12344 HTTP/1.1
Host: localhost:3000
Content-Length: 1325
Origin: http://localhost:3000
... other headers ...
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryePkpFF7tjBAqx29L
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="MAX_FILE_SIZE"
100000
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="uploadedfile"; filename="hello.o"
Content-Type: application/x-object
... contents of file goes here ...
------WebKitFormBoundaryePkpFF7tjBAqx29L--
NOTE: each boundary string must be prefixed with an extra --, just like in the end of the last boundary string. The example above already includes this, but it can be easy to miss. See comment by #Andreas below.
Instead of URL encoding the form parameters, the form parameters (including the file data) are sent as sections in a multipart document in the body of the request.
In the example above, you can see the input MAX_FILE_SIZE with the value set in the form, as well as a section containing the file data. The file name is part of the Content-Disposition header.
The full details are here.
How does it send the file internally?
The format is called multipart/form-data, as asked at: What does enctype='multipart/form-data' mean?
I'm going to:
add some more HTML5 references
explain why he is right with a form submit example
HTML5 references
There are three possibilities for enctype:
x-www-urlencoded
multipart/form-data (spec points to RFC2388)
text-plain. This is "not reliably interpretable by computer", so it should never be used in production, and we will not look further into it.
How to generate the examples
Once you see an example of each method, it becomes obvious how they work, and when you should use each one.
You can produce examples using:
nc -l or an ECHO server: HTTP test server accepting GET/POST requests
a user agent like a browser or cURL
Save the form to a minimal .html file:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8"/>
<title>upload</title>
</head>
<body>
<form action="http://localhost:8000" method="post" enctype="multipart/form-data">
<p><input type="text" name="text1" value="text default">
<p><input type="text" name="text2" value="aωb">
<p><input type="file" name="file1">
<p><input type="file" name="file2">
<p><input type="file" name="file3">
<p><button type="submit">Submit</button>
</form>
</body>
</html>
We set the default text value to aωb, which means aωb because ω is U+03C9, which are the bytes 61 CF 89 62 in UTF-8.
Create files to upload:
echo 'Content of a.txt.' > a.txt
echo '<!DOCTYPE html><title>Content of a.html.</title>' > a.html
# Binary file containing 4 bytes: 'a', 1, 2 and 'b'.
printf 'a\xCF\x89b' > binary
Run our little echo server:
while true; do printf '' | nc -l 8000 localhost; done
Open the HTML on your browser, select the files and click on submit and check the terminal.
nc prints the request received.
Tested on: Ubuntu 14.04.3, nc BSD 1.105, Firefox 40.
multipart/form-data
Firefox sent:
POST / HTTP/1.1
[[ Less interesting headers ... ]]
Content-Type: multipart/form-data; boundary=---------------------------735323031399963166993862150
Content-Length: 834
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="text1"
text default
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="text2"
aωb
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="file1"; filename="a.txt"
Content-Type: text/plain
Content of a.txt.
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="file2"; filename="a.html"
Content-Type: text/html
<!DOCTYPE html><title>Content of a.html.</title>
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="file3"; filename="binary"
Content-Type: application/octet-stream
aωb
-----------------------------735323031399963166993862150--
For the binary file and text field, the bytes 61 CF 89 62 (aωb in UTF-8) are sent literally. You could verify that with nc -l localhost 8000 | hd, which says that the bytes:
61 CF 89 62
were sent (61 == 'a' and 62 == 'b').
Therefore it is clear that:
Content-Type: multipart/form-data; boundary=---------------------------735323031399963166993862150 sets the content type to multipart/form-data and says that the fields are separated by the given boundary string.
But note that the:
boundary=---------------------------735323031399963166993862150
has two less dadhes -- than the actual barrier
-----------------------------735323031399963166993862150
This is because the standard requires the boundary to start with two dashes --. The other dashes appear to be just how Firefox chose to implement the arbitrary boundary. RFC 7578 clearly mentions that those two leading dashes -- are required:
4.1. "Boundary" Parameter of multipart/form-data
As with other multipart types, the parts are delimited with a
boundary delimiter, constructed using CRLF, "--", and the value of
the "boundary" parameter.
every field gets some sub headers before its data: Content-Disposition: form-data;, the field name, the filename, followed by the data.
The server reads the data until the next boundary string. The browser must choose a boundary that will not appear in any of the fields, so this is why the boundary may vary between requests.
Because we have the unique boundary, no encoding of the data is necessary: binary data is sent as is.
TODO: what is the optimal boundary size (log(N) I bet), and name / running time of the algorithm that finds it? Asked at: https://cs.stackexchange.com/questions/39687/find-the-shortest-sequence-that-is-not-a-sub-sequence-of-a-set-of-sequences
Content-Type is automatically determined by the browser.
How it is determined exactly was asked at: How is mime type of an uploaded file determined by browser?
application/x-www-form-urlencoded
Now change the enctype to application/x-www-form-urlencoded, reload the browser, and resubmit.
Firefox sent:
POST / HTTP/1.1
[[ Less interesting headers ... ]]
Content-Type: application/x-www-form-urlencoded
Content-Length: 51
text1=text+default&text2=a%CF%89b&file1=a.txt&file2=a.html&file3=binary
Clearly the file data was not sent, only the basenames. So this cannot be used for files.
As for the text field, we see that usual printable characters like a and b were sent in one byte, while non-printable ones like 0xCF and 0x89 took up 3 bytes each: %CF%89!
Comparison
File uploads often contain lots of non-printable characters (e.g. images), while text forms almost never do.
From the examples we have seen that:
multipart/form-data: adds a few bytes of boundary overhead to the message, and must spend some time calculating it, but sends each byte in one byte.
application/x-www-form-urlencoded: has a single byte boundary per field (&), but adds a linear overhead factor of 3x for every non-printable character.
Therefore, even if we could send files with application/x-www-form-urlencoded, we wouldn't want to, because it is so inefficient.
But for printable characters found in text fields, it does not matter and generates less overhead, so we just use it.
Send file as binary content (upload without form or FormData)
In the given answers/examples the file is (most likely) uploaded with a HTML form or using the FormData API. The file is only a part of the data sent in the request, hence the multipart/form-data Content-Type header.
If you want to send the file as the only content then you can directly add it as the request body and you set the Content-Type header to the MIME type of the file you are sending. The file name can be added in the Content-Disposition header. You can upload like this:
var xmlHttpRequest = new XMLHttpRequest();
var file = ...file handle...
var fileName = ...file name...
var target = ...target...
var mimeType = ...mime type...
xmlHttpRequest.open('POST', target, true);
xmlHttpRequest.setRequestHeader('Content-Type', mimeType);
xmlHttpRequest.setRequestHeader('Content-Disposition', 'attachment; filename="' + fileName + '"');
xmlHttpRequest.send(file);
If you don't (want to) use forms and you are only interested in uploading one single file this is the easiest way to include your file in the request.
Update:
In all modern browsers you can these days also use the fetch API for (binary) upload. The same as mentioned in the example above would then look like this:
const promise = fetch(target, {
method: 'POST',
body: file,
headers: {
'Content-Type': mimeType,
'Content-Disposition', `attachment; filename="${fileName}"`,
},
});
promise.then(
(response) => { /*...do something with response*/ },
(error) => { /*...handle error*/ },
);
I have this sample Java Code:
import java.io.*;
import java.net.*;
import java.nio.charset.StandardCharsets;
public class TestClass {
public static void main(String[] args) throws IOException {
ServerSocket socket = new ServerSocket(8081);
Socket accept = socket.accept();
InputStream inputStream = accept.getInputStream();
InputStreamReader inputStreamReader = new InputStreamReader(inputStream, StandardCharsets.UTF_8);
char readChar;
while ((readChar = (char) inputStreamReader.read()) != -1) {
System.out.print(readChar);
}
inputStream.close();
accept.close();
System.exit(1);
}
}
and I have this test.html file:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>File Upload!</title>
</head>
<body>
<form method="post" action="http://localhost:8081" enctype="multipart/form-data">
<input type="file" name="file" id="file">
<input type="submit">
</form>
</body>
</html>
and finally the file I will be using for testing purposes, named a.dat has the following content:
0x39 0x69 0x65
if you interpret the bytes above as ASCII or UTF-8 characters, they will actually will be representing:
9ie
So let 's run our Java Code, open up test.html in our favorite browser, upload a.dat and submit the form and see what our server receives:
POST / HTTP/1.1
Host: localhost:8081
Connection: keep-alive
Content-Length: 196
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: null
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary06f6g54NVbSieT6y
DNT: 1
Accept-Encoding: gzip, deflate
Accept-Language: en,en-US;q=0.8,tr;q=0.6
Cookie: JSESSIONID=27D0A0637A0449CF65B3CB20F40048AF
------WebKitFormBoundary06f6g54NVbSieT6y
Content-Disposition: form-data; name="file"; filename="a.dat"
Content-Type: application/octet-stream
9ie
------WebKitFormBoundary06f6g54NVbSieT6y--
Well I am not surprised to see the characters 9ie because we told Java to print them treating them as UTF-8 characters. You may as well choose to read them as raw bytes..
Cookie: JSESSIONID=27D0A0637A0449CF65B3CB20F40048AF
is actually the last HTTP Header here. After that comes the HTTP Body, where meta and contents of the file we uploaded actually can be seen.
An HTTP message may have a body of data sent after the header lines. In a response, this is where the requested resource is returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data or uploaded files are sent to the server.
http://www.tutorialspoint.com/http/http_messages.htm

Express JS 4.0, serve binary data, request Accept header changes output

Thanks in advance.
Short:
Express JS 4.0 alters the output data, due to the Accept headers in the request.
Is there a way for me to override this behaviour, and just write the same data regardless of the request headers.
When Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 is present output is changed.
Is there a way I can ignore, remove, override these headers.
Long (probably tl;dr):
I am trying to serve binary data from a Node/ExpressJS app.
I am storing a compressed log file (plain/text), that has been gzipped, base64 encoded and sent to my server app, where it is being stored in a mongo database using mongoose. I know this is probably not optimal, but is currently a necessary evil. This is working fine.
$(gzip --stdout /var/log/cloud-init-script.log | base64 --wrap=0)
Is being used to compress and base64 the data, before it is sent with other data as part of a json post.
The problem occurs when I attempt to retrieve, decode the base64 encoded string and send to the browser as a binary gzip file.
// node, referring to the machine the log came from
var log = new Buffer(node.log, 'base64');
res.setHeader('Content-Disposition', 'attachment; filename=' + node.name + "-log.gz");
res.setHeader('Content-Type', 'application/x-gzip');
res.setHeader('Content-Length', log.length);
console.log(log.toString('hex'));
// res.end(log, 'binary'); I tried this hoping I could by pass, some content-negotiation
res.send(log);
I had this working when using ExpressJS 3.0 using res.send.
But when I updated to ExpressJS 4.0 the downloaded data, ceased to extract properly. The data being pulled down seemingly corrupt somehow.
I started to try and fix this by comparing the downloaded file and the source file in hexidecimal output using xxd or od and found that the downloaded file was different to the source. I also dumped the hex of the NodeJS Buffer just before it is sent to the client to console, and this matches the source.
I have been banging my head against this issued for nearly a day now, and have suspected that NodeJS might be doing something funky with character encoding (UTF-8 v. Buffer v. UTF16 Strings), OS endianess.
Eventually finding none of this the be problem, I had assumed NodeJS had always been outputting the wrong data to the browser, which was correct, but it wasn't "Always" outputting the wrong data.
I had a break through, when I did a curl request to the endpoint, and the data came through as expected (matching the source), I then added the request headers that were sent with my browser requests, and got back the mangled data.
Actual log file:
I'm a log file
Good Request:
> User-Agent: curl/7.37.1
> Host: 127.0.0.1:9000
> Accept: */*
>
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Last-Modified: Tue, 26 May 2015 11:47:46 GMT
< Content-Description: File Transfer
< Content-Disposition: attachment; filename=test-log.gz
< Content-Type: application/x-gzip
< Content-Transfer-Encoding: binary
< Content-Length: 57
< Date: Tue, 26 May 2015 11:47:46 GMT
< Connection: keep-alive
0000000: 1f8b 0808 0256 6455 0003 636c 6f75 642d .....VdU..cloud-
0000010: 696e 6974 2d73 6372 6970 742e 6c6f 6700 init-script.log.
0000020: f354 cf55 4854 c8c9 4f57 48cb cc49 e502 .T.UHT..OWH..I..
0000030: 003b 5ff5 5f0f 0000 00 .;_._....
Bad Request:
> Host: localhost:9000
> Connection: keep-alive
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.65 Safari/537.36
> Referer: http://localhost:9000/nodes?query=environment%3D5549b6cbdc023b5e26fe6bd4%20type%3Dnat
> Accept-Language: en-US,en;q=0.8
>
< HTTP/1.1 200 OK
< X-Powered-By: Express
< Last-Modified: Tue, 26 May 2015 11:47:00 GMT
< Content-Description: File Transfer
< Content-Disposition: attachment; filename=test-log.gz
< Content-Type: application/x-gzip
< Content-Transfer-Encoding: binary
< content-length: 57
< Date: Tue, 26 May 2015 11:47:00 GMT
< Connection: keep-alive
0000000: 1ffd 0808 0256 6455 0003 636c 6f75 642d .....VdU..cloud-
0000010: 696e 6974 2d73 6372 6970 742e 6c6f 6700 init-script.log.
0000020: fd54 fd55 4854 fdfd 4f57 48fd fd49 fd02 .T.UHT..OWH..I..
0000030: 003b 5ffd 5f0f 0000 00 .;_._....
res.end(node.log, 'base64');
instead of
res.send(log);
Where node.log is the raw base64 encoded String and log was a Buffer that had decoded that string.
Bearing in mind I am using Node v0.10.38.
I ended up following the function call chain.
// I call
res.send(log);
// ExpressJS calls on http.ServerResponse
this.end(chunk, encoding); // chunk = Buffer, encoding = undefined
// NodeJS http.ServerResponse calls
res.inject(string);
At this point NodeJS appears to be treating the data as a string, which is where the buffer contents were being mangled.
This behaviour was different when the 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' header was not present, a different end(chunk, encoding) function was being called in this case, not using res.inject and not mangling the Buffer data.
I am not entirely sure where the content negotiation is happening and what is swapping in the different res.end functions, whether this is NodeJS or ExpressJS, but it would be nice to be able to control this content negotiation in some simple way.

image upload to web service in JavaScript

I need to upload an image to a webservice from javascript. I have to send a json string an a file(image). In java we have MultipartEntity. I have the followig code in java:
HttpPost post = new HttpPost( aWebImageUrl2 );
MultipartEntity entity = new MultipartEntity( HttpMultipartMode.BROWSER_COMPATIBLE );
// For File parameters
entity.addPart( "picture", new FileBody((( File ) imgPath )));
// For usual String parameters
entity.addPart( "url", new StringBody( aImgCaption, "text/plain", Charset.forName( "UTF-8" )));
post.setEntity( entity );
Now I need to do the same image upload in javascript.
But in javaScript I didn't find any equivalent of MultipartEntity. Please Suggest any solutions.
For uploading images I use either Valum's ajax upload plugin or jQuery form plugin that allows to submit a normal form in an ajax way.
If you will use POST requests then don't forget to use MAX_FILE_SIZE hidden attribute:
<input type="hidden" name="MAX_FILE_SIZE" value="20000000">
Note that it must precede the file input field. It is in bytes, so this will limit the upload to 20MB. See PHP documentation for details.
Assuming that your Java code is using Apache HttpComponents (what you really should have said then), your code, when augmented with
URI aWebImageUrl2 = new URI("http://localhost:1337/");
File imgPath = new File("…/face.png");
final String aImgCaption = "face";
// …
HttpClient httpClient = new DefaultHttpClient();
httpClient.execute(post);
submits the following example HTTP request (as tested with nc -lp 1337, see GNU Netcat):
POST / HTTP/1.1
Content-Length: 990
Content-Type: multipart/form-data; boundary=oQ-4zTK_UL007ymPgBL2VYESjvFwy4cN8C-F
Host: localhost:1337
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.1.2 (java 1.5)
--oQ-4zTK_UL007ymPgBL2VYESjvFwy4cN8C-F
Content-Disposition: form-data; name="picture"; filename="face.png"
Content-Type: application/octet-stream
�PNG[…]
The simplest solution to do something like this in HTML is, of course, to use a FORM element and no or minimal client-side scripting:
<form action="http://service.example/" method="POST"
enctype="multipart/form-data">
<input type="file" name="picture">
<input type="submit">
</form>
which submits (either when submitted with the submit button or the form object's submit() method) the following example request:
POST / HTTP/1.1
Host: localhost:1337
Connection: keep-alive
Content-Length: 886
Cache-Control: max-age=0
Origin: http://localhost
User-Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryhC26St5JdG0WUaCi
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Referer: http://localhost/scripts/test/XMLHTTP/file.html
Accept-Encoding: gzip,deflate,sdch
Accept-Language: de-CH,de;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
------WebKitFormBoundaryhC26St5JdG0WUaCi
Content-Disposition: form-data; name="picture"; filename="face.png"
Content-Type: image/png
�PNG[…]
But since you have asked explicitly about a "javascript" solution (there really is no such programming language), I presume that you want to have more client-side control over the submit process. In that case, you can use the W3C File API and XMLHttpRequest or XMLHttpRequest2 APIs as provided by recent browsers (not the programming languages):
<script type="text/javascript">
function isHostMethod(obj, property)
{
if (!obj)
{
return false;
}
var t = typeof obj[property];
return (/\bunknown\b/i.test(t) || /\b(object|function)\b/i.test(t) && obj[property]);
}
var global = this;
function handleSubmit(f)
{
if (isHostMethod(global, "XMLHttpRequest"))
{
try
{
var input = f.elements["myfile"];
var file = input.files[0];
var x = new XMLHttpRequest();
x.open("POST", f.action, false); // ¹
try
{
var formData = new FormData();
formData.append("picture", file);
x.send(formData);
return false;
}
catch (eFormData)
{
try
{
var reader = new FileReader();
reader.onload = function (evt) {
var boundary = "o" + Math.random();
x.setRequestHeader(
"Content-Type", "multipart/form-data; boundary=" + boundary);
x.send(
"--" + boundary + "\r\n"
+ 'Content-Disposition: form-data; name="picture"; filename="' + file.name + '"\r\n'
+ 'Content-Type: application/octet-stream\r\n\r\n'
+ evt.target.result
+ '\r\n--' + boundary + '--\r\n');
};
reader.readAsBinaryString(file);
return false;
}
catch (eFileReader)
{
}
}
}
catch (eFileOrXHR)
{
}
}
return true;
}
</script>
<form action="http://service.example/" method="POST"
enctype="multipart/form-data"
onsubmit="return handleSubmit(this)">
<input type="file" name="myfile">
<input type="submit">
</form>
This approach tries to use the XMLHttpRequest API. If that fails, the function returns true, so true is returned to the event handler (see the attribute value), and the form is submitted the usual way (the latter might not work with your Web service; test before use by disabling script support).
If XMLHttpRequest can be used, it is "tested"² if the file input has a files property and the object referred to by that has a 0 property (referring to the first selected File for that form control, if supported).
If yes, the XMLHttpRequest2 API is tried, which send() method can take a reference to a FormData and do all the multi-part magic by itself. If the XMLHttpRequest2 API is not supported (which should throw an exception), the File API's FileReader is tried, which can read the contents of a File as binary string (readAsBinaryString()); if that is successful (onload), the request is prepared and submitted. If one of those approaches seemingly worked, the form is not submitted (return false).
Example request submitted with this code using the FormData API:
POST / HTTP/1.1
Host: localhost:1337
Connection: keep-alive
Content-Length: 887
Origin: http://localhost
User-Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryLIXsjWnCpVbD8FVA
Accept: */*
Referer: http://localhost/scripts/test/XMLHTTP/file.html
Accept-Encoding: gzip,deflate,sdch
Accept-Language: de-CH,de;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
------WebKitFormBoundaryLIXsjWnCpVbD8FVA
Content-Disposition: form-data; name="picture"; filename="face.png"
Content-Type: image/png
�PNG[…]
The example request looks slightly different when the FileReader API was used instead (just as proof of concept):
POST / HTTP/1.1
Host: localhost:1337
Connection: keep-alive
Content-Length: 1146
Origin: http://localhost
User-Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1
Content-Type: multipart/form-data; boundary=o0.9578036249149591
Accept: */*
Referer: http://localhost/scripts/test/XMLHTTP/file.html
Accept-Encoding: gzip,deflate,sdch
Accept-Language: de-CH,de;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
--o0.9578036249149591
Content-Disposition: form-data; name="picture"; filename="face.png"
Content-Type: application/octet-stream
PNG[…]
Notice that the XMLHttpRequest2, FormData and File API are having only Working Draft status and so are still in flux. Also, this approach works if the resource submitted from and the resource submitted to are using the same protocol, domain, and port number; you may have to deal with and work around the Same Origin Policy. Add feature tests and more exception handling as necessary.
Also notice that the request made using FileReader is larger with the same file and misses the leading character, as indicated in the question referred to by Frits van Campen. This may be due to a (WebKit) bug, and you may want to remove this alternative then; suffice it for me to say that the readAsBinaryString() method is deprecated already in the File API Working Draft in favor of readAsArrayBuffer() which should use Typed Arrays.
See also "Using files from web applications".
¹ Use true for asynchronous handling; this avoids UI blocking, but requires you to do processing in the event listener, and you will always have to cancel form submission (even if XHR was unsuccessful).
² If the property access is not possible, an exception will be thrown. If you prefer a real test, implement (additional) feature-testing (instead), and be aware that not everything can be safely feature-tested.
you can actually invoke a service using javascript, there is a sample code for this here
if your requirement is to upload the image and make the webservice call from JS then it could be tricky.
you can simply upload the image to a server and have the server call the webservice, there are loads of tools which helps you to upload a file to a server.
MultipartEntity sounds like Multipart/form-data.
You can use a regular XMLHttpRequest to make a POST request. You can use the HTML 5 FormData to build your Multipart/form-data request.
Here is an example: HTML5 File API readAsBinaryString reads files as much larger, different than files on disk
I've done this before and it works, using HTML5's canvas element. I'll be using jQuery here. I'm assuming a generic image of 300px by 300px.
First, add a hidden canvas to your page :
$("body").append('<canvas id="theCanvas" style="display:none" width="300px" height="300px"></canvas>');
Then, load the image to the canvas :
var canvas = document.getElementById('theCanvas');
var context = canvas.getContext('2d');
var imageObj = new Image();
imageObj.src = "/path/to/image.jpg";
context.drawImage(imageObj, 0, 0, 300, 300);
Now, you can access what's on the canvas as a data string and post it to the webservice using jQuery's post function :
$.post("path/to/service", {'image':canvas.toDataURL("image/png"), 'url':'caption'}, function(file){
//Callback code
});

Categories

Resources