Cannot encode/decode base64 a file using Node on Windows - javascript

I am trying to send over HTTP a ZIP file, to achieve that I encode/decode it in Base64. That is not working unfortunately.
I have figured out the issue is actually in the encode/decode itself and was able to isolate and reproduce it.
Consider a simple code which:
Reads a file from the filesystem.
Base64 encodes that file.
Base64 decodes the previously computed base64 string into a binary stream and save it into another file (same identical as original).
const fs = require("fs");
const buffer = fs.readFileSync("C:/users/Public/myzip.zip"); // 1. read
const base64data = buffer.toString("base64"); // 2. encode
fs.writeFileSync("C:/users/Public/myzip2.zip",
new Buffer(base64data, "base64"),
"base64"); // 3. decode + save
The code runs fine (I am on Windows 10), no errors. It successfully reads and writes files. However, file myzip2.zip is written but it cannot be opened: Windows complains it is invalid :(
A bit more context
The reason for this question is the following. I am using Base64 encoding in order to successfully send over a ZIP file from a client to a server.
This code isolates the problem I am having by leaving the networking complexity out of the equation. I need to figure out how to properly encode/decode a file using Base64. Once I can make it work on a single machine, it will work when sending the file.
Why is this basic set of commands not working?

Related

Remove the mutation (or corruption) from a base64 string

I upload a file using the file from the browser using the file type input. I have a system, where I receive that file on the client side read the binary of it and then convert it to base64 (using btoa), do some more stuff as part of my system and then the system uploads it to the remote Apache server as a Multipart request.
When I download the file, I do the reverse, parse the response, get the raw string, convert it to base64, send that string to client side and convert that base64 string into binary again (using atob) and save the file using octet-stream type using Blob constructor and the createObjectURL method. The file is saved on the disk, so no problem here. When I open the file it's content is unreadable.
I did some research and compared my base64 during upload with the base64 during the download part and I see that the base64 string gets corrupted in the download process.
So, for example, I am just making up the strings, but the intent is to show what is happening
Upload base64:
UIABvcfce+tre2df
Download corrupted base64:
UIABvcfDuuuvvv55rrre2df
You would see as shown above ce+t is changed to Duuuvvv55rr.
And therefore, my question - How to remove the mutation (or corruption) from a base64 string?
I tried a lot of search on stackoverflow and other online platform with direct or indirect references but somewhere it is not working.

NodeJS read file into string without encoding like PHP file_get_contents

I'm trying to process image and pass image data from a NodeJS app to hit an API wrote in PHP.
I use fs.readFileSync to read the image file (I'm using PNG here). The API only takes string as file content to upload. It seems like PHP file_get_contents doesn't have a specified encoding as I tried mb_detect_encoding($fileContent) and outputs false and the fileContents starts like \x89PNG\r.
I'm using Node v8 and it seems like i have to use some encoding to convert Buffer to string. I tried couple of encoding like base64, binary and the fileContent starts like \xc2\x89PNG\r or \xef\xbf\xbdPNG\r.
What is the equivalent of PHP file_get_contents in NodeJS? How can I get the right format of image data?
Thanks!
If I understood you right, you want to read an image file using nodeJS and send the image as a string to an API which is built using PHP. Have you tried first encoding it to base64 and casting it as string ?
var fs = require('fs');
// read binary data
var png= fs.readFileSync(pathToImageFile);
// convert binary data to base64 encoded string
var imageString = new Buffer(png).toString('base64');
Then send the imageString as a string, hope this helps!

Why can't I extract a zip file from a POST request?

I have a piece of client side code that exports a .docx file from Google Drive and sends the data to my server. It's pretty straight forward, it just exports the file, makes it into a blob, and sends the blob to a POST endpoint.
gapi.client.drive.files.export({
fileId: file_id,
mimeType: "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
}).then(function (response) {
// the zip file data is now in response.body
var blob = new Blob([response.body], {type: "application/vnd.openxmlformats-officedocument.wordprocessingml.document"});
// send the blob to the server to extract
var request = new XMLHttpRequest();
request.open('POST', 'return-xml.php', true);
request.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
request.onload = function() {
// the extracted data is in the request.responseText
// do something with it
};
request.send(blob);
});
Here is my server side code to save this file onto my server so I can do things with it:
<?php
file_put_contents('tmp/document.docx', fopen('php://input', 'r'));
When I run this, the file is created on my server. However, I believe it is corrupted, because when I try to unzip it (as you can do with .docx), this happens:
$ mv tmp/document.docx tmp/document.zip
$ unzip tmp/document.zip
Archive: document.zip
error [document.zip]: missing 192760059 bytes in zipfile
(attempting to process anyway)
error [document.zip]: start of central directory not found;
zipfile corrupt.
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
Why isn't it recognizing it as a proper .zip file?
You should first download the original zip, and compare its content to that what yhou receive on you server, you can do this e.gg. with totalcommander or line "diff" command.
When you do this, you will see if your zip is change during transfer.
With this information you can continue searching WHY it is changed.
E.g. when in you zipfile ascii 10 is transformed to "13" or "10 13" it could be a line ending problem on the file transfer
Because when you open files in php with fopen(..., 'r') it can happen, that \n signs are transformed when you are using windows, you could try to use fopen(..., 'rb') wich enforces BINARY reading a file without transfering line endings.
#see: https://stackoverflow.com/a/7652022/2377961
#see php documentation fopen
I think it may depends by that "application/x-www-form-urlencoded". So when you read the request data with php://input it saves also some http property, so the .zip it's corrupted. Try to open the .zip file and look at what there is inside.
To fix, if the problem is what I said before try to change the Contenent-type to application/octet-stream.
I would suggest using base64 to encode the binary data into a text stream before posting, I've done this before and it works well, using url encoding for binary data isn't going to work. Then on your server you base 64 decode to convert back to binary before storing.
Once its in base64 you can post it as text.
Well, to me it is not a ZIP file. Looking at the Drive API you can see that application/vnd.openxmlformats-officedocument.wordprocessingml.document is not zipped, like application/zipis. You should handle the file as an DOCX, i think. Have you tried that?
You are sending a BLOB (binary file) using "Content-type", "application/x-www-form-urlencoded" with no url encoding applied on the BLOB... so, the file that PHP receive is not a ZIP file, it's a corrupted one. Change the "Content-type" or apply url enconding to BLOB. You can get a better idea looking at MDN - Sending forms through JavaScript. This questions should help too: question 1, question 2. You must send the file properly.

File broken after encoding to base64 and decoding in another app

A PDF file is being generated client-side using jsPDF, encoded in base64 using btoa(), sent to a PHP API and there it's decoded and saved as a binary file, but it isn't working and I'm getting a malformed PDF.
PHP code:
$destination = 'test/file.pdf';
$content = base64_decode($content);
$uploaded = file_put_contents($destination, $content);
If I compare both files (The pdf file downloaded directly from the frontend, which works, vs the one downloaded from the server) this is what I get:
Original PDF File fragment (I cannot disclose the full file):
Post encode/decode one:
What could be causing this difference? Seems to be an encoding problem?
I cannot comment, because I need 50 rep :) Leaving an answer instead.
Make sure you are doing your POST request correctly. Instead of your PDF file, try to post another file to the server, for example an image file and try to open posted image file on the server.

javascript sendfile binary data to web service

At work we are trying to upload files from a web page to a web service using html 5/javascript in the browser end and C# in the web service. But have some trouble with encoding of some sort.
As for the javascript we get the file's binary data with help from a FileReader.
var file = ... // gets the file from an input
var fileReader = new FileReader();
fileReader.onload = dataRecieved;
fileReader.readAsBinaryString(file);
function dataRecieved() {
// Here we do a normal jquery ajax post with the file data (fileReader.result).
}
Wy we are posting the data manually and not with help from XmlHttpRequest (or similar) is for easier overall posting to our web service from different parts of the web page (it's wrapped in a function). But that doesn't seem to be the problem.
The code in the Web Service looks like this
[WebMethod]
public string SaveFileValueFieldValue(string value)
{
System.Text.UnicodeEncoding encoder = new UnicodeEncoding();
byte[] bytes = encoder.GetBytes(value);
// Saves file from bytes here...
}
All works well, and the data seems to be normal, but when trying to open a file (an image as example) it cannot be opened. Very basic text files seems to turn out okay. But if I upload a "binary" file like an image and then open both the original and the uploaded version in a normal text editor as notepad to see what differs, it seems to be wrong with only a few "invisible" characters and something that displays as a new line a few bytes in from from the start.
So basicly, the file seems to encode just a few bytes wrong somewhere in the conversions.
I've also tried to create an int array in javascript from the data, and then again transformed to a byte[] in the web service, with the exact same problem. If I try to convert with anything else than unicode (like UTF-8), the data turns out completly different from the original, so I think om on the right track here, but with something slightly wrong.
The request itself is text, so binary data is lost if you send the wrong enc-type.
What you can do is encode the binary to base64 and decode it on the other side.
To change the enc-type to multi-part/mixed and set boundaries (just like an e-mail or something) you'd have to assemble the request yourself.

Categories

Resources