Storing and sending raw file data within a JSON object

Storing and sending raw file data within a JSON object - javascript

I'm looking for a way to transfer the raw file data of any file-type with any possible content (By that I mean files and file-content are all user generated) both ways using xhr/ajax calls in a Backbone front-end against a Django back-end.
EDIT: Maybe the question is still unclear...
If you open a file in an IDE (such as Sublime), you can view and edit the actual code that comprises that file. I'm trying to put THAT raw content into a JSON so I can send to the browser, it can be modified, and then sent back.
I posted this question because I was under the impression that because the contents of these files can effectively be in ANY coding language that just stringify-ing the contents and sending it seems like a brittle solution that would be easy to break or exploit. Content could contain any number of ', ", { and } chars that would seem to break JSON formatting, and escaping those characters would leave artifacts within the code that would effectively break them (wouldn't it?).
If that assumption is wrong, THAT would also be an acceptable answer (so long as you could point out whatever it is I'm overlooking).
The project I'm working on is a browser-based IDE that will receive a complete file-structure from the server. Users can add/remove files, edit the content of those files, then save their changes back to the server. The sending/receiving all has to be handled via ajax/xhr calls.
Within Backbone, each "file" is instantiated as a model and stored in a Collection. The contents of the file would be stored as an attribute on the model.
Ideally, file content would still reliably throw all the appropriate events when changes are made.
Fetching contents should not be broken out into a separate call from the rest of the file model. I'd like to just use a single save/fetch call for sending/receiving files including the raw content.
Solutions that require Underscore/jQuery are fine, and I am able to bring in additional libraries if there is something available that specializes in managing that raw file data.

Interesting question. The code required to implement this would be quite involved, sorry that I'm not providing examples, but you seem like a decent programmer and should be able to implement what's mentioned below.
Regarding the sending of raw data through JSON, all you would need to do to make it JSON-safe and not break your code is to escape the special characters by stringyfying using Python's json.dumps & JavaScript's JSON.stringyfy. [1]
If you are concerned about some form of basic tamper-proofing, then light encoding of your data will fit the purpose, in addition to having the client and server pass a per-session token back and forth with JSON transfers to ensure that the JSON isn't forged from a malicious address.
If you want to check the end-to-end integrity of the data, then generate an md5 checksum and send it inside your JSON and then generate another md5 on arrival and compare with the one inside your JSON.
Base64 encoding: The size of your data would grow by 33% as it encodes four characters to represent three bytes of data.
Base85: Encodes four bytes as five characters and will grow your data by 25%, but uses much more processing overhead than Base64 in Python. That's a 8% improvement in data size, but at the expense of processing overhead. Also it's not string safe as double & single quotation marks, angle brackets, and ampersands cannot be used unescaped inside JSON, as it uses all 95 printable ASCII characters. Needs to be stringyfied before JSON transport. [2]
yEnc has as little as 2-3% overhead (depending on the frequency of identical bytes in the data), but is ruled out by impractical flaws (see [3]).
ZeroMQ Base-85, aka Z85. It's a string-safe variant of Base85, with a data overhead of 25%, which is better than Base64. No stringyfying necessary for sticking it into JSON. I highly recommended this encoding algorithm. [4] [5] [6]
If you're sending only small files (say a few KB), then the overhead of binary-to-text conversion will be acceptable. With files as large as a few Mbs, it might not be acceptable to have them grow by 25-33%. In this case you can try to compress them before sending. [7]
You can also send data to the server using multipart/form-data, but I can't see how this will work bi-directionally.
UPDATE
In conclusion, here's my solution's algorithm:
Sending data
Generate a session token and store it for the associated user upon
login (server), or retrieve from the session cookie (client)
Generate MD5 hash for the data for integrity checking during transport.
Encode the raw data with Z85 to add some basic tamper-proofing and JSON-friendliness.
Place the above inside a JSON and send POST when requested.
Reception
Grab JSON from POST
Retrieve session token from storage for the associated user (server), or retrieve from the session cookie (client).
Generate MD5 hash for the received data and test against MD5 in received JSON, reject or accept conditionally.
Z85-decode the data in received JSON to get raw data and store in file or DB (server) or process/display in GUI/IDE (client) as required.
References
[1] How to escape special characters in building a JSON string?
[2] Binary Data in JSON String. Something better than Base64
[3] https://en.wikipedia.org/wiki/YEnc
[4] http://rfc.zeromq.org/spec:32
[5] Z85 implementation in C/C++ https://github.com/artemkin/z85
[6] Z85 Python implementation of https://gist.github.com/minrk/6357188
[7] JavaScript zip library http://stuk.github.io/jszip/
[8] JavaScript Gzip SO JavaScript implementation of Gzip

AFAI am concerned a simple Base64 conversion will do it. Stringify, convert to base64, then pass it to the server and decode it there. Then you won't have the raw file transfer and you will still maintain your code simple.
I know this solution could seem a bit too simple, but think about it: many cryptographics algorithms can be broken given the right hardware. One of the most secure means would be through a digital certificate and then encrypt data with the private key and then send it over to the server. But, to reach this level of security every user of your application would have to have a digital certificate, which I think would be an excessive demand to your users.
So ask yourself, if implementing a really safe solution adds a lot of hassle to your users, why do you need a safe transfer at all? Based on that I reaffirm what I said before. A simple Base64 conversion will do. You can also use some other algotithms like SHA256 ou something to make it a litter bit safer.

If the only concern here is that the raw content of your code files (the "data" your model is storing), will cause some type of issue when stored in JSON, this is easily availed by escaping your data.
Stringifying your raw code file contents can cause issues as anything resembling JavaScript or JSON will be parsed into an actual JSON object. Your code file data can and should be stored simply as an esacaped string. Your fear here is that said string may contain characters that could break being stored in JavaScript inside a string, this is alleviated by escaping the entire string, and thus double, triple, quadruple, etc. escaping anything already escaped in the code file.
In essence it is important to remember here that raw code in a file is nothing but a glorified string when stored in a database, unless you are adding in-line metadata dynamically. It's just text, and doing standard escaping will make it safe to store in whatever format as a string (inside "" or '') in JSON.
I recommend reading this SO answer, as I also referenced it to verify what I already thought was correct:
How To Escape a JSON string containing newline characters using JavaScript

Related

How to create Web user auth token for Sinch?

I would like to create an authTicket for use with the Sinch Web SDK, as described in the docs (Authentication by your backend).
For the server-side code required to do this, Sinch provides at least two examples:
https://github.com/sinch/sinch-js-ticketgen
https://github.com/sinch/php-auth-ticket
The first step is JSON encoding. However, if I run this in JavaScript or PHP respectively, I get different results:
JavaScript JSON.stringify(userTicket)
{
"applicationKey":"XXXXXXXXXXXXXXXX",
"identity":{"type":"username","endpoint":"johndoe"},
"created":"2017-04-12T12:34:56.789Z",
"expiresIn":86400
}
PHP json_encode($userTicket)
{
"identity":{"type":"username","endpoint":"johndoe"},
"expiresIn":86400,
"applicationKey":"XXXXXXXXXXXXXXXX",
"created":"2017-04-12T12:34:56.789Z"
}`
(Please disregard the whitespace, this has to do with StackOverflow formatting. I'm asking about order of keys.)
Later, this output is supposed to go into a hash function. Since both JSON strings have the keys in different order, both inputs won't possibly result in the same hashed output.
What is the correct algorithm to compute the authTicket, especially when it comes to the JSON encoding part?

How can you access the HTTP response from a server using client-side JavaScript?

I'm trying to do client-side processing of some data sent in a server's HTTP response.
Here's what I'm working with: I have a web application that sends commands to a backend simulation engine, which then sends back a bunch of data/results in the response body. I want to be able to access this response using JavaScript (note..not making a new response, but simply accessing the data already sent from the server).
Right now, I am able to do this via a kludgy hack of sorts:
var responseText = "{{response}}";
This is using Django's template system, where I have already pre-formatted the template context variable "response" to contain a pre-formatted string representation of a csv file (i.e., proper unicode separators, etc).
This results in a huge string being transmitted to the page. Right now, this supports my immediate goal of making this data available for download as a csv, but it doesn't really support more sophisticated tasks. Also, I'm not sure if it will scale well when my string is, say, 2 MB as opposed to less than 1 KB.
I'd like to have this response data stored more elegantly, perhaps as part of the DOM or maybe in a cache (?) [not familiar with this].

The ideal way to do this is to not load the csv on document load, either as a javascript variable or as part of the DOM. Why would you want to load a 2MB data every time to the user when his intention may not be to download the csv everytime?
I suggest creating a controller/action for downloading the csv and get it on click of the download button.

Protecting raw JSON data from being copied

I'm creating an application with Node.js and Mongo DB, rendering the views with Swig.
I have a database of business names, addresses and geo location data that is being plotted onto a Google map with pins.
I'd like to stop users from easily copying the raw JSON data using view source, Firebug, Chrome Dev tools etc.
I'm not after bank grade security, just want to make it hard enough for most users to give up.
I have two routes of delivering the JSON package to the browser:
1) Using Swig, passing the JSON package directly to the view. Problem is that a simple view source will show the JSON.
2) Requesting the data with an AJAX call. In this scenario the data is easily accessible with Chrome Dev tools.
What are my options?

Base-64 encode the string.
Then you can just base64-decode it in JavaScript.
That should make it sufficiently unreadable, no real security though - of course.
Plus it's fast.
You need to take care with UTF-8 characters (e.g. German äöüÄÖÜ, or French èéàâôû)
e.g. like this in JavaScript:
var str = "äöüÄÖÜçéèñ";
var b64 = window.btoa(unescape(encodeURIComponent(str)))
console.log(b64);
var str2 = decodeURIComponent(escape(window.atob(b64)));
console.log(str2);
example:
var imgsrc = 'data:image/svg+xml;base64,' + btoa(unescape(encodeURIComponent(markup)));
var img = new Image(1, 1); // width, height values are optional params
img.src = imgsrc;
More secure variant:
Return encrypted base64 encoded JSON, plus the decryption algorithm, base64 encode them server-side, bit-shift it a few bits, return via ajax, then de-bitshift the string on the webpage, pass it to eval, which will give you the decrypt function, then decrypt the encrypted base64 string, then base-64 decode that string.
But that takes only a few seconds more on the chrome debug console to decrypt, i did decrypt such a thing once, I think on codecanyon to get to a "Tabs" script for free; (don't bother for the tabs, they're bloatware, better invest the time to do it yourself) ;)
I think you find that nowadays here http://www.slidetabs.com/, but I don't know if the "encryption" method is still in there.
Additionally, you can also escape the string in JavaScript, that then looks like this:
var _0xe91d=["\x28\x35\x28\x24\x29\x7B\x24\x2E\x32\x77\x2E
...
x5F\x63\x6F\x6E\x74\x5F\x64\x75\x72\x7C\x76\x5F\x74\x61\x62\x73\x5F\x61\x6C\x69\x67\x6E\x7C\x76\x5F\x74\x61\x62\x73\x5F\x64\x75\x72\x7C\x76\x5F\x73\x63\x72\x6F\x6C\x6C\x7C\x63\x6F\x6E\x74\x5F\x61\x6E\x69\x6D\x7C\x63\x6F\x6E\x74\x5F\x66\x78\x7C\x74\x61\x62\x5F\x66\x78\x7C\x72\x65\x70\x6C\x61\x63\x65\x7C\x62\x61\x6C\x69\x67\x6E\x7C\x61\x6C\x69\x67\x6E\x5F\x7C\x75\x6E\x6D\x6F\x75\x73\x65\x77\x68\x65\x65\x6C\x7C\x73\x77\x69\x74\x63\x68\x7C\x64\x65\x66\x61\x75\x6C\x74\x7C\x6A\x51\x75\x65\x72\x79","","\x66\x72\x6F\x6D\x43\x68\x61\x72\x43\x6F\x64\x65","\x72\x65\x70\x6C\x61\x63\x65","\x5C\x77\x2B","\x5C\x62","\x67"]
;eval(function (_0x173cx1,_0x173cx2,_0x173cx3,_0x173cx4,_0x173cx5,_0x173cx6){_0x173cx5=function (_0x173cx3){return (_0x173cx3<_0x173cx2?_0xe91d[4]:_0x173cx5(parseInt(_0x173cx3/_0x173cx2)))+((_0x173cx3=_0x173cx3%_0x173cx2)>35?String[_0xe91d[5]](_0x173cx3+29):_0x173cx3.toString(36));} ;if(!_0xe91d[4][_0xe91d[6]](/^/,String)){while(_0x173cx3--){_0x173cx6[_0x173cx5(_0x173cx3)]=_0x173cx4[_0x173cx3]||_0x173cx5(_0x173cx3);} ;_0x173cx4=[function (_0x173cx5){return _0x173cx6[_0x173cx5];} ];_0x173cx5=function (){return _0xe91d[7];} ;_0x173cx3=1;} ;while(_0x173cx3--){if(_0x173cx4[_0x173cx3]){_0x173cx1=_0x173cx1[_0xe91d[6]]( new RegExp(_0xe91d[8]+_0x173cx5(_0x173cx3)+_0xe91d[8],_0xe91d[9]),_0x173cx4[_0x173cx3]);} ;} ;return _0x173cx1;} (_0xe91d[0],62,284,_0xe91d[3][_0xe91d[2]](_0xe91d[1]),0,{}));
You can then bring the string back like:
"\x66\x72\x6F\x6D\x43\x68\x61\x72\x43\x6F\x64\x65".toString()
But for a moderate coder (like me), to figure out the system and decrypt the data of all this combined will take only appx. 15-30 minutes, (experimential find, from the codecanyon-try).
It's questionable if such a thing is worth the expense of your time, because it takes somebody like me less time to reverse-engineer your "encryption" than it takes you to "code" it.
Note that if you put a string like "\x66\x72\x6F\x6D\x43\x68\x61\x72\x43\x6F\x64\x65" into your appllication, you may trigger false alarms on certain virus scanners (McAffee, TrendMicro, Norton, etc., the usual suspects).
You can also partition the JSON string into an array of JSON-string chunks, makes it harder to decrypt it (maybe rotating the sequence in the array according to a certain system might help as well).
You can also break the string into an array of char:
var x = ['a', 'b', 'c'];
You can then bring it back like
console.log(x.join(""));
You can also reverse the string, and put that into an array (amCharts does that).
Then you bring it back with
x.reverse().join("");
The last one might be tricky for utf-8, as you need to correctly reverse strings like "Les misérables" (see also this and this)

Since the data will go on your client's computer, there is no other way to fully protect that data than... not sending it.
So, you could render some views on the server side and send them to the client but it may not be doable in your case.
Other way, would be to send data, but to make it difficult for an unauthorized user to access to it.
If your application is using an user database, you could generate a fixed key per user and encrypt sensible data before sending it to the client, and then the client would decrypt it with the same key calculated on the client side.
In addition, you can fine tune which data you want to send or not send to each user.
If you want to protect the data betweeen the moment the client's receive it and the moment it goes in your map, I'm afraid it is not possible as the map component you're using is probably waiting for standard JSON data.
Anyway, it makes no sense to protect your data as it will be displayed on your map.

Everything that is passed to client is not safe, you can try obfuscating data, but in the end the place where you put in the map will be accessible by just adding a line of console.log()
Another option, I'm just speculating as I'm not really sure how google maps work, but you might firstly send only the geolocation to the map, this way you will have pins on the map, the only after clicking on the ping you could fetch other data from api (name, address). Google maps should support something like onclick.

Annoy a potential scraper/hacker with all the tricks everyone talks about on this thread and others. But as it's been said many times, once the data is sent to the client, it's basically unprotected.
Perhaps your thinking should involve these things too:
-How to identify when someone is scraping (e.g. monitoring IPs, thresholds, user activity, etc) and do something about it or at least identify the culprit.
-Put copyrights and other identification on any thing you can, to help other users see and understand that it's your data, not the scrapers'. Look at what artists have been doing already, for a long time.
-Lay hidden traps in your data to help identify it as unique; that only you know about and the scraper wouldn't bother to look for or too lazy to check. If the scraper uses your data publicly too, then maybe this can be used in a legal case, or at least you could publicly shame the offender.

In node.js and express, how should a client send a large, up to 3k, amount of data to the server?

the client will be sending my server a change log, containing a list of commands and parameters, JSON or not is TBD.
This payload can be a 3 or 4K not likely to be more.
What is the standard approach to deal with requirement?
Client should send a json, containing all of the changes, as part of the request body?
Any recommendations? Lessons learned?

Just POST the data. 3-4 KB is nothing unless you're dealing with feature-phone WAP browsers in the middle of rural India, performance issues of the "OMG, I'm Google and care about every byte ever because of my zillion-user userbase" type, or something like that.
If you're really worried about payload size, you can gzip-base64 encode it before sending - but only do this if a) you really care about this (which is unlikely) and b) your payload is large enough that this saves you bandwidth. (gzip-base64'ing small payloads often increases their size, since there isn't enough data to get enough compression benefit to offset the 33% size increase from base64 encoding.)

You can use a normal JSON post to send across 3/4K of data.
You should pay more attention to what you do with the data received on the server side, whether you buffer up all data before you start processing them (store in db or elsewhere), or process them in chunks. If you are simply dumping the data into files on server, you should create a Writable stream and pump chunks of data received into the stream.
How are you going to process the received data on the server? But then, 3/4K is not really worrying amount of data.

You can set the maximun upload size with
app.use(express.limit('5mb'));
if that's an issue?
But, there shouldn't really be any limitations on this as default, except the max buffer size (which I believe is 1GB).
It also sounds like this is something that you can just post to the server with a regular POST request, in other words you use a form with a file input and just upload the file the regular way, as 4kb isn't really a big file.

send unescaped string with XHR (javascript)

I have string containing javascript file content and I need to upload it to server (it's not part of server side code - it's used only for development (on-site code editing) process so security is not important). PHP script uses file and content variables. I found only implementations where xhr.send() argument was string in standard GET format (var1=sth&var2=sthelse) but I'd like to send this string without any encoding / decoding / escaping / unescaping on server side. As it's js it contains all possible characters including '&' or '='.
Is it possible to pass data to POST in other way than using standard query?
I'm rather not interested in jQ solution

Sending JS file contents in a query string parameter is not a good idea. A few points:
If your situation support HTML5 you can send files to your server with JS.
If you don't want to send as a file upload I would send it as POST data.
Any data encoded or escaped on the client can be decoded or unescaped on the server. In some setups this is automatically done for you. There's no reason to avoid this if it's the proper way to handle data.

Develop Reference

JavaScript is the programming language of the Web.