I was wondering what the current state of javascript based json compression is. Are there any libraries currently available that allow compressing json, either by replacing long names with single characters, or some other method?
Someone has implemented HPack in Javascript, which could really improve JSON data set sizes, assuming your data set is homogeneous.
Since your emphasis is on transfer, rather than storage, don't forget to use things like gzip and to minimise your JSON. Those should be the first steps before adding yet more compression overhead.
Related
I'm just wondering if it's worth it, I'm using nodejs with socket.io and I need to send medium sized arrays to clients which contains small strings and numbers.
Would it be worth it to zip them or something or would the time to compress them would defeat it's own purpose to be faster ? The array I'm trying to compress are less that 1 mb.
As of now I see no latency but who knows, someone might have slow internet or old devices.
It depends entirely upon how large the arrays are and how much they would benefit from compression - neither of which you have disclosed.
For example, if they were 50k and could be compressed to 40k, that difference would be unlikely to be perceived.
If they were 1MB and could be compressed to 300k, that difference could be meaningful.
You will need to measure how large they typically are and then, if those are in a range where it might make a meaningful difference to compress them, then do some tests on how much they compress.
FYI, you can also look at how exactly the data is being sent over the wire because socket.io's default of JSON is not always the most compact way to format things either. For example, sending a large array of objects is going to repeat property names over and over in the JSON which might benefit a lot from compression, but might benefit even more from using a custom data format that's more compact.
What is the most efficient way in JavaScript to parse huge amounts of data from a file?
Currently I use JSON parse to serialize an uncompressed 250MB file, which is really slow. Is there a simple and fast way to read a lot of data in JavaScript from a file without looping through every character? The data stored in the file are only a few floating point arrays?
UPDATE:
The file contains a 3d mesh, 6 buffers (vert, uv etc). Also the buffers need to be presented as typed arrays. streaming is not a option because the file has to be fully loaded before the graphics engine can continue. Maybe a better question is how to transfer huge typed arrays from a file to javascript in the most efficient way.
I would recommend a SAX based parser for these kind of JavaScript or a stream parser.
DOM parsing would load the whole thing in memory and this is not the way to go by for large files like you mentioned.
For Javascript based SAX Parsing (in XML) you might refer to
https://code.google.com/p/jssaxparser/
and
for JSON you might write your own, the following link demonstrates how to write a basic SAX based parser in Javascript
http://ajaxian.com/archives/javascript-sax-based-parser
Have you tried encoding it to a binary and transferring it as a blob?
https://developer.mozilla.org/en-US/docs/DOM/XMLHttpRequest/Sending_and_Receiving_Binary_Data
http://www.htmlgoodies.com/html5/tutorials/working-with-binary-files-using-the-javascript-filereader-.html#fbid=LLhCrL0KEb6
There isn't a really good way of doing that, because the whole file is going to be loaded into memory and we all know that all of them have big memory leaks. Can you not instead add some paging for viewing the contents of that file?
Check if there are any plugins that allow you to read the file as a stream, that will improve this greatly.
UPDATE
http://www.html5rocks.com/en/tutorials/file/dndfiles/
You might want to read about the new HTML5 API's to read local files. You will have the issue with downloading 250mb of data still tho.
I can think of 1 solution and 1 hack
SOLUTION:
Extending the split the data in chunks: it boils down to http protocol. REST parts on the notion that http has enough "language" for most client-server scenarios.
You can setup on the client a request header Content-len to establish how much data you need per request
Then on the backend have some options http://httpstatus.es
Reply a 413 if the server is simply unable to get that much data from the db
417 if the server is able to reply but not under the requested header (Content-len)
206 with the provided chunk, letting know the client "there is more from where that came from"
HACK:
Use Websocket and get the binary file. Then use the html5 FileAPI to load it into memory.
This is likely to fail though because its not the download causing the problem, but the parsing of an almost-endless JS object
You're out of luck on the browser. Not only do you have to download the file, but you'll have to parse the json regardless. Parse it on the server, break it into smaller chunks, store that data into the db, and query for what you need.
I have a grid in a browser.
I want to send rows of data to the grid via JSON, but the browser should continuously parse the JSON as it receives it and add rows to the grid as they are parsed. Put another way, the rows shouldn't be added to the grid all at once after the entire JSON object is received -- they should be added as they are received.
Is this possible? Particularly using jQuery, Jackson, and Spring 3 MVC?
Does this idea have a name? I only see bits of this idea sparsely documented online.
You can use Oboe.js which was built exactly for this use case.
Oboe.js is an open source Javascript library for loading JSON using streaming, combining the convenience of DOM with the speed and fluidity of SAX.
It can parse any JSON as a stream, is small enough to be a micro-library, doesn’t have dependencies, and doesn’t care which other libraries you need it to speak to.
You can't parse incomplete or invalid JSON using the browser's JSON.parse. If you are streaming text, it will invariably try and parse invalid JSON at some point which will cause it to fail. There exists streaming JSON parsers out there, you might be able to find something to suit your needs.
Easiest way in your case would remain to send complete JSON documents for each row.
Lazy.js is able to parse "streaming" JSON (demo).
Check out SignalR.
http://www.hanselman.com/blog/AsynchronousScalableWebApplicationsWithRealtimePersistentLongrunningConnectionsWithSignalR.aspx
March 2017 update:
Websockets allow you to mantain an open connection to the server that you can use to stream the data to the table. You could encode individual rows as JSON objects and send them, and each time one is received you can append it to the table. This is perhaps the optimal way to do it, but it requires using websockets which might not be fully supported by your technology stack. If websockets is not an option, then you can try requesting the data in the smallest chunks the server will allow, but this is costly since every request has an overhead and you would end up doing multiple requests to get the data.
AFAIK there is no way to start parsing an http request before it's finished, and there is no way to parse a JSON string partially. Also, getting the data is orders of magnitude slower than processing it so it's not really worth doing.
If you need to parse large amounts of data your best bet is to do the streaming approach.
I have a simple 2D array in javascript. I want to pass this array to a ASP.NET page.
I wanted to know what would be the best option, going with JSON or XML.
The metric is speed and size of the data. In some cases, the array may be long in size.
Thank You.
The metric is speed and size of the data.
JSON is faster then XML in terms of speed. It's smaller then XML in terms of size.
XML is bloated to allow you to represent and validate structures.
However there are various BSON formats around where people take JSON and then hand optimise the storage format excessively. (BSON is binary JSON)
Some BSON spec I picked from google
Bison, A JavaScript parser for some arbitary BSON format.
Now if you really have bottlenecks with transferring data (which you probably don't) you may want to use WebSockets to send data over TCP rather then HTTP, thus reducing the amount of traffic and data you send.
Of course you only care about that if you making say X000 requests per second.
JSON Should be your best bet XML datatype sending might be a big pain as sometimes you would have to add in new configs just to support XML datatype to be sent as form data to the server. Genreally it is not a recommended practice due to security concerns
After some comments by David, I've decided to revise my question. The original question can be found below as well as the newly revised question. I'm leaving the original question simply to have a history as to why this question was started.
Original Question (Setting LZMA properties for jslzma)
I've got some large json files I need to transfer with ajax. I'm currently using jQuery and $.getJSON(). I'd like to use the jslzma library to decompress the files upon receiving them. Currently, I'm using django with the pylzma library to compress the files.
The only problem is that there's a lack of documentation for the jslzma library. There is some, but not enough. So I have two questions about how to use the library.
It gives this as an example:
LZMA.decompress(properties, inStream, outStream, outSize);
I know how to set the inStream and outStream variables, but not the properties or the outSize. So can anyone give an example(s) on how to set the properties variable (ie. what's expected) and how to calculate the outSize...
Thanks.
Edit #1 (Revised Question)
I'm looking for a compression scheme that lends itself to highly repeatable data using python (django) and javascript.
The data being transferred contains elevation measurements. Each file has 1200x1200 data points, which equates to about 2.75MB in it's raw binary form uncompressed. JSON balloons it to between 5-6MB. I've also looked into base64 (just to cover all the bases), which would reduce the size but I haven't had any success reading it in js. I think the data lends itself to easy compression just because of the highly repeatable data values. For example, one file only has 83 unique elevation values to describe 1440000 data points.
I just haven't had much luck, mainly because I'm just starting to learn JavaScript.
So can anyone suggest a compression scheme for this type of data? The goal is to minimize the transfer time by reducing the size for the data.
Thanks.
For what it's worth LZMA is typically very slow to compress as well as decompress; and thus it is more common to use bit faster compression schemes. Standard GZIP (deflate) has reasonably good balance: its compression ratio is acceptable, and its compression speed is MUCH better than that of LZMA or bzip2.
Also: most web servers and clients support automatic handling of gzip compression, which makes it even more convenient to use.
Decompression on the client side with Javacscript can take a significant longer time and highly depends on the available bandwidth of the client's box. Why not just implement a lesser but faster and easier to write decompression like rle, delta or golomb code? Or maybe you want to look into compressed Jsons?