Alternatives to Local Storage with JavaScript - javascript

I'm working on an app that needs to post frequently to the server using ajax, which slows down performance. As a workaround, I started testing localStorage as an alternative. The results are less than ideal, retrieving/setting localStorage frequently is super slow (something I was surprised to find), and kills performance.
On the other-hand, I don't really want to post to the server every time an event is fired, as this has pretty much the same performance as localStorage.
I'm wondering what other solutions there are?
Could I create a "data" object that gets continually updated when events are fired? With a timeout function that falls-back to post? If so how would you set that up?
What options are there for storing data on events that can be retrieved quickly? If someone could point me in the right direction I'd be super appreciative.

simple solution is to collect the information you wish to save in an array and save it once in a while
buffer = [];
// collect all mousemove events on #mydiv
$('#mydiv').mousemove(function(event) {
buffer.push(event.pageX + ", " + event.pageY)
})
createInterval(function() {
if (buffer.length > 0) { // save only if something happened
var to_transmit = buffer; // theoretically between this line and the next
// line you may miss an event, but in practice I wouldn't worry
buffer = []; // new buffer - next events will be inserted here
$.ajax({url: "/post/here/", data: to_transmit})
}
},
5000); // send ajax every 5 seconds

Related

Non blocking Javascript and concurrency

I have code on a web-worker and because i can't post to it an object with methods(functions) , i dont know how to stop blocking the UI with this code:
if (data != 'null') {
obj['backupData'] = obj.tbl.data().toArray();
obj['backupAllData'] = data[0];
}
obj.tbl.clear();
obj.tbl.rows.add(obj['backupAllData']);
var ext = config.extension.substring(1);
$.fn.dataTable.ext.buttons[ext + 'Html5'].action(e, dt, button, config);
obj.tbl.clear();
obj.tbl.rows.add(obj['backupData'])
This code exports records from an html table. Data is an array and is returned from a web worker and sometimes can have 50k or more objects.
As obj and all the methods that it contains are not transferable to we-worker, when data length 30k ,40k or 50k or even more, the UI blocks.
which is the best way to do this?
Thanks in advance.
you could try wrapping the heavy work in an async function like a timeout to allow the engine to queue the whole logic and elaborate it as soon as it has time
setTimeout(function(){
if (data != 'null') {
obj['backupData'] = obj.tbl.data().toArray();
obj['backupAllData'] = data[0];
}
//heavy stuff
}, 0)
or , if the code is extremely long, you can try figure it out a strategy to split your code into chunk of operation and execute each chunk in a separate async function (timeout)
Best way to iterate over an array without blocking the UI
Update:
Sadly, ImmutableJS doesn't work at the moment across webworkers. You should be able to transfer the ArrayBuffer so you don't need to parse it back into an array. Also read this article. If your workload is that heavy, it would be best to actually send back one item at a time from the worker.
Previously:
The code is converting all the data into an array, which is immediately costly. Try returning an immutable data structure from web worker if possible. This will guarantee that it doesn't change when the references change and you can continue iterating over it slowly in batches.
The next thing you can do is to use requestIdleCallback to schedule small batches of items to be processed.
This way you should be able to make the UI breathe a bit.

Getting large AJAX responses in Chrome without crashing

I am working on a project which is aimed at the Chrome browser. Our goal which we would like to accomplish is to get a one million record array into the browser to work with the data. When I generated a test file that contained a million records it was a bit more than one gigabyte.
For reasons I will explain, I believe we can accomplish the goal if can get the browser to collect the garbage when necessary. I believe the browser holds the text of the AJAX responses when it doesn't need to and crashes for that reason.
Now, I can generate a million records within the browser and manipulate it as I need to. However, I have trouble sending the AJAX to the browser without crashing it.
Since sending one million crashes it, I tried sending batches of one hundred thousand. I can get two such batches across and parse the JSON. If I do not have a onreadystatechange on my AJAX call, I can make the call a number of times. Also, if I receive a hundred thousand records, I can go over it ten times and make the full array.
Because I seem to be able to actually hold one million records, I believe that, as I said, holding the response texts is overwhelming the browsers.
In order to try to get better memory management, I have pushed the AJAX resquests and parsing into a web worker. When the webworker gets the AJAX and makes the hundred thousand record array, it pushes it to the DOM thread. When the DOM thread has taken the data it has the web worker do another AJAX.
However, it still crashes.
I am open to using websockets or something else, if that would help somehow.
Here is the code in the DOM thread:
var iterations=3;
var url='hunthou.json';
var worker=new Worker('src/worker.js');
var count=0;
worker.addEventListener('message',function(e){
alert('count: '+count);
//bigArr=bigArr.concat(e.data);
console.log('e.data length: '+e.data.length);
bigArr[count]=e.data;
console.log('bigArr length: '+bigArr.length);
if(count<(iterations-1)){
worker.postMessage(url);
} else{
alert('done');
console.log('done');
worker.terminate();
console.log('bye');
}
count++;
});
worker.postMessage(url);
Here is the webworker:
var arr=[];
var request = new XMLHttpRequest();
request.onreadystatechange = function () {
var DONE = this.DONE || 4;
if (this.readyState === DONE){
arr=JSON.parse(request.responseText);
self.postMessage(arr);
arr.length=0;
request.responseText.length=0;
console.log('okay');
}
};
self.addEventListener('message', function(e) {
var url=e.data;
console.log('url: '+url);
request.open("GET",'../'+url,true);
request.send(null);
}, false);
Instead of sending the whole data at once.
You can create multiple requests which will retrieve chunks of data instead of retrieving whole data at once, this will prevent your browser from crashing.

Parsing a large JSON array in Javascript

I'm supposed to parse a very large JSON array in Javascipt. It looks like:
mydata = [
{'a':5, 'b':7, ... },
{'a':2, 'b':3, ... },
.
.
.
]
Now the thing is, if I pass this entire object to my parsing function parseJSON(), then of course it works, but it blocks the tab's process for 30-40 seconds (in case of an array with 160000 objects).
During this entire process of requesting this JSON from a server and parsing it, I'm displaying a 'loading' gif to the user. Of course, after I call the parse function, the gif freezes too, leading to bad user experience. I guess there's no way to get around this time, is there a way to somehow (at least) keep the loading gif from freezing?
Something like calling parseJSON() on chunks of my JSON every few milliseconds? I'm unable to implement that though being a noob in javascript.
Thanks a lot, I'd really appreciate if you could help me out here.
You might want to check this link. It's about multithreading.
Basically :
var url = 'http://bigcontentprovider.com/hugejsonfile';
var f = '(function() {
send = function(e) {
postMessage(e);
self.close();
};
importScripts("' + url + '?format=json&callback=send");
})();';
var _blob = new Blob([f], { type: 'text/javascript' });
_worker = new Worker(window.URL.createObjectURL(_blob));
_worker.onmessage = function(e) {
//Do what you want with your JSON
}
_worker.postMessage();
Haven't tried it myself to be honest...
EDIT about portability: Sebastien D. posted a comment with a link to mdn. I just added a ref to the compatibility section id.
I have never encountered a complete page lock down of 30-40 seconds, I'm almost impressed! Restructuring your data to be much smaller or splitting it into many files on the server side is the real answer. Do you actually need every little byte of the data?
Alternatively if you can't change the file #Cyrill_DD's answer of a worker thread will be able to able parse data for you and send it to your primary JS. This is not a perfect fix as you would guess though. Passing data between the 2 threads requires the information to be serialised and reinterpreted, so you could find a significant slow down when the data is passed between the threads and be back to square one again if you try to pass all the data across at once. Building a query system into your worker thread for requesting chunks of the data when you need them and using the message callback will prevent slow down from parsing on the main thread and allow you complete access to the data without loading it all into your main context.
I should add that worker threads are relatively new, main browser support is good but mobile is terrible... just a heads up!

Socket.io: How to limit the size of emitted data from client to the websocket server

I have a node.js server with socket.io. My clients use socket.io to connect to the node.js server.
Data is transmitted from clients to server in the following way:
On the client
var Data = {'data1':'somedata1', 'data2':'somedata2'};
socket.emit('SendToServer', Data);
On the server
socket.on('SendToServer', function(Data) {
for (var key in Data) {
// Do some work with Data[key]
}
});
Suppose that somebody modifies his client and emits to the server a really big chunk of data. For example:
var Data = {'data1':'somedata1', 'data2':'somedata2', ...and so on until he reach for example 'data100000':'data100000'};
socket.emit('SendToServer', Data);
Because of this loop on the server...
for (var key in Data) {
// Do some work with Data[key]
}
... the server would take a very long time to loop through all this data.
So, what is the best solution to prevent such scenarios?
Thanks
EDIT:
I used this function to validate the object:
function ValidateObject(obj) {
var i = 0;
for(var key in obj) {
i++;
if (i > 10) { // object is too big
return false;
}
}
return false;
}
So the easiest thing to do is just check the size of the data before doing anything with it.
socket.on('someevent', function (data) {
if (JSON.stringify(data).length > 10000) //roughly 10 bytes
return;
console.log('valid data: ' + data);
});
To be honest, this is a little inefficient. Your client sends the message, socket.io parses the message into an object, and then you get the event and turn it back into a String.
If you want to be even more efficient then on the client side you should be enforcing max lengths of messages.
For even more efficiency (and to protect against malicious users), as packets come into Socket.io, if the length gets too long, then you should discard them. You'll either need to figure a way to extend the prototypes to do what you want or you'll need to pull the source and modify it yourself. Also, I haven't looked into the socket.io protocol but I'm sure you'll have to do more than just "discard" the packet. Also, some packets are ack-backs and nack-backs so you don't want to mess with those, either.
Side note: If you ONLY care about the number of keys then you can use Object.keys(obj) which returns an array of keys:
if (Object.keys(obj).length > 10)
return;
Probably you may consider switching to socket.io-stream and handle input stream directly.
This way you should join chunks and finally parse json input manually, but you have chance to close connection when input data length exceeds threshold you decide.
Otherwise (staying with socket.io approach) your callback won't be called until whole data stream were received. This doesn't stop your js main thread execution, but waste memory, cpu and bandwith.
On the other hand, if your only goal is to avoid overload of your processing algorithm you can continue limitting it by counting elements in the received object. For instance:
if (Object.keys(data).length > n) return; // Where n is your maximum acceptable number of elements.
// But, anyway, this doesn't control the actual size of each element.
EDIT: Because the question is about "how to handle server overload" You should check load balancing with gninx http://nginx.com/blog/nginx-nodejs-websockets-socketio/ - you could have additional servers in case one client is creating a bottleneck. The other servers would be available. Even if you solve this problem, there are still other problems, like client sending several small packets and so on.
The Socket.io -library seems to be a bit problematic, managing too big messages is not available at the websockets layer, there was a pull -request three years ago, which gives an idea how it might be solved:
https://github.com/Automattic/socket.io/issues/886
However, because WebSockets -protocol does have finite packet size it would allow you to stop processing of the packets if certain size has been achieved. The most effective way of doing this would be before the packet is stransformed to JavaScript heap. This means that you should handle the WebSocket transform manually - this is what the socket.io is doing for you but it does not take into account the packet size.
If you want to implement you own websocket layer, using this WebSocket -node implementation might be useful:
https://github.com/theturtle32/WebSocket-Node
If you do not need to support older browsers, using this pure websockets -approach might be suitable solution.
Well, I'll go with the Javascript side of the thing... let's say you don't want to allow users to go over a certain limit of data, you can just:
var allowedSize = 10;
Object.keys(Data).map(function( key, idx ) {
if( idx > allowedSize ) return;
// Do some work with Data[key]
});
this not only allows you to properly cycle through the elements of your object, it lets you limit easily. ( obviously this can also ruin your own pre-set requests )
Maybe destroy buffer size is what you need.
From the wiki:
destroy buffer size defaults to 10E7
Used by the HTTP transports. The Socket.IO server buffers HTTP request bodies up to this limit. This limit is not applied to websocket or flashsockets.

Database Backed Work Queue

My situation ...
I have a set of workers that are scheduled to run periodically, each at different intervals, and would like to find a good implementation to manage their execution.
Example: Let's say I have a worker that goes to the store and buys me milk once a week. I would like to store this job and it's configuration in a mysql table. But, it seems like a really bad idea to poll the table (every second?) and see which jobs are ready to be put into the execution pipeline.
All of my workers are written in javascript, so I'm using node.js for execution and beanstalkd as a pipeline.
If new jobs (ie. scheduling a worker to run at a given time) are being created asynchronously and I need to store the job result and configuration persistently, how do I avoid polling a table?
Thanks!
I agree that it seems inelegant, but given the way that computers work something *somewhere* is going to have to do polling of some kind in order to figure out which jobs to execute when. So, let's go over some of your options:
Poll the database table. This isn't a bad idea at all - it's probably the simplest option if you're storing the jobs in MySQL anyway. A rate of one query per second is nothing - give it a try and you'll notice that your system doesn't even feel it.
Some ideas to help you scale this to possibly hundreds of queries per second, or just keep system resource requirements down:
Create a second table, 'job_pending', where you put the jobs that need to be executed within the next X seconds/minutes/hours.
Run queries on your big table of all jobs only once in a longer while, then populate the small table which you query every shorter while.
Remove jobs that were executed from the small table in order to keep it small.
Use an index on your 'execute_time' (or whatever you call it) column.
If you have to scale even further, keep the main jobs table in the database, and use the second, smaller table I suggest, just put that table in RAM: either as a memory table in the DB engine, or in a Queue of some kind in your program. Query the queue at extremely short intervals if you have too - it'll take some extreme use cases to cause any performance issues here.
The main issue with this option is that you'll have to keep track of jobs that were in memory but didn't execute, e.g. due to a system crash - more coding for you...
Create a thread for each of a bunch of jobs (say, all jobs that need to execute in the next minute), and call thread.sleep(millis_until_execution_time) (or whatever, I'm not that familiar with node.js).
This option has the same problem as no. 2 - where you have to keep track job execution for crash recovery. It's also the most wasteful imo - every sleeping job thread still takes system resources.
There may be additional options of course - I hope that others answer with more ideas.
Just realize that polling the DB every second isn't a bad idea at all. It's the most straightforward way imo (remember KISS), and at this rate you shouldn't have performance issues so avoid premature optimizations.
Why not have a Job object in node.js that's saved to the database.
var Job = {
id: long,
task: String,
configuration: JSON,
dueDate: Date,
finished: bit
};
I would suggest you only store the id in RAM and leave all the other Job data in the database. When your timeout function finally runs it only needs to know the .id to get the other data.
var job = createJob(...); // create from async data somewhere.
job.save(); // save the job.
var id = job.id // only store the id in RAM
// ask the job to be run in the future.
setTimeout(Date.now - job.dueDate, function() {
// load the job when you want to run it
db.load(id, function(job) {
// run it.
run(job);
// mark as finished
job.finished = true;
// save your finished = true state
job.save();
});
});
// remove job from RAM now.
job = null;
If the server ever crashes all you have to is query all jobs that have [finished=false], load them into RAM and start the setTimeouts again.
If anything goes wrong you should be able to restart cleanly like such:
db.find("job", { finished: false }, function(jobs) {
each(jobs, function(job) {
var id = job.id;
setTimeout(Date.now - job.dueDate, function() {
// load the job when you want to run it
db.load(id, function(job) {
// run it.
run(job);
// mark as finished
job.finished = true;
// save your finished = true state
job.save();
});
});
job = null;
});
});

Categories

Resources