I have code on a web-worker and because i can't post to it an object with methods(functions) , i dont know how to stop blocking the UI with this code:
if (data != 'null') {
obj['backupData'] = obj.tbl.data().toArray();
obj['backupAllData'] = data[0];
}
obj.tbl.clear();
obj.tbl.rows.add(obj['backupAllData']);
var ext = config.extension.substring(1);
$.fn.dataTable.ext.buttons[ext + 'Html5'].action(e, dt, button, config);
obj.tbl.clear();
obj.tbl.rows.add(obj['backupData'])
This code exports records from an html table. Data is an array and is returned from a web worker and sometimes can have 50k or more objects.
As obj and all the methods that it contains are not transferable to we-worker, when data length 30k ,40k or 50k or even more, the UI blocks.
which is the best way to do this?
Thanks in advance.
you could try wrapping the heavy work in an async function like a timeout to allow the engine to queue the whole logic and elaborate it as soon as it has time
setTimeout(function(){
if (data != 'null') {
obj['backupData'] = obj.tbl.data().toArray();
obj['backupAllData'] = data[0];
}
//heavy stuff
}, 0)
or , if the code is extremely long, you can try figure it out a strategy to split your code into chunk of operation and execute each chunk in a separate async function (timeout)
Best way to iterate over an array without blocking the UI
Update:
Sadly, ImmutableJS doesn't work at the moment across webworkers. You should be able to transfer the ArrayBuffer so you don't need to parse it back into an array. Also read this article. If your workload is that heavy, it would be best to actually send back one item at a time from the worker.
Previously:
The code is converting all the data into an array, which is immediately costly. Try returning an immutable data structure from web worker if possible. This will guarantee that it doesn't change when the references change and you can continue iterating over it slowly in batches.
The next thing you can do is to use requestIdleCallback to schedule small batches of items to be processed.
This way you should be able to make the UI breathe a bit.
Related
I'm developing a facebook app which searches for facebook events near your position.The only way to do so is to search for all the places id's in your zone and then for each of those check if there is an event today.The problem I have is that the computation takes like 1-1:30 min which is kinda long. This is the code I use(might not be the best, I know):
foreach (var item in allPlacesIds)
{
RunOnUiThread (() =>loading.Text = string.Format ("Loading {0} possible events out of {1}",count,allPlacesIds.Count));
string query = string.Format ("{0}?&fields=id,name,events.fields(id,name,description,start_time,attending_count,declined_count,maybe_count,noreply_count).since({1}).until({2})", item,dateNow,dateTomorrow);
JsonObject result=(JsonObject)fb.Get (query, null);
try
{
JsonArray allEvents= (JsonArray)((JsonObject) result ["events"])["data"];
foreach (var events in allEvents)
{
Events theEvent= new Events(((JsonObject)events) ["id"].ToString(),
((JsonObject)events) ["name"].ToString(),
((JsonObject)events) ["description"].ToString(),
((JsonObject)events) ["start_time"].ToString(),
int.Parse(((JsonObject)events) ["attending_count"].ToString()),
int.Parse(((JsonObject)events) ["declined_count"].ToString()),
int.Parse(((JsonObject)events) ["maybe_count"].ToString()),
int.Parse(((JsonObject)events) ["noreply_count"].ToString()));
todaysEvents.Add(theEvent);
}
}
catch(Exception ex)
{
}
count++;
}
Where the try starts I used to have an if but that made it take even longer so I replaced it with a try block, as the result comes as null.
I know this isn't exactly a technical issue but I felt maybe you guys know a faster and better implementation of this, my only other option is to create and host a web service and use that just to interrogate data. the problem with that is that I need to invest a lot of money into a server/real ip/ and then I need to create a scheduled job to update the data daily.
Each API call takes some time, the only way to make it faster is to use Batch Requests. Here´s the documentation about those: https://developers.facebook.com/docs/graph-api/making-multiple-requests
Keep in mind that this will not count as one API call, it´s still the same amount, so be careful with API limits.
I am working on a project which is aimed at the Chrome browser. Our goal which we would like to accomplish is to get a one million record array into the browser to work with the data. When I generated a test file that contained a million records it was a bit more than one gigabyte.
For reasons I will explain, I believe we can accomplish the goal if can get the browser to collect the garbage when necessary. I believe the browser holds the text of the AJAX responses when it doesn't need to and crashes for that reason.
Now, I can generate a million records within the browser and manipulate it as I need to. However, I have trouble sending the AJAX to the browser without crashing it.
Since sending one million crashes it, I tried sending batches of one hundred thousand. I can get two such batches across and parse the JSON. If I do not have a onreadystatechange on my AJAX call, I can make the call a number of times. Also, if I receive a hundred thousand records, I can go over it ten times and make the full array.
Because I seem to be able to actually hold one million records, I believe that, as I said, holding the response texts is overwhelming the browsers.
In order to try to get better memory management, I have pushed the AJAX resquests and parsing into a web worker. When the webworker gets the AJAX and makes the hundred thousand record array, it pushes it to the DOM thread. When the DOM thread has taken the data it has the web worker do another AJAX.
However, it still crashes.
I am open to using websockets or something else, if that would help somehow.
Here is the code in the DOM thread:
var iterations=3;
var url='hunthou.json';
var worker=new Worker('src/worker.js');
var count=0;
worker.addEventListener('message',function(e){
alert('count: '+count);
//bigArr=bigArr.concat(e.data);
console.log('e.data length: '+e.data.length);
bigArr[count]=e.data;
console.log('bigArr length: '+bigArr.length);
if(count<(iterations-1)){
worker.postMessage(url);
} else{
alert('done');
console.log('done');
worker.terminate();
console.log('bye');
}
count++;
});
worker.postMessage(url);
Here is the webworker:
var arr=[];
var request = new XMLHttpRequest();
request.onreadystatechange = function () {
var DONE = this.DONE || 4;
if (this.readyState === DONE){
arr=JSON.parse(request.responseText);
self.postMessage(arr);
arr.length=0;
request.responseText.length=0;
console.log('okay');
}
};
self.addEventListener('message', function(e) {
var url=e.data;
console.log('url: '+url);
request.open("GET",'../'+url,true);
request.send(null);
}, false);
Instead of sending the whole data at once.
You can create multiple requests which will retrieve chunks of data instead of retrieving whole data at once, this will prevent your browser from crashing.
I'm supposed to parse a very large JSON array in Javascipt. It looks like:
mydata = [
{'a':5, 'b':7, ... },
{'a':2, 'b':3, ... },
.
.
.
]
Now the thing is, if I pass this entire object to my parsing function parseJSON(), then of course it works, but it blocks the tab's process for 30-40 seconds (in case of an array with 160000 objects).
During this entire process of requesting this JSON from a server and parsing it, I'm displaying a 'loading' gif to the user. Of course, after I call the parse function, the gif freezes too, leading to bad user experience. I guess there's no way to get around this time, is there a way to somehow (at least) keep the loading gif from freezing?
Something like calling parseJSON() on chunks of my JSON every few milliseconds? I'm unable to implement that though being a noob in javascript.
Thanks a lot, I'd really appreciate if you could help me out here.
You might want to check this link. It's about multithreading.
Basically :
var url = 'http://bigcontentprovider.com/hugejsonfile';
var f = '(function() {
send = function(e) {
postMessage(e);
self.close();
};
importScripts("' + url + '?format=json&callback=send");
})();';
var _blob = new Blob([f], { type: 'text/javascript' });
_worker = new Worker(window.URL.createObjectURL(_blob));
_worker.onmessage = function(e) {
//Do what you want with your JSON
}
_worker.postMessage();
Haven't tried it myself to be honest...
EDIT about portability: Sebastien D. posted a comment with a link to mdn. I just added a ref to the compatibility section id.
I have never encountered a complete page lock down of 30-40 seconds, I'm almost impressed! Restructuring your data to be much smaller or splitting it into many files on the server side is the real answer. Do you actually need every little byte of the data?
Alternatively if you can't change the file #Cyrill_DD's answer of a worker thread will be able to able parse data for you and send it to your primary JS. This is not a perfect fix as you would guess though. Passing data between the 2 threads requires the information to be serialised and reinterpreted, so you could find a significant slow down when the data is passed between the threads and be back to square one again if you try to pass all the data across at once. Building a query system into your worker thread for requesting chunks of the data when you need them and using the message callback will prevent slow down from parsing on the main thread and allow you complete access to the data without loading it all into your main context.
I should add that worker threads are relatively new, main browser support is good but mobile is terrible... just a heads up!
I'm trying transmit an image file from the server to the client, but my javascript callback becomes active before the stream closes I doing this because sending it in a traditional render json: times out and takes way to long anyway. The stream takes much less time, but i keep can't get all the data before the callback fires up.
controller code
def mytest
image=ImageList.new(AssistMe.get_url(image_url))
response.stream.write image.export_pixels(0, 0, image.columns, image.rows, 'RGBA').to_s
response.stream.close
end
javascript
var getStream, runTest;
runTest = function() {
return $.post('/dotest', getStream);};
getStream = function(params) {
return document.getElementById('whatsup2').innerHTML =
"stream is here " + params.length;};
the response is an array, I can make it an array of arrays by adding a "[" at the front and a "],['finish'] at the end to be able to detect the end of the data, but I haven't been able to figure out how to get javascript to wait until the end of stream to run. I assume i need to set up some kind of pole to check for the end, but how do I attach it to the callback?
Okay, here's a blog that describes this pretty well
blog
But i decided to forgo a stream and use .to_s. Since you can pipe several actions tougher
render object.method.method.to_s you get all the server side benefits of using a stream without the complexity. If you have a slow process where you need to overlap the client and server actions, then go to the blog and do it. Otherwise to_s covers it pretty well
My situation ...
I have a set of workers that are scheduled to run periodically, each at different intervals, and would like to find a good implementation to manage their execution.
Example: Let's say I have a worker that goes to the store and buys me milk once a week. I would like to store this job and it's configuration in a mysql table. But, it seems like a really bad idea to poll the table (every second?) and see which jobs are ready to be put into the execution pipeline.
All of my workers are written in javascript, so I'm using node.js for execution and beanstalkd as a pipeline.
If new jobs (ie. scheduling a worker to run at a given time) are being created asynchronously and I need to store the job result and configuration persistently, how do I avoid polling a table?
Thanks!
I agree that it seems inelegant, but given the way that computers work something *somewhere* is going to have to do polling of some kind in order to figure out which jobs to execute when. So, let's go over some of your options:
Poll the database table. This isn't a bad idea at all - it's probably the simplest option if you're storing the jobs in MySQL anyway. A rate of one query per second is nothing - give it a try and you'll notice that your system doesn't even feel it.
Some ideas to help you scale this to possibly hundreds of queries per second, or just keep system resource requirements down:
Create a second table, 'job_pending', where you put the jobs that need to be executed within the next X seconds/minutes/hours.
Run queries on your big table of all jobs only once in a longer while, then populate the small table which you query every shorter while.
Remove jobs that were executed from the small table in order to keep it small.
Use an index on your 'execute_time' (or whatever you call it) column.
If you have to scale even further, keep the main jobs table in the database, and use the second, smaller table I suggest, just put that table in RAM: either as a memory table in the DB engine, or in a Queue of some kind in your program. Query the queue at extremely short intervals if you have too - it'll take some extreme use cases to cause any performance issues here.
The main issue with this option is that you'll have to keep track of jobs that were in memory but didn't execute, e.g. due to a system crash - more coding for you...
Create a thread for each of a bunch of jobs (say, all jobs that need to execute in the next minute), and call thread.sleep(millis_until_execution_time) (or whatever, I'm not that familiar with node.js).
This option has the same problem as no. 2 - where you have to keep track job execution for crash recovery. It's also the most wasteful imo - every sleeping job thread still takes system resources.
There may be additional options of course - I hope that others answer with more ideas.
Just realize that polling the DB every second isn't a bad idea at all. It's the most straightforward way imo (remember KISS), and at this rate you shouldn't have performance issues so avoid premature optimizations.
Why not have a Job object in node.js that's saved to the database.
var Job = {
id: long,
task: String,
configuration: JSON,
dueDate: Date,
finished: bit
};
I would suggest you only store the id in RAM and leave all the other Job data in the database. When your timeout function finally runs it only needs to know the .id to get the other data.
var job = createJob(...); // create from async data somewhere.
job.save(); // save the job.
var id = job.id // only store the id in RAM
// ask the job to be run in the future.
setTimeout(Date.now - job.dueDate, function() {
// load the job when you want to run it
db.load(id, function(job) {
// run it.
run(job);
// mark as finished
job.finished = true;
// save your finished = true state
job.save();
});
});
// remove job from RAM now.
job = null;
If the server ever crashes all you have to is query all jobs that have [finished=false], load them into RAM and start the setTimeouts again.
If anything goes wrong you should be able to restart cleanly like such:
db.find("job", { finished: false }, function(jobs) {
each(jobs, function(job) {
var id = job.id;
setTimeout(Date.now - job.dueDate, function() {
// load the job when you want to run it
db.load(id, function(job) {
// run it.
run(job);
// mark as finished
job.finished = true;
// save your finished = true state
job.save();
});
});
job = null;
});
});