I have a cpu intensive task in JavaScript that is blocking the VM while executing within a Promise:
An example could be the following (try it out in the browser):
function task() {
return new Promise((r,s) => {
for(var x=0; x < 1000000*1000000; x++) {
var Y = Math.sqrt(x/2)
}
return r(true)
})
}
I would like to avoid the VM main thread to be blocked, so I have tried to detach using a setTimeout in the Promise passing the resolve and reject as context like:
function task() {
return new Promise((r,s) => {
var self=this;
setTimeout( function(r,s) {
for(var x=0; x < 1000000*1000000; x++) {
var Y = Math.sqrt( Math.sin (x/2) + Math.cos(x/2))
}
return r(true);
},500,r,s);
})
}
but with no success. Any idea how to avoid the main thread to be stuck?
In addition to using web workers or similar, you can also (depending on the task) break up the work into smaller chunks that are worked on and scheduled using setImmediate. This example is a little silly, but you get the idea.
function brokenUpTask() {
let x = 0; // Keep track of progress here, not in the loop.
const limit = 1000000;
const chunk = 100000;
return new Promise((resolve) => {
function tick() { // Work a single chunk.
let chunkLimit = Math.min(x + chunk, limit);
for(x = 0; x < chunkLimit; x++) {
var Y = Math.sqrt(x/2);
}
if(x === limit) { // All done?
resolve(true);
return;
}
setImmediate(tick); // Still work to do.
}
tick(); // Start work.
});
}
brokenUpTask().then(() => console.log('ok'));
You can use "Web Workers". That way, you stay conform with standard. This will basically be a background thread (for example like BackgroundWorker in older C#).
MDN Web Workers
Using Web Workers to Speed-Up Your JavaScript Applications
I am not sure if current node.js supports this nativly, so use
npm web workers package
Related
I have a cordova app for iOS in which I'm using indexedDB to store significant amounts of data in separate stores in one database.
I want to inform the user of the amount of space being used by the app in this way, partly as the limit for indexedDB seems to be unclear/different on different devices, and I'd like to use it to see where the usage is at at point of failure, and also as a way to warn the user that they need to manage the data they're storing offline before it becomes a problem (although I know I can capture this is the transaction abort event - I just have no idea what the limit is!)
In development I've been using the function below in the browser (I have the browser platform added, just for development) which has worked well:
function showIndexedDbSize(db_name) {
"use strict";
var this_db;
var storesizes = new Array();
function openDatabase() {
return new Promise(function(resolve, reject) {
var request = window.indexedDB.open(db_name);
request.onsuccess = function (event) {
this_db = event.target.result;
resolve(this_db.objectStoreNames);
};
});
}
function getObjectStoreData(storename) {
return new Promise(function(resolve, reject) {
var trans = this_db.transaction(storename, IDBTransaction.READ_ONLY);
var store = trans.objectStore(storename);
var items = [];
trans.oncomplete = function(evt) {
var szBytes = toSize(items);
var szMBytes = (szBytes / 1024 / 1024).toFixed(2);
storesizes.push({'Store Name': storename, 'Items': items.length, 'Size': szMBytes + 'MB (' + szBytes + ' bytes)'});
resolve();
};
var cursorRequest = store.openCursor();
cursorRequest.onerror = function(error) {
reject(error);
};
cursorRequest.onsuccess = function(evt) {
var cursor = evt.target.result;
if (cursor) {
items.push(cursor.value);
cursor.continue();
}
}
});
}
function toSize(items) {
var size = 0;
for (var i = 0; i < items.length; i++) {
var objectSize = JSON.stringify(items[i]).length;
size += objectSize * 2;
}
return size;
}
openDatabase().then(function(stores) {
var PromiseArray = [];
for (var i=0; i < stores.length; i++) {
PromiseArray.push(getObjectStoreData(stores[i]));
}
Promise.all(PromiseArray).then(function() {
this_db.close();
console.table(storesizes);
});
});
};
It works well on the device too when the stores total <150MB, or thereabouts (there isn't a clear threshold), but it uses JSON.stringify to serialize the objects in order to count the bytes, and the process of doing this as the database grows larger on the device forces the app to restart. I'm watching the memory usage in XCode and it doesn't peak at all. Nothing. It hovers between 25 and 30MB whatever you do, not just this, which seems ok to me. The CPU is also <5%. The energy usage is high, but I'm not sure this would affect the app negatively, just drain the battery faster (unless I've misunderstood something). So I'm not sure why it's forcing an ugly restart. In my endless googling I've learnt that JSON.parse and JSON.stringify are very hungry processes, which is why I switched to indexedDB in the first place as it allows the storage of objects, avoiding these processes entirely.
My questions are as follows:
Is there a way to amend the function to slow it down (it doesn't need to be fast, just reliable!) to prevent the restart?
Why would the app refresh if there is not discernible pressure on the memory in XCode? Or is this not a very good way of detecting this sort of thing? Is there some hidden garbage collection problem in the function (I'm a noob when it comes to GC generally, but there doesn't seem to be any leaks in the app)
Is there a better way to show the usage of the database that would avoid this problem? Everything I find always relies on these JSON processes and the navigator.storage Web API doesn't appear to be supported on the cordova iOS platform (which is a real shame as it works amazingly on the browser! Gah!)
Any suggestions/thoughts massively appreciated!
Following is the code to create a 2d matrix in javascript:
function Create2DArray(rows) {
var arr = [];
for (var i=0;i<rows;i++) {
arr[i] = [];
}
return arr;
}
now I have a couple of 2d matrices inside an array:
const matrices = []
for(let i=1; i<10000; i++){
matrices.push(new Create2DArray(i*100))
}
// I'm just mocking it here. In reality we have data available in matrix form.
I want to do operations on each matrix like this:
for(let i=0; i<matrices.length; i++){
...domeAnythingWithEachMatrix()
}
& since it will be a computationally expensive process, I would like to do it via a web worker so that the main thread is not blocked.
I'm using paralleljs for this purpose since it will provide nice api for multithreading. (Or should I use the native Webworker? Please suggest.)
update() {
for(let i=0; i<matrices.length; i++){
var p = new Parallel(matrices[i]);
p.spawn(function (matrix) {
return doanythingOnMatrix(matrix)
// can be anything like transpose, scaling, translate etc...
}).then(function (matrix) {
return back so that I can use those values to update the DOM or directly update the DOM here.
// suggest a best way so that I can prevent crashes and improve performance.
});
}
requestAnimationFrame(update)
}
So my question is what is the best way of doing this?
Is it ok to use a new Webworker or Parallel instance inside a for loop?
Would it cause memory issues?
Or is it ok to create a global instance of Parallel or Webworker and use it for manipulating each matrix?
Or suggest a better approach.
I'm using Parallel.js for as alternative for Webworker
Is it ok to use parallel.js for multithreading? (Or do I need to use the native Webworker?)
In reality, the matrices would contain position data & this data is processed by the Webworker or parallel.js instance behind the scenes and returns the processed result back to the main app, which is then used to draw items / update canvas
UPDATE NOTE
Actually, this is an animation. So it will have to be updated for each matrix during each tick.
Currently, I'm creating a new Instance of parallel inside the for loop. I fear that this would be a non conventional approach. Or it would cause memory leaks. I need the best way of doing this. Please suggest.
UPDATE
This is my example:
Following our discussion in the comments, here is an attempt at using chunks. The data is processed by groups of 10 (a chunk), so that you can receive their results regularly, and we only start the animation after receiving 200 of them (buffer) to get a head start (think of it like a video stream). But these values may need to be adjusted depending on how long each matrix takes to process.
That being said, you added details afterwards about the lag you get. I'm not sure if this will solve it, or if the problem lays in your canvas update function. That's just a path to explore:
/*
* A helper function to process data in chunks
*/
async function processInChunks({ items, processingFunc, chunkSize, bufferSize, onData, onComplete }) {
const results = [];
// For each group of {chunkSize} items
for (let i = 0; i < items.length; i += chunkSize) {
// Process this group in parallel
const p = new Parallel( items.slice(i, i + chunkSize) );
// p.map is no a real Promise, so we create one
// to be able to await it
const chunkResults = await new Promise(resolve => {
return p.map(processingFunc).then(resolve);
});
// Add to the results
results.push(...chunkResults);
// Pass the results to a callback if we're above the {bufferSize}
if (i >= bufferSize && typeof onData === 'function') {
// Flush the results
onData(results.splice(0, results.length));
}
}
// In case there was less data than the wanted {bufferSize},
// pass the results anyway
if (results.length) {
onData(results.splice(0, results.length));
}
if (typeof onComplete === 'function') {
onComplete();
}
}
/*
* Usage
*/
// For the demo, a fake matrix Array
const matrices = new Array(3000).fill(null).map((_, i) => i + 1);
const results = [];
let animationRunning = false;
// For the demo, a function which takes time to complete
function doAnythingWithMatrix(matrix) {
const start = new Date().getTime();
while (new Date().getTime() - start < 30) { /* sleep */ }
return matrix;
}
processInChunks({
items: matrices,
processingFunc: doAnythingWithMatrix,
chunkSize: 10, // Receive results after each group of 10
bufferSize: 200, // But wait for at least 200 before starting to receive them
onData: (chunkResults) => {
results.push(...chunkResults);
if (!animationRunning) { runAnimation(); }
},
onComplete: () => {
console.log('All the matrices were processed');
}
});
function runAnimation() {
animationRunning = results.length > 0;
if (animationRunning) {
updateCanvas(results.shift());
requestAnimationFrame(runAnimation);
}
}
function updateCanvas(currentMatrixResult) {
// Just for the demo, we're not really using a canvas
canvas.innerHTML = `Frame ${currentMatrixResult} out of ${matrices.length}`;
info.innerHTML = results.length;
}
<script src="https://unpkg.com/paralleljs#1.0/lib/parallel.js"></script>
<h1 id="canvas">Buffering...</h1>
<h3>(we've got a headstart of <span id="info">0</span> matrix results)</h3>
First i use async and await very often and i get this error:
RangeError: Value undefined out of range for undefined options property undefined
at Set.add (<anonymous>)
at AsyncHook.init (internal/inspector_async_hook.js:19:25)
at PromiseWrap.emitInitNative (internal/async_hooks.js:134:43)
And i dont know how i can fix this, i write my code completly in Typescript and i dont created any file that is named 'async_hooks'.
And i dont run more then 10 function async at once i use await very often so it shouldnt stack up but javascript seems not to reduce the asyncId and reach the number limit very fast.
I tried to use less async await but this didnt fix the problem, but the error msg comes later. If i use very less async await i can prevent that this error comes until the function successfully finish the job.
(I use Electron 7)
Electron seems to have a very low async pool but it can be reproduced by a default typescript code:
class Test {
private async testCompare(a,b):Promise<boolean> {
return a == b;
}
public async testRun():Promise<void> {
for (let index = 0; index < 999999999; index++) {
for (let index2 = 0; index2 < 999999999; index2++) {
await this.testCompare(index,index2)
}
}
}
}
new Test().testRun();
This Code produce very much ram usage, and i think i have the same problem in my program. I think that the async pool get filled up until it reached its limit.
I got the same error for Set.add as you did once my set size reached 16777216 (2^24). I'm unable to find the info on this limit in the documentation, but I'll assume sets are limited to 16777216 unique values?
Easily tested with a simple for loop.
This will throw the exact same error:
let s = new Set();
for (let i = 0; i <= 16777216; i++) s.add(i);
This will run successfully:
let s = new Set();
for (let i = 0; i < 16777216; i++) s.add(i);
Do note that this will eat up some ~5GB of memory, so increase your heap limit accordingly if it's crashing due to memory restrictions.
Just encountered this using a Set, i used the following as a quick workaround
class Set {
hash = {}
add (v) {
this.hash[JSON.stringify(v)] = true
}
has (v) {
return this.hash[JSON.stringify(v)] == true
}
}
this has no limit, only limited by your systems memory
A workaround for that (as fast as Set) And Unique,
const setA = {};
try {
for (let i = 0; i < 16777216 + 500; i++) setA[i] = null;
} catch (err) {
console.log('Died at ', setA.size, ' Because of ');
console.error(err);
}
console.log('lived even after ', Object.keys(setA).length);
I got a blob to construct, and received almost 100 parts of (500k) to decrypt and construct a blob file.
Actually it's working fine, but when i do my decryption, that take processor, and freeze my page.
I try different approach, with defered of jquery, timeout but always the same probleme.
It's there a ways to not freez the UI thread ?
var parts = blobs.sort(function (a, b) {
return a.part - b.part;
})
// notre bytesarrays finales
var byteArrays = [];
i = 0;
for (var i = 0; i < blobs.length; i++)
{
// That job is intensive, and take time
byteArrays.push(that.decryptBlob(parts[i].blob.b64, fileType));
}
// create new blob with all data
var blob = new Blob(byteArrays, { type: fileType });
The body inside for(...) loop is synchronous, so the entire decryption process is synchronous, in simple words, decryption happens chunk after chunk. How about making it asynchronous ? Like decrypting multiple chunks in parallel. In JavaScript terminology we can use Asynchronous Workers. These workers can work in parallel, so if you spawn 5 workers for example. The total time is reduced by T / 5. (T = total time in synchronous mode).
Read more about worker threads here :
https://blog.logrocket.com/node-js-multithreading-what-are-worker-threads-and-why-do-they-matter-48ab102f8b10/
Tanks to Sebastian Simon,
I took the avenue of worker. And it's working fine.
var chunks = [];
var decryptedChucnkFnc = function (args) {
// My builder blob job here
}
// determine the number of maximum worker to use
var maxWorker = 5;
if (totalParts < maxWorker) {
maxWorker = totalParts;
}
for (var iw = 0; iw < maxWorker; iw++) {
eval('var w' + iw + ' = new Worker("decryptfile.min.js")');
var wo = eval("w" + iw);
var item = blobs.pop();
wo.postMessage(MyObjectPassToTheFile);
wo.onmessage = decryptedChucnkFnc;
}
Dear Javascript Guru's:
I have the following requirements:
Process a large array in batches of 1000 (or any arbitrary size).
When each batch is processed, update the UI to show our progress.
When all batches have been processed, continue with the next step.
For example:
function process_array(batch_size) {
var da_len = data_array.length;
var idx = 0;
function process_batch() {
var idx_end = Math.min(da_len, idx + batch_size);
while (idx < idx_end) {
// do the voodoo we need to do
}
}
// This loop kills the browser ...
while (idx < da_len) {
setTimeout(process_batch, 10);
// Show some progress (no luck) ...
show_progress(idx);
}
}
// Process array ...
process_array(1000);
// Continue with next task ...
// BUT NOT UNTIL WE HAVE FINISHED PROCESSING THE ARRAY!!!
Since I am new to javascript, I discovered that everything is done on a single thread and as such, one needs to get a little creative with regard to processing and updating the UI. I have found some examples using recursive setTimeout calls, (one key difference is I have to wait until the array has been fully processed before continuing), but I cannot seem to get things working as described above.
Also -- I am in need of a "pure" javascript solution -- no third party libraries or the use of web workers (that are not fully supported).
Any (and all) guidance would be appreciated.
Thanks in advance.
You can make a stream from array and use batch-stream to make batches so that you can stream in batches to UI.
stream-array
and
batch-stream
In JavaScript when executing scripts in a HTML page, the page becomes unresponsive until the script is finished. This is because JavaScript is single thread.
You could consider using a web worker in JavaScript that runs in the background, independently of other scripts, without affecting the performance of the page.
In this case User can continue to do whatever he wants in the UI.
You can send and receive messages from the web worker.
More info on Web Worker here.
So part of the magic of recursion is really thinking about the things that you need to pass in, to make it work.
And in JS (and other functional languages) that frequently involves functions.
function processBatch (remaining, processed, batchSize,
transform, onComplete, onProgress) {
if (!remaining.length) {
return onComplete(processed);
}
const batch = remaining.slice(0, batchSize);
const tail = remaining.slice(batchSize);
const totalProcessed = processed.concat(batch.map(transform));
return scheduleBatch(tail, totalProcessed, batchSize,
transform, onComplete, onProgress);
}
function scheduleBatch (remaining, processed, batchSize,
transform, onComplete, onProgress) {
onProgress(processed, remaining, batchSize);
setTimeout(() => processBatch(remaining, processed, batchSize,
transform, onComplete, onProgress));
}
const noop = () => {};
const identity = x => x;
function processArray (array, batchSize, transform, onComplete, onProgress) {
scheduleBatch(
array,
[],
batchSize,
transform || identity,
onComplete || noop,
onProgress || noop
);
}
This can be simplified extremely, and the reality is that I'm just having a little fun here, but if you follow the trail, you should see recursion in a closed system that works with an arbitrary transform, on arbitrary objects, of arbitrary array lengths, with arbitrary code-execution when complete, and when each batch is completed and scheduling the next run.
To be honest, you could even swap this implementation out for a custom scheduler, by changing 3 lines of code or so, and then you could log whatever you wanted...
const numbers = [1, 2, 3, 4, 5, 6];
const batchSize = 2;
const showWhenDone = numbers => console.log(`Done with: ${numbers}`);
const showProgress = (processed, remaining) =>
`${processed.length} done; ${remaining.length} to go`;
const quintuple = x => x * 5;
processArray(
numbers,
batchSize,
quintuple,
showWhenDone,
showProgress
);
// 0 done; 6 to go
// 2 done; 4 to go
// 4 done; 2 to go
// Done with: 5, 10, 15, 20, 25, 30
Overkill? Oh yes. But worth familiarizing yourself with the concepts, if you're going to spend some time in the language.
Thank-you all for your comments and suggestions.
Below is a code that I settled on. The code works for any task (in my case, processing an array) and gives the browser time to update the UI if need be.
The "do_task" function starts an anonymous function via setInterval that alternates between two steps -- processing the array in batches and showing the progress, this continues until all elements in the array have been processed.
function do_task() {
const k_task_process_array = 1;
const k_task_show_progress = 2;
var working = false;
var task_step = k_task_process_array;
var batch_size = 1000;
var idx = 0;
var idx_end = 0;
var da_len = data_array.length;
// Start the task ...
var task_id = setInterval(function () {
if (!working) {
working = true;
switch (task_step) {
case k_task_process_array:
idx_end = Math.min( idx + batch_size, da_len );
while (idx < idx_end) {
// do the voodoo we need to do ...
}
idx++;
}
task_step = k_task_show_progress;
working = false;
break;
default:
// Show progress here ...
// Continue processing array ...
task_step = k_task_process_array;
working = false;
}
// Check if done ...
if (idx >= da_len) {
clearInterval(task_id);
task_id = null;
}
working = false;
}
}, 1);
}