How to return memory of terminated web worker safely?

How to return memory of terminated web worker safely? - javascript

I made to use web workers to upload files.
It handle with small size files.
But with large size file, the speed is getting very slow and my script causes web page collapse.
It does not return memory of web workers.
See the attachment.
The Dedicated Workers keep being accumulated and consume GB memory, when large file is being uploaded.
And I see this warning accumulating whenever web worker call close()
Scripts may close only the windows that were opened by them.
I throttled threadsQuantity as 5.
I think the number of web workers should not be exceeded more than 5.
class Queue {
constructor() {
this.timestamp = new Date().getTime();
this.activeConnections = {};
this.threadsQuantity = 5;
}
async sendNext() {
const activeConnections = Object.keys(this.activeConnections).length;
if (activeConnections >= this.threadsQuantity) {
return;
}
if (!this.chunksQueue.length) {
if (!activeConnections) {
this.complete();
}
return;
}
let chunkId = this.chunksQueue.pop();
this.activeConnections[chunkId] = true;
this.sendChunk( chunkId) ;
}
sendChunk( chunkId) {
if (window.Worker) {
let chunk = this.getChunk( chunkId)
const myWorker = new Worker("/assets/js/worker.js?v=" + this.timestamp);
myWorker.postMessage([this.timestamp, chunkId, chunk]);
myWorker.onmessage = (e) => {
var obj = JSON.parse(e.data)
if( obj.success) {
delete this.activeConnections[chunkId];
this.sendNext()
close();
} else {
sendChunk( chunkId);
}
}
}
}
}
I tried with close() , self.close() but all got same warning and failed.
I tried with this.close(), but it cause this error.
app.0a4dcc55.js:32 Uncaught TypeError: this.close is not a function
at _.onmessage
How can I kill terminated web workers safely during process ?

Related

Web worker Integration

I want to use web worker to handle my zipcode checker function, I haven't worked with web worker before so the concept is new to me
This is my zipcode function
``
function checkZipCode() {
event.preventDefault();
if(document.getElementById('zipcode').value < 20000) {
document.getElementById('zip-result').innerHTML = 'Sorry, we haven’t expanded to that area yet';
} else if (document.getElementById('zipcode').value >= 20000) {
document.getElementById('zip-result').innerHTML = 'We’ve got your area covered!'
} else {
return null
}
};

As per the docs workers are pretty easy to spin up:
//in a JS file
const myWorker = new Worker('./myWorker.js');//worker requested and top-level scope code executed
myWorker.postMessage('hello');
myWorker.addEventListener('message', e => {
//e.data will hold data sent from worker
const message = e.data;
console.log(message); // HELLO
//if it's just a one-time thing, you can kill the worker
myWorker.terminate();
}
myWorker.addEventListener('error', e => {//worker might throw an error
e.preventDefault();
console.log(e.message, `on line ${e.lineno}`);
});
//myWorker.js
//run whatever you need, just no DOM stuff, no window etc
console.log('this line runs when worker loads');
addEventListener('message', (e) => {
postMessage(e.data.toUpperCase());//up-case message and send it right back
});

Postmessage in a loop

I want to bombard the receiver with say 1M messages, but also that the receiver will get each message as soon as possible. The naive way is to loop 1M times and just send a message to the receiver with postmessage. Doesn't work.
What i get is that the whole 1M messages are queued, and only when the code finishes, the receiver starts processing them.
What i need to happen is that the sender will send 1M messages and as he keeps on sending the messages the receiver simultaneously will process them.
For example, what i have now is something like this:
sender : send m1.
sender : send m2.
sender : send m3.
receiver : received m1.
receiver : received m2.
receiver : received m3.
What i want:
sender : send m1.
receiver : received m1.
sender : send m2.
receiver : received m2.
sender : send m3.
receiver : received m3.
How can i achieve this? I can not make the receiver send acks. My goal is to send as many massages as i can the fastest.
Edit: The code i have now:
Sender:
function sendx(x){
console.log("start spam");
for(let i=0; i<200000; i++){
window.opener.postMessage(x, '*');
}
console.log("done");
}
Receiver:
window.addEventListener("message", r_function );
function r_function(event)
{
let index = event.data;
let junk = something(index);
return junk;
}
Where the sender is a new window created by the receiver. What i get in practice is that only when the 'sendx' function ends, the receiver start receiving messages.

What i need to happen is that the sender will send 1M messages and as he keeps on sending the messages the receiver simultaneously will process them.
That's what happens already.
const worker = new Worker(URL.createObjectURL(
new Blob([worker_script.textContent])
));
let logged_first = false;
worker.onmessage = e => {
if(e.data === "spam") {
if(!logged_first) {
console.log('received first message at', new Date().toLocaleString());
logged_first = true; // ignore next messages
}
}
else {
console.log(e.data);
}
}
<script type="text/worker-script" id="worker_script">
const now = performance.now();
postMessage("start spamming at " + new Date().toLocaleString());
while(performance.now() - now < 5000) {
postMessage('spam');
}
postMessage("end spamming at " + new Date().toLocaleString());
</script>
However, for it to work, there is one big condition that needs to be met:
Your two JavaScript instances (sender & receiver) must run on different threads.
That is, if you were doing it using a MessageChannel on the same thread, then it would obviously be unable to treat the messages at the same time it's sending it:
const channel = new MessageChannel();
channel.port1.onmessage = e => {
console.log('received first message at', new Date().toLocaleString());
channel.port1.onmessage = null; // ignore next messages
};
const now = performance.now();
console.log("start spamming at ", new Date().toLocaleString());
while(performance.now() - now < 5000) {
channel.port2.postMessage('spam');
}
console.log("end spamming at ", new Date().toLocaleString());
And if you are dealing with an iframe or an other window, you can not be sure that you'll meet this condition. Browsers all behave differently, here, but they will all run at least some windows on the same process. You have no control as to which process will be used, and hence can't guarantee that you'll run in an other one.
So the best you can do, is to run your loop in a timed-loop, which will let the browser some idle time where it will be able to process other windows event loops correctly.
And the fastest timed-loop we have is actually the one postMessage offers us.
So to do what you wish, the best would be to run each iteration of your loop in the message event of a MessageChannel object.
For this, generator function* introduced in ES6 are quite useful:
/***************************/
/* Make Fake Window part */
/* ONLY for DEMO */
/***************************/
const fake_win = new MessageChannel();
const win = fake_win.port1; // window.open('your_url', '')
const opener = fake_win.port2; // used in Receiver
/********************/
/* Main window part */
/********************/
const messages = [];
win.onmessage = e => {
messages.push(e.data);
};
!function log_msg() {
document.getElementById('log').textContent = messages.length;
requestAnimationFrame(log_msg);
}();
/*******************/
/* Receiver part */
/*******************/
// make our loop a Generator function
function* ourLoopGen(i) {
while(i++ < 1e6) {
opener.postMessage(i);
yield i;
}
}
const ourLoop = ourLoopGen(0);
// here we init our time-loop
const looper = new MessageChannel();
looper.port2.onmessage = e => {
const result = ourLoop.next();
if(!result.done)
looper.port1.postMessage(''); // wait next frame
};
// start our time-loop
looper.port1.postMessage('');
<pre id="log"></pre>
We could also do the same using ES6 async/await syntax, since we can be sure that nothing else in our MessageChannel powered timed-loop will interfere (unlike in a Window's postMessage), we can promisify it:
/***************************/
/* Make Fake Window part */
/* ONLY for DEMO */
/***************************/
const fake_win = new MessageChannel();
const win = fake_win.port1; // window.open('your_url', '')
const opener = fake_win.port2; // used in Receiver
/********************/
/* Main window part */
/********************/
const messages = [];
win.onmessage = e => {
messages.push(e.data);
};
! function log_msg() {
document.getElementById('log').textContent = messages.length;
requestAnimationFrame(log_msg);
}();
/*******************/
/* Receiver part */
/*******************/
const looper = makeLooper();
// our async loop function
async function loop(i) {
while (i++ < 1e6) {
opener.postMessage(i);
await looper.next()
}
}
loop(0);
// here we init our promisified time-loop
function makeLooper() {
const engine = new MessageChannel();
return {
next() {
return new Promise((res) => {
engine.port2.onmessage = e => res();
engine.port1.postMessage('');
});
}
};
};
<pre id="log"></pre>
But it could obviously also be made entirely ES5 style with callbacks and everything:
/***************************/
/* Make Fake Window part */
/* ONLY for DEMO */
/***************************/
var fake_win = new MessageChannel();
var win = fake_win.port1; // window.open('your_url', '')
var opener = fake_win.port2; // used in Receiver
/********************/
/* Main window part */
/********************/
var messages = [];
win.onmessage = function(e) {
messages.push(e.data);
};
!function log_msg() {
document.getElementById('log').textContent = messages.length;
requestAnimationFrame(log_msg);
}();
/*******************/
/* Receiver part */
/*******************/
var i = 0;
var looper = makeLooper(loop);
// our callback loop function
function loop() {
if (i++ < 1e6) {
opener.postMessage(i);
looper.next(loop);
}
}
loop(0);
// here we init our promisified time-loop
function makeLooper(callback) {
var engine = new MessageChannel();
return {
next: function() {
engine.port2.onmessage = function(e) {
callback();
}
engine.port1.postMessage('');
}
};
};
<pre id="log"></pre>
But note that browsers will anyway throttle the pages that are not in focus, so you may have slower results than in these snippets.

fs.createWriteStream doesn't use back-pressure when writing data to a file, causing high memory usage

Problem
I'm trying to scan a drive directory (recursively walk all the paths) and write all the paths to a file (as it's finding them) using fs.createWriteStream in order to keep the memory usage low, but it doesn't work, the memory usage reaches 2GB during the scan.
Expected
I was expecting fs.createWriteStream to automatically handle memory/disk usage at all times, keeping memory usage at a minimum with back-pressure.
Code
const fs = require('fs')
const walkdir = require('walkdir')
let dir = 'C:/'
let options = {
"max_depth": 0,
"track_inodes": true,
"return_object": false,
"no_return": true,
}
const wstream = fs.createWriteStream("C:/Users/USERNAME/Desktop/paths.txt")
let walker = walkdir(dir, options)
walker.on('path', (path) => {
wstream.write(path + '\n')
})
walker.on('end', (path) => {
wstream.end()
})
Is it because I'm not using .pipe()? I tried creating a new Stream.Readable({read{}}) and then inside the .on('path' emitter pushing paths into it with readable.push(path) but that didn't really work.
UPDATE:
Method 2:
I tried the proposed in the answers drain method but it doesn't help much, it does reduce memory usage to 500mb (which is still too much for a stream) but it slows down the code significantly (from seconds to minutes)
Method 3:
I also tried using readdirp, it uses even less memory (~400mb) and is faster but I don't know how to pause it and use the drain method there to reduce the memory usage further:
const readdirp = require('readdirp')
let dir = 'C:/'
const wstream = fs.createWriteStream("C:/Users/USERNAME/Desktop/paths.txt")
readdirp(dir, {alwaysStat: false, type: 'files_directories'})
.on('data', (entry) => {
wstream.write(`${entry.fullPath}\n`)
})
Method 4:
I also tried doing this operation with a custom recursive walker, and even though it uses only 30mb of memory, which is what I wanted, but it is like 10 times slower than the readdirp method and it is synchronous which is undesirable:
const fs = require('fs')
const path = require('path')
let dir = 'C:/'
function customRecursiveWalker(dir) {
fs.readdirSync(dir).forEach(file => {
let fullPath = path.join(dir, file)
// Folders
if (fs.lstatSync(fullPath).isDirectory()) {
fs.appendFileSync("C:/Users/USERNAME/Desktop/paths.txt", `${fullPath}\n`)
customRecursiveWalker(fullPath)
}
// Files
else {
fs.appendFileSync("C:/Users/USERNAME/Desktop/paths.txt", `${fullPath}\n`)
}
})
}
customRecursiveWalker(dir)

Preliminary observation: you've attempted to get the results you want using multiple approaches. One complication when comparing the approaches you used is that they do not all do the same work. If you run tests on file tree that contains only regular files, that tree does not contain mount points, you can probably compare the approaches fairly, but when you start adding mount points, symbolic links, etc, you may get different memory and time statistics merely due to the fact that one approach excludes files that another approach includes.
I've initially attempted a solution using readdirp, but unfortunately, but that library appears buggy to me. Running it on my system here, I got inconsistent results. One run would output 10Mb of data, another run with the same input parameters would output 22Mb, then I'd get another number, etc. I looked at the code and found that it does not respect the return value of push:
_push(entry) {
if (this.readable) {
this.push(entry);
}
}
As per the documentation the push method may return a false value, in which case the Readable stream should stop producing data and wait until _read is called again. readdirp entirely ignores that part of the specification. It is crucial to pay attention to the return value of push to get proper handling of back-pressure. There are also other things that seemed questionable in that code.
So I abandoned that and worked on a proof of concept showing how it could be done. The crucial parts are:
When the push method returns false it is imperative to stop adding data to the stream. Instead, we record where we were, and stop.
We start again only when _read is called.
If you uncomment the console.log statements that print START and STOP. You'll see them printed out in succession on the console. We start, produce data until Node tells us to stop, and then we stop, until Node tells us to start again, and so on.
const stream = require("stream");
const fs = require("fs");
const { readdir, lstat } = fs.promises;
const path = require("path");
class Walk extends stream.Readable {
constructor(root, maxDepth = Infinity) {
super();
this._maxDepth = maxDepth;
// These fields allow us to remember where we were when we have to pause our
// work.
// The path of the directory to process when we resume processing, and the
// depth of this directory.
this._curdir = [root, 1];
// The directories still to process.
this._dirs = [this._curdir];
// The list of files to process when we resume processing.
this._files = [];
// The location in `this._files` were to continue processing when we resume.
this._ix = 0;
// A flag recording whether or not the fetching of files is currently going
// on.
this._started = false;
}
async _fetch() {
// Recall where we were by loading the state in local variables.
let files = this._files;
let dirs = this._dirs;
let [dir, depth] = this._curdir;
let ix = this._ix;
while (true) {
// If we've gone past the end of the files we were processing, then
// just forget about them. This simplifies the code that follows a bit.
if (ix >= files.length) {
ix = 0;
files = [];
}
// Read directories until we have files to process.
while (!files.length) {
// We've read everything, end the stream.
if (dirs.length === 0) {
// This is how the stream API requires us to indicate the stream has
// ended.
this.push(null);
// We're no longer running.
this._started = false;
return;
}
// Here, we get the next directory to process and get the list of
// files in it.
[dir, depth] = dirs.pop();
try {
files = await readdir(dir, { withFileTypes: true });
}
catch (ex) {
// This is a proof-of-concept. In a real application, you should
// determine what exceptions you want to ignore (e.g. EPERM).
}
}
// Process each file.
for (; ix < files.length; ++ix) {
const dirent = files[ix];
// Don't include in the results those files that are not directories,
// files or symbolic links.
if (!(dirent.isFile() || dirent.isDirectory() || dirent.isSymbolicLink())) {
continue;
}
const fullPath = path.join(dir, dirent.name);
if (dirent.isDirectory() & depth < this._maxDepth) {
// Keep track that we need to walk this directory.
dirs.push([fullPath, depth + 1]);
}
// Finally, we can put the data into the stream!
if (!this.push(`${fullPath}\n`)) {
// If the push returned false, we have to stop pushing results to the
// stream until _read is called again, so we have to stop.
// Uncomment this if you want to see when the stream stops.
// console.log("STOP");
// Record where we were in our processing.
this._files = files;
// The element at ix *has* been processed, so ix + 1.
this._ix = ix + 1;
this._curdir = [dir, depth];
// We're stopping, so indicate that!
this._started = false;
return;
}
}
}
}
async _read() {
// Do not start the process that puts data on the stream over and over
// again.
if (this._started) {
return;
}
this._started = true; // Yep, we've started.
// Uncomment this if you want to see when the stream starts.
// console.log("START");
await this._fetch();
}
}
// Change the paths to something that makes sense for you.
stream.pipeline(new Walk("/home/", 5),
fs.createWriteStream("/tmp/paths3.txt"),
(err) => console.log("ended with", err));
When I run the first attempt you made with walkdir here, I get the following statistics:
Elapsed time (wall clock): 59 sec
Maximum resident set size: 2.90 GB
When I use the code I've shown above:
Elapsed time (wall clock): 35 sec
Maximum resident set size: 0.1 GB
The file tree I use for the tests produces a file listing of 792 MB

You could exploit the returned value from WritableStream.write(): it essentially states if you should continue to read or not. a WritableStream has an internal property that stores the threshold after which the buffer should be processed by the OS. The drain event will be emitted when the buffer has been flushed, i.e. you can call safely call WritableStream.write() without risking to excessively fill the buffer (which means the RAM). Luckily for you, walkdir let you control the process: you can emit pause(pause the walk. no more events will be emitted until resume) and resume(resume the walk) event from the walkdir object, pausing and resuming the writing process on you stream accordingly. Try with this:
let is_emitter_paused = false;
wstream.on('drain', (evt) => {
if (is_emitter_paused) {
walkdir.resume();
}
});
walkdir.on('path', function(path, stat) {
is_emitter_paused = !wstream.write(path + '\n');
if (is_emitter_paused) {
walkdir.pause();
}
});

Here's an implementation inspired by #Louis's answer. I think it's a bit easier to follow and in my minimal testing it performs about the same.
const fs = require('fs');
const path = require('path');
const stream = require('stream');
class Walker extends stream.Readable {
constructor(root = process.cwd(), maxDepth = Infinity) {
super();
// Dirs to process
this._dirs = [{ path: root, depth: 0 }];
// Max traversal depth
this._maxDepth = maxDepth;
// Files to flush
this._files = [];
}
_drain() {
while (this._files.length > 0) {
const file = this._files.pop();
if (file.isFile() || file.isDirectory() || file.isSymbolicLink()) {
const filePath = path.join(this._dir.path, file.name);
if (file.isDirectory() && this._maxDepth > this._dir.depth) {
// Add directory to be walked at a later time
this._dirs.push({ path: filePath, depth: this._dir.depth + 1 });
}
if (!this.push(`${filePath}\n`)) {
// Hault walking
return false;
}
}
}
if (this._dirs.length === 0) {
// Walking complete
this.push(null);
return false;
}
// Continue walking
return true;
}
async _step() {
try {
this._dir = this._dirs.pop();
this._files = await fs.promises.readdir(this._dir.path, { withFileTypes: true });
} catch (e) {
this.emit('error', e); // Uh oh...
}
}
async _walk() {
this.walking = true;
while (this._drain()) {
await this._step();
}
this.walking = false;
}
_read() {
if (!this.walking) {
this._walk();
}
}
}
stream.pipeline(new Walker('some/dir/path', 5),
fs.createWriteStream('output.txt'),
(err) => console.log('ended with', err));

Electron app stops rendering while running Javascript code

I'm currently working on an electron app that is essentially a DDNS launcher for a media server I control. Basically, it checks for an internet connection, gets the current IP for the server, then opens it in the system's default browser. However, the splash screen that I wrote is totally broken.
Whenever I launch the app on my system (using npm from the terminal), it loads the frame, but the image freezes loading at about the 1/3 point. It won't load the rest of the image until the script that it at the bottom of the main HTML page is finished executing.
Is there something I'm missing about this? I can provide excerpts of the code if needed.
EDIT:
Source code excerpt:
<script>
function wait(ms) {
var start = new Date().getTime();
var end = start;
while (end < start + ms) {
end = new Date().getTime();
}
}
const isOnline = require('is-online');
const isReachable = require('is-reachable');
const {
shell
} = require('electron');
window.onload = function() {
// Main Script
console.log('before');
wait(3000);
document.getElementById('progresstext').innerHTML = "Testing connection...";
bar.animate(0.15); // Number from 0.0 to 1.0
wait(250);
var amIOnline = false;
if (isOnline()) {
amIOnline = true;
}
console.log("Internet Test Ran");
if (!amIOnline) {
document.getElementById('errortext').innerHTML = "ERROR: No internet connection. Check the internet connection.";
document.getElementById('progresstext').innerHTML = "ERROR";
}
var isEmbyReachable = false;
if (isReachable('******')) {
isEmbyReachable = true;
document.getElementById('progresstext').innerHTML = "Connection Test: Passed";
//=> true
}
//Open Emby in the default browser
if (amIOnline && isEmbyReachable) {
shell.openExternal("*****");
}
};
</script>
Pastebin link to the full source: https://pastebin.com/u1iZeSSK
Thanks
Development System Specs: macOS Mojave 10.14, Latest stable build of electron

The problem is in your wait function, since node js is sigle threaded your wait function is blocking your process. You may try following code. But I really recommend you to take a look at how to write async functions in JavaScript and setInterval and setTimeout as a start.
But for the time you may try this code.
window.onload = function () {
// Main Script
console.log('before');
// wait 3 seconds
setTimeout(function () {
document.getElementById('progresstext').innerHTML = "Testing connection...";
bar.animate(0.15); // Number from 0.0 to 1.0
// wait 250 mills
setTimeout(function () {
var amIOnline = false;
if (isOnline()) {
amIOnline = true;
}
console.log("Internet Test Ran");
if (!amIOnline) {
document.getElementById('errortext').innerHTML = "ERROR: No internet connection. Check the internet connection.";
document.getElementById('progresstext').innerHTML = "ERROR";
}
var isEmbyReachable = false;
if (isReachable('******')) {
isEmbyReachable = true;
document.getElementById('progresstext').innerHTML = "Connection Test: Passed";
//=> true
}
//Open Emby in the default browser
if (amIOnline && isEmbyReachable) {
shell.openExternal("*****");
}
}, 250)
}, 3000)
};
You may not while or any other blocking loops to wait in JavaScript since it will block all other executions including page rendering.

Chrome crash with webworkers and createImageBitmap

I'm currently having an issue when loading images with webworkers. I want to batch load a bunch of images and then do some processing on these images (in my case, convert source image to ImageBitmap using createImageBitmap). Currently the user has the ability to cancel the request. This causes a crash when trying to terminate the worker if the worker hasn't finished. I've created a fiddle here https://jsfiddle.net/e4wcro0o/18/ that crashes consistently.
The issue lies here:
function closeWorker() {
if (!isClosed) {
console.log("terminating worker");
isClosed = true;
worker.terminate();
}
}
for (let i = 0; i < srcImages.length; i++) {
loadImageWithWorker(new URL(srcImages[i], window.location).toString()).then(function(img) {
closeWorker();
console.log(img);
});
}
This may look a bit funky to call closeWorker() on the first resolved promise, but does it mean that the crash is reproducible. I've only test on chrome with 64.0.3282.186 (Official Build) (64-bit)
Any ideas on what I'm doing wrong?

I have come across the same issue. I think the cause comes terminating the worker during the createImageBitmap function.
I have modified your JSFiddle with a method of terminating the worker at the earliest chance to avoid a crash.
const worker = createWorker(() => {
const pendingBitmaps = {};
var pendingKill = false;
self.addEventListener("message", e => {
const src = e.data;
if (src == "KILL") {
pendingKill = true;
Promise.all(Object.values(pendingBitmaps)).then(_ => self.postMessage("READY"));
}
// not accepting anymore conversions
if (pendingKill) {
self.postMessage({src, bitmap: null});
return;
}
pendingBitmaps[src] = fetch(src).then(response => response.blob())
.then(blob => {
if (pendingKill) return null;
return createImageBitmap(blob);
})
.then(bitmap => {
self.postMessage({src,bitmap});
delete pendingBitmaps[src];
})
})
});
https://jsfiddle.net/wrf1sLbx/16/

Develop Reference

JavaScript is the programming language of the Web.

How to return memory of terminated web worker safely? - javascript

Related

Web worker Integration

Postmessage in a loop

fs.createWriteStream doesn't use back-pressure when writing data to a file, causing high memory usage

Electron app stops rendering while running Javascript code

Chrome crash with webworkers and createImageBitmap

Categories

Resources