NodeJS - Writing to File Sometimes Finishes

NodeJS - Writing to File Sometimes Finishes - javascript

I have an electron app and in it, I'm writing to a file.
if( fs.existsSync(envFilepath) ) {
envFile = fs.createWriteStream(envFilepath)
var envs = document.getElementById("envs").value.split(",")
envs.forEach( function(env) {
if( env && env.localeCompare(" ") !== 0 ) {
env = env.trim()
envFile.write(env + '\n')
}
})
envFile.end()
envFile.on('finish', function syncConfigs() {
//perform additional tasks
})
}
Sometimes, the array envs has elements and it will write. For those moments, end() is called and finish event is caught and I roll along smoothly. However, sometimes, envs could be empty and I don't write to the file. NodeJS seems to hang because I didn't call a write() and the finish event never gets called.
Why is this happening? Is there a workaround for this situation?

If you use fs.createWriteStream() you should listen for the 'close' event not the 'finish' event.
envFile.on('close', () => {
// handle done writing
})

You can work-around your issue by just not opening the file if you don't have anything to write to it. And, if you just collect all your data to write, it's both more efficient to write it all at once and simpler to know in advance if you actually have anything to write so you can avoid even opening the file if there's nothing to write:
if( fs.existsSync(envFilepath) ) {
function additionalTasks() {
// perform additional tasks after data is written
}
// get data to be written
let envs = document.getElementById("envs").value.split(",");
let data = envs.map(function(env) {
if( env && env.localeCompare(" ") !== 0 ) {
return env.trim() + '\n';
} else {
return "";
}
}).join("");
// if there is data to be written, write it
if (data.length) {
fs.writeFile(envFilePath, data, function(err) {
if (err) {
// deal with errors here
} else {
additionalTasks();
}
});
} else {
// there was no file to write so just go right to the additional tasks
// but do this asynchronously so it is consistently done asynchronously
process.nextTick(additionalTasks);
}
}

Related

Why does 'require' keep on loading files as strings in the heap even though it's removed from de require.cache?

I have a NodeJS application that is like a task manager. Every second (using a rxjs timer), it fetches the tasks from the database that are scheduled. Every task has a filename and that file is loaded with required. Every file must have a run() method that executes the task. After the task is completed the required file is removed from require.cache. The application fails after a couple of days, because the heap is full. I've checked with node --inspect and saw that the required files are kept in the strings. Can somebody explain me why the required files are kept as strings on the heap, while the file is removed from the require.cache? And even better, if someone can explain to me how it can be solved it would be even better!
Below is a shortened version of my code
async test() {
timer(0, 1000).pipe(
exhaustMap(async () => {
return await this.executeTasks();
})
).subscribe();
}
async executeTasks(): Promise<void> {
// get all queue items
const tasks = await this.getTasks();
// for each item
for (const task of tasks) {
// execute item
await this.executeTask(task);
}
}
async executeTask(item: QueueItem) {
try {
const Callback: typeof AutomationCallback = require(item.callbackFile);
const callback = new Callback();
await callback.run();
} catch (error) {
if (error instanceof Error && (error as any).code === 'MODULE_NOT_FOUND') {
log_message('Failed executing', item.callback, item.id, `Cannot find file ${item.callbackFile}`);
} else if (error instanceof RescheduleError) {
log_message('Failed executing', item.callback, item.id, error.message);
} else {
log_message('Failed executing', item.callback, item.id, (error as any).toString());
}
} finally {
// remove require from cache;
try {
delete require.cache[require.resolve(item.callbackFile)];
} catch (error) {
}
}
}
Below is a screenshot of a snapshot, the highlighted file is added to the heap everytime the file is loaded with require. As you can see on the left side the heap will slowly fill, time between snapshot 1 and 2 is probably 5 minutes.

If everything is previously defined, why not have them required at top & call them conditionally?
Example:
class TaskRunnerBase {
async run() { this.doRun(); }
async doRun() {} // implemented by specific runners
}
class TaskRunnerA extends TaskRunnerBase {
async doRun() {...} // custom implementation for TaskRunnerA
}
class TaskRunnerB extends TaskRunnerBase {
async doRun() {...} // custom implementation for TaskRunnerB
}
class DefaultRunner extends TaskRunnerBase {
async doRun() { console.log('Default Runner'); }
}
TaskManager
const TaskRunnerA = require('./task-runners/task-runner-a');
const TaskRunnerB = require('./task-runners/task-runner-b');
// more runners
const DefaultRunner = require('./task-runners/default-runner');
...
...
async executeTask(item: QueueItem) {
try {
const Callback: typeof AutomationCallback = this.getRunnerByType(item.callbackFile);
const callback = new Callback();
await callback.run();
} catch (error) {
...
} finally {
...
}
}
getRunnerByType(type) {
if(type === 'RunnerA') { // "RunnerA" for example
return TaskRunnerA;
}
else if(type === 'RunnerB') { // "RunnerB" for example
return TaskRunnerB;
}
....
else {
return DefaultRunner; // If unknown type is passed, let's keep going, with DefaultRunner.
}
}
This is kind of approach is simple, manageable & no brainer in a long run.

require.cache is not the only place where loaded modules are cached. There are a few more ones depending on the type of module and how it is loaded, and some of these objects are not even exposed by Node.js, meaning that you could not delete them if you wanted.
A well-documented example is module.children, an array of loaded modules. If you look at the length of this array, you'll see that it gets larger and larger after each require, because a you are effectively loading the same modules over and over again, instead of reusing the cached objects:
const Callback: typeof AutomationCallback = require(item.callbackFile);
console.log('Loaded modules:', module.children.length); // <-
const callback = new Callback();
await callback.run();
You could of course clear the children list (which is not used as a cache anyway) together with the require cache:
// No guarantee that this will help.
delete require.cache[require.resolve(item.callbackFile)];
module.children.pop();
I even wrote my own tool to do something like that in a way that doesn't affect the rest of the program.
Doing so may help or not, and in fact, it could be even counterproductive if the sole goal is to prevent memory leaks - by allocating resources to load a module every time the module is required.
Your best option is probably to get rid of the finally block and keep the require.cache unchanged. In this way, each module will be loaded at most once.

How to read large files with fs.read and a buffer in javascript?

I'm just learning javascript, and a common task I perform when picking up a new language is to write a hex-dump program. The requirements are 1. read file supplied on command line, 2. be able to read huge files (reading a buffer-at-a-time), 3. output the hex digits and printable ascii characters.
Try as I might, I can't get the fs.read(...) function to actually execute. Here's the code I've started with:
console.log(process.argv);
if (process.argv.length < 3) {
console.log("usage: node hd <filename>");
process.exit(1);
}
fs.open(process.argv[2], 'r', (err,fd) => {
if (err) {
console.log("Error: ", err);
process.exit(2);
} else {
fs.fstat(fd, (err,stats) => {
if (err) {
process.exit(4);
} else {
var size = stats.size;
console.log("size = " + size);
going = true;
var buffer = new Buffer(8192);
var offset = 0;
//while( going ){
while( going ){
console.log("Reading...");
fs.read(fd, buffer, 0, Math.min(size-offset, 8192), offset, (error_reading_file, bytesRead, buffer) => {
console.log("READ");
if (error_reading_file)
{
console.log(error_reading_file.message);
going = false;
}else{
offset += bytesRead;
for (a=0; a< bytesRead; a++) {
var z = buffer[a];
console.log(z);
}
if (offset >= size) {
going = false;
}
}
});
}
//}
fs.close(fd, (err) => {
if (err) {
console.log("Error closing file!");
process.exit(3);
}
});
}
});
}
});
If I comment-out the while() loop, the read() function executes, but only once of course (which works for files under 8K). Right now, I'm just not seeing the purpose of a read() function that takes a buffer and an offset like this... what's the trick?
Node v8.11.1, OSX 10.13.6

First of all, if this is just a one-off script that you run now and then and this is not code in a server, then there's no need to use the harder asynchronous I/O. You can use synchronous, blocking I/O will calls such as fs.openSync(), fs.statSync(), fs.readSync() etc... and then thinks will work inside your while loop because those calls are blocking (they don't return until the results are done). You can write normal looping and sequential code with them. One should never use synchronous, blocking I/O in a server environment because it ruins the scalability of a server process (it's ability to handle requests from multiple clients), but if this is a one-off local script with only one job to do, then synchronous I/O is perfectly appropriate.
Second, here's why your code doesn't work properly. Javascript in node.js is single-threaded and event-driven. That means that the interpreter pulls an event out of the event queue, runs the code associated with that event and does nothing else until that code returns control back to the interpreter. At that point, it then pulls the next event out of the event queue and runs it.
When you do this:
while(going) {
fs.read(... => (err, data) {
// some logic here that may change the value of the going variable
});
}
You've just created yourself an infinite loop. This is because the while(going) loop just runs forever. It never stops looping and never returns control back to the interpreter so that it can fetch the next event from the event queue. It just keeps looping. But, the completion of the asynchronous, non-blocking fs.read() comes through the event queue. So, you're waiting for the going flag to change, but you never allow the system to process the events that can actually change the going flag. In your actual case, you will probably eventually run out of some sort of resource from calling fs.read() too many times in a tight loop or the interpreter will just hang in an infinite loop.
Understanding how to program a repetitive, looping type of tasks with asynchronous operations involved requires learning some new techniques for programming. Since much I/O in node.js is asynchronous and non-blocking, this is an essential skill to develop for node.js programming.
There are a number of different ways to solve this:
Use fs.createReadStream() and read the file by listening for the data event. This is probably the cleanest scheme. If your objective here is do a hex outputter, you might even want to learn a stream feature called a transform where you transform the binary stream into a hex stream.
Use promise versions of all the relevant fs functions here and use async/await to allow your for loop to wait for an async operation to finish before going to the next iteration. This allows you to write synchronous looking code, but use async I/O.
Write a different type of looping construct (not using a while) loop that manually repeats the loop after fs.read() completes.
Here's a simple example using fs.createReadStream():
const fs = require('fs');
function convertToHex(val) {
let str = val.toString(16);
if (str.length < 2) {
str = "0" + str;
}
return str.toUpperCase();
}
let stream = fs.createReadStream(process.argv[2]);
let outputBuffer = "";
stream.on('data', (data) => {
// you get an unknown length chunk of data from the file here in a Buffer object
for (const val of data) {
outputBuffer += convertToHex(val) + " ";
if (outputBuffer.length > 100) {
console.log(outputBuffer);
outputBuffer = "";
}
}
}).on('error', err => {
// some sort of error reading the file
console.log(err);
}).on('end', () => {
// output any remaining buffer
console.log(outputBuffer);
});
Hopefully you will notice that because the stream handles opening, closing and reading from the file for you that this is a lot simpler way to code. All you have to do is supply event handlers for data that is read, a read error and the end of the operation.
Here's a version using async/await and the new file interface (where the file descriptor is an object that you call methods on) with promises in node v10.
const fs = require('fs').promises;
function convertToHex(val) {
let str = val.toString(16);
if (str.length < 2) {
str = "0" + str;
}
return str.toUpperCase();
}
async function run() {
const readSize = 8192;
let cntr = 0;
const buffer = Buffer.alloc(readSize);
const fd = await fs.open(process.argv[2], 'r');
try {
let outputBuffer = "";
while (true) {
let data = await fd.read(buffer, 0, readSize, null);
for (let i = 0; i < data.bytesRead; i++) {
cntr++;
outputBuffer += convertToHex(buffer.readUInt8(i)) + " ";
if (outputBuffer.length > 100) {
console.log(outputBuffer);
outputBuffer = "";
}
}
// see if all data has been read
if (data.bytesRead !== readSize) {
console.log(outputBuffer);
break;
}
}
} finally {
await fd.close();
}
return cntr;
}
run().then(cntr => {
console.log(`done - ${cntr} bytes read`);
}).catch(err => {
console.log(err);
});

How to stop function execution inside nested promises

I'm a beginner when it comes to promises and I'm trying to understand how to work with them.
I have a firebase trigger where I am performing some validation. If the validation fails, I want to "exit" the trigger, meaning I don't want any code after the validation to execute (assuming the validation failed). But it does. Even though I'm sure that the validation fails (the "You have been timed out due to inactivity. Please go back to the bed booking map and start again" is sent to the android app I'm developing), the code after keeps executing. I know this because I've placed console logs inside it.
I've put comments in my code for the validation I'm talking about, and what code I don't want executed.
exports.createCharge = functions.database.ref('/customers/{userId}/charges/{id}')
.onCreate((snap, context) => {
console.log("Entered createharge function");
const val = snap.val();
return admin.database().ref(`/customers/${context.params.userId}/customer_id`)
.once('value').then((snapshot) => {
return snapshot.val();
}).then((customer) => {
// Do stuff
if (val.type === 'beds') {
// Check if user email is the same as bed email
for (var i = 0; i < val.beds.length; i++) {
var bedEmailRef = db.ref(`beds/${val.hid}/${val.beds[i]}/email`);
bedEmailRef.on("value", function(bedEmailSnap) {
var bedEmail = bedEmailSnap.val();
if (val.email !== bedEmail) { // VALIDATION
snap.ref.child('error').set("You have been timed out due to inactivity. Please go back to the bed booking map and start again");
return null; // Here, I want to exit the whole function.
}
});
}
// If val.email !== bedEmail, I NEVER want to reach here!
return admin.database().ref(`/hotels/${val.hid}/bedPrice`)
.once('value').then((tempBedPrice) => {
// do stuff
console.log("got here");
return stripe.charges.create(charge, {idempotency_key: idempotencyKey});
}).then((response) => {
// Do more stuff
return snap.ref.set(response);
}).catch((error) => {
snap.ref.child('error').set(userFacingMessage(error));
return reportError(error, {user: context.params.userId});
})
} else throw Error('No type');
});
});
Why am I getting this behaviour? How can I stop the code after the validation from executing?

The problem is that you are adding a "listener" after checking for "beds" type. A solution would be to fetch data once:
// This is a snippet from Firebase documentation
return firebase.database().ref('/users/' +
userId).once('value').then(function(snapshot) {
var username = (snapshot.val() && snapshot.val().username) ||
'Anonymous';
// ...
});
Or refactor your code so your validation can be set as a callback that can be returned if some criteria is met.
Additionally, here's a link for Cloud functions best practices that can help you writing better cloud functions.

In cases where you want to exit / don't want anymore events on the listeners, refactor your code as below:
bedEmailRef.on("value", function(bedEmailSnap) {
var bedEmail = bedEmailSnap.val();
if (val.email !== bedEmail) { // VALIDATION
snap.ref.child('error').set("You have been timed out due to inactivity. Please go back to the bed booking map and start again");
bedEmailRef.off('value'); // This will stop listening for further events
return null; // This is useless because returning inside a listener has no effect
}
});

Javascript Loop over elements and click a link using WebdriverIO

I am using Javascript, webdriverio (v2.1.2) to perform some data extraction from an internal site. So the idea is
Authenticate
Open the required URL, when authenticated
In the new page, search for an anchor tag having specific keyword
Once found, click on the anchor tag
Below is what I have tried and it works (last two points). I had to use Q and async to achieve it. I was hoping to use only Q to achieve it. Can someone help me, on how to achieve it using Q only ??
var EmployeeAllocationDetails = (function () {
'use stricy';
/*jslint nomen: true */
var Q = require('Q'),
async = require('async'),
_ead_name = 'Employee Allocation Details',
goToEadFromHome;
goToEadFromHome = function (browserClient) {
browserClient.pause(500);
var deferred = Q.defer();
browserClient.elements('table.rmg td.workListTD div.tab2 div.contentDiv>a', function (err, results) {
if (err) {
deferred.reject(new Error('Unable to get EAD page. ' + JSON.stringify(err)));
} else {
async.each(results.value, function (oneResult, callback) {
console.log('Processing: ' + JSON.stringify(oneResult));
browserClient.elementIdText(oneResult.ELEMENT, function (err, result) {
if (err) {
if (err.message.indexOf('referenced element is no longer attached to the DOM') > -1 ){
callback();
} else {
callback('Error while processing :' + JSON.stringify(oneResult) + '. ' + err);
}
} else if(!result){
console.log('result undefined. Cannot process: ' + JSON.stringify(oneResult));
callback();
} else if(result.value.trim() === _ead_name){
deferred.resolve(oneResult);
callback();
}
});
}, function (err) {
// if any of the processing produced an error, err would equal that error
if( err ) {
// One of the iterations produced an error.
// All processing will now stop.
console.log('A processing failed to process. ' + err);
} else {
console.log('All results have been processed successfully');
}
}); //end of async.each
}
});
return deferred.promise;
};
return {
launchEad : goToEadFromHome
}
})();
module.exports = EmployeeAllocationDetails;
Related Github Issue link https://github.com/webdriverio/webdriverio/issues/123

I think you should use async. I think your code is great. It runs everything in parallel and it handles error well.
If
If you want to remove async, there are several options:
use Q flow control
copy paste async's implementation
implement it yourself
If you try to use Q's flow control it will look something like this (pseudo-code):
var getTextActions = [];
function createAction(element){
return function(){
return element.getText();
}
}
each(elements, function(element){ getTextActions.push( createAction(element) ) });
Q.all(getTextActions).then(function(results) {
... iterate all results and resolve promise with first matching element..
} );
note this implementation has worse performance. It will first get the text from all elements, and only then try to resolve your promise. You implementation is better as it all runs in parallel.
I don't recommend implementing it yourself, but if you still want to, it will look something like this (pseudo-code):
var elementsCount = elements.length;
elements.each(function(element){
element.getText(function(err, result){
elementsCount --;
if ( !!err ) { logger.error(err); /** async handles this much better **/ }
if ( isThisTheElement(result) ) { deferred.resolve(result); }
if ( elementsCount == 0 ){ // in case we ran through all elements and didn't find any
deferred.resolve(null); // since deferred is resolved only once, calling this again if we found the item will have no effect
}
})
})
if something is unclear, or if I didn't hit the spot, let me know and I will improve the answer.

Node.js return result of file

I'd like to make a node.js function that, when calls, reads a file, and returns the contents. I'm having difficulty doing this because 'fs' is evented. Thus, my function has to look like this:
function render_this() {
fs.readFile('sourcefile', 'binary', function(e, content) {
if(e) throw e;
// I have the content here, but how do I tell people?
});
return /* oh no I can't access the contents! */;
};
I know that there might be a way to do this using non-evented IO, but I'd prefer an answer that allows me to wait on evented functions so that I'm not stuck again if I come to a situation where I need to do the same thing, but not with IO. I know that this breaks the "everything is evented" idea, and I don't plan on using it very often. However, sometimes I need a utility function that renders a haml template on the fly or something.
Finally, I know that I can call fs.readFile and cache the results early on, but that won't work because in this situation 'sourcefile' may change on the fly.

OK, so you want to make your development version to automatically load and re-render the file each time it changes, right?
You can use fs.watchFile to monitor the file and then re-render the template each time it changed, I suppose you've got some kind of global variable in your which states whether the server is running in dev or production mode:
var fs = require('fs');
var http = require('http');
var DEV_MODE = true;
// Let's encapsulate all the nasty bits!
function cachedRenderer(file, render, refresh) {
var cachedData = null;
function cache() {
fs.readFile(file, function(e, data) {
if (e) {
throw e;
}
cachedData = render(data);
});
// Watch the file if, needed and re-render + cache it whenever it changes
// you may also move cachedRenderer into a different file and then use a global config option instead of the refresh parameter
if (refresh) {
fs.watchFile(file, {'persistent': true, 'interval': 100}, function() {
cache();
});
refresh = false;
}
}
// simple getter
this.getData = function() {
return cachedData;
}
// initial cache
cache();
}
var ham = new cachedRenderer('foo.haml',
// supply your custom render function here
function(data) {
return 'RENDER' + data + 'RENDER';
},
DEV_MODE
);
// start server
http.createServer(function(req, res) {
res.writeHead(200);
res.end(ham.getData());
}).listen(8000);
Create a cachedRenderer and then access it's getData property whenever needed, in case you're in development mod it will automatically re-render the file each time it changes.

function render_this( cb ) {
fs.readFile('sourcefile', 'binary', function(e, content) {
if(e) throw e;
cb( content );
});
};
render_this(function( content ) {
// tell people here
});

Develop Reference

JavaScript is the programming language of the Web.

NodeJS - Writing to File Sometimes Finishes - javascript

If you use fs.createWriteStream() you should listen for the 'close' event not the 'finish' event. envFile.on('close', () => { // handle done writing })

Related

Why does 'require' keep on loading files as strings in the heap even though it's removed from de require.cache?

How to read large files with fs.read and a buffer in javascript?

How to stop function execution inside nested promises

Javascript Loop over elements and click a link using WebdriverIO

Node.js return result of file

Categories

Resources