I have this server in Node.js using socket.io.
I have a function
function updatePoints(){
pool.getConnection(function(err, connection) {
connection.query("SELECT * FROM `users`", function(error, rows) {
//here it fetchs the users' points and emit to their respective sockets
});
connection.release();
});
}
Alright, when I call this function inside another one, it seems to run before the codes that come before it.
Here:
function(){
connection.query('UPDATE `users` SET '...});
connection.query('UPDATE `users` SET '...});
updatePoints();
}
This second function is when a user is donating points to another one, so I decrease the points from the donor and increase the receiver's points.
Funny thing that when the function is called it emits for the donor just like I want it to do. The donor sees his points updating real time, while in the receiver's side he can't see it, unless the other one donates him points for a second time, then he sees the first donation going on. If the function is running before if should, how can I make it work?
connection.query() is asynchronous. That means that calling it just starts the operation and it then runs in the background. Meanwhile, the next line of Javascript runs right away while the connection.query() is going on in the background. So, if you do:
connection.query(...);
someFunction();
Then, as you have discovered, someFunction() will execute before connection.query() finishes.
The solution is to sequence your work using the completion callback of the connection.query() operation by putting anything that should happen after the query completes inside the callback itself.
connection.query(..., function(err, rows) {
// execute some function after the query has completed
someFunction();
});
This can be extended to more levels like this:
connection.query(..., function(err, rows) {
// execute another query after the first one
connection.query(..., function(err, moreRows) {
// execute some function after both queries have completed
someFunction();
});
});
// code placed right here will execute BEFORE the queries finish
Now standard in ES6 and available in many libraries, the concept of promises can also be used to coordinate or synchronize asynchronous operations. You can read more about promises in zillions of places on the web. One introductory explanation is here on MDN.
I don't really understand what you are trying to say but it seems like a closures problem.
If that is the case may be this
How do JavaScript closures work? can help you out.
It explains closures in detail.
Related
I have a cloud function in Firebase that, among a chain of promise invocations, ends with a call to this function:
function sendEmail() {
return new Promise((accept) => {
const Email = require('email-templates');
const email = new Email({...});
email.send({...}).then(() => {
console.log('Email sent');
}).catch((e) => {
console.error(e);
});
accept();
});
}
I am well aware of the fact that email.send() returns a promise. There's a problem however with this approach, that is, if I were to change the function to be:
function sendEmail() {
const Email = require('email-templates');
const email = new Email({...});
return email.send({...});
}
It usually results in the UI hanging for a significant amount of time (10+ seconds) because the time it takes from the promise to resolve equals the amount of time it takes for the email to send.
That's why I figured the first approach would be better. Just call email.send() asynchronously, it'll send the email eventually, and return a response to the client whether the email has finished its round trip or not.
The first approach is giving me problems. The cloud function finishes execution must faster, and thus ends up being a better experience for the user, however, the email doesn't send for another 15+ minutes.
I am considering another approach where we have a separate cloud function hook that handles the email sending, but I wanted to ask StackOverflow first.
I think there are two aspects being mixed here.
One side of the question deals with promises in the context of Cloud Functions. Promises in Cloud Functions need to be resolved before you call res.send() because right after this call the function will be shutdown and there's no guarantee that unresolved promises will complete before the function instance is terminated, see this question. You might as well never call res.send() and instead return the result of a promise as shown in the Firebase documentation, the key here would be to ensure the promise is resolved properly for example using an idiom like return myPromise().then(console.log); which will force the promise resolution.
Separately, as Bergi pointed out in the comments the first snippet uses an anti-pattern with promises and the second one is way more concise and clear. If you're experiencing a delay in the UI it's likely that the execution gets freezed waiting for the Function response and you might consider whether this could be avoided in your particular use case.
All that said, your last idea of creating a separate function to deal with the email send process would also likely reduce the response time and could even make more sense from a separation of concerns point of view. To go this route I would suggest to send a PubSub message from the main function so that a second one sends the email. Moreover, PubSub triggered function allows to configure retry policies which may be useful to ensure the mail will be sent in the context of eventual errors. This approach is also suggested in the question linked above.
As I have heard, and actually know recursion is far not better solution for many cases if talk about sync code. But i would like to ask someone who more experienced than I am about the solution. What do you think about this code? It will work okay (as I suppose now - cause it is not syncronous) or may be it has some significant (or not so) drawbacks? Where? Why?
Guys, I would very appreciate your help, I'm doubtfull about this part of code.
Maybe there is a better solution for this?
I just want to have function which will be able to run promise function (class method) every exact time span + time for resolving this async functon.
If still enough clear.. Step by step it should -
exec target promise fn ->
waiting for resolve ->
waiting for interval ->
exec target promise fn.
And additionally it should stop if promise fn failed
Thanks in advance!
export function poll(fn: Function, every: number): number {
let pollId = 0;
const pollRecursive = async () => {
try {
await fn();
} catch (e) {
console.error('Polling was interrupted due to the error', e);
return;
}
pollId = window.setTimeout(() => {
pollRecursive();
}, every);
};
pollRecursive();
return pollId;
}
Although you have a call to pollRecursive within the function definition, what you actually do is pass a new anonymous function that will call pollRecursive when triggered by setTimeout.
The timeout sets when to queue the call to pollRecursive, but depending on the contents of that queue, the actually function will be run slightly later. In any case, it allows all other items in the queue to process in turn, unlike a tight loop / tight recursive calls, which would block the main thread.
The one addition you may want to add is more graceful error handling, as transient faults are common on the Internet (i.e. there are a lot of places where something can go missing, which makes regular failed request part of "normal business" for a TypeScript app that polls).
In your catch block, you could still re-attempt the call after the next timer, rather than stop processing. This would handle transient faults.
To avoid overloading the server after a fault, you can back off exponentially (i.e. double the every value for each contiguous fault). This reduces server load, while still enabling your application to come back online later.
If you are running at scale, you should also add jitter to this back off, otherwise the server will be flooded 2, 4, 8, 16, 32... seconds after a minor fault. This is called a stampede. By adding a little random jitter, the clients don't all come back at once.
node.js beginner here:
A node.js applications scrapes an array of links (linkArray) from a list of ~30 urls.
Each domain/url has a corresponding (name).json file that is used to check whether the scraped links are new or not.
All pages are fetched, links are scraped into arrays, and then passed to:
function checkLinks(linkArray, name){
console.log(name, "checkLinks");
fs.readFile('json/'+name+'.json', 'utf8', function readFileCallback(err, data){
if(err && err.errno != -4058) throw err;
if(err && err.errno == -4058){
console.log(name+'.json', " is NEW .json");
compareAndAdd(linkArray, {linkArray: []}.linkArray, name);
}
else{
//file EXISTS
compareAndAdd(linkArray, JSON.parse(data).linkArray, name);
}
});
}
compareAndAdd() reads:
function compareAndAdd(arrNew, arrOld, name){
console.log(name, "compareAndAdd()");
if(!arrOld) var arrOld = [];
if(!arrNew) var arrNew = [];
//compare and remove dups
function hasDup(value) {
for (var i = 0; i < arrOld.length; i++)
if(value.href == arrOld[i].href)
if(value.text.length <= arrOld[i].text.length) return false;
arrOld.push(value);
return true;
}
var rArr = arrNew.filter(hasDup);
//update existing array;
if(rArr.length > 0){
fs.writeFile('json/'+name+'.json', JSON.stringify({linkArray: arrOld}), function (err) {
if (err) return console.log(err);
console.log(" "+name+'.json UPDATED');
});
}
else console.log(" "+name, "no changes, nothing to update");
return;
}
checkLinks() is where the program hangs, it's unbelievably slow. I understand that fs.readFile is being hit multiple times a second, but imo less than 30 hits seems pretty trivial: assuming this is a function meant to be used to serve data to (potentially) millions of users. Am I expecting too much from fs.readFile, or (more likely) is there another component (like writeFile, or something else entirely) that's locking everything up.
supplemental:
using write/readFileSync creates a lot of problems: this program is inherently async because it begins with request to external websites with largely varied response times and read/write would frequently collide. the functions above insure that writing for a given file only happens after it's been read. (though it is very slow)
Also, this program does not exit on its own, and I do not know why.
edit
I've reworked the program to read first then write synchronously last and the process is down to ~12 seconds. Apparently fs.readFile was getting hung when called multiple times. I don't understand when/how to use asynchronous fs, if multiple calls hangs the function.
All async fs operations are executed inside the libuv thread pool, which has a default size of 4 (can be changed by setting the UV_THREADPOOL_SIZE environment variable to something different). If all threads in the thread pool are busy, any fs operations will be queued up.
I should also point out that fs is not the only module that uses the thread pool, dns.lookup() (the default hostname resolution method used internally by node), async zlib methods, crypto.randomBytes(), and a couple of other things IIRC also use the libuv thread pool. This is just something to keep in mind.
If you read many files (checkLinks) in a loop, firstly ALL the fs.readFile functions will be called. And only AFTER that callbacks will be processed (they processed only if main function stack is empty). This would lead to significant starting delay. But don't worry about that.
You point that a program never ends. So, make a counter, count calls to checkLinks, and decrease the counter after callback function is called. Inside the callback, check the counter against 0 and then do finalizing logic (I suspect this could be a response to the http request).
Actually, it doesn't matter whether you use async version or sync. They will work relatively the same time.
Recently I made a webscraper in nodejs using 'promise'. I created a Promise for each url I wanted to scrape and then used all method:
var fetchUrlArray=[];
for(...){
var mPromise = new Promise(function(resolve,reject){
(http.get(...))()
});
fetchUrlArray.push(mPromise);
}
Promise.all(fetchUrlArray).then(...)
There were thousands of urls but only a few of them got timed out. I got the impression that it was handling 5 promises in parallel at a time.
My question is how exactly does promise.all() work. Does it:
Call each promise one by one and switch to the next one till the previous one is resolved.
Or does in process the promises in a batch of a few from the array.
Or does it fire all promises
What is the best way to solve this problem in nodejs. Because as it stands I can solve this problem way faster in Java/C#
What you pass Promise.all() is an array of promises. It knows absolutely nothing about what is behind those promises. All it knows is that those promises will get resolved or rejected sometime in the future and it will create a new master promise that follows the sum of all the promises you passed it. This is one of the nice things about promises. They are an abstraction that lets you coordinate any type of action (usually asynchronous) without regard for what type of action it is. As such, promises have literally nothing to do with the actual action. All they do is monitor the completion or error of the action and report that back to those agents following the promise. Other code actually runs the action.
In your particular case, you are immediately calling http.get() in a tight loop and your code (nothing to do with promises) is launching a zillion http.get() operations at once. Those will get fired as fast as the underlying transport can do them (likely subject to connection limits).
If you want them to be launched serially or in batches of say 10 at a time, then you have to code it that way yourself. Promises have nothing to do with that.
You could use promises to help you code them to launch serially or in batches, but it would take extra of your code to do that either way to make that happen.
The Async library is specifically built for running things in parallel, but with a maximum number in flight at any given time because this is a common scheme where you either have connection limits on your end or you don't want to overwhelm the receiving server. You may be interested in the parallelLimit option which lets you run a number of async operations in parallel, but with a maximum number in flight at any given time.
I would do it like this
Personally, I'm not a big fan of Promises. I think the API is extremely verbose and the resulting code is very hard to read. The method defined below results in very flat code and it's much easier to immediately understand what's going on. At least imo.
Here's a little thing I created for an answer to this question
// void asyncForEach(Array arr, Function iterator, Function callback)
// * iterator(item, done) - done can be called with an err to shortcut to callback
// * callback(done) - done recieves error if an iterator sent one
function asyncForEach(arr, iterator, callback) {
// create a cloned queue of arr
var queue = arr.slice(0);
// create a recursive iterator
function next(err) {
// if there's an error, bubble to callback
if (err) return callback(err);
// if the queue is empty, call the callback with no error
if (queue.length === 0) return callback(null);
// call the callback with our task
// we pass `next` here so the task can let us know when to move on to the next task
iterator(queue.shift(), next);
}
// start the loop;
next();
}
You can use it like this
var urls = [
"http://example.com/cat",
"http://example.com/hat",
"http://example.com/wat"
];
function eachUrl(url, done){
http.get(url, function(res) {
// do something with res
done();
}).on("error", function(err) {
done(err);
});
}
function urlsDone(err) {
if (err) throw err;
console.log("done getting all urls");
}
asyncForEach(urls, eachUrl, urlsDone);
Benefits of this
no external dependencies or beta apis
reusable on any array you want to perform async tasks on
non-blocking, just as you've come to expect with node
could be easily adapted for parallel processing
by writing your own utility, you better understand how this kind of thing works
If you just want to grab a module to help you, look into async and the async.eachSeries method.
First, a clarification: A promise does represent the future result of a computation, nothing else. It does not represent the task or computation itself, which means it cannot be "called" or "fired".
Your script does create all those thousands of promises immediately, and each of those creations does call http.get immediately. I would suspect that the http library (or something it depends on) has a connection pool with a limit of how many requests to make in parallel, and defers the rest implicitly.
Promise.all does not do any "processing" - it's not responsible for starting the tasks and resolving the passed promises. It only listens to them and checks whether they all are ready, and returns a promise for that eventual result.
I am quite new (just started this week) to Node.js and there is a fundamental piece that I am having trouble understanding. I have a helper function which makes a MySQL database call to get a bit of information. I then use a callback function to get that data back to the caller which works fine but when I want to use that data outside of that callback I run into trouble. Here is the code:
/** Helper Function **/
function getCompanyId(token, callback) {
var query = db.query('SELECT * FROM companies WHERE token = ?', token, function(err, result) {
var count = Object.keys(result).length;
if(count == 0) {
return;
} else {
callback(null, result[0].api_id);
}
});
}
/*** Function which uses the data from the helper function ***/
api.post('/alert', function(request, response) {
var data = JSON.parse(request.body.data);
var token = data.token;
getCompanyId(token, function(err, result) {
// this works
console.log(result);
});
// the problem is that I need result here so that I can use it else where in this function.
});
As you can see I have access to the return value from getCompanyId() so long as I stay within the scope of the callback but I need to use that value outside of the callback. I was able to get around this in another function by just sticking all the logic inside of that callback but that will not work in this case. Any insight on how to better structure this would be most appreciated. I am really enjoying Node.js thus far but obviously I have a lot of learning to do.
Short answer - you can't do that without violating the asynchronous nature of Node.js.
Think about the consequences of trying to access result outside of your callback - if you need to use that value, and the callback hasn't run yet, what will you do? You can't sleep and wait for the value to be set - that is incompatible with Node's single threaded, event-driven design. Your entire program would have to stop executing whilst waiting for the callback to run.
Any code that depends on result should be inside the getCompanyId callback:
api.post('/alert', function(request, response) {
var data = JSON.parse(request.body.data);
var token = data.token;
getCompanyId(token, function(err, result) {
//Any logic that depends on result has to be nested in here
});
});
One of the hardest parts about learning Node.js (and async programming is general) is learning to think asynchronously. It can be difficult at first but it is worth persisting. You can try to fight and code procedurally, but it will inevitably result in unmaintainable, convoluted code.
If you don't like the idea of multiple nested callbacks, you can look into promises, which let you chain methods together instead of nesting them. This article is a good introduction to Q, one implementation of promises.
If you are concerned about having everything crammed inside the callback function, you can always name the function, move it out, and then pass the function as the callback:
getCompanyId(token, doSomethingAfter); // Pass the function in
function doSomethingAfter(err, result) {
// Code here
}
My "aha" moment came when I began thinking of these as "fire and forget" methods. Don't look for return values coming back from the methods, because they don't come back. The calling code should move on, or just end. Yes, it feels weird.
As #joews says, you have to put everything depending on that value inside the callback(s).
This often requires you passing down an extra parameter(s). For example, if you are doing a typical HTTP request/response, plan on sending the response down every step along the callback chain. The final callback will (hopefully) set data in the response, or set an error code, and then send it back to the user.
If you want to avoid callback smells you need to use Node's Event Emitter Class like so:
at top of file require event module -
var emitter = require('events').EventEmitter();
then in your callback:
api.post('/alert', function(request, response) {
var data = JSON.parse(request.body.data);
var token = data.token;
getCompanyId(token, function(err, result) {
// this works
console.log(result);
emitter.emit('company:id:returned', result);
});
// the problem is that I need result here so that I can use it else where in this function.
});
then after your function you can use the on method anywhere like so:
getCompanyId(token, function(err, result) {
// this works
console.log(result);
emitter.emit('company:id:returned', result);
});
// the problem is that I need result here so that I can use it else where in this function.
emitter.on('company:id:returned', function(results) {
// do what you need with results
});
just be careful to set up good namespacing conventions for your events so you don't get a mess of on events and also you should watch the number of listeners you attach, here is a good link for reference:
http://www.sitepoint.com/nodejs-events-and-eventemitter/