Javascript event sequence, pg, postgresql, why? - javascript

I am trying to write a javascript file in express to talk to a postgresql database. More precisely, I want to write a function that takes SQL as an input parameter and returns the stringified json. I can assume memory is not an issue given these table sizes. This is paid work making an internal use tool for a private business.
My most recent attempt involved the query callback putting the value into a global variable, but even that still fails because the outermost function returns before the json string is defined. Here is the relevant code:
var dbjson;
function callDB(q) {
pg.connect(connectionString, function(err, client, done) {
if (err) {
console.error('error fetching client from pool', err);
} else {
client.query(q, [], function(err, result) {
client.query('COMMIT');
done();
if (err) {
console.error('error calling query ' + q, err);
} else {
dbjson = JSON.stringify(result.rows);
console.log('1 ' + dbjson);
}
console.log('2 ' + dbjson);
});
console.log('3 ' + dbjson);
}
console.log('4 ' + dbjson);
});
console.log('5 ' + dbjson);
}
The SQL in my test is "select id from users".
The relevant console output is:
5 undefined
GET /db/readTable?table=users 500 405.691 ms - 1671
3 undefined
4 undefined
1 [{"id":1},{"id":2},{"id":3},{"id":4}]
2 [{"id":1},{"id":2},{"id":3},{"id":4}]
Why do the console logs occur in the order that they do?
They are consistent in the order.
I attempted to write a polling loop to wait for the global variable to be set using setTimeout in the caller and then clearing the timeout within the callback but that failed, I think, because javascript is single threaded and my loop did not allow other activity to proceed. Perhaps I was doing that wrong.
While I know I could have each function handle its own database connection and error logging, I really hate repeating the same code.
What is a better way to do this?
I am relatively new to express and javascript but considerably more experienced with other languages.

Presence of the following line will break everything for you:
client.query('COMMIT');
You are trying to execute an asynchronous command in a synchronous manner, and you are calling done(), releasing the connection, before that query gets a chance to execute. The result of such invalid disconnection would be unpredictable, especially since you are not handling any error in that case.
And why are you calling a COMMIT there in the first place? That in itself looks completely invalid. COMMIT is used for closing the current transaction, that which you do not even open there, so it doesn't exist.
There is a bit of misunderstanding there in terms of asynchronous code usage and the database also. If you want to have a good start at both, I would suggest to have a look at pg-promise.

Related

Using recursion in try catch block in JavaScript

I have a nodejs script which creates dynamic tables and views for the temperature recorded for the day. Sometimes it does not create tables if the temperature is not in the normal range. For this I decide to use try catch and call the function recursively. I am not sure if I have done it correctly or if there is another way to call the con.query method, so that tables get created. I encountered this problem for first time in nodejs.
To start with, you have to detect errors and only recurse when there are specific error conditions. If the problem you're trying to solve is one specific error, then you should probably detect that specific error and only repeat the operation when you get that precise error.
Then, some other recommendations for retrying:
Retry only a fixed number of times. It's a sysop's nightmare when some server code gets stuck in a loop banging away over and over on something and just getting the same error every time.
Retry only on certain conditions.
Log every error so you are someone running your server can problem solve when something is wrong.
Retry only after some delay.
If you're going to retry more than few times, then implement a back-off delay so it gets longer and longer between retries.
Here's the general idea for some code to implement retries:
const maxRetries = 5;
const retryDelay = 500;
function execute_query(query, callback) {
let retryCntr = 0;
function run() {
con.query(query, function(err, result, fields) {
if (err && err is something we should retry for) {
++retryCntr;
if (retryCntr <= maxRetries) {
console.log('Retrying after error: ', err);
setTimeout(run, retryDelay)
} else {
// too many retries, communicate back error
console.log(err);
callback(err);
}
} else if (err) {
console.log(err);
// communicate back error
callback(err);
} else {
// communicate back result
callback(null, result, fields);
}
});
}
run();
}
The idea behind retries and backoffs if you're going to do lots of retries is that retry algorithms can lead to what are called avalanche failures. The system gets a little slow or a little too busy and it starts to create a few errors. So, your code then starts to retry over and over which creates more load which leads to more errors so more code starts to retry and the whole things then fails with lots of code looping and retrying in what is called an avalanche failure.
So, instead, when there's an error you have to make sure you don't inadvertently overwhelm the system and potentially just make things worse. That's why you implement a short delay, that's why you implement max retries and that's why you may even implement a back-off algorithm to make the delay between retries longer each time. All of this allows a system that has some sort of error causing perturbation to eventually recover on its own rather than just making the problem worse to the point where everything fails.

Waiting for a response from app.post in another app.post

I'm developing a back-end for a mobile application using express.js for my API.
For this mobile application, the users sign-in using mobile numbers, an OTP code is sent to their mobiles, and they need to send back the OTP they received to the server for verification and validation.
When the users first attempt to sign-in, they POST their mobile number to the server, and then a bunch of processing happens, and an OTP is sent to them through an SMS gateway.
Now while this request is still ongoing, I need to wait for the users to send the OTP through a POST request to another route, verify it, and then proceed on with the appropriate steps in the first, ongoing POST request.
After some search on the net, I eventually decided to wrap the app.post method for the verifyOTP route in a function that creates and returns a new promise, and then resolve it or reject it after verification. This worked wonderfully for the first time I perform this operation after restarting the server, but that's it. It only works the first time, and then for the consecutive times that follow, none of the new promises that should be created are resolved or rejected, and the first request to the sign-in route remains waiting.
I tried a bunch of things like making the function wrapping the verifyOTP route async, and creating promises inside the route instead of wrapping it in one, but still no use. Can you help me?
For the sake of finding a solution for this problem, I've simplified the process and did a simulation of the actual situation using this code, and it simulates the problem well:
This is to simulate the first request:
app.get("/test", async function(req, res) {
console.log("Test route\n");
var otpCode = Math.floor(Math.random() * (9999 - 2)) + 1;
var timestamp = Date.now();
otp = {
code: otpCode,
generated: timestamp
};
console.log("OTP code sent: " + otpCode + "\n");
console.log("OTP sent.\n");
res.end();
/* verifyOTP().then(function() {
console.log("Resolved OTP verification\n\n");
res.end();
}).catch(function() {
console.log("Bad\n\n");
res.end();
});*/
});
This is the verifyOTP route:
var otp;
app.post("/verifyOTP", function(req, res) {
console.log("POST request - verify OTP request\n");
var msg;
if ((Date.now() - otp.generated) / 1000 > 30) {
msg = "OTP code is no longer valid.";
res.status(403).json({
error: msg
});
} else {
var submitted = req.body.otp;
if (submitted !== otp.code) {
msg = "OTP code is incorrect.";
res.status(403).json({
error: msg
});
} else {
msg = "Verified.";
res.end();
}
}
console.log(res.statusCode + " - " + res.statusMessage + "\n");
console.log(msg + "\n");
});
Just to mention, this isn't the only place in my server that I need OTP verification, although the implementation of what happens after the verification varies. Therefore, I'd appreciate it if the solution could still keep the code reusable for multiple instances..
Well, after some more research on my own, I discarded the use of Promises for this use case all together, and instead used RxJS' Observables..
It solved my problem pretty much the way I want it, although I had to do some slight modifications..
For those who stumble upon my question looking for a solution for the same problem I faced:
Promises can only be resolved or rejected once, and as far as I can tell, unless the Promises function finishes running, you can't create a new one with the same code (please correct me if I'm wrong on this one, I'd really appreciate it, this was only based on my own personal observations and guesswork), and unless you create a brand new Promise, you can't resolve it again.
In this case, we are making a Promise out of a listener (or whatever it's called in js), so unless you delete the listener, the function warapped inside the promise won't finish running (I think), and you won't get to create a new Promise.
Observables, on the other hand, can be reused as many times as you want, see this for a comparison between Promises and Observables, and this for a nice tutorial that will help you understand Observables and how to use them. See this for how to install RxJS for node.
However, be warned - for some reason, once you subscribe to an observable, the variables used in the function passed to observable.subscribe() remains the same, it doesn't get updated with every new request you make to the observer route. So unless you find a way to pass the variables that change into the observer.next() function inside the observable definition, you will get the wrong results.

JavaScript Why is some code getting executed before the rest?

I've mostly learned coding with OOPs like Java.
I have a personal project where I want to import a bunch of plaintext into a mongodb. I thought I'd try to expand my horizons and do this with using node.js powered JavaScript.
I got the code working fine but I'm trying to figure out why it is executing the way it is.
The output from the console is:
1. done reading file
2. closing db
3. record inserted (n times)
var fs = require('fs'),
readline = require('readline'),
instream = fs.createReadStream(config.file),
outstream = new (require('stream'))(),
rl = readline.createInterface(instream, outstream);
rl.on('line', function (line) {
var split = line.split(" ");
_user = "#" + split[0];
_text = "'" + split[1] + "'";
_addedBy = config._addedBy;
_dateAdded = new Date().toISOString();
quoteObj = { user : _user , text : _text , addedby : _addedBy, dateadded : _dateAdded};
db.collection("quotes").insertOne(quoteObj, function(err, res) {
if (err) throw err;
console.log("record inserted.");
});
});
rl.on('close', function (line) {
console.log('done reading file.');
console.log('closing db.')
db.close();
});
(full code is here: https://github.com/HansHovanitz/Import-Stuff/blob/master/importStuff.js)
When I run it I get the message 'done reading file' and 'closing db' and then all of the 'record inserted' messages. Why is that happening? Is it because of the delay in inserting a record in the db? The fact that I see 'closing db' first makes me think that the db would be getting closed and then how are the records being inserted still?
Just curious to know why the program is executing in this order for my own peace of mind. Thanks for any insight!
In short, it's because of asynchronous nature of I/O operations in the used functions - which is quite common for Node.js.
Here's what happens. First, the script reads all the lines of the file, and for each line initiates db.insertOne() operation, supplying a callback for each of them. Note that the callback will be called when the corresponding operation is finished, not in the middle of this process.
Eventually the script reaches the end of the input file, logs two messages, then invokes db.close() line. Note that even though 'insert' callbacks (that log 'inserted' message) are not called yet, the database interface has already received all the 'insert' commands.
Now the tricky part: whether or not DB interface succeeds to store all the DB records (in other words, whether or not it'll wait until all the insert operations are completed before closing the connection) is up both to DB interface and its speed. If write op is fast enough (faster than reading the file line), you'll probably end up with all the records been inserted; if not, you can miss some of them. That's why it's a safest bet to close the connection to database not in the file close (when the reading is complete), but in insert callbacks (when the writing is complete):
let linesCount = 0;
let eofReached = false;
rl.on('line', function (line) {
++linesCount;
// parsing skipped for brevity
db.collection("quotes").insertOne(quoteObj, function(err, res) {
--linesCount;
if (linesCount === 0 && eofReached) {
db.close();
console.log('database close');
}
// the rest skipped
});
});
rl.on('close', function() {
console.log('reading complete');
eofReached = true;
});
This question describes the similar problem - and several different approaches to solve it.
Welcome to the world of asynchronicity. Inserting into the DB happens asynchronously. This means that the rest of your (synchronous) code will execute completely before this task is complete. Consider the simplest asynchronous JS function setTimeout. It takes two arguments, a function and a time (in ms) after which to execute the function. In the example below "hello!" will log before "set timeout executed" is logged, even though the time is set to 0. Crazy right? That's because setTimeout is asynchronous.
This is one of the fundamental concepts of JS and it's going to come up all the time, so watch out!
setTimeout(() => {
console.log("set timeout executed")
}, 0)
console.log("hello!")
When you call db.collection("quotes").insertOne you're actually creating an asynchronous request to the database, a good way to determine if a code will be asynchronous or not is if one (or more) of its parameters is a callback.
So the order you're running it is actually expected:
You instantiate rl
You bind your event handlers to rl
Your stream starts processing & calling your 'line' handler
Your 'line' handler opens asynchronous requests
Your stream ends and rl closes
...
4.5. Your asynchronous requests return and execute their callbacks
I labelled the callback execution as 4.5 because technically your requests can return at anytime after step 4.
I hope this is a useful explanation, most modern javascript relies heavily on asynchronous events and it can be a little tricky to figure out how to work with them!
You're on the right track. The key is that the database calls are asychronous. As the file is being read, it starts a bunch of async calls to the database. Since they are asynchronous, the program doesn't wait for them to complete at the time they are called. The file then closes. As the async calls complete, your callbacks runs and the console.logs execute.
Your code reads lines and immediately after that makes a call to the db - both asynchronous processes. When the last line is read the last request to the db is made and it takes some time for this request to be processed and the callback of the insertOne to be executed. Meanwhile the r1 has done it's job and triggers the close event.

Insert an array of documents into a model

Here's the relevant code:
var Results = mongoose.model('Results', resultsSchema);
var results_array = [];
_.each(matches, function(match) {
var results = new Results({
id: match.match_id,
... // more attributes
});
results_array.push(results);
});
callback(results_array);
});
}
], function(results_array) {
results_array.insert(function(err) {
// error handling
Naturally, I get a No method found for the results_array. However I'm not sure what else to call the method on.
In other functions I'm passing through the equivalent of the results variable here, which is a mongoose object and has the insert method available.
How can I insert an array of documents here?
** Edit **
function(results_array) {
async.eachLimit(results_array, 20, function(result, callback) {
result.save(function(err) {
callback(err);
});
}, function(err) {
if (err) {
if (err.code == 11000) {
return res.status(409);
}
return next(err);
}
res.status(200).end();
});
});
So what's happening:
When I clear the collection, this works fine.
However when I resend this request I never get a response.
This is happening because I have my schema to not allow duplicates that are coming in from the JSON response. So when I resend the request, it gets the same data as the first request, and thus responds with an error. This is what I believe status code 409 deals with.
Is there a typo somewhere in my implementation?
Edit 2
Error code coming out:
{ [MongoError: insertDocument :: caused by :: 11000 E11000 duplicate key error index:
test.results.$_id_ dup key: { : 1931559 }]
name: 'MongoError',
code: 11000,
err: 'insertDocument :: caused by :: 11000 E11000 duplicate key error index:
test.results.$_id_ dup key: { : 1931559 }' }
So this is as expected.
Mongo is responding with a 11000 error, complaining that this is a duplicate key.
Edit 3
if (err.code == 11000) {
return res.status(409).end();
}
This seems to have fixed the problem. Is this a band-aid fix though?
You seem to be trying to insert various documents at once here. So you actually have a few options.
Firstly, there is no .insert() method in mongoose as this is replaced with other wrappers such as .save() and .create(). The most basic process here is to just call "save" on each document you have just created. Also employing the async library here to implement some flow control so everything just doesn't queue up:
async.eachLimit(results_array,20,function(result,callback) {
result.save(function(err) {
callback(err)
});
},function(err) {
// process when complete or on error
});
Another thing here is that .create() can just take a list of objects as it's arguments and simply inserts each one as the document is created:
Results.create(results_array,function(err) {
});
That would actually be with "raw" objects though as they are essentially all cast as a mongooose document first. You can ask for the documents back as additional arguments in the callback signature, but constructing that is likely overkill.
Either way those shake, the "async" form will process those in parallel and the "create" form will be in sequence, but they are both effectively issuing one "insert" to the database for each document that is created.
For true Bulk functionality you presently need to address the underlying driver methods, and the best place is with the Bulk Operations API:
mongoose.connection.on("open",function(err,conn) {
var bulk = Results.collection.initializeUnorderedBulkOp();
var count = 0;
async.eachSeries(results_array,function(result,callback) {
bulk.insert(result);
count++;
if ( count % 1000 == 0 ) {
bulk.execute(function(err,response) {
// maybe check response
bulk = Results.collection.initializeUnorderedBulkOp();
callback(err);
});
} else {
callback();
}
},function(err) {
// called when done
// Check if there are still writes queued
if ( count % 1000 != 0 )
bulk.execute(function(err,response) {
// maybe check response
});
});
});
Again the array here is raw objects rather than those cast as a mongoose document. There is no validation or other mongoose schema logic implemented here as this is just a basic driver method and does not know about such things.
While the array is processed in series, the above shows that a write operation will only actually be sent to the server once every 1000 entries processed or when the end is reached. So this truly does send everything to the server at once.
Unordered operations means that the err would normally not be set but rather the "response" document would contain any errors that might have occurred. If you want this to fail on the first error then it would be .initializeOrderedBulkOp() instead.
The care to take here is that you must be sure a connection is open before accessing these methods in this way. Mongoose looks after the connection with it's own methods so where a method such as .save() is reached in your code before the actual connection is made to the database it is "queued" in a sense awaiting this event.
So either make sure that some other "mongoose" operation has completed first or otherwise ensure that your application logic works within such a case where the connection is sure to be made. Simulated in this example by placing within the "connection open" event.
It depends on what you really want to do. Each case has it's uses, with of course the last being the fastest possible way to do this as there are limited "write" and "return result" conversations going back and forth with the server.

Nodejs asynchronous database function needs synchronous answer

I am new to nodejs and am writing some code that needs to query my MySQL database and return a username from a given user_id. I've been reading that all your functions should be asynchronous. In this case, ideally I would like the server to be able to respond to other event requests while this query is taking place. However, it isn't a particularly large query and only returns a single value. Maybe I should make it synchronous? (If that is your answer, sample code to change it would be great) Anyways, here is my function. It gives an error near the last line "return current_username;" because current_username is undefined at that point. Any suggestions?
function get_current_username(current_user_id) {
console.log(' | Entered get_current_username');
sqlq = 'SELECT username FROM users WHERE id = ' + current_user_id;
connection.query(sqlq, function(err, rows, fields) {
if (err) throw err;
var current_username = rows[0].username;
console.log(' | the current_username =' + current_username);
});
return current_username;
}
Pass in a callback function to get_current_username and then call that callback function from inside of connect.query's callback:
function get_current_username(current_user_id, callback) {
console.log(' | Entered get_current_username');
sqlq = 'SELECT username FROM users WHERE id = ' + current_user_id;
connection.query(sqlq, function(err, rows, fields) {
if (err) throw err;
var current_username = rows[0].username;
console.log(' | the current_username =' + current_username);
callback(current_username);
});
}
When you go to use this function then, you'd do something like:
get_current_username(12345, function(username) {
console.log("I am " + username);
});
You could also check out the use of promises/futures. I have a feeling I won't be able to do an explanation of these justice, so I'll link off to this StackOverflow question about understanding Promises.
This is an architectural decision though - some people would prefer to use callbacks, especially if writing a module intended for re-use by 3rd parties. (And in fact, it's probably best to get your head fully wrapped around callbacks in the learning stages here, before adopting something like Promise.)

Categories

Resources