Does nodejs make separate database operations in a single function as atomic? - javascript

I've been trying to write some database queries that run on node js, and this came up to my mind. I can smell I'm misunderstanding something, but can't figure it out what it is.
If node is single-threaded, then this means all function calls run one after another. If there are multiple clients accessing the same node server, there will still be a race in which request gets queued first, but as long as each request is processed one by one, doesn't this make all database operations performed in a single request atomic?
For example, let's say I want to 1) count the number of columns matching some value in a table and 2) insert a new row if the number is more than 10. (It is probably possible to do this in one SQL query, but let's assume I run them separately. It's a dumb example, but I hope you get the point.) I'd use a transaction to make sure the number of matching columns do not change unexpectedly. Now, I'm thinking to write a function that runs the two queries without a transaction. Would that guarantee to work properly on node?

I suppose you want to run your first call, a count, then you want to execute an insert in the callback of the previous call.
Because of the nature of js and thus of nodejs, once you are waiting for the response to the first call, once it is come back and put on the event loop (so, the count is gone, you already have the result on the fly), while waiting to be served, everything else can happen... Even another insert somewhere else could be served, why not (let me say, you are serving user requests, so an insert due to a request of another user, not the one that is executing the count).
That said, the insert of the second user could be served before the one sent after the count of the first user and that means that the insert bound to the count is working on a wrong expected size of the dataset.
That's only one of the possible cases you can incur in, but it should be enough to solve your doubts.

Related

In Postgresql is it possible to do a "soft" commit of a transaction so far while retaining a row lock?

When using a transaction and a select * ... FOR UPDATE to lock a row, is it possible to do a "soft" commit that would write the changes so far to the table so they become permanent, while retaining the lock on the row?
In this specific use case, I have a long running function that triggers a series of operations based on a particular record. During that long running function, the row should remain locked for modification by other parts of the application.
However at different stages of the function there are side effect triggers that need to be be committed to the database (and made permanent).
If anything happens past one of those steps it would only roll back to that point.
If I just COMMIT then my current transaction finishes (and can't run further operations with that transaction) and any other queued operation kicks in.
COMMIT AND CHAIN doesn't prevent existing pending transactions from kicking in first.
Is there a way to do this at the database level?
No, that is not possible. If you need to prevent concurrent data modifications for a longer time, long transactions are not a good solution. You should solve this with application logic, for example by adding a boolean column that indicates that the row is being worked on.

How does Firebase atomic increment work in race condition?

Firebase atomic increment can be used in update or set. But they don't return the updated value on completion. So, I have to use once('value') immediately after update or set:
var submitref = firebase.database().ref('/sequence/mykey')
return submitref.set(firebase.database.ServerValue.increment(1)).then(_=>{
return submitref.once('value').then(snap=>snap.val());
});
Lets assume 2 threads are executing this code concurrently. submitref.set() will work fine, because of atomic increment. But if they complete submitref.set() at the same time and execute submitref.once('value') at the same time, both threads will receive same incremented value by +2.
Is this a possibility or am I not understanding it correctly?
The increment operation executes atomically on the server. There is no guarantee in their operation that all clients (or even any client) will see all intermediate states.
Your use-case to keep a sequential, monotonically incremental counter is better suited to a transaction since with a transaction the client controls the new value based on the current value.
JavaScript is a single-threaded language. Every bit of code executes in some order relative to any other bit of code. There is no threading contention except through any native libraries, which is not the case here. Also, Realtime Database pipelines all of its operations over a single connection in the order they were received, so there is a consistent order to its operations as well.
All that said, I imagine you could get into a situation where the two calls to set() happen before the two calls to once(), which means they would both show the twice-incremented value.
In this case, you might be better off overall using a listener with on() in order to simply know the most recent value at any time, and act on it whenever it's seen to change, regardless of what happens after establishing it.

What happens if you are simultaneously modifying an array that you are looping through in node js?

I'm new to node js and am just getting used to the asynchronous nature of the language...
I have a function that updates values in an array A every 60 seconds (grabs them from a database).
I also have a function that processes requests from users, and while processing these requests it uses values from A (and so I loop over A).
I'm wondering if this type of thing being asynchronous will crash my app?
For example if I'm in the middle of looping over A and grabbing values but at the same time, the 60 second timer hits and begins to update the values in A, can anything bad happen?
If so, do you have any design suggestions to avoid this?
Thank you very much for any help!
It really depends upon your exact code so no concrete answer can be provided without seeing and understanding your exact code.
Javascript in node.js is single threaded so if you are looping through an array with a synchronous mechanism such as .forEach() or a for loop, then any async calls you make will be initiated, but they cannot finish until after your current thread of execution (including the loop through the array) is completely done. So, no async result can impact the array while you are still looping.
There are less conventional and async ways to iterate through the array. If you were using one of those, we'd have to see your actual code to offer an opinion on whether you have any concurrency issues or not.
For example if I'm in the middle of looping over A and grabbing values
but at the same time, the 60 second timer hits and begins to update
the values in A, can anything bad happen?
node.js is event-driven. This means that when a timer is ready to fire, it inserts an event into the node.js event queue. That event will not be processed until the current running Javascript is done and the node.js JS engine can then pull the next event out of the event queue. So, a timer will not interrupt your current running Javascript, ever.
You may encounter weird bugs from doing this. If you are in the middle of A and A gets much shorter, it will terminate the loop. For example suppose that A has 30 elements. While you are looping though, it gets changed to length 15 when you are on index 16. It would not continue looping. Another example. Suppose that A has length 2. When you finish the stuff for the first index, when you go to the second, the value of both indexes gets changed, then the first one would be reported as the old value, and the second one as the new value.

Long load times on website when running R script

I'm attempting to query a MySQL database on a webpage. Within my R script, I have 4 different "query" functions along with multiple calculations which will display statistical graphs to my webpage, all dependent on an "N" variable. I'm using PHP (using shell_exec) to call to R and send "N". I'm using the RMySQL & ggplot2 libraries in R.
Running my R script with just 1 basic query function (includes dbConnect(), dbGetQuery and on.exit(dbDisconnect()), then using png(), plot(), and dev.off() takes ~15 seconds to display the graph on my website.
With 2 functions and 2 plots, I haven't even had the patience to wait it out to see if it works since the load time is so long. The queries themselves are rather lengthy (could probably made easier through looping), but I've tested them to work through MySQL and I'm not sure how to avoid loop errors with SQL.
Could the long loading time be due to having dbConnect/dbDisconnect in each individual function? Should I only do this once in the script (i.e. create a new "connect" function and then call to the other functions from here)?
Is it the fact I'm running multiple and lengthy query requests? If that's the case, would it be better if I split each "query function" into individual R scripts, then "shell_exec" each and allow the user to select which graphs to display (i.e. check boxes in HTML/PHP that allow for the execution of each script/graph desired)?
Through testing, I know that my logic is there, but I might be totally missing something. I would like to speed up the process so the website user doesn't have to stare at a loading screen forever and I can actually get some tangible results.
Sorry for the lengthy request, I appreciate any help you can give! If you'd like to see the webpage or any of my code to get a better idea, I can upload that and share.
Thanks!
EDIT: It should also be noted that I'm using a while loop (x < 100) for some of the calculations; I know loops in R are typically known to be expensive processes but the whole vectoring thing (I think that's the name?) is over my head.
Your requests are probably very demanding and cannot be executed synchronously. You could instead use a queue system. When a request is made, it is send to queue. The results will be output asynchronously when the server will be ready. In the meantime, you can redirect your user to another page and the use could be made aware of when the results is available.
Here are some suggestions:
PHP Native
PHP Framework
GitHub

Algorithm for setting a handler for when multiple ajax request returns in JS

I am writing a program where I need to put out n number of Ajax requests, I want to write a handler to be executed when all of them returns.
What is a good way to do this?
*my initial thought is to create a counter that is incremented when you create a request and decremented when the request returns. When you are at 0, do your handler. However, this creates a race condition where if you don't create your requests fast enough, your previous requests might return first and decrement your counter to 0 and execute the handler. Is there a better way to do this?
Your initial thought is actually perfectly fine.
Since Javascript is single-threaded, you won't get any callbacks while your code is running.
As long as you create all of the requests in the same event handler, you're fine.
If you don't create all the requests at once, you'll either need to know in advance how many requests you're going to create, or handle partial responses.

Categories

Resources