Long running requests in nodejs - javascript

I was wondering if there is any good practice how to utilize long-running requests in nodejs?
I am using Express.js and have been wondering if there is any way to, I don't know, move one of express requests to the different instance of node or something?
The reason why I am wondering is that if strongly uses a CPU, blocking at the same time event loop in node. The main problem is that firstly, I am fetching some data, then calculating and inserting missing data into DB, then fetching other data depends on the result, and then calculating again and updating other stuff in DB.
I haven't noticed this problem as I was the only one using this request so far. But now I am testing it with other users and if few of them use this request, the whole application freeze for a few seconds. And on my monitoring tools I see a huge CPU usage (100%-200% lol)
So I have tried clustering but it seems, it does not work. I have used pm2 and ran app on all cores of my CPU. Because of complexicity of my alghoritm I tried to move several functions to worker threads but it looks like these threads would be used very, very often and I am afraid of crashing the whole node instance.
I have no clue, which solution would be the best and don't know if are there any good tools for Express, which would help me offload/utilize these requests? I have not diveed in partitioning yet, but it looks like this may work. Has anyone used this for the larger projects and knows if this could be the solution to distribute one request into several ticks?
What about the job queue like Bull or Kue? Would be this a good idea to push these tasks into a queue? I have never used such tools and I am asking, as I have no idea if this would make any sense.
Cheers

in case you have data present - You can sent data to user using incoming HTTP request.
but if you don't found any data you can send empty response in HTTP request.
And you need to do calculating and inserting missing data using queue service. bee-queue is nice one.
Use redis with bee-queue to make background jobs faster.

Related

Duplicate websocket subscription in Azure webapp

I have a node.js app running in Azure as a webApp. On startup it connects to an external service using a websocket subscription. Specifically I'm using the reconnecting-websockets NPM package to wrap it to handle disconnects.
The problem I am having is that because there are 2 instances of the app running on Azure (horizontal scaling for failover) I end up with two subscriptions at any one time.
Is there an obvious way to solve this problem?
For extra context, this is a problem for 2 reasons:
I pay for each message received and am over quota
When messages are received I process then and do database updates, these are also being duplicated.
You basically want to have an AppService with potentially multiple instances, but you don't want your application to run in parallel. At least you don't want two have two subscriptions. Ideally you don't want to touch your application code.
An easy way to implement this would be to wrap your application into a continuous WebJob, and set its scale property to singleton.
Here is one tutorial on how to set up a nodejs webjob: https://morshemesh.medium.com/continuous-deployment-of-web-jobs-with-node-js-2308f95e63b1
You can then use a settings.job file to control that your webjob only runs on a single instance at any one time. Or you can use the Azure Portal to set the value when you manually deploy the Webjob.
{
"is_singleton": true
}
https://github.com/projectkudu/kudu/wiki/WebJobs-API#set-a-continuous-job-as-singleton
PS: Don't forget to enable Always On. It is also mentioned in the docs. But you probably already need that for your current deployment.
If you don't want your subscription to be duplicated then it stands to reason that you only want one process subscribing to the external websocket connection.
Since you mentioned that messages received will be updated in the db, then it makes sense that this would be an isolated backend process since you made it clear that you have multiple instances running for the frontend server (and whether or not a separate backend).
Of course if you want more redundancy, you could use a load balancer with simple distribution of messages to any number of instances behind. Perhaps some persistent queueing system if you feel that it's needed.
If you want these messages to be propagated to the client (not clear from the question), this will be a bit more annoying. If it's a one-way simple channel, then you could consider using SSE which is a rather simple protocol. If it's bilateral then I would myself probably consider running a STOMP server with intermediary broker (like RabbitMq) and connect directly from the client (i.e. the browser, not the server generating the frontend) to the service.
Not sure if you're well versed with Java, but I made some app that you could use for reference in case interested when we had to prepare some internal demos: https://github.com/kimgysen/zwoop-backend/tree/develop/stomp-api/src/main/java/be/zwoop
For all intents and purposes, I'm not sure if all this is worth the hustle for you, it sounds like you're on a tight budget and that you're looking for simple solutions without too much complexity. Have you considered giving up on load balancing the website (is the load really that high?), I don't have enough background knowledge on your project to judge, I believe. But proper caching optimization and initially scaling vertically may be sufficient at the start (?).
Personally I would start simple and gradually increase complexity when needed.
I'm just throwing ideas at you, hopefully it is helpful in any way to have a few considerations.
Btw, I don't understand why other answers on this question were all deleted (?).

Different approaches to optimize the response time of an api that does calculation in server side with NodeJs

I am making an API call from the client (Angular v10) to server-side (NodeJs v12). This API call performs calculations on a very large set of data that takes up around 7-10 min to get completed.
The objective here is to somehow reduce response time to make it efficient and return response faster to clients. The calculation algorithm does contain lots of for loops which consume most of the time.
Some of the alternatives I have are -
Increase the server's processing power where code is going to be deployed.
Integrate AWS Lambda functions to the server and lets it do the calculations part for me. Don't know much about it. Any feedback on this is really appreciated
Is there any other approaches/technology that i should look upon in advance to avoid performance concerns later?
It's not so easy to give you an advice, because to do that I should have some more specific information about the implementation. So below i'm going to list some tips related to the information that you gave and some more topics that you could study more deeply to achieve your goal.
for loops are computationally expensive, so you should have the to reduce if possible some of this. One good approach could be treat the information by query (what database are you using?)
Do you have a monolithic API? Maybe you could split it in micro services take a look at that
AWS lambda is a good ideia, but for me you are not resolving your problem using that.
If you are facing problem retrieving a massing bunch of data, may be you could consider use a memory database as Redis to make cache follow this link
You also have the possibility to apply an old fashioning way to improve your calculus using memoization take a look at that

How do I throttle $http requests in angularjs?

I'm building a UI for a data importer using angularjs. The angular app is going to be crunching through the input data source (a spreadsheet, or whatever) and sending GETs/POSTs to an API to create/update records on the server and retrieve changes, etc.
If a user is importing thousands of records, I probably don't want to be opening up thousands of ajax calls at once (not that Angular would be able to get all of the requests sent out before the first finished). My thought was to add some sort of connection pool so that it could be throttled to just 10 or 50 or so ajax calls at once.
Does angular already have a built-in means of throttling ajax calls? I know I could build one without too much trouble, but I don't want to re-invent the wheel if there's already something slick out there. Can anyone recommend any tools/plugins for that? I know there are a few for jquery, but I'm hoping to avoid jquery, as much as possible, for this project.
In investigating this further, I found that, according to this question and this page, browsers automatically throttle http requests to the same server to somewhere between 6 and 13 concurrent requests, and they queue up additional requests until other requests complete. So, apparently, no action is needed, and I can just let my code fly.
I looked at angular-http-throttler (suggested by nathancahill), but (1) it appears to duplicate what browsers are already doing by themselves, and (2) it doesn't (currently) appear to have a mechanism for handling error cases, so if the server doesn't respond, or if it was a bad request, it doesn't decrement the request count, thus permanently clogging up it's queue.
Letting the browser queue up too many requests at once can still cause memory/performance issues if dealing with very large amounts of data. I looked at various techniques for creating a fixed-length queue in javascript, where it can fire off a callback when a queue spot becomes available, but the non-blocking nature of javascript makes such a queue tricky & fragile... Hopefully, I won't need to go that direction (I'm not dealing with THAT much data), and the browser's throttling will be sufficient for me.

Ajax polling - efficient?

I'm developing a website that uses a notification system (like the Facebook's one).
For this purpose I think I'll write a JQuery polling function that look for new notification in the server side, using ajax.
My question is, is this a good idea?
It will be ok for client side, it's more likely to be a server issue. http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2007-016.pdf
In conclusion, if you want high data coherence and network performance you should use push rather than pulling. But push will consume more cpu thus has scalability issue.
The efficiency of pulling depends on the pull interval vs publish interval. If they equal, everything will be perfect. But realistically they will never meet.
Some additional personal opinion, if your server script is blocking in nature (like ruby on rails with single thread server). You better rethink the solution.

Node.js web sockets server: Is my idea for data management stable/scalable?

I'm developing a html5 browser multi-player RPG with node.js running in the backend with a web sockets plug-in for client data transfer. The problem i'm facing is accessing and updating user data, as you can imagine this process will be taking place many times a second even with few users connected.
I've done some searching and found only 2 plug-ins for node.js that enable MySQL capabilities but they are both in early development and I've figured that querying the database for every little action the user makes is not efficient.
My idea is to get node.js to access the database using PHP when a user connects and retrieve all the information related to that user. The information collected will then be stored in an JavaScript object in node.js. This will happen for all users playing. Updates will then be applied to the object. When a user logs off the data stored in the object will be updated to the database and deleted from the object.
A few things to note are that I will separate different types of data into different objects so that more commonly accessed data isn't mixed together with data that would slow down lookups. Theoretically if this project gained a lot of users I would introduce a cap to how many users can log onto a single server at a time for obvious reasons.
I would like to know if this is a good idea. Would having large objects considerably slow down the node.js server? If you happen to have any ideas on another possible solutions for my situation I welcome them.
Thanks
As far as your strategy goes, for keeping the data in intermediate objects in php, you are adding a very high level of complexity to your application.
Just the communication between node.js and php seems complex, and there is no guarantee this will be any faster than just putting things right in mysql. Putting any uneeded barier between you and your data is going to make things more difficult to manage.
It seems like you need a more rapid data solution. You could consider using an asynchronous database like mongodb, or redis that will read and write quickly (redis will write in memory, should be incredibly fast)
These are both commonly used with node.js just for the reason that they can handle the real time data load.
Actually redis is what your really asking for, it actually stores things in memory and then persists it to the disk periodically. You can´t get any faster than that, but you will need enough ram. If ram looks like an issue, go with mongodb which is still really fast.
The disadvantage is you will need to relearn the ideas about data persistance, and that is hard. I´m in the process of doing that myself!
I have an application doing allmost what you describe- I choosed to do it that way since th MYSQL drivers for node was unstable/ undocumented at the time of development.
I have 200 connected users - requesting data 3-5 times each second, and fetch entire tables through php pages (each 200-800 ms) returning JSON from apache , with approx 1000 lines and put the contents in arrays. I loop through the arrays and find the relevant data on request - it works, and its fast - putting no significant load on cpu and memory.
All data insertion/updating, which is limited goes through php/mysql.
Advantages:
1. its a simple solution, with known stable services.
2. Only 1 client connecting to apache/php/mysql each 200-800 ms
3. all node clients get the benefit of non-blocking.
4. Runs on 2 small "pc-style" servers - and handles about 8000 req/second. (apache bench)
Disadvantages:
1. many - but it gets the job done.
I found that my node script COULD stop -1-2 times a week- maybe due to some connection problems (unsolved) - but Combined with Upstart and Monit it restarts and alerts with no problems.....

Categories

Resources