Node JS - file generator architecture - javascript

Need to add file generator REST API endpoint to web app. So far I've came up with following idea:
client sends file parameters to endpoint
server receives request and using AMQP sends parameters to dedicated service
dedicated service creates a file, puts it into server folder and sends responce that file created with file name
endpoint sends response to client with file
I'm not sure if's a good idea to keep REST request on a server for so long. But still don't want to use email with generated link or sockets.
Do I need to set timeout time in request so it will not be declined after a long wait time?
As far as I know maximum timeout is 120sec for rest api call. If it takes more time for the service to create a file then I need to use sockets, is that right?

The way I've handled similar is to do something like this:
Client sends request for file.
Server adds this to a queue with a 'requested' state, and responds (to the client) almost immediately with a reponse which includes a URL to retrieve the file.
Some background thread/worker/webJob/etc is running in a separate process from the actual web server and is constantly monitoring the queue - when it sees a new entry appear it updates the queue to a 'being generated' state & begins generating the file. When it finishes it updates the queue to a 'ready' state and moves on...
when the server receives a request to download the file (via the URL it gave the client), it can check the status of the file on the queue. If not ready, it can give a response indicating this. If it IS ready, it can simply respond with the file contents.
The Client can use the response to the initial request to re-query the url it was given after a suitable length of time, or repeatedly query it every couple of seconds or whatever is most suitable.
You need some way to store the queue that is accessible easily by both parts of the system - a database is the obvious one, but there are other things you could use...
This approach avoids either doing too much on a request thread or having the client 'hanging' on a request whilst the server is compiling the file.
That's what I've done (successfully) in these sorts of situations. It also makes it easy to add things like lifetimes to the queue, so a file can automatically 'expire' after a while too...

Related

React: - How to call an API multiple times and get data chunk by chunk

Problem
I have a scenario that need to call an API multiple times till get full data from server, Since the data is huge, server is sending the data chunk by chunk.
Current Implementation
Current implementation existing in GWT (Google web toolkit) and the scenario handled is, if the server taking time to collect the full data, will send a response with token id and 1st chunk of data. So that client will not get timeout and make another request with the latest token id. This continues process happens till client get full data.
challenge in React part
Currently i am planning with Redux and ReduxSaga implementation. But the challenge is how to hold the data without updating the state and make the repeated API call and wait till server respond full data.

On form submit, does the server ‘directly’ receive req or listen to changes in a particular place?

please forgive me if my question sounds naive. I researched on google, and several forums, but couldn’t find anything that is clear.
Here is my dilemma,
Step 1 -> Node.js Server is listening
Step 2 -> User on page ‘/new-users’. (POST, ‘/signup-controller)
Step 3 (& maybe Step 4) -> Id like to know what happens here, before the server decides where to take the data.
On step 1, Was the server listening to the local storage to see if any new requests are there?
Or, does it ‘directly’ receive the request in step 3?
I’ve always been under the impression that servers just listen to changes. Meaning it does not literally ‘receive’ req or res data.
Thanks a lot for reading my question and I look forward to any feedback.
EDIT: to clarify, does the client walk up to the server directly and hand over the data’s, hand to hand, or does the client store the data at some ‘locker’ or ‘location, and the server notices a filled locker, hence triggering the subsequent events?
No it will directly receive the request data and if you are using framework like express in node then you can use middleware to validate or check request data and move forward
The server only listen for a request, not for response
when it finds a request (req), operates with this request and bases od that must deliver a response (res) with data, files, error.. whatever..
The server receives a POST og GET (Depending on the METHOD attribute in the FORM tag) - If you want to implement some logic to decide where to put the data, it should be done by the server, analyzing the data. Hidden input tags (Type="hidden") could assist supplying info. Like a hidden input tag saying "NEW" or "EDIT" and the "ID" to example.
Using an AJAX method instead lets you negotiate with the server before the final POST.
hth.
Ole K Hornnes
On step 1, Was the server listening to the local storage to see if any new requests are there?
no, the server not listening the local storage, it listening the server port. and waiting the request.
does it ‘directly’ receive the request in step 3?
Server will receive when client send a request, in your case , step 2
The data from the form is formatted into an HTTP request and sent over the network to the server directly. The server receives it from the network, puts it into memory (RAM), and calls your handler.
A TCP connection (that HTTP is built on) transmits sequences of bytes - that's why it is called a stream-oriented transport. This means you get the bytes in the same order you've sent them. An HTTP request is just a piece of text which looks similar to this:
POST /signup-controller HTTP/1.1
Host: localhost:8080
Content-Type: application/json
Content-Length: 17
{"hello":"world"}
Note the blank line between the headers and the body. This gap is what allows Node.js (and HTTP servers in general) to quickly determine that the request is meant for localhost:8080/signup-controller using the POST method, without looking at the rest of the message! If the body was much larger (a real monster of a JSON), it would not make a difference, because the headers are still just a few short lines.
Thus, Node.js only has to buffer that part until the blank line (formally, \r\n\r\n) in memory. It gets to that point and it knows to call the HTTP request handler function that you've supplied. The rest - after the line break - is then available in the req object as a Readable Stream.
Even though there is some amount of buffering involved at each step (at the client, in switches, at intermediate routers, in the server's kernel, and finally in the server process), the communication is "direct" - one process on one host communicates with another process on another host, without involving the disk at any point.

Node.js Request drops before Response is received

The project that I am working on is to receive a request where in the main and/or most part of that request consists of data coming from a database. Upon receiving, my system proceeds with its function which is to parse all the data and ultimately concatenates the needed information to form a query, then insert those data using the mentioned query into my local database.
It is working fine and no issue at all. Except for the fact that it takes too long to process when the request has over 6,000,000 characters and over 200,000 lines (or maybe less but still with large numbers).
I have this tested with my system being used as a server (the supposed setup in production), and with Postman as well, but both drops the connection before the final response is built and sent. I have already tested and seen that although the connection drops, my system still proceeds with processing the data even up to the query, and even until it sends its supposed response. But since the request dropped somewhere in the middle of the processing, the response is ignored.
Is this about connection timeout in nodejs?
Or limit in 'app.use(bodyParser.json({limit: '10mb'}))'?
I really only see 1 way around this. I have done similar in the past. Allow the client to send as much as you need/want. However, instead of trying to have the client wait around for some undetermined amount of time (at which point the client may timeout), instead send an immediate response that is basically "we got your request and we're processing it".
Now the not so great part but it's the only way I've ever solved this type of issue. In your "processing" response, send back some sort of id. Now the client can check once in a while to see if it's request has been finished by sending you that id. On the server end you store the result for the client by the id you gave them. You'll have to make a few decisions about things like how long a response id is kept around and if it can be requested more than once, things like that.

Can the same php query handle simultaneous ajax requests from different pages

I have a PHP page named update_details.php?id=xyz which has a query for getting the details and updating the login time of the users.
The users have a profile page named profile.php?id=xyz. So for different users the profile page is different like profile.php?id=abc, profile.php?id=def etc. Now this profile.php has an ajax function that sends the user id to the update_details.php through ajax call so that the update_details.php can update the record.
Now for example if I have 2000 users and all of them open their profile page simultaneously. Now my question is will the update_details page be able to handle this. I mean is it one update_details.php or each update_details.php?id=abc, update_details.php?id=def etc is considered to be a seperate one.
To be more precise, when 2000 users are updating their record through 2000 ajax calls, are the calls going to one update_details.php or to the one according to their ids like update_details.php?id=abc, update_details.php?id=def etc. TIA
Okay, let's check how the request goes from the browser till it's served and the browser gets a response.
The client clicks on a link, maybe a button.
The browser makes a HTTP request and sends it to the server ( that maybe Apache, nginx, whatever you use )
The server analyzes the request, checks its rules.. Saying : I found a rule when I hit a url with .php extension, I run a php interpreter and pass it the request info..
The server spawns new process or assign the request to one of its workers ( depends on the internals of the server ).
How many concurrent php processes will run ? it depends on the web server configuration and design.
So to answer your question, each php process is running has its isolated memory segment even if they are executing the same instructions from update_details.php
Think of it like 10 workers in a factory crafting a chair following the same instruction, but each one uses a different paint color, wood type, etc..

handle HTTP time out for ajax save

I have a JavaScript application that regularly saves new and updated data. However I need it to work on slow connection as well.
Data is submitted in one single HTTP POST request. The response will return newly inserted ids for newly created records.
What I'm finding is that data submitted is fully saved, however sometimes the return result times out. The browser application therefore does not know the data has been submitted successfully and will try to save it again.
I know I can detect the timeout in the browser, but how can I make sure the data is saved correctly?
What are some good methods of handling this case?
I see from here https://dba.stackexchange.com/a/94309/2599 that I could include a pending state:
Get transaction number from server
send data, gets saved as pending on server
if pending transaction already exists, do not overwrite data, but send same results back
if success received, commit pending transaction
if error back, retry later
if timeout, retry later
However I'm looking for a simpler solution?
Really, it seems you need to get to the bottom of why the client thinks the data was not saved, but it actually was. If the issue is purely one of timing, then perhaps a client timeout just needs to be lengthened so it doesn't give up too soon or the amount of data you're sending back in the response needs to be reduced so the response comes back quicker on a slow link.
But, if you can't get rid of the problem that way, there are a bunch of possibilities to program around the issue:
The server can keep track of the last save request from each client (or a hash of such request) and if it sees a duplicate save request come in from the same client, then it can simply return something like "already-saved".
The code flow in the server can be modified so that a small response is sent back to the client immediately after the database operation has committed (no delays for any other types of back-end operations), thus lessening the chance that the client would timeout after the data has been saved.
The client can coin a unique ID for each save request and if the server sees the same saveID being used on multiple requests, then it can know that the client thinks it is just trying to save this data again.
After any type of failure, before retrying, the client can query the server to see if the previous save attempt succeeded or failed.
You can have multiple retries count as a simple global int.
You can also automatically retry, but this isn't good for an auto save app.
A third option is use the auto-save plugins for jQuery.
Few suggestions
Increase the time out, don't handle timeout as success.
You can flush output of each record as soon as you get using ob_flush and flush.
Since you are making request in regular interval. Check for connection_aborted method on each API call, if client has disconnected you can save the response in temp file and on next request you can append the last response with new response but this method is more resource consuming.

Categories

Resources