ExpressJS Random timeouts on Heroku - javascript

I have an Express Node API on Heroku. It has a few open routes that makes upstream requests to another web server that then executes some queries and returns data.
So client->Express->Apache->DB2.
This has been running just fine for several months on a Heroku Hobby dyno. Now, however, we exposed another route and more client requests are coming in. Heroku is then throwing H12 errors (bc the Express app isn't giving back a response within 30S).
I'm using axios in the Express app to make the requests to Apache and get data. However, I'm not seeing anything fail in the logs. No errors are getting caught to give me some more details about why things could be timing out. Investigated the Apache->DB2 side of things and the bottleneck doesn't seem there and is almost certainly on the Express side of things.
Per Heroku's advice, I connected the app to NewRelic but haven't gained any insights yet. Sounds like this could be a scalability issue with Express and a high number of new requests coming in at a short period of time? There's not particularly that many. i.e ~50/min at highest. Would beefing up the Heroku dyno do anything? Any other ways to actually see what's going on with Express?
It seems like 10-15% of the client requests are receiving the timeout (and seems to happen at a time when there's lots of incoming requests)
Thanks!

Related

Apollo Explorer Failed to load resource: net::ERR_CERT_AUTHORITY_INVALID

I am running an Apollo Server with express to create an http server:
const express = require("express");
const cors = require('cors');
const app = express();
const server = new ApolloServer({ ... });
server.applyMiddleware({ app });
// enable pre-flight for cors requests
app.options('*', cors());
// Create the HTTP server
let httpServer = http.createServer(app);
httpServer.listen({ port: config.port });
Locally I can run the server and query it on Apollo Explorer without any issues.
However, when I deploy this server on dev environment, and try to access the Explorer page with the dev endpoint, I get a few errors.
The app.options() line with cors argument somehow seems to have solved part of them but not all.
Errors I am getting (on Dev Tools console):
Failed to load resource: net::ERR_CERT_AUTHORITY_INVALID
POST https://dev.endpoint.service/graphql net::ERR_CERT_AUTHORITY_INVALID
Errors I am getting (as popups on the Explorer page):
Unable to reach server
To diagnose the problem, please run:
npx diagnose-endpoint#1.0.12 --endpoint=https://dev.endpoint.service/graphql
I've tried running the command as instructed in the error and got this result:
Diagnosing https://dev.endpoint.service/graphql
Could not find any problems with the endpoint. Would you please to let us know about this > at explorer-feedback#apollographql.com
Frankly, I'm not even sure I understand the problem.
Am I getting these errors because, even though I launch an http server of Apollo without certificates, I am trying to access it via an https endpoint (which requires certificates)? I have to do this, service is stored in AKS cluster, which is only accessible through the endpoint I am calling. But every service that is already there is also an http service, not https, and is accessible through this same endpoint.
Also, even though these errors are showing up frequently, I am also able to query the server successfully most of the time on Explorer, and the data returned is exactly what I expected, which makes even less sense.
I am using edge browser but also tried chrome, and have the same issues.
How can an error like this be intermittent?
Without any intervention on my part, sometimes it's like this:
Any help, hints, ideas, please.
Thank you so much.
As much as it pains me to admit, it seems the issue is related to the VPN my company is using.
There were a few tells that pointed in this direction, once I started paying attention:
We can't access the endpoint I mentioned without the VPN turned on.
Other services in the AKS behave with the same error, if being called constantly through the same endpoint. I did not think to do that test at first, but when I realized that on Apollo server, the server is constantly doing the introspection thing to check the schema, it means it is being called more often than the other services that do not have this functionality.
We have some monitoring tools, to check the pod statuses and so on, and nothing indicated any problems in this service, or that it needed any kind of pod escalation (due to excessive number of requests).
I actually performed kubectl portforward test linking my localhost directly to the AKS cluster. Calling the service this way bypasses that endpoint which I am, under normal circumstances, forced to use before the request actually reaches the cluster. And I was simultaneously seeing on one window where I was calling the service the normal way showing that error on Apollo Studio, and at the same time on another Apollo Studio window performing the same request with this portforward bypass mechanic, and the latter was working just fine. If it really was a problem with the service, it would be down for both windows.
Other colleagues were testing the service at the same time as me and they were saying the service was working fine for them, until it wasn't. So every developer on my team could be accessing the service at the same time, and the error would just randomly show up for some, but not for others.
There are long periods where the error doesn't occur at all, like during lunch hours, or after work hours, and I assume the VPN traffic will be much lower during those hours.

SQL - node.js express - react load behavior

I am relatively new to programming and i have a general question about the relationship between the server and client sides of a react on any other js app.
I have a mysql db with a table that i expose as an api (every n seconds) with nodejs express running on aws instance. That api is pulled as json and displayed every n seconds by the react app.
In my head, the connection between sql and nodejs is separate from the connection between nodejs and react. I think of that the sql is only connected to one thing (node express server) and therefore is not under a heavy load ever. Then node express server exposes the sql table through a few queries as 3-4 jsons. And finally, lets say, a 100 people open my react app and pull those jsons. So the only loaded area of the server is the node express.
Im i correct? or do i completely misunderstand how this works?
Thank you in advance!
or do i completely misunderstand how this works?
It works the way you are going to make it work, and it seems you are on a good way.
The technique you are describing is called "caching" or at least some kind of, and is a good way to take load of your database. Instead of piping every request to the express server into the database, you store the result of the first request into memory (e.g. an object) of your express server. The next request will get served directly from memory, without asking the database.
Apart from polling you could use other communication channels too, but the same techniques would apply to avoid hammering the database
Server Sent Events
Websockets
Streaming (HTTP request does not close immediately, but server continues to send data every n seconds)

HTTP request is being blocked

I am working on multiple apps that communicate between each other. I am using Chrome and Firefox both to test my apps on. The problem seems to be persistent in both browsers.
The problem:
I am sending a PUT request from app nr.1 to the Express Node server that essentially sends an update to my mongo database server. Once updated app nr.2 will retrieve the updated value with a GET request. Websockets are being used to notify apps on changes.
The problem however is that the HTTP GET requests on the receiving app nr.2 is taking multiple seconds for it to complete (after a few of them have been done).
To explain the written lines above look at the screenshot below:
the first few GET request take 3-5ms to complete, then the upcoming GET requests will take up to 95634ms to complete....
What could be the cause of this and how could this be fixed?
It is difficult to tell without seeing your whole stack.
Sometimes a reverse-proxy that sits in front of you applications can cause issues like this
They could be trying to route to ipv6 instead of ipv4 especially if you are using localhost to point your GET requests. The fix is to use 127.0.0.1 instead of localhost
Also, a high keepalive timeout setting on a proxy can cause this
Good first places to look in a situation like this are
Proxy logs
Node logs
Server logs (ie firewall or throttling)

Node.js: How to determine cause of server hangs

I recently started noticing my Node.js app server hanging after a number of requests. After more investigation I narrowed it down to a specific endpoint. After about 30 hits on this endpoint, the server just stops responding and requests start timing out.
In summary, on each request the clients upload a file, server does some preliminary checks, creates a new processing job, puts that on a queue, using bull, and returns a response to the client. The server then continues to process the job on the queue. The job process involves retrieving job data from a Redis database, clients opening WebSocket connections to check on job status, and server writing to the database once job is complete. I understand any one of those things could be causing the hang up.
There are no errors or exceptions being thrown or logged. All I see is requests start to time out. I am using node-inspector to try to figure out what is causing the hang ups but not sure what to look for.
My question is, is there a way to determine the root cause of hang up, using a debugger or some other means, to figure out too many ABC instances have been created, or too many XYZ connections are open.
In this case it was not actually a Node.js specific issue. I was leaking a database connection in one code path. I learned the hard way, if you exhaust database connections, your Node server will stop responding. Requests just seem to hang.
A good way to track this down, on Postgres at least is by running the query:
SELECT * FROM pg_stat_activity
Which lists all the open database connections, as well as which query was last run on that connection. Then you can check your code for where that query is called, which is super helpful in tracking down leaks.
A lot of your node server debugging can be done using the following tools.
Nodemon is an excellent tool to do debugging because it will display errors just like node but it will also attempt to restart your server based on new changes; removing a lot of the stop/start hastle.
https://nodemon.io/
Finally I would recommend Postman. Postman lets you manually send in requests to your server, and will help you narrow your search.
https://www.getpostman.com/
Related to OP's issue and solution from my experience:
If you are using the mysql library and get a specific connection connection.getConnection(...) you need to remember to release it afterwards connection.release();
Otherwise your pool will run out of connections and your process will hang.

Intermittent "Cannot GET /" with node.js on Bluemix

For some reason on all my Bluemix services, I intermittently get the error "Cannot GET /pathname" on my node.js express services. This is intermittent (it works about 1/3 of the time). There is no error or logging shown in the application when this happens (however that response is coming from express I assume).
Any ideas? I have no idea how to progress here. The server has ample resources (memory + CPU).
I've seen this happen before when the user accidently has 2 different applications mapped to the same route/URL. The load balancer is hitting a different application on each time.
Try changing the route to something else and try to recreate the problem....myappname2.mybluemix.net
If that seems to fix it, log in to the UI and confirm that you do not have duplicate applications and that all applications have a unique route.

Categories

Resources