Apollo Explorer Failed to load resource: net::ERR_CERT_AUTHORITY_INVALID - javascript

I am running an Apollo Server with express to create an http server:
const express = require("express");
const cors = require('cors');
const app = express();
const server = new ApolloServer({ ... });
server.applyMiddleware({ app });
// enable pre-flight for cors requests
app.options('*', cors());
// Create the HTTP server
let httpServer = http.createServer(app);
httpServer.listen({ port: config.port });
Locally I can run the server and query it on Apollo Explorer without any issues.
However, when I deploy this server on dev environment, and try to access the Explorer page with the dev endpoint, I get a few errors.
The app.options() line with cors argument somehow seems to have solved part of them but not all.
Errors I am getting (on Dev Tools console):
Failed to load resource: net::ERR_CERT_AUTHORITY_INVALID
POST https://dev.endpoint.service/graphql net::ERR_CERT_AUTHORITY_INVALID
Errors I am getting (as popups on the Explorer page):
Unable to reach server
To diagnose the problem, please run:
npx diagnose-endpoint#1.0.12 --endpoint=https://dev.endpoint.service/graphql
I've tried running the command as instructed in the error and got this result:
Diagnosing https://dev.endpoint.service/graphql
Could not find any problems with the endpoint. Would you please to let us know about this > at explorer-feedback#apollographql.com
Frankly, I'm not even sure I understand the problem.
Am I getting these errors because, even though I launch an http server of Apollo without certificates, I am trying to access it via an https endpoint (which requires certificates)? I have to do this, service is stored in AKS cluster, which is only accessible through the endpoint I am calling. But every service that is already there is also an http service, not https, and is accessible through this same endpoint.
Also, even though these errors are showing up frequently, I am also able to query the server successfully most of the time on Explorer, and the data returned is exactly what I expected, which makes even less sense.
I am using edge browser but also tried chrome, and have the same issues.
How can an error like this be intermittent?
Without any intervention on my part, sometimes it's like this:
Any help, hints, ideas, please.
Thank you so much.

As much as it pains me to admit, it seems the issue is related to the VPN my company is using.
There were a few tells that pointed in this direction, once I started paying attention:
We can't access the endpoint I mentioned without the VPN turned on.
Other services in the AKS behave with the same error, if being called constantly through the same endpoint. I did not think to do that test at first, but when I realized that on Apollo server, the server is constantly doing the introspection thing to check the schema, it means it is being called more often than the other services that do not have this functionality.
We have some monitoring tools, to check the pod statuses and so on, and nothing indicated any problems in this service, or that it needed any kind of pod escalation (due to excessive number of requests).
I actually performed kubectl portforward test linking my localhost directly to the AKS cluster. Calling the service this way bypasses that endpoint which I am, under normal circumstances, forced to use before the request actually reaches the cluster. And I was simultaneously seeing on one window where I was calling the service the normal way showing that error on Apollo Studio, and at the same time on another Apollo Studio window performing the same request with this portforward bypass mechanic, and the latter was working just fine. If it really was a problem with the service, it would be down for both windows.
Other colleagues were testing the service at the same time as me and they were saying the service was working fine for them, until it wasn't. So every developer on my team could be accessing the service at the same time, and the error would just randomly show up for some, but not for others.
There are long periods where the error doesn't occur at all, like during lunch hours, or after work hours, and I assume the VPN traffic will be much lower during those hours.

Related

ExpressJS Random timeouts on Heroku

I have an Express Node API on Heroku. It has a few open routes that makes upstream requests to another web server that then executes some queries and returns data.
So client->Express->Apache->DB2.
This has been running just fine for several months on a Heroku Hobby dyno. Now, however, we exposed another route and more client requests are coming in. Heroku is then throwing H12 errors (bc the Express app isn't giving back a response within 30S).
I'm using axios in the Express app to make the requests to Apache and get data. However, I'm not seeing anything fail in the logs. No errors are getting caught to give me some more details about why things could be timing out. Investigated the Apache->DB2 side of things and the bottleneck doesn't seem there and is almost certainly on the Express side of things.
Per Heroku's advice, I connected the app to NewRelic but haven't gained any insights yet. Sounds like this could be a scalability issue with Express and a high number of new requests coming in at a short period of time? There's not particularly that many. i.e ~50/min at highest. Would beefing up the Heroku dyno do anything? Any other ways to actually see what's going on with Express?
It seems like 10-15% of the client requests are receiving the timeout (and seems to happen at a time when there's lots of incoming requests)
Thanks!

HTTP request is being blocked

I am working on multiple apps that communicate between each other. I am using Chrome and Firefox both to test my apps on. The problem seems to be persistent in both browsers.
The problem:
I am sending a PUT request from app nr.1 to the Express Node server that essentially sends an update to my mongo database server. Once updated app nr.2 will retrieve the updated value with a GET request. Websockets are being used to notify apps on changes.
The problem however is that the HTTP GET requests on the receiving app nr.2 is taking multiple seconds for it to complete (after a few of them have been done).
To explain the written lines above look at the screenshot below:
the first few GET request take 3-5ms to complete, then the upcoming GET requests will take up to 95634ms to complete....
What could be the cause of this and how could this be fixed?
It is difficult to tell without seeing your whole stack.
Sometimes a reverse-proxy that sits in front of you applications can cause issues like this
They could be trying to route to ipv6 instead of ipv4 especially if you are using localhost to point your GET requests. The fix is to use 127.0.0.1 instead of localhost
Also, a high keepalive timeout setting on a proxy can cause this
Good first places to look in a situation like this are
Proxy logs
Node logs
Server logs (ie firewall or throttling)

Intermittent "Cannot GET /" with node.js on Bluemix

For some reason on all my Bluemix services, I intermittently get the error "Cannot GET /pathname" on my node.js express services. This is intermittent (it works about 1/3 of the time). There is no error or logging shown in the application when this happens (however that response is coming from express I assume).
Any ideas? I have no idea how to progress here. The server has ample resources (memory + CPU).
I've seen this happen before when the user accidently has 2 different applications mapped to the same route/URL. The load balancer is hitting a different application on each time.
Try changing the route to something else and try to recreate the problem....myappname2.mybluemix.net
If that seems to fix it, log in to the UI and confirm that you do not have duplicate applications and that all applications have a unique route.

Prevent Express.js from crashing

I'm quite new to Express.js and one of the things that surprised me more at first, compare to other servers such as Apache or IIS, is that Express.js server crashes every time it encounters an uncatched exception or some error turning down the site and making it accessible for users. A terrible thing!
For example, my application is crashing with a Javascript error because a variable is not defined due to a name change in the database table.
TypeError: Cannot call method 'replace' of undefined
This is not such a good example, because I should solve it before moving the site to production, but sometimes similar errors can take part which shouldn't be causing a server crash.
I would expect to see an error page or just an error in that specific page, but turning down the whole server for these kind of things sounds terrifying.
Express error handlers doesn't seem to be enough for this purposes.
I've been reading about how to solve this kind of things in Node.js by using domains, but I found nothing specifically for Express.js.
Another option I found, which doesn't seem to be recommended in all cases, is using tools to keep running a process forever, so after a crash, it would restart itself. Tools like Forever, Upstart or Monit.
How do you guys deal with this kind of problems in Express.js?
The main difference between Apache and nodejs in general is that Apache forks a process per request while nodejs is single threaded, hence if an error occurs in Apache then the process handling that request will crash while the others will continue to work, in nodejs instead the only thread goes down.
In my projects I use monit to check memory/cpu (if nodejs takes to much resources of my vps then monit will restart nodejs) and daemontools to be sure nodejs is always up and running.
I would recommend using Domains along with clusters. There is example in doc itself at http://nodejs.org/api/domain.html. There are also some modules for expressjs https://www.npmjs.org/package/express-domain-middleware.
So when such errors occur use of domain along with cluster will help us separate context of where error occur and will effect only single worker in cluster, we should be logging them and disconnect that worker in cluster and refork it. We can then read logs to fix such errors that need to be fixed in code.
I was facing the same issue and I fixed using try/cache like this. So I created different route files and included each route files in try/cache block like this.
try{
app.use('/api', require('./routes/user'))
}
catch(e)
{
console.log(e);
}
try{
app.use('/api', require('./routes/customer'))
}
catch(e)
{
console.log(e);
}

Websockets not working in my Rails app when I run on Unicorn server, but works on a Thin server

I'm learning Ruby on Rails to build a real-time web app with WebSockets on Heroku, but I can't figure out why the websocket connection fails when running on a Unicorn server. I have my Rails app configured to run on Unicorn both locally and on Heroku using a Procfile...
web: bundle exec unicorn -p $PORT -c ./config/unicorn.rb
...which I start locally with $foreman start. The failure occurs when creating the websocket connection on the client in javascript...
var dispatcher = new WebSocketRails('0.0.0.0:3000/websocket'); //I update the URL before pushing to Heroku
...with the following error in the Chrome Javascript console, 'websocket connection to ws://0.0.0.0:3000/websocket' failed. Connection closed before receiving a handshake response.
...and when I run it on Unicorn on Heroku, I get a similar error in the Chrome Javascript console, 'websocket connection to ws://myapp.herokuapp.com/websocket' failed. Error during websocket handshake. Unexpected response code: 500.
The stack trace in the Heroku logs says, RuntimeError (eventmachine not initialized: evma_install_oneshot_timer):
What's strange is that it works fine when I run it locally on a Thin server using the command $rails s.
I've spent the last five hours researching this problem online and haven't found the solution. Any ideas for fixing this, or even ideas for getting more information out of my tools, would be greatly appreciated!
UPDATE: I found it strange that websocket-rails only supported EventMachine-based web servers while faye-websocket which websocket-rails is based upon, supports many multithread-capable web servers.
After further investigation and testing, I realised that my earlier assumption had been wrong. Instead of requiring an EventMachine-based web server, websocket-rails appears to require a multithread-capable (so no Unicorn) web server which supports rack.hijack. (Puma meets this criteria while being comparable in performance to Unicorn.)
With this assumption, I tried solving the EventMachine not initialized error using the most direct method, namely, initializing EventMachine, by inserting the following code in an initializer config/initializers/eventmachine.rb:
Thread.new { EventMachine.run } unless EventMachine.reactor_running? && EventMachine.reactor_thread.alive?
and.... success!
I have been able to get Websocket Rails working on my local server over a single port using a non-EventMachine-based server without Standalone Server Mode. (Rails 4.1.6 on ruby 2.1.3p242)
This should be applicable on Heroku as long as you have no restriction in web server choice.
WARNING: This is not an officially supported configuration for websocket-rails. Care must be taken when using multithreading web servers such as Puma, as your code and that of its dependencies must be thread-safe. A (temporary?) workaround is to limit the maximum threads per worker to one and increase the number of workers, achieving a system similar to Unicorn.
Out of curiousity, I tried Unicorn again after fixing the above issue:
The first websocket connection was received by the web server (Started GET "/websocket" for ...) but
the state of the websocket client was stuck on connecting, seeming to hang
indefinitely.
A second connection resulted in HTTP error code 500 along with app
error: deadlock; recursive locking (ThreadError) showing up in the
server console output.
By the (potentially dangerous) action of removing Rack::Lock, the deadlock error can be resolved, but connections still hang, even though the server console shows that the connections were accepted.
Unsurprisingly, this fails. From the error message, I think Unicorn is incompatible due to reasons related to its network architecture (threading/concurrency). But then again, it might just be some some bug in this particular Rack middleware...
Does anyone know the specific technical reason for why Unicorn is incompatible?
ORIGINAL ANSWER:
Have you checked the ports for both the web server and the WebSocket server and their debug logs? Those error messages sound like they are connecting to something other than a WebSocket server.
A key difference in the two web servers you have used seems to be that one (Thin) is EventMachine-based and one (Unicorn) is not. The Websocket Rails project wiki states that a Standalone Server Mode must be used for non-EventMachine-based web servers such as Unicorn (which would require an even more complex setup on Heroku as it requires a Redis server). The error message RuntimeError (EventMachine not initialized: evma_install_oneshot_timer): suggests that standalone-mode was not used.
Heroku AFAIK only exposes one internal port (provided as an environmental variable) externally as port 80. A WebSocket server normally requires its own socket address (port number) (which can be worked around by reverse-proxying the WebSocket server). Websocket-Rails appears to get around this limitation by hooking into an existing EventMachine-based web server (which Unicorn does not provide) hijacking Rack.

Categories

Resources