Debugging ENOTFOUND error using aws sdk v3

Debugging ENOTFOUND error using aws sdk v3 - javascript

I've been using AWS JS SDK V3 and have noticed that my lambdas are intermittently hitting errors connecting to AWS resources. Below I have an example for dynamodb, but I have also had issues connecting to secrets manager. My lambdas and resources are all contained within a VPC. I've noticed that these issues seem to be hit more often during a lambda cold start, but I'm not entirely sure. If a request is resent (user on the frontend refreshes the page) this error seems to go away. I was hoping that the built in client retries would reduce the errors that my code sees, but it appears that no retries are attempted.
I am looking for potential debugging tips that might reveal what is the cause of these issues. So far I've been looking through cloudwatch logs which does not appear to have any good insights. I believe this is being cause by bad DNS resolution, but I am surprised by the frequency of these errors. Stopping short of moving my lambdas to ec2 and utilizing a cache, what are ways in which I can fix this.
Reading this article: https://aws.amazon.com/premiumsupport/knowledge-center/vpc-find-cause-of-failed-dns-queries/ suggests increasing the DNS retry timer, but I'm unsure how I would do that as well.
{
"errorType": "Error",
"errorMessage": "getaddrinfo ENOTFOUND dynamodb.us-east-1.amazonaws.com",
"code": "ENOTFOUND",
"errno": -3008,
"syscall": "getaddrinfo",
"hostname": "dynamodb.us-east-1.amazonaws.com",
"$metadata": {
"attempts": 1,
"totalRetryDelay": 0
},
"stack": [
"Error: getaddrinfo ENOTFOUND dynamodb.us-east-1.amazonaws.com",
" at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26)"
]
}

How do you connect to DynamoDB from within your VPC?
Are you using a NAT instance or gatway?
Are you using custom DNS resolution?
I would suggest that you add a DynamoDB VPCe to your VPC to allow you to connect to DynamoDB via AWS private network.

Related

Unit testing with firebase auth emulator requires real service account?

We are using the firebase emulators to write integration tests. One of our functions modifies the claims on a user. As such, our test checks to see if the claim has been added. In our test, we call the following function:
admin.auth().getUser(user.userId)
Our intention is to then check the claims. Unfortunately, when this function is called, we get an error.
(node:96985) UnhandledPromiseRejectionWarning: Error: Credential
implementation provided to initializeApp() via the "credential"
property failed to fetch a valid Google OAuth2 access token with the
following error: "Error fetching access token: Error while making
request: getaddrinfo ENOTFOUND metadata.google.internal. Error code:
ENOTFOUND".
Keep in mind we are running against the local auth emulator, not a cloud service. We found an issue on github which seems to be related: https://github.com/firebase/firebase-tools/issues/1708
Unfortunately, the recommended course of action in that issue is to use an actual service account file from an actual cloud service. We do not check such files into our repos as this would be a security hazard. Does anyone know of a better way to deal with this situation?
In case it is relevant, we also get the following warning:
{"severity":"WARNING","message":"Warning, FIREBASE_CONFIG and
GCLOUD_PROJECT environment variables are missing. Initializing
firebase-admin will fail"}

Sails js, error when connect to cloud mongodb

I used cloud MongoDB in sails adapter but when I am running app, it throw an error, can someone help how to solve it?
default: {
adapter: 'sails-mongo',
url: 'mongodb://USERNAME:PASS#cluster0-shard-00-00.ikncs.mongodb.net:27017,cluster0-shard-00-01.ikncs.mongodb.net:27017,cluster0-shard-00-02.ikncs.mongodb.net:27017/test?ssl=true&replicaSet=atlas-qhs0wy-shard-0&authSource=admin&retryWrites=true&w=majority'
}
error: Error: Consistency violation: Unexpected error creating db connection manager:
MongoError: connection 3 to cluster0-shard-00-01.ikncs.mongodb.net:27017 closed
error: Could not tear down the ORM hook. Error details: Error: Consistency violation: Attempting to tear down a datastore (default) which is not currently registered with this adapter. This is usually due to a race condition in userland code (e.g. attempting to tear down the same ORM instance more than once), or it could be due to a bug in this adapter. (If you get stumped, reach out at http://sailsjs.com/support.)

This looks like you are unable to connect to the cluster hosted on Atlas.
You will need to add your IP to the whitelist on Atlas. In the security section, under Network Access, add your IP to the whitelist (or the IP of the server you're connecting to the cluster from if you are using a remote server).

Access Firestore from Google Functions - Getting metadata from plugin failed with error: Could not refresh access token

I am deploying a google function that does some server computation and writes results in the Firestore DB in the same project.
I follow the how-tos and configure the function by
const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();
and access the Firestore db by using:
admin.firestore().collection('COLLECTION_NAME').add({data: value});
The IAM user ...#gcf-admin-robot.iam.gserviceaccount.com has the role of the Google Cloud Functions Service Agent assigned.
I get the following error:
Error: 500 undefined: Getting metadata from plugin failed with error: Could not refresh access token: Unsuccessful response status code. Request failed with status code 500
at Object.callErrorFromStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call.js:30:26)
at Object.onReceiveStatus (/workspace/node_modules/#grpc/grpc-js/build/src/client.js:175:52)
at Object.onReceiveStatus (/workspace/node_modules/#grpc/grpc-js/build/src/client-interceptors.js:341:141)
at Object.onReceiveStatus (/workspace/node_modules/#grpc/grpc-js/build/src/client-interceptors.js:304:181)
at Http2CallStream.outputStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:116:74)
at Http2CallStream.maybeOutputStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:155:22)
at Http2CallStream.endCall (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:141:18)
at Http2CallStream.cancelWithStatus (/workspace/node_modules/#grpc/grpc-js/build/src/call-stream.js:457:14)
at callStream.filterStack.sendMetadata.then (/workspace/node_modules/#grpc/grpc-js/build/src/channel.js:225:36)
at process._tickCallback (internal/process/next_tick.js:68:7)
Caused by: Error
at WriteBatch.commit (/workspace/node_modules/#google-cloud/firestore/build/src/write-batch.js:415:23)
at DocumentReference.create (/workspace/node_modules/#google-cloud/firestore/build/src/reference.js:283:14)
at CollectionReference.add (/workspace/node_modules/#google-cloud/firestore/build/src/reference.js:2011:28)
at exports.parseProduct.functions.region.https.onRequest (/workspace/index.js:55:56)
at process._tickCallback (internal/process/next_tick.js:68:7)
code: '500',
details:
'Getting metadata from plugin failed with error: Could not refresh access token: Unsuccessful response status code. Request failed with status
metadata: Metadata { internalRepr: Map {}, options: {} },
note:
'Exception occurred in retry method that was not classified as transient'
What am I doing wrong?

For anyone seeing the above error message. It is connected (at least in this case) to permissions of service accounts.
I was adjusting some permissions and deactivated some accounts that I thought I would not need.
Among them was "projectname"#appspot.gserviceaccount.com
Reactivating it solved my problem.
It would be really good, if Google would show more meaningful error messages in such cases!

It seems to be an internal issue rather than maybe something you're doing. Making a quick search I found a very similar issue in GitHub and SO. From those links, I noticed that it was more about a library and I would suggest to reach Firebase Support since they can check the internals of the environment since it is possible some strange magic could be happening behind the scenes.

Error while login into AWS through google in react native

I working on AWS login through google. I'm following this link https://github.com/patw0929/react-native-cognito-login-example but I'm getting problem with AWS. I have added the lib for aws-sdk-react-native-core as in the link But I'm getting error while running the app.
java:45: error: method does not override or implement a method from a supertype
#Override
^
Note: C:\Users\krishna21\Awsslogin\node_modules\aws-sdk-react-native-core\android\src\main\java\com\amazonaws\reactnative\core\BackgroundRunner.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 error
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':aws-sdk-react-native-core:compileDebugJavaWithJavac'.
> Compilation failed; see the compiler error output for details.
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
I have added aws-sdk-react-native-core manually.

John,
1. First of all, the library which you are using is old and try using AWS Amplify instead, which is far superior in features and it been tested for security and errors.
2. In AWS Federated Login can be done in 2 ways, one using Cognito User
Pool as well as using Cognito Identity Pool. I hope you are using
Cognito Identity Pool.
3. Please add the Google Client ID in Edit identity pool->Authentication providers->Google+
4. Whitelist the domain which you are hitting in Google Developer
account.
Full documention is here https://itnext.io/google-sign-in-using-aws-amplify-and-amazon-cognito-69cc3bf219ad
https://aws.amazon.com/blogs/mobile/amplify-framework-adds-authentication-features-and-enhancements-for-ios-and-android-mobile-sdks/

What's the cause of the error 'getaddrinfo EAI_AGAIN'?

My server threw this today, which is a Node.js error I've never seen before:
Error: getaddrinfo EAI_AGAIN my-store.myshopify.com:443
at Object.exports._errnoException (util.js:870:11)
at errnoException (dns.js:32:15)
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:78:26)
I'm wondering if this is related to the DynDns DDOS attack which affected Shopify and many other services today. Here's an article about that.
My main question is what does dns.js do? What part of node is it a part of? How can I recreate this error with a different domain?

If you get this error with Firebase Cloud Functions, this is due to the limitations of the free tier (outbound networking only allowed to Google services).
Upgrade to the Flame or Blaze plans for it to work.

EAI_AGAIN is a DNS lookup timed out error, means it is a network connectivity error or proxy related error.
My main question is what does dns.js do?
The dns.js is there for node to get ip address of the domain(in brief).
Some more info:
http://www.codingdefined.com/2015/06/nodejs-error-errno-eaiagain.html

If you get this error from within a docker container, e.g. when running npm install inside of an alpine container, the cause could be that the network changed since the container was started.
To solve this, just stop and restart the container
docker-compose down
docker-compose up
Source: https://github.com/moby/moby/issues/32106#issuecomment-578725551

As xerq's excellent answer explains, this is a DNS timeout issue.
I wanted to contribute another possible answer for those of you using Windows Subsystem for Linux - there are some cases where something seems to be askew in the client OS after Windows resumes from sleep. Restarting the host OS will fix these issues (it's also likely restarting the WSL service will do the same).

For those who perform thousand or millions of requests per day, and need a solution to this issue:
It's quite normal to get getaddrinfo EAI_AGAIN errors when performing a lot of requests on your server. Node.js itself doesn't perform any DNS caching, it delegates everything DNS related to the OS.
You need to have in mind that every http/https request performs a DNS lookup, this can become quite expensive, to avoid this bottleneck and getaddrinfo errors, you can implement a DNS cache.
http.request (and https) accepts a lookup property which defaults to dns.lookup()
http.get('http://example.com', { lookup: yourLookupImplementation }, response => {
// do something here with response
});
I strongly recommend to use an already tested module, instead of writing a DNS cache yourself, since you'll have to handle TTL correctly, among other things to avoid hard to track bugs.
I personally use cacheable-lookup which is the one that got uses (see dnsCache option).
You can use it on specific requests
const http = require('http');
const CacheableLookup = require('cacheable-lookup');
const cacheable = new CacheableLookup();
http.get('http://example.com', {lookup: cacheable.lookup}, response => {
// Handle the response here
});
or globally
const http = require('http');
const https = require('https');
const CacheableLookup = require('cacheable-lookup');
const cacheable = new CacheableLookup();
cacheable.install(http.globalAgent);
cacheable.install(https.globalAgent);
NOTE: have in mind that if a request is not performed through Node.js http/https module, using .install on the global agent won't have any effect on said request, for example requests made using undici

The OP's error specifies a host (my-store.myshopify.com).
The error I encountered is the same in all respects except that no domain is specified.
My solution may help others who are drawn here by the title "Error: getaddrinfo EAI_AGAIN"
I encountered the error when trying to serve a NodeJs & VueJs app from a different VM from where the code was developed originally.
The file vue.config.js read :
module.exports = {
devServer: {
host: 'tstvm01',
port: 3030,
},
};
When served on the original machine the start up output is :
App running at:
- Local: http://tstvm01:3030/
- Network: http://tstvm01:3030/
Using the same settings on a VM tstvm07 got me a very similar error to the one the OP describes:
INFO Starting development server...
10% building modules 1/1 modules 0 activeevents.js:183
throw er; // Unhandled 'error' event
^
Error: getaddrinfo EAI_AGAIN
at Object._errnoException (util.js:1022:11)
at errnoException (dns.js:55:15)
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:92:26)
If it ain't already obvious, changing vue.config.js to read ...
module.exports = {
devServer: {
host: 'tstvm07',
port: 3030,
},
};
... solved the problem.

I started getting this error (different stack trace though) after making a trivial update to my GraphQL API application that is operated inside a docker container. For whatever reason, the container was having difficulty resolving a back-end service being used by the API.
After poking around to see if some change had been made in the docker base image I was building from (node:13-alpine, incidentally), I decided to try the oldest computer science trick of rebooting... I stopped and started the docker container and all went back to normal.
Clearly, this isn't a meaningful solution to the underlying problem - I am merely posting this since it did clear up the issue for me without going too deep down rabbit holes.

I was having this issue on docker-compose. Turns out I forgot to add my custom isolated named network to my service which couldn't be found.
TLDR; Make sure, in your compose file, you have your custom-networks defined on both services that need to talk to each other.
My error looked like this: Error: getaddrinfo EAI_AGAIN minio-service. The error was coming from my server's backend when making a call to the minio-service using the minio-service hostname. This tells me that minio-service's running service, was not reachable by my server's running service. The way I was able to fix this issue is I changed the minio-service in my docker-compose from this:
docker-compose.yml
version: "3.8"
# ...
services:
server:
# ...
networks:
my-network:
# ...
minio-service:
# ... (missing networks: section)
# ...
networks:
my-network:
To include my custom isolated named network, like this:
docker-compose.yml
version: "3.8"
# ...
services:
server:
# ...
networks:
my-network:
# ...
minio-service:
# ...
networks:
my-network:
# ...
# ...
networks:
my-network:
More details on docker-compose networking can be found here.

This is the issue related to hosts file setup.
Add the following line to your hosts file
In Ubuntu: /etc/hosts
127.0.0.1 localhost
In windows: c:\windows\System32\drivers\etc\hosts
127.0.0.1 localhost

In my case the problem was the docker networks ip allocation range, see this post for details

#xerq pointed correctly, here's some more reference
http://www.codingdefined.com/2015/06/nodejs-error-errno-eaiagain.html
i got the same error, i solved it by updating "hosts" file present under this location in windows os
C:\Windows\System32\drivers\etc
Hope it helps!!

In my case, connected to VPN, the error happens when running Ubuntu from inside Windows Terminal but doesn't happen when opening Ubuntu directly from Windows (not from inside the Windows Terminal)

I had a same problem with AWS and Serverless. I tried with eu-central-1 region and it didn't work so I had to change it to us-east-2 for the example.

I was getting this error after I recently added a new network to my docker-compose file.
I initially had these services:
services:
frontend:
depends_on:
- backend
ports:
- 3005:3000
backend:
ports:
- 8005:8000
I decided to add a new network which hosts other services I wanted my frontend service to have access to, so I did this:
networks:
moar:
name: moar-network
attachable: true
services:
frontend:
networks:
- moar
depends_on:
- backend
ports:
- 3005:3000
backend:
ports:
- 8005:8000
Unfortunately, the above made it so that my frontend service was no longer visible on the default network, and only visible in the moar network. This meant that the frontend service could no longer proxy requests to backend, therefore I was getting errors like:
Error occured while trying to proxy to: localhost:3005/graphql/
The solution is to add the default network to the frontend service's network list, like so:
networks:
moar:
name: moar-network
attachable: true
services:
frontend:
networks:
- moar
- default # here
depends_on:
- backend
ports:
- 3005:3000
backend:
ports:
- 8005:8000
Now we're peachy!
One last thing, if you want to see which services are running within a given network, you can use the docker network inspect <network_name> command to do so. This is what helped me discover that the frontend service was not part of the default network anymore.

Enabled Blaze and it still doesn't work?
Most probably you need to set .env from the right path, require('dotenv').config({ path: __dirname + './../.env' }); won't work (or any other path). Simply put the .env file in the functions directory, from which you deploy to Firebase.

Develop Reference

JavaScript is the programming language of the Web.