I would like to send ~50,000 SMS with Twilio, and I was just wondering if my requests are going to crash if I loop through a phone number array of this size. The fact is that Twilio only allows 1 message for each request, so I have to make 50,000 of them.
Is it possible to do it this way or do I have to find another way?
50,000 seems too much but I have no idea of how many requests I can do.
phoneNumbers.forEach(function(phNb)
{
client.messages.create({
body: msgCt,
to: phNb,
from: ourPhone
})
.then((msg) => {
console.log(msg.sid);
});
})
Thanks in advance
Twilio developer evangelist here.
API Limits
First up, a quick note on our limits. With a single number, Twilio has a limit of sending one message per second. You can increase that by adding more numbers, so 10 numbers will be able to send 10 messages per second. A short code can send 100 messages per second..
We also recommend that you don't send more than 200 messages on any one long code per day.
Either way I recommend using a messaging service to send messages like this.
Finally, you are also limited to 100 concurrent API requests. It's good to see other answers here talking about making requests sequentially rather than asynchronously as that will eat up the memory on your server as well as start to find requests are turned down by Twilio.
Passthrough API
We now have an API that allows you to send more than one message with a single API call. It's known as the passthrough API, as it lets you pass many numbers through to the Notify service. You need to turn your numbers into "bindings" and send them via a Notify service, which also uses a messaging service for number pooling.
The code looks a bit like this:
const Twilio = require('twilio');
const client = new Twilio(accountSid, authToken);
const service = client.notify.services('ISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX');
service.notifications
.create({
toBinding: [
JSON.stringify({
binding_type: 'sms',
address: '+15555555555',
}),
JSON.stringify({
binding_type: 'facebook-messenger',
address: '123456789123',
}),
],
body: 'Hello Bob',
})
.then(notification => {
console.log(notification);
})
.catch(error => {
console.log(error);
})
The only drawbacks in your situation is that every message needs to be the same and the request needs to be less than 1 megabyte in size. We've found that typically means about 10,000 numbers, so you might need to break up your list into 5 API calls.
Let me know if that helps at all.
There are two factors here.
You need to consider Twilio Api usage Limits.
Performing 50.000 parallel http requests (actually your code do it) is not a good idea: you will have memory problems.
Twilio sms limits change based on source and destination.
You have two solution:
Perform 50k http requests sequentially
phoneNumbers.forEach(async function(phNb){
try {
let m = await client.messages.create({
body: msgCt,
to: phNb,
from: ourPhone
})
console.log(a)
} catch(e) {
console.log(e)
}
})
Perform 50k http requests concurrently with concurrency level
This is quite easy to do with the awesome bluebird sugar functions. Anyway, the twilio package uses native promise. You can use async module with mapLimit method for this purpose
You send your requests asynchronous due to non-blocking forEach body calls, I guess it's fastest for the Client. But the question is: does Twilio allow such a load from a single source? it needs to be tested... And if no, you should build some kind of requests queue, e.g. promise based, something like
function sendSync(index = 0) {
if(index === phoneNumbers.length) {
return;
}
client.messages.create({
body: msgCt,
to: phoneNumbers[index],
from: ourPhone
})
.then(function(msg) {
console.log(msg.sid);
sendSync(index + 1);
})
.catch(function(err) {
console.log(err);
});
}
sendSync();
Or if you like async/await –
async function sendSync() {
for (let phNb of phoneNumbers) {
try {
let msg = await client.messages.create({
body: msgCt,
to: phNb,
from: ourPhone
});
console.log(msg);
} catch(err) {
console.log(err);
}
})
}
sendSync();
Related
I have an application in Electron that does facial recognition of people to then decide whether or not they can enter the place and for that I'm using Amazon Rekognition.
Everything was working fine (for a few months) until, two days ago, a customer reported to me that the app was behaving strangely, like it wasn't responding to requests for facial recognition.
After several tests, I discovered that what is happening with it is a timeout error, which occurs in all API calls, whether they are looking for faces (SearchFacesByImage) or registering new faces (IndexFaces).
The error says:
{
"message": "connect ETIMEDOUT 3.226.60.54:443",
"errno": -4039,
"code": "TimeoutError",
"syscall": "connect",
"address": "3.226.60.54",
"port": 443,
"time": "2022-12-14T13:50:10.909Z",
"region": "us-east-1",
"hostname": "rekognition.us-east-1.amazonaws.com",
"retryable": true
}
What intrigued me was the fact that everything was working fine, until this behavior just started happening (and I didn't make any code changes/updates to the app running on my client's computer).
And what makes me even more intrigued is that this behavior occurs completely randomly and only on the machine of that client in question. Sometimes the API calls work correctly (returning whether the person was recognized or not), but most of the time, the calls take about 90 seconds to return the timeout error. When executing the same code on my machine (same methods and same CollectionId) everything runs normally and there was no timeout error at any time - while at the exact same moment on my client's machine the behavior continues.
I was using aws-sdk and then switched to #aws-sdk/client-rekognition (thinking that could solve the problem) but the code only worked on a few of the first calls to the API and a few minutes later it got the timeout errors again.
The code I'm using to configure and make calls to Rekognition is basically this:
const { RekognitionClient, IndexFacesCommand, SearchFacesByImageCommand } = require('#aws-sdk/client-rekognition')
const rekognitionClient = new RekognitionClient({
credentials: {
accessKeyId: 'accessKeyId',
secretAccessKey: 'secretAccessKey'
},
region: 'us-east-1'
})
const registerFaceOnRekognition = async (bytes, userId) => {
const params = {
CollectionId: 'collectionId',
Image: { Bytes: bytes },
ExternalImageId: userId,
MaxFaces: 1,
QualityFilter: 'HIGH'
}
const command = new IndexFacesCommand(params)
try {
const { FaceRecords } = await rekognitionClient.send(command)
if (!FaceRecords.length) {
console.log('No faces detected.')
return
}
console.log('Face created:')
console.log(FaceRecords[0].Face.FaceId)
} catch (error) {
console.error(error) // timeout error
}
}
const searchFaceByImageOnRekognition = async (bytes) => {
const params = {
CollectionId: 'collectionId',
Image: { Bytes: bytes },
MaxFaces: 1,
FaceMatchThreshold: 99,
QualityFilter: 'HIGH'
}
const command = new SearchFacesByImageCommand(params)
try {
const { FaceMatches } = await rekognitionClient.send(command)
if (!FaceMatches.length) {
console.log('This face has not been registered yet')
return
}
console.log('Face found:')
console.log(FaceMatches[0].Face.ExternalImageId)
} catch (error) {
console.error(error) // timeout error
}
}
// Method called through the renderer process that has a canvas where the webcam view is reproduced
const onTakePicture = (event, data) => {
const bytes = Buffer.from(data.dataURL.replace('data:image/jpeg;base64,', ''), 'base64')
// If there is a userId, register the face in the image
if (data.userId) {
registerFaceOnRekognition(bytes, data.userId)
return
}
// Else, search for the face in the image
searchFaceByImageOnRekognition(bytes)
}
Just remembering that: during all tests on my client's computer the internet connection was stable and working properly.
What is the best way to investigate and resolve this issue?
UPDATE:
I enabled Rekognition debug logs and they can be found at: https://gist.github.com/IgorSamer/4e58e09f3fa615401f85ca325b794245
In it, the first three requests (2022-12-16T13:48:45.932Z, 2022-12-16T13:53:20.325Z and 2022-12-16T14:19:12.479Z) occur normally. However, all other consecutive requests start to give the timeout error, where, in fact, no data is returned after the [DEBUG] App: endpoints Resolved endpoint: step.
As previously mentioned the internet connection is working fine. I could also managing to reproduce the error via remote access, that is, the machine internet was ok at the time of error.
Is there a possibility that there is a block made by my client's firewall/network that prevents requests from being sent by the SDK after a few successful requests? If yes, what is the best way to investigate this?
Exploration
This is what I would do initially to gather some info:
Verify if this is happening ALL the time with that specific client.
Verify if this is happening ONLY with one client, or more.
Verify if this is happening in one or multiple regions (i.e us-east-1).
Verify if Amazon Recognition has had/or has issues in the affected region during the time window of interest.
Check Recognition's status in the Health dashboard in your AWS console: link
Use AWS Recognition Guidelines and Quotas as a reference to determine if your app/service usage of Recognition is under the set limits.
Note there's a limit on TPS per resource (i.e SearchFacesByImage, IndexFaces) per account.
Possible approaches
Verify if there was a change in the client network/firewall. Just ask.
Replicate your app's API call with AWS CLI and study logs.
Access remotely to your client's device.
Setup temporal AWS credentials (remember to remove access after the test)
Send an API call to the Recognition endpoint. Note that even a 4XX error will be good news, as you got at least some response.
Set up proper logging for your app (as CloudWatch logs may not be enough to troubleshoot).
Check Splunk's APM and NewRelic's APM
I hope this may be of help to at least create a troubleshooting strategy
so running into an issue where I am using NodeJS with Express for API calls. I am fetching all documents in a collection using
export async function main(req, res) {
try {
const tokens = await tokenModel.find({}).lean();
res.json(tokens);
} catch {(err) => {
res.status(500).json({ message: err.message })
console.log('err', err.message)
}
}
console.log('Get Data')
}
Now this request works great and returns me the data I need. The problem is I have over 10K documents, and on a PC takes about 10 seconds to return that data, and on a mobile phone it takes over 45 seconds. I know network on phone matters, but is there any way I can increase this? Nothing I have tried works. I keep finding that lean is the option to use, and I am already using it with no success or improvements.
Well, it's slow because you are returning all 10k results.
Do you actually need all 10k results? If not, you should consider filtering only results that you actually need.
If not, I suggest implementing pagination, where you would return results in batches (50 per page for example).
In addition, if you are using only some of the fields from the documents, you should tell MongoDB to return only these fields, and not all of them. That would also increase the performance since less data will be transferred through the network.
Here is how I write a document and it's subcollections:
public async setEvent(event: EventInterface): Promise<void[]> {
return new Promise<void[]>(async (resolve, reject) => {
const writePromises: Promise<void>[] = [];
event.setID(event.getID() || this.afs.createId());
event.getActivities()
.forEach((activity) => {
activity.setID(activity.getID() || this.afs.createId());
writePromises.push(this.afs.collection('events').doc(event.getID()).collection('activities').doc(activity.getID()).set(activity.toJSON()));
activity.getAllStreams().forEach((stream) => {
this.logger.info(`Steam ${stream.type} has size of GZIP ${getSize(this.getBlobFromStreamData(stream.data))}`);
writePromises.push(this.afs
.collection('events')
.doc(event.getID())
.collection('activities')
.doc(activity.getID())
.collection('streams')
.doc(stream.type) // #todo check this how it behaves
.set({
type: stream.type,
data: this.getBlobFromStreamData(stream.data),
}))
});
});
try {
await Promise.all(writePromises);
await this.afs.collection('events').doc(event.getID()).set(event.toJSON());
resolve()
} catch (e) {
Raven.captureException(e);
// Try to delete the parent entity and all subdata
await this.deleteEvent(event.getID());
reject('Something went wrong')
}
})
}
However when I look at the network tab:
I see one request firing up, well ok so far , req_0 data is my activity but looking further on the same request I can see:
So it adds more data and that should not happen because:
a) I pass the size of the request to the firestore (1mb)
b) due to slow connection I pass the time limit to write.
Most interesting is that this behavior happens when I have a slow network.
EDIT: Here is the payload of the request example:
Anyone, to explain why this?
What happens is the so-called batching, so your write operations will not fire immediately, they will be aggregated into a single request because doing network I/O is expensive in terms of time and battery life.
Minimizing network I/O saves battery life (as stated above) and that is actually the main concern.
There's "magic" happening under the hood
In short, I've run into an issue where multiple parallel GET requests to my Node.js server cause the server to get "clogged up" and hang, thus resulting in timeouts for the clients (503, service unavailable).
After a lot of performance analysis, I've realized it's a CPU issue. The specific request (we'll call it GET /foo) queries data from multiple services over HTTP, and then does a lot of computation, and returns the results to the client, like this:
Client request GET /foo
/foo controller queries data over HTTP from multiple other services`
/foo controller then does a bunch of iterations over the data to compile some output for the client
Step 3 takes around 2 seconds to complete. However, if I send 2 requests in parallel to /foo, each client will receive their response in about 4 seconds. When I run the app in a cluster using more cores, the requests run much faster, but not quite what I want.
Seems like I have several options here:
pre-compute the response (ideally would like to avoid this for now, since it will require a whole "cache invalidation" scheme), or
/foo sends the CPU-blocking computation asynchronously to another process (using Heroku, so that would be another dyno), and then I can use a websocket or something to push the results to the client (again, very complex for my situation), or
somehow yield to a child process in the request and return the results to the client
Would love to do something like option 3. Something like this:
get('/foo', function*(request) {
// I/O, so not blocking the event loop (I think)
let data = yield getData(request)
// make this happen in a different process
let response = yield doSomeHeavyProcessing(data)
return response
})
I've omitted a lot of implementation details above, but if it's necessary to know, I'm using Koa and Node.js 6.
Ideally, doSomeHeavyProcessing would do the CPU-intensive computation in some separate process, and when it's done, still send the results back in a "synchronous" fashion to the request client.
Been trying to wrap my head around child processes, web workers, fibers, etc., and have been doing some basic "hello worlds" with these to get them to do basically the above, but to no avail. Can post more details if necessary.
Here are some approaches that you can try:
1.
Split blocking computation in small chunks and use setImmediate to place the next chunk of work at the end of the event queue. So computation is no longer blocking and other requests can be processed.
2.
Microsoft recently released napajs. As stated in their README
As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them.
I haven't tried it, but it looks very promising:
var napa = require('napajs');
var zone1 = napa.zone.create('zone1', { workers: 4 });
get('/foo', function*(request) {
let data = yield getData(request)
let response = yield zone1.execute(doSomeHeavyProcessing, [data])
return response
})
3. If nothing of the above is enough and you need to spread the load across multiple machines, then you probably couldn't avoid using some sort of message queue to distribute work to different servers. In this case check out ZeroMQ. It is extremely easy to use from node, and you can implement any kind of distributed messaging pattern with it.
You could utilize Child process with additional wrapper for convenience.
worker.js - this module will run in a separate process and will do the heavy work
const crypto = require('crypto');
function doHeavyWork(data) {
return crypto.pbkdf2Sync(data, 'salt', 100000, 64, 'sha512');
}
process.on('message', (message) => {
const result = doHeavyWork(message.data);
process.send({ id: message.id, result });
});
client.js - a convenience (but primitive) wrapper for Child process
const cp = require('child_process');
let worker;
const resolves = new Map();
module.exports = {
init(moduleName, errorCallback) {
worker = cp.fork(moduleName);
worker.on('error', errorCallback);
worker.on('message', (message) => {
const resolve = resolves.get(message.id);
resolves.delete(message.id);
if (!resolve) {
errorCallback(new Error(`Got response from worker with unknown id: ${message.id}`));
return;
}
resolve(message.result);
});
console.log(`Service PID: ${process.pid}, Worker PID: ${worker.pid}`);
},
doHeavyWorkRemotly(data) {
const id = `${Date.now()}${Math.random()}`;
return new Promise((resolve) => {
worker.send({ id, data });
resolves.set(id, resolve);
});
}
}
I use fork() to utilize an additional communication channel as it is stated in the docs.
Also I keep a record of all submitted to worker process requests (const resolves = new Map();) and resolve Promises (resolve(message.result);) only when the worker process returns response for the specific request (const resolve = resolves.get(message.id);).
run.js - a startup module, it utilizes co to 'execute' generators.
const co = require('co');
const client = require('./client');
function errorCallback(error) {
console.log('Got an unexpected error!');
console.log(error);
}
client.init('./worker.js', errorCallback);
function* run() {
while(true) {
yield client.doHeavyWorkRemotly('mydata');
}
}
co(run);
To test it simply run node run.js, it will print
Service PID: XXXX, Worker PID: XXXX
then take a look at CPU utilization, worker process will probably take around 100% of CPU while Service will be quite idle.
I'm using Nodemailer to send mailings in my NodeJS / Express server. Instead of sending the mail directly I want to wait 20 minutes before sending the mail. I think this feels more personal then sending mail directly.
But I have no idea how to achieve this. I guess I don't need something like a NodeJS cronjob like this NodeCron package, or do I?
router.post('/', (req, res) => {
const transporter = nodemailer.createTransport(smtpTransport({
host: 'smtp.gmail.com',
port: 465,
auth: {
user: 'noreply#domain.nl',
pass: 'pass123'
}
}));
const mailOptions = {
from: `"${req.body.name}" <${req.body.email}>`,
to: 'info#domain.nl',
subject: 'Form send',
html: `Content`
};
transporter.sendMail(mailOptions, (error, info) => {
if (error) res.status(500).json({ responseText: error });
res.status(200).json({ responseText: 'Message send!' });
});
}
});
My router looks like as shown above. So if post is called I want this request to wait 20 minutes. Instead of with a cronjob I want to execute the post just once, but with a bit of a delay. Any suggestions on how to do this?
Well some folks may come here and tell you to use an external queue system and bla bla... But you could simply use plain old Javascript to schedule the sending 20*60*1000 milliseconds into the future to get things started. :)
There's however a problem with your code: you're waiting for the mailer to succeed before sending the 200 - 'Message sent' response to the user. Call me a madman but I'm pretty sure the user won't be staring at the browser window for 20 minutes, so you'll probably have to answer as soon as possible and then schedule the mail. Modifying your code:
router.post('/', (req, res) => {
const DELAY = 20*60*1000 // min * secs * milliseconds
const transporter = nodemailer.createTransport(smtpTransport({
host: 'smtp.gmail.com',
port: 465,
auth: {
user: 'noreply#domain.nl',
pass: 'pass123'
}
}));
const mailOptions = {
from: `"${req.body.name}" <${req.body.email}>`,
to: 'info#domain.nl',
subject: 'Form send',
html: `Content`
};
res.status(200).json({ responseText: 'Message queued for delivery' });
setTimeout(function(){
transporter.sendMail(mailOptions, (error, info) => {
if (error)
console.log('Mail failed!! :(')
else
console.log('Mail sent to ' + mailOptions.to)
}),
DELAY
);
}
});
There are however many possible flaws to this solution. If you're expecting big traffic on that endpoint you could end up with many scheduled callbacks that will eat the stack. In addition, if something fails the user of course won't be able to know.
If this is a big / serious project, consider using that cronjob package or using an external storage mechanism where you can queue this "pending" messages (Redis would do and it's incredible simple), and have a different process read tasks from there and perform the email sending.
EDIT: saw some more things on your code.
1) You probably don't need to create a new transport inside your POST handler, create it outside and reuse it.
2) In addition to the mentioned problems, if your server crashed no email will be ever sent.
3) If you still want to do it in a single Node.js app, instead of scheduling an email on every request to this endpoint, you'd be better storing the email data (from, to, subject, body) somewhere and schedule every 20 minutes a function that will get all pending emails, send them one by one, and then reschedule itself to re-run 20 minutes later. This will keep you memory usage low. Server crash still make all emails lost, but if you add REDIS into the mix then you can simply grab all pending emails from REDIS when your app start.
Probably too much for an answer, sorry if it wasn't needed! :)
I think CharlieBrown's answer is correct and since I had two answers in my mind while reading the question, I thank him for simplifying my answer to be the alternative of his.
setTimeout is actually a good idea, but it has a drawback: in the case when there is any reason to stop the server code (server restart, module installation, file management, etc.) your callbacks scheduled at the end of the setTimeout's time parameter will not be executed and some users will not receive emails.
If the problem above is serious-enough, then you might want to store scheduled emails to be sent in the database or into Redis and use a cron job to periodically check the email set and send the emails if there are some.
I think that either this answer or CharlieBrown's should suffice for you, depending on your preferences and needs.