ElasticSearch AWS request Timeout

ElasticSearch AWS request Timeout - javascript

I have an ElasticSearch instance running in AWS which I was able to connect to via the JavaScript client in a MeteorJS application. There was no issue creating mappings(indices and analyzers) or updating mappings.
The problem arises whenever there is an index, update or delete request to the instance. After serving above 200 request, the ElasticSearch instance starts throwing request timeout error with code 408. Initially, I thought making multiple single request is the casue, so I decided to do bulk push. Below is the snippet for the bulk push request.
var bulk = SearchService.ElasticQueue.splice(0, 1000);
console.log('Size: ', bulk.length);
if (bulk.length > 0) {
EsClient.bulk({
body: bulk
}, function (error, response) {
if (!error) {
console.log(response);
} else {
console.log(error);
}
});
}
The SearchService.ElasticQueue is a form of queue and a cron job runs frequently to fetch data from it and run bulk requests. I also tried reducing the number of documents in the bulk request and also increased request Timeout in the connection config, but it doesn't seem to help. I would appreciate any suggestion made.
Thanks.

There is only one way that you can use:
wait_for_completion=false
Which will return a Task ID and then you can pull the data using this Task ID

Related

Sending Many POST requests simultaneously Nodejs

I am new to Nodejs so excuse me for any mistake .. :)
Let me explain what i am trying to do :
basically i am making a push notification service for my platform .. i will explain further..
I have two NodeJs servers (using express) :
SERVER 1 :
it gets everything needed from the database such as ( device registration , identifier ..) and should send to the second server.
SERVER 2 : This server Receives a JSON ( contains everything i need ) to create the FCM and APNS payload and then send to the convenient provider (FCM,APNS).
what i am using : i am using axios to send POST requests.
The issue : since the 1st server will be sending big amount of requests ( usually 5K or more -- it's dynamic) at the same time , axios cannot handle that , and I've tried many other alternatives to axios but faced the same thing.
My question : How can i send that amount of requests without any issues ?
PS: when i send few requests ( 100 or bit more) i face no errors ...
I hope everything is clear and i would really appreciate any help.
Code Example of the Request with Axios :
PS: it always falls in the "[Request Error] ..."
try
{
axios.post(endpoint,{sendPushRequest})
.then( response => {
console.log(response.data);
})
.catch( er => {
if (er.response) {
console.log("[Response Error] on sending to Dispatcher...");
// The request was made and the server responded with a status code
// that falls out of the range of 2xx
console.log(er.response.data);
console.log(er.response.status);
console.log(er.response.headers);
} else if (er.request) {
console.log("[Request Error] on sending to Dispatcher...");
// The request was made but no response was received
// `error.request` is an instance of XMLHttpRequest in the browser and an instance of
} else {
// Something happened in setting up the request that triggered an Error
console.log('[Error]', er.message);
}
console.log(er.config);
});
}
catch (e) {
console.log('[ Catch Error]', e);
}

Usually, for doing this kind of asynchronous stuff you should use any queuing service as if the second server gets busy which it might in case of handling such a huge number of rest APIs your user would miss the notification.
Your flow should be like:
SERVER 1: it gets everything needed from the database such as ( device registration, identifier ..) and should push/publish to any queuing service such as rabbitMQ/Google pubsub etc.
SERVER 2: Instead of having rest APIs, this server should pull messages from the queue recursively and then Receives a JSON ( contains everything I need ) to create the FCM and APNS payload and then send to the convenient provider (FCM, APNS).
This is beneficial because even if anything happens to your server like busy/crashes the message would persist in the queue and on restarting the server you would be able to do your work(sending a notification or whatever).

specific response to specific clients while making requests at the same time to the server

Is there any way to get responses to a specific client when another client has a different request at the same time to the same server?
This is code snippet for an exchange server. The given function is present in a library named "ccxt", this function "exchange.fetchMarkets()" has an API which requests to a third party server which is an exchange server like 'bitfinex', 'crex24', 'binance', etc. The issue I am facing is when one client is requesting for an exchange like 'crex24' at the same time when another client is requesting for different exchange like 'binance', they are getting the same response as the function calls for the last recent exchange.
I want it to give responses according to the client's requests independent of each other.
this one is controller function:
const ccxt = require("ccxt");
exports.fetchMarkets = function(req, res){
let API = req.params.exchangeId;
let exchange = new ccxt[API]();
if (exchange.has["fetchMarkets"]) {
try{
var markets = await exchange.fetchMarkets();
res.send(markets)
}catch (err) {
let error = String(err);
res.send({ failed: error });
}
}else{
res.send({loadMarkets : "not available"})
}
}
This is end point for the server request:
app.route('/markets/:exchangeId')
.get(exchange.fetchMarkets)
Here you can find the ccxt library: https://github.com/ccxt/ccxt/wiki/Manual and can be included in the project by "npm install ccxt"

I don't see why the code you mentioned wouldn't work the way you are expecting it to work. I created a small app and it is working as expected. You can check here
https://repl.it/repls/IllfatedStrangeRepo
I am hitting four different request with different ids and I am getting different response.
Hope it clear the doubts.

Sending many requests from Node.js to an API causes error

I have more than 2000 user in my database , when I try to broadcast a message to all users, it barely sends about 200 request then my server stops and I get an error as below :
{ Error: connect ETIMEDOUT 31.13.88.4:443
at Object.exports._errnoException (util.js:1026:11)
at exports._exceptionWithHostPort (util.js:1049:20)
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1090:14)
code: 'ETIMEDOUT',
errno: 'ETIMEDOUT',
syscall: 'connect,
address: '31.13.88.4',
port: 443 }
Sometimes I get another error that says :
Error!: Error: socket hang up
This is my request :
function callSendAPI(messageData) {
request({
uri: 'https://graph.facebook.com/v2.6/me/messages',
qs: { access_token: '#####' },
method: 'POST',
json: messageData
}, function (error, response, body) {
if (!error && response.statusCode == 200) {
var recipientId = body.recipient_id;
var messageId = body.message_id;
if (messageId) {
console.log("Successfully sent message with id %s to recipient %s",
messageId, recipientId);
} else {
console.log("Successfully called Send API for recipient %s",
recipientId);
}
} else {
console.error("Failed calling Send API");
console.log(error)
}
});
}
I have tried
setTimeout to make the the API calling wait for a while:
setTimeout(function(){callSendAPI(data)},200);
Can anyone help if he/she faced a similar error ?
EDITED
I'm using Messenger Platform which support high rate of calls to the Send API and it is not limited with 200 calls .

You may be hitting Facebook API limits. To throttle the requests you should send every request after some interval from the previous one. You didn't include where you're iterating over all users but I suspect that you maybe do it in a loop and if you use setTimeout to delay every request with flat 200ms delay then you have all requests done at the same time like you did before - just 200ms later.
What you can do is:
You can use setTimeout and add variable delay for every request (not recommended)
You can use Async module's series or parallelLimit (using callbacks)
You can use Bluebird's Promise.mapSeries or Promise.map with concurrency limit (using promises)
The 1 is not recommended because it will still be fire-and-forget (unless you add more complexity to that) and you still risk that you have too much concurrency and go over limit because you only control when the requests start, not how many of outstanding requests are there.
The 2 and 3 are mostly the same but differ by using callbacks or promises. In your example you're using callbacks but your callSendAPI doesn't take its own callback which it should if you want option 2 to work - or, alternatively, it should return a promise if you want option 3 to work.
For more info see the docs:
https://caolan.github.io/async/docs.html#parallelLimit
https://caolan.github.io/async/docs.html#series
http://bluebirdjs.com/docs/api/promise.map.html
http://bluebirdjs.com/docs/api/promise.mapseries.html
Of course there are more ways to do it but those are the most straightforward.
Ideally, if you want to fully utilize the 200 requests per hour limit then you should queue the requests yourself and make the requests at certain intervals that correspond to that limit. Sometimes if you didn't do a lot of requests in an hour then you won't need delays, sometime you will. What you should really do here is to queue all requests centrally and empty the queue at intervals corresponding to the already used up portion to the limit which you should track yourself - but that can be tricky.

It sounds like you are hitting a rate limit.
From the Facebook documentation:
Your app can make 200 calls per hour per user in aggregate.
You can check the dashboard to see if you are hitting the rate limiting in these cases.

How to handle the Storage Queue using the WebJobs

I just started to use Azure as my mobile development as well as my web development.
I am using NodeJs as my framework to work on the azure backend. I am using mobile services and web apps in the azure.
Here is the situation, I am using the Storage Queue from Azure and I am using webjob from my webapps to handle the storage queues. The messages in the queue are going to be sent out to each specific user via notification hub. (Push Notification)
So, the queues will have the size of the 50,000 or more queue messages. All these messages are used to push out the message to the user one by one. However, I tried to handle the queues using WebJob by scheduling 2minutes interval. I know that webjob wont run two instances when the schedule is currently running.
Initially, I wanna use the webjob which run continuously but it will go to pending to restart once the script run finished. My assumption for the continuously running of webjob is that it will run under an endless loop for the script over and over again. until it caught exception or something wrong. My assumption goes wrong, where it will restart by it self once it succeeded the whole script. I know the restart can be adjusted to less than 60seconds but I am not sure whether this helps as I could a lot aysnc operation as well.
For my script, it will run 50,000 or more users messages in the loop. Then, it will send out the push message via Azure nodejs package and then upon return, then it will delete the messages so that it wont appear in the queue anymore. So, there will be some async operation for each loop in the action.
However, everything is working fine but the webjob only have execute maximum of 5 mins and then it will run again on next schedule. Meaning, it will only run to a maximum 5 mins regardless of the operation. I tried with 1,000 messages from the queue and everything works fine but when the messages go up to 5,000 and above, the time is not sufficient. Therefore, some of the async operation is not completed which cause the messages are not deleted.
Is there a way to extend the 5 mins execution time or other better ways to handle the Storage Queues. I looked into the Webjobs SDK but it is only limited to C# and Visual Studio. I am using Mac OSX and Javascript which I could not use.
Please advise as I wasted a lot of time figuring out whats best to handle the storage queue using webjobs but now it seems like it does not serve the purpose when the messages grow bigger and when it dealt with async operation with the total of only 5 mins execution time. I do not have any VM at the moments which I only use PAAS in azure.

According your description:
All these messages are used to push out the message to the user one by one
it will run 50,000 or more users messages in the loop
So your requirement is to send each message in queue to user,and now you get all the messages in queue one time even the message size will get up to more then 50,000, and loop the messages for further operations?
If there is any misunderstanding, feel free to let me know.
In my opinion, cloud you get the top message of the queue at once, and send it to your user, so that it will remarkbly reduce the processing time and which can be set in a continuously webjob. You can refer to How To: Peek at the Next Message to see how to peek at the message in the front of a queue without removing it from the queue
update
As I found you have mentioned that I also have a Web App in Node.js in your whole project architecture.
So I consider whether you can leverage continuous webjob in Web Apps to get one message and send to Notification Hub one time.
And here is my test code snippet:
var azureStorage = require('azure-storage'),
azure = require('azure'),
accountName = '<accountName>',
accountKey = '<accountKey>';
var queueSvc = azureStorage.createQueueService(accountName, accountKey);
var notificationHubService = azure.createNotificationHubService('<notificationhub-name>', '<connectionstring>');
queueSvc.getMessages('myqueue', {numOfMessages:1}, function(error, result, response) {
if (!error) {
// Message text is in messages[0].messagetext
var message = result[0];
console.log(message.messagetext);
var payload = {
data: {
msg: message.messagetext
}
};
notificationHubService.gcm.send(null, payload, function(error) {
if (!error) {
//notification sent
console.log('notification sent');
queueSvc.deleteMessage('myqueue', message.messageid,message.popreceipt,function(error, response) {
if (!error) {
console.log(response);
// Message deleted
} else {
console.log(error);
}
});
}
});
}
});
Details refer to How to use Notification Hubs from Node.js And https://github.com/Azure/azure-storage-node/blob/master/lib/services/queue/queueservice.js#L727
update2
As I get the idea of Service-bus demo on GitHub, I modified the code above, and which greatly improve the efficiency.
Here the code snippet, for your information:
var queueName = 'myqueue';
function checkForMessages(queueSvc, queueName, callback) {
queueSvc.getMessages(queueName, function(err, message) {
if (err) {
if (err === 'No messages to receive') {
console.log('No messages');
} else {
console.log(err);
// callback(err);
}
} else {
callback(null, message[0]);
console.log(message);
}
});
}
function processMessage(queueSvc, err, lockedMsg) {
if (err) {
console.log('Error on Rx: ', err);
} else {
console.log('Rx: ', lockedMsg);
var payload = {
data: {
msg: lockedMsg.messagetext
}
};
notificationHubService.gcm.send(null, payload, function(error) {
if (!error) {
//notification sent
console.log('notification sent');
console.log(lockedMsg)
console.log(lockedMsg.popreceipt)
queueSvc.deleteMessage(queueName, lockedMsg.messageid, lockedMsg.popreceipt, function(err2) {
if (err2) {
console.log('Failed to delete message: ', err2);
} else {
console.log('Deleted message.');
}
})
}
});
}
}
var t = setInterval(checkForMessages.bind(null, queueSvc, queueName, processMessage.bind(null, queueSvc)), 100);
I set the loop time as 100ms in setInterval, now it can process almost 600 message per minutes in my test.

The various configuration settings for WebJobs are explained on this wiki page. In your case you should increase the WEBJOBS_IDLE_TIMEOUT value, which is the time in seconds that a triggered job will timeout if it hasn't produced any output for a period of time. The WEBJOBS_IDLE_TIMEOUT setting needs to be configured in the portal app settings, not via the app.config file.

Maintaining data across multiple API requests on node server

Here is the problem I am working on.
The client needs to poll the node server for some data using an API. For the node server to respond, it needs a data set (to be read from DB). I want to avoid reading the database in every poll. How do I maintain the data set across multiple polls?
And if this is possible, would it have any impact on server performance.

db query reduction is always a complicated question. my solution is making db query a promise and cache it till it is expired.
For example:
var cache = {};
var cachedQuery = function(id) {
if(id in cache) return cache[id];
return cache[id] = new Promise(function(resolve, reject) {
db.query('select * from test where id=?', id, function(err, rows) {
delete cache[id];
if(err) reject(err);
else resolve(rows);
}
});
}
Assume that you have 100+ queries at the same time that shares the same request id, those queries will share the same db request. The request is established when the first query comes, and all queries will return the same result when the query completes.
For a more generic usage, you can use a lru-cache to store the promises created, giving it a expriation time.

You can use an in-memory cache. The idea is that before making a trip to the database, you check if the requested value exists in the cache. If it is - serve it. If not - fetch from database, save to cache and return it.
Serving from memory is the fastest you can get. There are existing solutions for node out there, like node-cache.

Develop Reference

JavaScript is the programming language of the Web.