Anyway to prevent rabbitmq dead letter queue dropping properties? - javascript

I'm using the dead letter exchange feature on rabbitmq to perform scheduled rpc calls, but after the queue is dead lettered it dropped the replyto property that was in the original queue. Is there anyway to declare the replyto property in a way that it will be retained in the "dead queue"?
I'm doing this with amqplib in node.js by the way.

Unfortunately, RabbitMQ will only preserve the properties that are listed on the Dead Letter Exchange page:
queue - the name of the queue the message was in before it was dead-lettered,
reason - reason for DLX being used
time - the date and time the message was dead lettered as a 64-bit AMQP format timestamp,
exchange - the exchange the message was published to
routing-keys - the routing keys the message was published with,
count - how many times this message was dead-lettered in this queue for this reason, and
original-expiration - the original expiration property of the message.
There are 2 ways to solve the problem you're seeing, I think.
1) Put the reply-to in your own header or property field, and read it from there / replace it when it's not in the usual spot
2) Don't use the reply-to field. Instead, use a well-known exchange for the reply at a later point in time.
Using the reply-to field typically implies a request/response or RPC scenario. These scenarios usually need a response fairly quickly. If a response does not come quickly, the system can usually move forward without it - even if it's just a message to the user saying "X is not available right now".
You say you're using a DLX to do scheduled RPC calls... delayed messages is a common use case for a DLX - nothing wrong with that. But delaying the RPC response can run in to some significant challenges beyond what you're already seeing.
For example, what happens when your system has a hiccup and the code that made the original request is no longer there to listen for the response? The answer to this depends on whether or not you really need the response to be handled. If you do need it to be handled - the system will run in to serious trouble if it isn't - then RPC can be dangerous.
Instead of relying on RPC and implying a temporal need for a given response, it's often better to use two-way messaging via separate queues. I've written about this in both my managing long running processes post and in my RabbitMQ Patterns email course / ebook.
The gist of it is that you can avoid the need for a reply-to queue by having the original message publisher also be a subscriber with a queue for the responses.
An example from the long running process post:
var DrinkRequestSender = new Sender(/* ... details ... */);
var DrinkRequestReceiver = new Receiver(/* ... details ... */);
var DrinkStation = {
make: function(drink){
DrinkRequestReceiver.receive((response) => {
var drinkResponse = response.body;
this.trigger("drinkup", drinkResponse);
});
var drinkData = drink.toJSON();
DrinkRequestSender.send(drinkData);
}
};
In this example, the code is sending out a "request" and later receiving a "response" - but not using a standard RPC setup. It is using a dedicated queue for the response, with the code on the other end sending the reply back via an exchange that routes to that queue.
This allows you to better handle failure scenarios, very long running processes and more.
This style of 2-way messaging does add some additional challenges, though. For one, you'll have to build in the ability to reconstruct the object that made the original request.
You can find this detailed in the long running process post, and there's a bit more info in RMQ Patterns, as well (along with a lot of other patterns).
Hope that helps!

Related

How to remove particular messages in rabbitmq before publishing new messages?

I have a subscriber which pushes data into queues. Now the messages looks this
{
"Content": {
"_id" ""5ceya67bbsbag3",
"dataset": {
"upper": {},
"lower": {}
}
}
Now a new message can be pushed with same content id but data will be different. So in that i want to delete the old message with same id or replaece the message those id is same & retain only latest message.
I have not found a direct solution for this in rabbitmq. Please guide me how we can do this ?
I have already gone through some posts.
Post 1
Post 2
What you are trying to achieve cannot be trivially solved with RabbitMQ (or rather the AMQP protocol).
RabbitMQ queues are simple FIFO stacks and don't offer any mean of access to the elements beyond publishing at their top and consuming from their bottom.
Therefore, the only way to "update" an already existing message without relying on an another service would be to fetch all the messages until you find the one you are interested in, discard it, and publish the new one with the other messages you fetched together with it.
Overall, the recommendation when using RabbitMQ in regards of message duplication is to make their consumption idempotent. In other words, the consumption of 2 messages deemed to be the same should lead to the same outcome.
One way to achieve idempotency is to rely on a secondary cache where you store the message identifiers and their validity. Once a consumer fetches a new message from RabbitMQ, it would check the cache to see if it's a valid message or not and act accordingly.
I think this is a slightly wrong way to use rabbitMQ.
only immutable (not intended to change) tasks should be put into queues which a worker shall consume.
An alternative way to implement your particular task is
just push immutable data into queue { "content" : { "_id" : "5ceya67bbsbag3"} .. }
store mutable data in db (mongo) or in-mem db (something like redis is suggested here).
whenever update needed, update in db
let your worker fetch required data using your "_id" ref from the db
I am not sure if removing a message is a good idea. If your requirement is to update the data as it comes so that always latest data is maintained for same Id.
Other thing is as messages are getting consumed always the last message data will get updated. So I don't see a issue here in Rabbit MQ.

How can a lambda function that consumes a SQS queue send the message to a Dead Letter Queue?

I have an AWS lambda function that consumes data from an AWS SQS queue. If this lambda finds a problem when processing the data of a message, then this message has to be added in a dead letter queue.
The documentation I found is not clear about how can I make the lambda send the message to the Dead Letter Queue. How is that accomplished?
Should I use the sendMessage() method, like I'd do to insert in a standard queue, or is there a better approach?
AWS will automatically send messages to your dead-letter-queue (DLQ) for you if receiveMessage returns that message too many times (configurable on the queue with maxReceiveCount property) - typically this happens if you receive a message, but don't delete it (if for example, you had some exception in processing it). This is the simplest way to use a DLQ - by letting AWS put messages there for you.
However, there's nothing wrong with manually sending a message to a DLQ. There's nothing special about it - it's just another queue - you can send and receive messages from it, or even give it its own DLQ!
Manually sending messages to a DLQ is useful in several scenarios, the simplest one being your case: when you know the message is broken (and want to save time trying to reprocess it). Another example is if you need to quickly burn through old items in your main queue but still save those messages for processing later - enabling you to catch up from backlog by processing more recent events first.
The key things to remember when manually sending a message to a DLQ are:
Send the message to the queue FIRST
Mark the message as consumed in the original queue (using deleteMessage) so AWS's automatic mechanisms don't put it there for you later.
if you delete the message from the original queue first, there is a small chance the message is lost (ie: if you crash or have an error before storing the message elsewhere)
You are not supposed to send messages to the dead letter queue, messages that fail to process too many times will get there on their own see here
The point is you get the message, fail on it, don't delete it, and after maxReceiveCount times it will redrive it to the DLQ.
Note that you can simply send it to the DLQ (Hinted on by the documentation see where it says The NumberOfMessagesSent and NumberOfMessagesReceived for a Dead-Letter Queue Don't Match) however it seems like an abuse, to me at least.
TLDR: You're not supposed to send it yourself, the queue needs to be configured with a DLQ and Amazon will do it for you after a set amount of failures.

Need some clarification on nodejs concepts

I am starting to learn more about how this "web world" works and that's why I am taking the free code camp course. I already took front-end development and I really enjoyed it. Now I am on the back end part.
The back end is much more foggy for me. There are many things that I don't get so I would hope that someone could help me out.
First of all I learned about the get method. so I did:
var http = require('http');
and then made a get request:
http.get(url, function callBack(response){
response.setEncoding("utf8");
response.on("data", function(data){
console.log(data);
});
});
Question 1)
So apparently this code "gets" a response from a certain URL. but What response? I didn't even ask for anything in particular.
Moving on...
The second exercise asks us to listen to a TCP connection and create a server and then write the date and time of that connection. So here's the answer:
var server = net.createServer(function listener (socket){
socket.end(date);
});
server.listen(port);
Question 2)
Okay so I created a TCP server with net.createServer() and when the connection was successful I outputted the date. But where? What did actually happen when I put date inside of socket.end()?
Last but not least...
in the last exercise I was told to create an HTTP server (what?) to server a text file for every time it receives requests, and here's what I did:
var server = http.createServer(function callback(request, response){
var read = fs.createReadStream(location);
read.pipe(response);
});
server.listen(port);
Question 3)
a) Why did I have to create an HTTP server instead of a regular TCP? what's the difference?
b)what does createReadStream do?
c) What does pipe() do?
If someone could help me, trying to make the explanation easier would help me a lot since I am, as you can see, pretty dumb on this subject.
Thank you a lot!
This is a little broad for Stackoverflow which favors focused questions that address specific problems. But I feel your pain, so…
Questions 1:
Http.get is roughly equivalent to requesting a webpage. The url in the function is the page you are requesting. The response will include several things like the HTTP response code, but also (most importantly) the content of the page, which is what you are probably after. On the backend this is normally used for hitting APIs that get data rather than actual web pages, but the transport mechanism is the same.
Question 2:
When you open a socket, you are waiting for someone else to request a connection. (The way you do when you use http.get(). When you output data you are sending them a response like the one you received in question 1.
Question 3:
HTTP is a higher level protocol than TCP. This basically means it is more specific and TCP is more general (pedants will take issue with that statement, but it's an easy way to understand it). HTTP defines the things like GET and POST that you use when you download a webpage. Lower down in the protocol stack HTTP uses TCP. You could just use TCP, but you would have to do a lot more work to interpret the requests that come in. The HTTP library does that work for you. Other protocols like FTP also use TCP, but they are different protocol than HTTP.
For this answer, you need to understand two things. An IP address is the numeric value of a website, it's the address to the server pointing to the site. A domain name is a conversion from IP to a NAMED system which allows humans an easier way to see the names of websites, so instead of typing numbers for websites, like 192.168.1.1, we can now just type names (www.hotdog.com). That's what your get request is doing, it's requesting the site.
socket.end is a method you're calling. socket.end "Half-closes the socket. i.e., it sends a FIN packet. It is possible the server will still send some data" from the nodejs.org docs, so basically it half closes your socket at the parameter you're sending in, which is todays current date.
HTTP is hyper text transfer protocol, TCP (transmissioncontrol protocol) is a link between two computers
3a HTTP is for browsers, so that's why you did it, for a web page you were hosting locally or something.
3b createreadstream() Returns a new ReadStream object. (See Readable Stream).
Be aware that, unlike the default value set for highWaterMark on a readable stream (16 kb), the stream returned by this method has a default value of 64 kb for the same parameter.
3c pipe:
The 'pipe' event is emitted when the stream.pipe() method is called on a readable stream, adding this writable to its set of destinations.

Node.js EventEmitter events not sharing event loop

Perhaps the underlying issue is how the node-kafka module I am using has implemented things, but perhaps not, so here we go...
Using the node-kafa library, I am facing an issue with subscribing to consumer.on('message') events. The library is using the standard events module, so I think this question might be generic enough.
My actual code structure is large and complicated, so here is a pseudo-example of the basic layout to highlight my problem. (Note: This code snippet is untested so I might have errors here, but the syntax is not in question here anyway)
var messageCount = 0;
var queryCount = 0;
// Getting messages via some event Emitter
consumer.on('message', function(message) {
message++;
console.log('Message #' + message);
// Making a database call for each message
mysql.query('SELECT "test" AS testQuery', function(err, rows, fields) {
queryCount++;
console.log('Query #' + queryCount);
});
})
What I am seeing here is when I start my server, there are 100,000 or so backlogged messages that kafka will want to give me and it does so through the event emitter. So I start to get messages. To get and log all the messages takes about 15 seconds.
This is what I would expect to see for an output assuming the mysql query is reasonably fast:
Message #1
Message #2
Message #3
...
Message #500
Query #1
Message #501
Message #502
Query #2
... and so on in some intermingled fashion
I would expect this because my first mysql result should be ready very quickly and I would expect the result(s) to take their turn in the event loop to have the response processed. What I am actually getting is:
Message #1
Message #2
...
Message #100000
Query #1
Query #2
...
Query #100000
I am getting every single message before a mysql response is able to be processed. So my question is, why? Why am I not able to get a single database result until all the message events are complete?
Another note: I set a break point at .emit('message') in node-kafka and at mysql.query() in my code and I am hitting them turn-based. So it appears that all 100,000 emits are not stacking up up front before getting into my event subscriber. So there went my first hypothesis on the problem.
Ideas and knowledge would be very appreciated :)
The node-kafka driver uses quite a liberal buffer size (1M), which means that it will get as many messages from Kafka that will fit in the buffer. If the server is backlogged, and depending on the message size, this may mean (tens of) thousands of messages coming in with one request.
Because EventEmitter is synchronous (it doesn't use the Node event loop), this means that the driver will emit (tens of) thousands of events to its listeners, and since it's synchronous, it won't yield to the Node event loop until all messages have been delivered.
I don't think you can work around the flood of event deliveries, but I don't think that specifically the event delivery is problematic. The more likely problem is starting an asynchronous operation (in this case a MySQL query) for each event, which may flood the database with queries.
A possible workaround would be to use a queue instead of performing the queries directly from the event handlers. For instance, with async.queue you can limit the number of concurrent (asynchronous) tasks. The "worker" part of the queue would perform the MySQL query, and in the event handlers you'd merely push the message onto the queue.

Ajax polling chat gets duplicates at message receiver front-end on speedy chat

I have developed a javascript chat (php on the backend) using:
1) long-polling to get new messages for the receiver
2) sessionStorage to store the counter of messages
3) setInterval to read new messages and if sessionStorageCounter < setIntervalCounter then the last message is shown to receiver.
4) javascript to create,update and write the chat dialogues
The module is working fine, but when users have a speedy chat the receiver' front end gets two or three same messages, (neither the counter fails, nor the query provides double inserts).
The code seems to be correct (that's why I don't provide the code), so the interval delay might be the reason (on reducing interval delay, nothing changes).
Do you think that the above schema is a bad practice and which schema do you think would eliminate the errors?
My approach, if solving it myself (as opposed to using an existing library that already handles this) would be:
Have the server assign a unique ID (GUID) to each message as it arrives.
On the clients, store the ID of the most recently received message.
When polling for new messages, do so with the ID of the last message successfully received. Server then responds by finding that message in its own queue and replaying all of the subsequent messages.
To guard against 'dropped' messages, each message can also carry the ID of the immediately-previous message (allowing the client to do consistency-checking)
If repolling does cause duplicates to be delivered from server to client, the presence of unique IDs on each message makes eliminating them trivial. Think of the server-side message queue as an event stream, with each client tracking their last-read position. The client makes no guesses about the appropriate order of messages, how many there are, etc - because its state consists entirely of 'what have I seen', there are few opportunities to get out of sync.
Since it's real time chat, the setInterval interval is probably small enough to ask the server for new messages two or three times simultaneously. Make sure that the server handler is synchronized and it is ignoring duplicated queries from the same user.

Categories

Resources