Rebus HTTP gateway and MSMQ health state

Question

Let's say we have

Client node with HTTP gateway outbound service
Server node with HTTP gateway inbound service

I consider situation where MSMQ itself stops from some reason on the client node. In current implementation Rebus HTTP gateway will catch the exception.

What do you think about idea that instead of just catching, the MessageQueueException exception could be also sent to server node and put on error queue? (name of error queue could be gathered from headers)

So without additional infrastructure server would know that client has a problem so someone could react.

UPDATE:

I guessed problems described in the answer would be raised. I should have explained my scenario deeper :) Sorry about it. Here it is:

I'm going to modify HTTP gateway in the way that InboundService would be able to do both - Send and Receive messages. So the OutboundService would be the only one who initiate the connection(periodically e.g. once per 5 minutes) in order to get new messages from server and send its messages to server. That is because client node is not considered as a server but as a one of many clients which are behind the NAT.

Indeed, server itself is not interested in client health but I though that instead of creating separate alerting service on client side which would use ~~HTTP gateway~~ HTTP gateway code, the HTTP gateway itelf could do this since it's quite in business of HTTP gateway to have both sides running.

What if the client can't reach the server at all?

Since MSMQ would be dead I thought about using in-process standalone persistent queue object like that http://ayende.com/blog/4540/building-a-managed-persistent-transactional-queue (just an example implementation, I'm not sure what kind of license it has) to aggregate exceptions on client side until server is reachable.

And how often will the client notify the server that is has experienced an error?

I'm not sure about that part - I thought it could be related to scheduled time of message synchronization like once per 5 minutes but what in case there would be no scheduled time just like in current implementation (while(true) loop)? Maybe it could be just set by config?

I like to have a consistent strategy about handling errors which usually involves plain old NLog logging

Since client nodes will be in the Internet behind the NAT standard monitoring techniques won't work. I thought about using queue as NLog transport but since MSMQ would be dead it wouldn't work.

I also thought about using HTTP as NLog transport but on the server side it would require queue (not really, but I would like to store it in queue) so we are back to sbus and HTTP gateway...that kind of NLog transport would be de facto clone of HTTP gateway.

UPDATE2: HTTP as NLog transport (by transport I mean target) would also require client side queue like I described in "What if the client can't reach the server at all?" section. It would be clone of HTTP gateway embedded into NLog. Madness :)

All the thing is that client is unreliable so I want to have all the information about client on the server side and log it in there.

UPDATE3

Alternative solution could be creating separate service, which would however be part of HTTP gateway (e.g. OutboundAlertService). Then three goals would be fulfilled:

shared sending loop code
no additional server infrastructure required
no negative impact on OutboundService (no complexity of adding in-process queue to it)

It wouldn't take exceptions from OutboundService but instead it would check MSMQ perodically itself.

Yet other alternative solution would be simply using other than MSMQ queue as NLog target but that's ugly overkill.

mookid8000 · Accepted Answer · 2014-09-04T07:07:59.823

Regarding your scenario, my initial thought is that it should never be the server's problem that a client has a problem, so I probably wouldn't send a message to the server when the client fails.

As I see it, there would be multiple problems/obstacles/challenges with that approach because, e.g. what if the client can't reach the server at all? And how often will the client notify the server that is has experienced an error?

Of course I don't know the details of your setup, so it's hard to give specific advice, but in general I like to have a consistent strategy about handling errors which usually involves plain old NLog logging and configuring WARN and ERROR levels to go the Windows Event Log.

This allows for setting up various tools (like e.g. Service Center Operations Manager or similar) to monitor all of your machines' event logs to raise error flags when someting goes wrong.

I hope I've said something you can use :)

UPDATE

After thinking about it some more, I think I'm beginning to understand your problem, and I think that I would prefer a solution where the client lets the HTTP listener in the other end know that it's having a problem, and then the HTTP listener in the other end could (maybe?) log that as an error.

Another option is that the HTTP listener in the other end could have an event, ReceivedClientError or something, that one could attach to and then do whatever is right in the given situation.

In your case, you might put a message in an error queue. I would just avoid putting anything in the error queue as a general solution because I think it confuses the purpose of the error queue - the "thing" in the error queue wouldn't be a message, and as such it would not be retryable etc.

Thank you for the update. I thought about it too and upto end of week I will try to summarize information from this post and other thoughts in form of GitHub issue and after your blessing in form of fork :) — user1121956, Sep 04 '14 at 11:47

Rebus HTTP gateway and MSMQ health state

1 Answers1