I was doing some research lately into a publish/subscribe scenario and it made me think of a “gotcha” that folks may not think about when building this type of messaging solution.
Specifically, what are the implications of resubmitted a failed transmission to a particular subscriber in a publish/subscribe scenario? For demonstration purposes, let’s say I’ve got a schema defining a purchase request for a stock.
Now let’s say that this is NOT an idempotent message and the subscriber only expects a single delivery. If I happen to send the above message twice, then 400 shares would get bought. So, we need a guaranteed-once delivery. Let’s also assume that we have multiple subscribers of this data who all do different things. In this demonstration, I have a single receive port/location which picks up this message, and two send ports which both subscribe on the message type and transmit the data to different locations.
As you’d expect, if I drop a single file in, I get two files out. Now what if the first send port fails for whatever reason? If I change the endpoint address to something invalid, the first port will fail, and the second will proceed as normal.
You can see that this suspension is directly associated with a particular send port, so resuming this failed message (after correcting the invalid endpoint address) should ONLY target the failed send port, and not put the message in a position to ALSO be processed by the previously-successful send port. This is verified in the scenario above.
So all is good. BUT what happens if you leverage an external system to facilitate the repair and resubmit of failed messages? This could be a SharePoint solution, custom application or the ESB Toolkit. Let’s use the ESB Toolkit here. I went into each send port and checked the Enable routing for failed messages box. This will result in port failures being published back to the bus where the ESB Toolkit “catch all” exception send port will pick it up.
Before testing this out, make sure you have an HTTP receive location set up. We’ll be using this to send message back from the ESB portal to BizTalk for reprocessing. I hadn’t set up an HTTP receive location yet on my IIS 7 box and found the instructions here (I used an HTTP receive location instead of the ESB on-ramps because I saw the same ESB Toolkit bug mentioned here).
So once again, I changed a send port’s address to something invalid and published a message to BizTalk. One message succeeded, one failed and there were no suspended messages because I had the failed message routing turned on. When I visit my ESB Toolkit Management Portal I can see the failed message in all its glory.
Clicking on the error drills into the details. From here I can view the message, click Edit and choose to resubmit it back to BizTalk.
This message comes back into BizTalk with no previous context or destination target. Rather, it’s as if I’m dropping this message into BizTalk for the first time. This means that ALL subscribers (in my scenario here) will get the message again and cause unintended side effects.
This is a case you may not think of when working primarily in point-to-point solutions. How do you get around it? A few ways I can think of:
- Build your messages and services to be idempotent. Who cares if a message comes once or ten times? Ideally there is a single identifier in each message that can indicate a message is a duplicate, or, the message itself is formatted in a way which is immune to retries. For instance, instead of the message saying to buy 200 shares, we could have fields with a “before amount” of 800 and “after amount” of 1000.
- Transform messages at the send port to destination specific formats. If each send port transforms the message to a destination format, then we could repair and resubmit it and only send ports looking for either the canonical format OR the destination format would pick it up.
- Have indicators in the message to indicate targets/retries and filter those out of send ports. We could add routing instructions to a message that specified a target system and have filters in send ports so only ports listening for that target pick up a message. The ESB Toolkit lets us edit the message itself before resubmitting it, so we could have a field called “target” and manually populate which send port the message should aim for.
So there you go. When working solely within BizTalk for messaging exceptions, the fact of using pub/sub or not shouldn’t matter. But, if you leverage error handling orchestrations or completely external exception management systems, you need to take into account the side effects of resubmitted messages that could reach multiple subscribers.