BizTalk Orchestration Throttling Pattern

I’m currently architecting a project where one of the requirements is to limit the number of concurrent calls to a web service. I’d covered a similar topic in a previous post, and outlined two ways one could try and configure this behavior.

First, you could limit the number of simultaneous connections for the SOAP adapter by setting the “maxconnections” setting in the btsntsvc.exe.config file. The downside to this mechanism is that if you have many messages, and the service takes a while to process, you could timeouts.

The second choice is to turn on ordered delivery at the send port. This eliminates the timeout issue, but, really slows processing. In our case, the downstream web service (a FirstDoc web service that uploads documents to Documentum) is fairly CPU intensive, and may take upwards of 200 seconds to run (which is why I also asked the developer to consider an asynchronous callback pattern), so we need a few calls happening at once, but not so many that the box collapses.

So, Scott Colestock recommended I take a look at the last issue of the BizTalk HotRod magazine and review the orchestration throttling pattern that he had expalined there. Unlike the two options mentioned above, this DOES requirement development, but, it also provides fairly tight control over the number of concurrent orchestrations. Since Scott’s article didn’t have code attached, I figured I’d rebuild the project to help our developer out, and, learn something myself.

The first step was to define my two BizTalk message schemas. My first is the schema that holds the data used by the FirstDoc web service. It contains a file path which the web service uses to stream the document from disk into Documentum. I didn’t want BizTalk to actually be routing 50MB documents, just the metadata. The second schema is a simple acknowledgement schema that will be used to complete an individual processing instance. Also, I need a property schema that holds a unique correlation ID, and the instance ID of the target convoy. Both properties are set to MessageContextPropertyBase since the values themselves don’t come from the message payload but rather, the context.

The second step was to build a helper component which will dictate which throttled instance will be called. Basically, this pattern uses convoy orchestrations. The key is though, that you have multiple convoys running, vs. a true singleton that processes ALL messages. The “correlation” used for each convoy is an instance ID that corresponds to a number in the numerical range of running orchestrations allowed. For instance, if I allowed 10 orchestrations to run at once, I’d have 10 convoy orchestrations, each one initializing (and following) an “InstanceID” between 1-10. Each of my calling orchestrations acquire a number between 1-10, and then target that particular correlation. Make sense? So I may have 500 messages come in at once, and 500 base orchestrations spin up, but each one of those target a specific throttled (convoy) orchestration.

So my helper component is responsible for doling out an instance ID to the caller. The code looks like this:

[Serializable] public static class RoundRobinHelper { //member variable holding current selection private static int roundRobinSelection = 0; private static object sync = new object(); /// /// Thread-safe retrieval of which /// orchestration instance to target /// /// public static int GetNext() { const int maxInstances = 3; lock (sync) { //increment counter roundRobinSelection++; //if we’ve reached the limit, reset if (roundRobinSelection == maxInstances) { roundRobinSelection = 0; } } return roundRobinSelection; } }

Notice that because it’s a static class, and it has a member variable, we have to be very careful to build this in a thread safe manner. This will be called by many orchestrations running on many threads.

Once this component was built and GAC-ed, I could build my orchestration. The first orchestration is the convoy. The first receive shape (which is direct bound to the MessageBox) initializes the “instance ID” correlation set. Then I have a loop which will run continuously. Inside that loop, I have a placeholder for the logic that actually calls Documentum, and waits for the response. Next I build the “acknowledgement message”, making sure to set the “Correlation ID” context property so that this acknowledgement reaches the orchestration that called it. I then send that message back out through a direct bound send port. Finally, I have a receive shape which follows the “Instance ID” correlation set (thus defining this orchestration as a convoy).

Next, we have the orchestration that spins up for EVERY inbound message. First, it receives a message from a port bound to a file receive location. Next, within an Expression shape, I call out to my “round robin” helper component which gives me the next instance ID in the sequence.

ThrottledInstanceId = BizTalkPattern.OrchHelper.RoundRobinHelper.GetNext();

I then make sure to create a new message with both the “Correlation ID” and “Instance ID” context properties set.

//create message copy Metadata_Output = Metadata_Input; //set convoy instance ID Metadata_Output(BizTalkPattern.BizTalkBits.InstanceId) = ThrottledInstanceId.ToString(); //set unique ID using orchestration instance identifier Metadata_Output(BizTalkPattern.BizTalkBits.CorrelationID) = BizTalkPattern.BizTalkBits.ProcessAllMetadataFiles (Microsoft.XLANGs.BaseTypes.InstanceId);

Finally, I send this message out (via direct bound send port) and wait for the acknowledgement back.

So what I have now is a very (basic) load balancing solution where many inbound messages flow through a narrowed pipe to the destination. The round robin helper component keeps things relatively evenly split between the convoy orchestrations, and I’m not stuck using a singleton that grinds all parallel processing to a halt.  Running a few messages through this solution yields the following trace …

If I look in the BizTalk Administration Console, I now have three orchestrations running at all times, since I set up a maximum of three convoys.  Neat.  Thanks to Scott for identifying this pattern.

Any other patterns for this sort of thing that people like?

Technorati Tags:

Author: Richard Seroter

Richard Seroter is currently the Chief Evangelist at Google Cloud and leads the Developer Relations program. He’s also an instructor at Pluralsight, a frequent public speaker, the author of multiple books on software design and development, and a former InfoQ.com editor plus former 12-time Microsoft MVP for cloud. As Chief Evangelist at Google Cloud, Richard leads the team of developer advocates, developer engineers, outbound product managers, and technical writers who ensure that people find, use, and enjoy Google Cloud. Richard maintains a regularly updated blog on topics of architecture and solution design and can be found on Twitter as @rseroter.

36 thoughts

  1. Have a look at my blogpost here :

    http://bloggingabout.net/blogs/wellink/archive/2007/02/12/limit-the-number-of-instances-of-any-biztalk-service.aspx

    or source at :

    http://www.codeplex.com/InstanceController/Release/ProjectReleases.aspx?ReleaseId=1863

    It’s my implementation of an “Instance controller”. See if it will work for your scenario…… I think it does the same.

    There is some work in there preventing the convoys of stopping if an outbound message get’s suspended and effectively stopping the entire convoy…

  2. Hi Richard,

    As usual another good post, just like to comment on one thing about your design though.

    If you had this running in a multi server environment then your singleton style implementation around generating your round robin instance id would mean potentially all servers will be counting from 1 each time and this means if you had 6 servers all processing a similar load the likelyhood is they could all pipe their message to the same throttled orchestration instance.

    As result you will be throttling but you could be well below your optimal performance.

    It might be worth considering using a random number between 1 and your max number of instances where you will probably get a more even distribution of messages across the throttled orchestration instances in both single and multi server environments

    Another benefit is it would allow you to remove the locking required in your helper class which as you rightly point out is the type of code which is usually a little higher risk to encounter a problem if not quite implemented correctly or changed by a developer who hasnt considered the multi threading aspects.

    Hope this is useful
    Mike

  3. Hey there Mike,

    Good suggestions. Scott mentioned the multi-server idea in his article, and you’re right, I’d have to pick a “max instance” which was acceptable if BOTH servers sent that many messages. Your concept of a random number (up to the max) is a pretty good idea.

  4. Well I am not really sure how much they differ.

    The instance controller is a deployed orchestration that can limit every outgoing message. It doesn’t know what it’s limiting.

    So it’s a deploy once limit all solution.

    For sure in the inplementation of your orchestration you have to wrap stuff up and send it to the controller, so you have to do little work for that.

    If you could send me your solution I could have a look at it. I am very eager to find the best solution to my problem of a year ago.

    The problem was we had to talk to Axapta via a webservice. And only 10 licences were bought. So we could have 50 Services hosted at Axapta and only 10 connections.

    Then batches were spun off and we had to keep the maximum number of connections to about 8 (Keep some open for the interactive users).

    So we had to limit outgoing instances over all those services.
    Complicated stuff alltogether…..

  5. Patrick

    Did you experiment with the solution of controlling this with the max connections element in the configuration file. If you configured each server in your group to only be allowed 5 connections to the load balanced address of the server hosting your Axapta web services and then configured the BizTalk host which your send ports run in to only run on 2 servers you should not be able to make more than 10 concurrent connections and still have fail over.

    This should also require no coding changes

    The below link discusses it a bit.
    http://msdn.microsoft.com/en-us/library/aa561380.aspx

    If you have tried it would be interested to know how you got on

    HTH
    Mike

  6. Hey Tommy,

    Unfortunately this was written for (and branded with) my company in mind. Hopefully you’d be able to recreate the solution using the steps in the post. If you run into any hurdles, just let me know.

  7. The random number generator does sound good. But if you stick with the round robin, don’t forget your friends in the Interlocked class. This is much faster than locking, and easier for junior programmers to maintain.

    System.Threading.Interlocked.Increment(ref roundRobinSelection); System.Threading.Interlocked.CompareExchange(ref roundRobinSelection, 0, maxInstances);

  8. Any thoughts on querying the biztalk database directly to find out the active count of running orchestrations for a given app?

    It would seem like polling + querying the database and then redirecting the message to another orchestration would be a fairly simple solution.

    Any thoughts on downsides?

    1. It would depend on load, I guess. If this was happening frequently, you could be putting excessive burden on a transactional DB. Plus, depending on how fast an orchestration runs, those numbers could get stale very quickly.

  9. How do you exactly get back the resonse from the DocumentunSingleProcess Orchestration. I tried to reproduce this with my code sample and got stuck with the response part. I’m all fine until the throttling part, except that if I have send a reponse to that exact callee orchestration ?

    Would be great Can you show me how exactly you configured the logical send and receive port

    Karthick G.
    karthickg@gmail.com

    1. Got it, it the Scope to be used inside a loop for the Send Shape in the DocumentunSingleProcess Orchestration (throttled singleton orchestration).

      Good Work!!!

      Cheers,
      Karthick G.

  10. Hi Richard,

    Great post. This solves a dilemma that I am currently facing where a BizTalk applications is overloading a downstream service.

    I just wanted to give you a heads up that the images seem to be mis-linked. Since you documented the process really well and I get the gist of things from your writings and code examples. However, it would be even more helpful if you could fix the broken links so that the images display properly.

    Thanks!

  11. Hi Richard,

    I have to implement synchronous call to window service ,I can make max 5-10 connections . it’s kind of load balancing on all session .so scenarios is like ,polling message from SQL and passing it to service in five session .so this pattern will work or not?

    Please share me your id also

  12. Hi Richard,

    Nice logic. I am trying to implement a sample based on the steps provided. I am using file inbound & outbound ports and implemented two Orchestrations. Correlation Set is correctly defined on InstanceID and initialized and followed in receive shapes in Orchestration2. When I drop one xml file in inbound port. I am getting errors.

    Under suspended service instances, For 1 Inbound message, I have 1 resumable message & 3 nonresumable messages.

    1 Resumable message has the following error:
    Uncaught exception (see the ‘inner exception’ below) has suspended an instance of service ‘SequentialConvoys.Orchestration1(6bb453fd-ea1e-4dd7-d139-ed8a7a31b6d8)’.
    The service instance will remain suspended until administratively resumed or terminated.
    If resumed the instance will continue from its last persisted state and may re-throw the same unexpected exception.
    InstanceId: 4178b6ba-f25b-46de-b324-25ecdff1250d
    Shape name: Send_1
    ShapeId: 732c57fa-1465-4454-a652-43c34ff1b5e6
    Exception thrown from: segment 1, progress 17
    Inner exception: Exception occurred when persisting state to the database.

    Exception type: PersistenceException
    Source: Microsoft.XLANGs.BizTalk.Engine
    Target Site: Void Commit()
    The following is a stack trace that identifies the location where the exception occured

    at Microsoft.BizTalk.XLANGs.BTXEngine.BTXXlangStore.Commit()
    at Microsoft.XLANGs.Core.Service.Persist(Boolean dehydrate, Context ctx, Boolean idleRequired, Boolean finalPersist, Boolean bypassComm

    3 NonResumable messages has the following error:
    Routing Failure Report for “Routing Failure Report for “SequentialConvoys.Orchestration1, SequentialConvoys, Version=1.0.0.0, Culture=neutral, PublicKeyToken=509738c310aa76ef””

    What is wrong with the implementation ? Any ideas ?

    Thanks
    Uday

    1. Hi Uday ,

      When you are initiating the Instance ID from Orchestration 1 it should be available in promoted property ,on that your orchestration 2 will pick message. I think you are missing that step. Check your suspended message to make sure it’s showing there as a promoted or not .

      I hope you understand it .

      1. Hi Shekhar,

        Both properties are set to MessageContextPropertyBase.I have verified them during my debugging. They are not promoted. I believe custom pipeline is required to promote them, which is not outlined in this article or any other way to promote them ?

        Thanks
        Uday

  13. Hi Richard
    Thanks for writing this instructive blog.
    We have tried to use this pattern in BizTalk 2013 to similarly protect a slow downstream web service from being flooded with requests. However we kept getting errors, and found this post from Vincent Scheel
    http://bloggingabout.net/blogs/vincent/archive/2013/12/05/error-when-publishing-a-message-from-a-convoy-in-biztalk-server-2013.aspx?CommentPosted=true#commentmessage
    We seem to be getting the same error as Vincent.
    I know your post is quite old now, but do you have any insights as to why we (and Vincent) are getting this error?
    Thanks
    Julian

      1. Hi Chandra – thanks for responding.

        We have a main orchestration and the convoy orchestration. Our problem seems to be the communication between them.
        We have tried two main approaches, which fail with different error messages;

        a) If the main orchestration sends the message with an associated property schema (that holds a unique correlation ID, and the instance ID of the target convoy) through a send port that is direct bound to the message box….then we get a runtime error stating that the message has been suspended because nothing is subscribing to it [even though our convoy receive port is also direct bound to the message box]
        > Error text: The published message could not be routed because no subscribers were found.

        b) If we create a correlation set, with a correlation type that contains the two IDs, and we initialize it in the main orchestration send port (which we understand causes those two IDs to really be promoted so that routing can occur) then the convoy receives the message successfully. The convoy logic proceeds until it gets to its acknowledgement send shape/port. Here it sends a response message that uses the same correlation type as the main orchestrations send, but again we get an error saying that it is suspended due to no subscribers OR if we initialize the correlation set we get an error saying that the correlation set can only be initialized once.
        > Error text: The published message could not be routed because no subscribers were found.
        or
        > Error text: a correlation may be initialized only once

        So:
        Should we use a correlation set/type in the communication?
        If yes, how do we make both the send and response work.
        If no, how do we make the main orchestration’s send have promoted properties so the routing to the convoy works?

        Thanks

        1. We just figured it out….
          Using approach ( b ) above the send shape in the convoy must be in its own Scope (this is visible in Richard’s orchestration but not explicitly mentioned), that way it can define its own correlation set using the same correlation type that was used in the main orchestration’s send shape, and can initialize it, thus it promotes its properties and can be successfully routed back to the main orchestration.
          Hope this helps someone else suffering from the same problem.
          cheers

      2. Hi Chandra,
        We are also trying to implement this on our biztalk2013 env, is it possible for you to send the sourcecode for reference .

  14. had a quick look at the sample and as a guess i reckon it might relate to the CorrelationType_IncomingSecondMessage correlation type which has a property called FieldUnused which looks like its feasible that the property could never be set which would sound plausible that this would then cause a problem inserting null into that table.

    Wonder if it might be as simple as just taking that out.

    Not sure why that wouldnt work after a migration but im sure there were some convoy changes which this might relate to. Maybe it used to just ignore null properties?

  15. I was implemented same as explained by Richard Seroter,Everything is working as expected in DEV machine.But in QA(Multiple servers) it is giving strange error like below.

    xlang/s engine event log entry: Uncaught exception (see the ‘inner exception’ below) has suspended an instance of service ‘OrchController(14d6d73a-ba7d-c245-929f-a9041d0a1293)’.
    The service instance will remain suspended until administratively resumed or terminated.
    If resumed the instance will continue from its last persisted state and may re-throw the same unexpected exception.
    InstanceId: 09443659-b90f-4b65-8c8c-7d152517e634
    Shape name: Snd_workerReq
    ShapeId: 00ed2b8b-dced-4bf0-bd3d-cae86b25185f
    Exception thrown from: segment 1, progress 19
    Inner exception: Exception occurred when persisting state to the database.

    Exception type: PersistenceException
    Source: Microsoft.XLANGs.BizTalk.Engine
    Target Site: Void Commit()
    The following is a stack trace that identifies the location where the exception occured

    at Microsoft.BizTalk.XLANGs.BTXEngine.BTXXlangStore.Commit()
    at Microsoft.XLANGs.Core.Service.Persist(Boolean dehydrate, Context ctx, Boolean idleRequired, Boolean finalPersist, Boolean bypassCommit, Boolean terminate)
    at Microsoft.XLANGs.Core.LongRunningTransaction.PendingCommit(Boolean ignore, XMessage msg)
    at Microsoft.BizTalk.XLANGs.BTXEngine.BTXPortBase.SendMessage(Int32 iOperation, XLANGMessage msg, Correlation[] initCorrelations, Correlation[] followCorrelations, Context cxt, Segment seg, ActivityFlags flags)
    at SBC.BizTalk.Contact.Micros.Orchestrations.RelateCRMCustomerUpdateController.segment1(StopConditions stopOn)
    at Microsoft.XLANGs.Core.SegmentScheduler.RunASegment(Segment s, StopConditions stopCond, Exception& exp)
    Additional error information:

    A batch item failed persistence Item-ID 71503ce1-3e29-4beb-9e74-65524bb8892e OperationType MAIO_CommitBatch Status -1061151998 ErrorInfo The published message could not be routed because no subscribers were found. .

    Exception type: PersistenceItemException
    Additional error information:

    Failed to publish (send) a message in the batch. This is usually because there is no one expecting to receive this message. The error was The published message could not be routed because no subscribers were found. with status -1061151998.

    Exception type: PublishMessageException

    Is there any solution for this issue.

  16. Hi Richard,

    With this pattern on a long run, the orchestration holds on to the processed messages with status as “Consumed”. This leads to Orchestration Throttling, slow process and eventual freeze of Orchestration.

    There seems to be 2 solutions,

    1. Restart the Host Instance
    2. Wait until all the “In Process” messages linked to the orchestrations are processed and terminate it to release the Active messages. This will restart the convoy process again.

    Is there a way to release the messages with “Consumed” status automatically after they are process and not to always restart ?

    or

    Exit the receive in the loop to end the Orchestration release the resources ?

    Can anyone please help ? This is a production issue and need to find a solution urgently.

    Thanks and Regards,
    Karthick

    1. I agree with the point here, these endless orchs will eventually lead to database throttling..Additionally as the convoy pattern is used, instances will not be evenly load balanced across all the runtime servers in a multi-server environment.. The first issue can be resolved by moving the range of values returned by the RoundRobin function over time and adding terminate logic to the orch after a sufficiently long timeout to guarantee the instance will not get routed to any longer.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.