Debatching Inbound Messages From BizTalk SQL Adapter

A buddy of mine asked me this morning how to do debatching with the SQL Adapter.  While I fully understand XML and flat file debatching, the SQL Adapter uses a generated XSD schema, and I wasn’t 110% sure of the best way to handle that.  So, as usual, I figured I’d build it and see what happened.

[04/08/2010 Update: I’ve done a new post showing how to do this with the new WCF-SQL adapter.  You can read that here.]

So let’s start with a database table and stored procedure.  I created a simple “Customers” table and a procedure that grabs every customer flagged as “New” and then sets those values to “Existing” after pulling them.

Next, I constructed a BizTalk project, and did an Add –> Generated Items and chose to build a schema from an adapter.  After picking the SQL adapter, I chose to use the stored proc built above.  The auto-generated schema then looked like this …

Make sure you go back afterwards and remove the XMLDATA clause since it’s only used when you need to generate the schema.  Next I built and deployed the project.  Finally, I set up receive and send ports.  The send port simply has a filter subscription pointing to BTS.ReceivePortName.  The Receive Location uses the XML Receive pipeline and the SQL adapter, configured as such …

Remember that the out-of-the-box XML Receive pipeline will do the debatching for you if the schemas are set up right.  If you use the Passthrough pipeline, nothing’s going to happen.  So what happens when I enable the Receive Location and turn on the Send Port?  I get a single message, holding all three records pulled.  That’s the default behavior here.

So now I went back to my schema to convert it to a recognized “envelope” schema.  You do this by setting the Envelope property to “Yes”, and setting the Body XPath on the root node.  In my case, the Body XPath should point to the root, since we want everything under it (the TempCust node instances) to be yanked off.  I also set the Max Occurs on the TempCust node to 1.

Now after deploying this updated project and resetting the database table, what do you expect will happen?  If you said “you’ll get some beat error message” then you win.

See what happened there?  Each message got debatched, but when trying to find a schema for the TempCust message type, BizTalk failed since no such schema exists.  We only have a schema for the NewCustomers type.

So how do we fix that?  Easy, create a schema for the TempCust body message.  The trick is to not create any more work for ourselves than we have to.  So, I created a brand new schema, and chose the Imports option.  Here I pointed to the “Envelope” schema we created above.

Now I can reuse the previous schema without manually re-creating the TempCust format.  After importing, I pointed to the root node of my new schema and set its Data Structure Type property to the TempCustType option in the drop down list.  Immediately, the type gets loaded into my new schema.  I changed the root node name to “TempCust” and set the Root Reference of the schema to the “TempCust” node (since we now have a multi-root schema).  Now, when the BizTalk engine debatches the NewCustomers message and is looking for a schema that corresponds to the TempCust message, we’ve got one.

Nice!  Now if I deploy, and reset my database, I see three individual messages get sent out of BizTalk, one for each row in the database table.  This model works well because if any changes are made to the auto-generated schema, my “SingleCustomer” message also gets updated.  I don’t have to keep two separate (but related) schemas manually in sync.

Also note that now you’ll want to be binding to the http://%5Bnamespace%5D#TempCust type, not the original schema generated by the SQL adapter.  So an orchestration message would be of the above type, not the envelope.  Or if you have a send port listening for message types, the http://%5Bnamespace%5D#TempCust is the type that matters, since the http://%5Bnamespace%5D#NewCustomers format no longer exists after the pipeline debatches the original message into the resulting individual messages.

There you go.  Any other ways you folks handle this sort of thing?

Technorati Tags:

Author: Richard Seroter

Richard Seroter is currently the Chief Evangelist at Google Cloud and leads the Developer Relations program. He’s also an instructor at Pluralsight, a frequent public speaker, the author of multiple books on software design and development, and a former InfoQ.com editor plus former 12-time Microsoft MVP for cloud. As Chief Evangelist at Google Cloud, Richard leads the team of developer advocates, developer engineers, outbound product managers, and technical writers who ensure that people find, use, and enjoy Google Cloud. Richard maintains a regularly updated blog on topics of architecture and solution design and can be found on Twitter as @rseroter.

65 thoughts

  1. Hi Richard… I actually have a comment about your article entitled “BizTalk Aggregation Pattern for Large Batches” that was on your previous Blog.

    My questions are…

    1) when you promote the fields, are you promoting to a property field or a distinguished field (and does it matter and why)?

    2) could you please send be a sample of what the input document actually looks like… I’m doing something very similar, but I’m having issues on how to create the Envelope and document schema for my input document…

    Hope your new job is going well!

    Thanks in advance, Brian

  2. Hi there Brian,

    I’m promoting to a property field. A distinguished field is pretty much a pointer to a spot in the message. If a value exists when it passes through the pipeline, that value is injected into context. If not, when you reference that distinguished field in the orchestration it uses the XPath to try and find it again.

    A promoted field is actually used when the message arrives to find subscriptions. All promoted values are processed to see if any subscriptions match the message. So, there’s a performance and storage consideration when using promoted fields. That’s why distinguished fields are “cheaper”.

    However, to do routing, the value must be promoted so that it’s visible to the engine. BizTalk doesn’t “see” distinguished fields when considering how to route a message.

    As for a sample, I don’t have my “demo” VM on this new laptop (silly me!), but this post shows you the messages and schema options … http://weblogs.asp.net/jan/archive/2004/03/07/85259.aspx.

    Good luck!

  3. Hi richard,

    I discover that you can simply split the XML file (even through SQL adapter) by simply put the node max min both = 1.

    No envelope /no “extra virtual” body schema is needed.

    Please test it, it should work.

  4. More, sorry I forgot one thing to make this work, discover this a long time ago but never posted it:
    1. set BOTH Min an dMax Occurs on the TempCust node to 1.

    2. Make a receive pipeline of to receive the SQL xml file in, at the Validator stage, pick the schema to validate.

    2 steps above make Biztalk split the file without Envelope/ “body” schema complicated way:)!!!!

  5. Another correction
    You have to receive the xml as normal
    then send to another send port

    . Make a send pipeline of to send the SQL xml file , at the Validator stage, pick the schema to validate. Then this send port will send out the splitted file.

  6. Sorry to clutter your blog but this is another way:
    1. set BOTH Min an dMax Occurs on the TempCust node to 1.

    2. Make a receive pipeline of to receive the SQL xml file in, at the XML DISASSEMBLE stage, pick the schema to validate.

  7. Hung,

    Could you please explain your process above in a little bit more detail? In your step 2 you say pick the schema to vaidate in your disassemble stage- could you explain this part more, like which schema etc? Would be a great help to me….

  8. Hey Hung,

    It’s not “clutter” if it adds value 😉 Thanks for the comments. I’ll have to try your scenario(s) out. One big value of using XML envelopes, in general, is the ability to take “non-body” content and include it with each “body” that gets disassembled. In the example on this post, there is nothing else but the “body” so that wouldn’t apply. That said, to Chris’ comment, I’m not sure which schema you’d use to disassemble. If you only use the schema from the SQL adapter, you have no schema for each split message and therefore couldn’t have subscribers listening on the split “type.”

  9. Sorry guys,
    I had it done differently and never had time to recall.
    This maybe more complex than Richard normal way.
    Here it is:
    1. The file schema has to be flat.
    2. set BOTH Min an dMax Occurs on the TempCust node to 1.
    3. On the flat file receive disassembling stage, put flat file disassembler and pick the schema.

    Step 2 is what you need to do because SQL adapter is XML file.
    I used to clone the XML schema to another schema, rename the namespace for the new schema, make the new schema flat file

    So you receive xml through SQL, immediately map that to to flat file(which is a “clone” beside the fact that it has flat file extenstion and a different namespace. This can be easily done through an inbound port map(no need for orchestration).
    Then send that flat file to a folder.
    Then receive that flat file back, on the receive pipepline, the flat file disassembler will split the file based on MAX and MIN = 1.
    No envelope and body schema is needed.

    Richard, you are very good on the point of “non-body” content.
    That is called debatching while preserving context:). I use another technique when I need that context. Usually I need more than just the parent node context.
    Here is a challenge for you, I solved it through another way.
    How do you preserve context using debatching when the context is huge and many nested level (Ancestor context). Try to split and preserve A and B.
    Example:


    Data1

    Data2



    Data3

    Data4


    How to SPLIT that into:


    Data1


    and


    Data2



    and


    Data3



    and


    Data4


  10. Hung,

    I believe you are missing the point. The purpose of this post is to show how to debatch a dataset coming in from a SQL receive port and NOT a flat file.

    Unless I am missing something…

    .kev

  11. Richard,
    Interesting note…

    It seems that I must 1) set the Envelope 2) set the Body XPath 3) save and exit the .XSD file 4) compile the project 5) open the .xsd

    It is only then that I can add the reference and document element back into the same document.

    Not that clear.

  12. What I meant is that Microsoft programmer split the flat file on Min and Max = 1 inside the flat file disassembler.
    They should do the same thing with XML schema.

    I found this out by accident.

  13. This isn’t working for me. I imported my envelope schema but I cannot select it for my root node like you did above. At best I can type it in, but I don’t get the greyed (imported) elements/nodes in it.

  14. Hi Richard,

    nice article.I have followed the steps for this process,but the xml file is not debatching,still getting the envelope at the output. Any idea’s, do you have the sample code.

    Many thanks

    Paul

  15. Hi Richard,

    pleaseignore my previous posting.I got it working, my target namespaces did not match.

    Nice solution 🙂

    Paul

  16. This isn’t working for me either. I’m new to biztalk so might need a little hand holding

    1) How do you import a schema of an XML file?
    Basically I am doing it the otherway around… Reading an XML file and populating a SQL DB

    2) Is it ok if I just manually created another schema file.

    3) I don’t get the “Document Structure Type” populated other than an xs:anyType. Should I type this in or select? I tried closing the xsd. rebuilding and opening it again but still did not work.

    Regards,
    Rohit

  17. Hey Rohit,

    You import a schema by clicking the topmost node in the schema, and on the properties window, selecting “import”. You do a “include” if the schema is in the same namespace as the one you’re adding, and an “import” if they are different namespaces.

    You don’t want to manually create the SQL schema. Walk through the “add generated items” wizard.

  18. Great walk-through! I do have a question though.

    Is it necessary to return the XML from the database as ELEMENTS, or can you remove that part and simply return attributes? When I tried w/o the ELEMENTS option, I could do everything except change the Group Min/Max on my “TempCust” node. Even if I went in and edited the schema to include it didn’t work. If I ever returned more than one record from the database, it would throw the following error: Unexpected event (“document_start”) in state “processing_empty_document”.

    I was finally able to make it work, but it required the data to be returned as ELEMENTS. Any thoughts???

    TIA – Brian

  19. Hmm. I know that the “ELEMENTS” portion is truly optional, but I got into that habit with this adapter years ago, so I don’t recall the limitations introduced by focusing on attributes instead.

    Any harm in sticking with elements?

  20. Great tutorial, thanks a lot!

    Just a note on little detail that got me crazy. I was getting error message “Unexpected event (“document_start”) in state “processing_empty_document” and it turned out that there’s a bug in BizTalk explained here:

    http://support.microsoft.com/default.aspx/kb/927741

    When I added ELEMENTS to my SQL statement, everything worked as expected.

    thnx again and best regards!

  21. Hi Richard

    Another quick question, lets say if i have schema something like this

    abc

    xyz

    and i want something like this after debatching

    abc
    123

    and

    xyz
    123

    Idea here is that there is something at the envelope level which i wanted at the document level, it doesn’t matter wherever at document level whether attribute of element.
    Any ideas how this can be achieved?

    Thanks in advance
    Arvind

  22. It seems like this post doesn’t shows schemas. Actually there was an attribute named Processid=123 that i wanted to show which is at envelope level and wanted at document level after debatching.

  23. Hey Arvind,

    If you want to push a value from the “envelope” to the debatched body nodes, promote both the envelope value, and the body value and point them to the same property schema node. When debatching, the value will be copied from the envelope to the body for each body that gets debatched.

  24. Thanks Richard but it doesn’t work, i promoted it both at envelope and document schema and pointing to the same property schema but after debatching values from envelope doesn’t gets copied over even though messages are getting bebatched successfully.

  25. Hey Richard Seroter

    hi i want to know how we get original file name processed by biztalkserver,when we process file biztalk generates its own messageid with file name but i want the original file name

    how to do that

    Thanks

  26. If you’re talking about receiving a file (since there is no “original file name” for the SQL input), then you can use a context property such as FILE.ReceivedFileName to extract the name of the file picked up by the adapter.

  27. Hey Richard Seroter

    hi thanks my problem is that i am sending file from one location to another location file is transmitting but the name of file changes when it goes from one place to another place ;I wants that file from which extension it was recieved that extension AND Name of file will same when sending to from one location to another location ;

    You says that you can use a context property such as FILE.ReceivedFileName to extract the name of the file picked up by the adapter but how i will bind that context property at send port

  28. Dear Richard
    I had take file adapter for to recieve file From One Location ,The File May Be any Type like *.xml,*.pdf,*.csv,*.doc ;
    i am sending that file Through File adapter to another location on server;
    Now can you tell me that how the name of file will preserver
    and how can i send multiple extension(*.xml,*.pdf,*.csv,*.doc ) file from one location to another without changing Name of file,Because when i send one file then it changes the name of file with messegeid genereated by BizTalk;
    Please Reply ME with solution on My this id

    1.saurabh.s@gmail.com

  29. Wow, very off-topic for the post itself, but if you’re looking to keep the file name, you’d want to use either an orchestration or send pipeline to extract the FILE.ReceivedFileName and set that value for a dynamic (FILE) send port. That would keep the name of the received file for the sent file. For doing this without orchestration, see Saravana’s post (http://www.digitaldeposit.net/blog/2007/04/can-you-use-dynamic-send-port-without.html).

  30. Hi Richard,

    I am not able to debatch the files. The output is same as the original (whole)SQL-output from the adapter. From list of comments on this blog, I found that it works if the original xsd and the debatched-xsd have the same namespace. But I am not able to modify the namespace of the debatched-xsd, it throws an error ‘Namespace attribute of an import staement must not match the real value of the enclosing targetNamespace of the ‘.
    Could you please help on this?

  31. It worked for me only when I created a new debatched-xsd and then assigned the whole-xsd’s namespace to the debatched-xsd’s targetNamespace.
    Importing the schema would be very convenient option to have but can anyone help on the above mentioned error (that I face while achieving this)?

  32. Hey Karuna,

    So you made sure that your secondary schema (representing the debatched message) had the same namespace as the auto-generated envelope schema, and THAT works? But using different namespaces did not. Which makes sense, since the pipeline breaks the message apart, and looks for the same namespace as the envelope when matching the single-record schema.

  33. Hello Richard,

    Sorry for the delayed reply.
    My problem is actually lies with importing the debatched schema from the envelope schema. While doing an import, I am not able to maintain the SAME namespace across my schemas. As soon as I import the envelope schema into my split-record schema, the split-record schema automatically gets a new targetNamespace (different from the envelope’s targetNamespace). When I try to manually modify this namespace, I face the error explained in my first message.
    To reiterate, this works only after I create a brand-new split-record schema(without any imports). I would like to know if we have a work around for this problem?

  34. hey richard
    i want to read pdf,.csv,.txt file through fileadapter and on the basis of file content i have to update my sqldatabase
    how is it possible;the arcitecture of the file is not fixed
    that is varrying;
    can u send me how to do it

    thanks

  35. well… i would like to ask simple question: does anyone try to promote properties in base (document, not envelope) xml schema and do content based routing?
    problem is that i made filter expresion on active receive shape in orchestration, based on one promoted property. so after xml disaassembler debatch XML message, received from SQL server, i get as many suspended services with routing error messages, as is child records in XML message. and error is about, that BT instances didn’t find any active subscriptions. but if i look to suspended massage, i can see this property, which isn’t promoted…
    so, does anyone has an idea how to solve this stupid problem?

  36. Hey saurabh,

    Sounds like you need to route the message based on the file name suffix. So, you could receive all content from this receive port into an orchestration (using XmlDocument as the message type so that ANY content can come in), and then yank out the FILE.ReceivedFileName, parse the string to extract the suffix, set a custom property schema value indicating the file type, and sending the message out for further processing.

  37. Hi, iam tring to this in a Oracle database. I use a table change event to poll the data from the database. I have information about several custumers and i need to do split them. I tried your solution, but i always i have that error “…verify the schema deployed property.”

    Can you please tell me in more detail i can i do this in a oracle database.

    Thanks.

  38. mclean,

    So you have a promoted property on a regular (XML) message, and use that value in the orchestration subscription? Or are we still talking about adding the promoted value to an envelope schema? If so, that value doesn’t automatically get pushed down to each debatched message. Hence “subscription not found” errors. If you have (a) the field in the envelope “body” and (b) field in the envelope header that both map to the same property schema node, THEN, the value gets pushed down during debatching.

  39. malmeida,

    If you’re doing debatching, you have to make sure you’ve built and deployed a separate schema that reflects the debatched body node. Otherwise when debatching occurs, there’s no schema to match the “new” message.

  40. Richard,

    actually i tried ‘almost’ everything… 🙂
    one of tries, that i promoted same element in both, envelope and body, schemas to the same PropertySchema node. in orchestration i bind only to body schema. one thing is that in body schema is one root node with three elements: one is promoted and all are distinguished. this mean, that one element is promoted and distinguished in the same time. but i doubt this should do any impact to this problem…
    another thing, that when i get suspended services with “routing failure report…”, the message in this suspended service don’t have ‘message parts’, i can see only ‘context’, in wich i can see these body message elements as context elements.

    and i would like to clear some definitions: what schould meant ‘envelope body’ and ‘envelope header’? i guess that ‘envelope body’ is what whill be extracted from original message as a single message. in another words, ‘enevelope body’ is a ‘body schema’. but can’t understand what is ‘envelope header’…

  41. mclean,

    I’m sure you’ve tried lots of configurations. You’ve probably already seen them, but make sure you review some of the envelope examples out there (http://weblogs.asp.net/jan/archive/2004/03/07/85259.aspx, http://msdn2.microsoft.com/en-us/library/aa546772.aspx).

    Having a field as both distinguished and promoted is fine, no problem there. When I say “header”, I’m thinking of something like this (http://msdn2.microsoft.com/en-us/library/ms944739.aspx). The header being the envelope-level data, while the “body” is the repeating structure that you want to process individually.

  42. Hey Richard

    I have to send data from one table to another table using biztalk server;
    The Problem is that in which table does i am sending data there are some data already present so during insertion it is giving primary key voilation error;
    when i applied debatching one by one then its working fine data is inserting in that table but when i am sending that records through batch mode then if any data into that table already exists it revert back complete batch operation;
    how to do that bulk insert operation should not revert back and the record which one is causing the problem that should be identified;
    I Want to do this with biztalk is there any excetion handler which send data and without stopping complete bulk insert operation
    thanksss
    saurabh k

  43. Richard,

    can i ask one example, how to fulfil both conditions (in your post 5136, (a) and (b)).
    I get from SQL adapter something like tihs:

    1
    2

    And i need to promote field2. So how should look mesage, which would fulfil conditions…
    Body xml “…” i form in SQL server…

  44. hm…

    Example, what i get:
    <Requests xmlns=”space1″>
    <ns0:Request xmlns:ns0=”space1″ xmlns=””>
    <ns0:field1>1</ns0:field1>
    <ns0:field2>2</ns0:field2>
    </ns0:Request>

    <Requests>

    And body:
    <ns0:Request xmlns:ns0=”space1″ xmlns=””> … </ns0:Request>

  45. Hey Richard

    I Have to extract pdf file content using biztalk server how is it possible to do this ;
    please help me

    thanks
    saurabh kumar singh

  46. Hi Richard,

    Messages are being debatched, and this is the resulting XML:

    1
    Duck Dodgers
    2008-02-24T00:00:00
    Atlanta
    GA
    12345785894
    New

    Notice the message does not have the ns0: prefix in it (not qualified).

    I created a seperate schema instead of reference/include, and the difference is that I used elements instead of records for the fields.

    So when I attempt to validate instance with the TempCust schema it fails with this message:

    The element ‘TempCust’ in namespace ‘http://SQLDebatch’ has invalid child element ‘CustId’ in namespace ‘http://SQLDebatch’. List of possible elements expected: ‘CustId’

    Basically it’s not finding the CustId element using the default namespace.

    When I created a TempCust schema using Records it parses fine.

    I have the Schema’s Element FormDefault property is set to Unqualified (I tried Qualiied also)

    The only difference is the schema’s fields are of Elements as oppose to Records.

    I’m trying to understand the reason why the document can’t be parsed using the TempCust schema using elments (instead of Records) for fields.

    I would think the default namespace would be inherently applicable for all child nodes.

    Any idea?

    Thanks
    E

  47. Awesome stuff…. i thought you always had to create a pipeline.

    you mean “Include not an “Import” right? when you say the following….

    “So, I created a brand new schema, and chose the Imports option. Here I pointed to the “Envelope” schema we created above”

    Thanks
    -Nen

  48. Hi Richard,

    Yeah, the schema I’m debatching uses elements, and to change would require alot of rework on dependencies.

    Excellent article all the same!!!

    Thanks

  49. Hello Richard,

    I need to debatch a flat file that has messages of different types.
    Eg:
    Header
    Message Type 1
    Message Type 1 – line 1
    Message Type 1 – line 2
    End Message Type 1
    Message Type 2
    Message Type 2 – line 1
    Message Type 2 – line 2
    End Message Type 2
    Trailer

    So far I have seen examples where debatching can be done using flat file disassemblers with one schema definition. Is it possible to achieve the debatching for the above example with in a Receive Pipeline using a flat file disassembler? Any help will be greatly appreciated. Thank you.

  50. Jona,
    Maybe if the flat file has labels for the messages, but I haven’t done that scenario before with a flat file. Doing it with an XML message is a different story.

Comments are closed.