Recently, when calling web services from the BizTalk environment, we were seeing intermittent instances of the “WebException: The request was aborted: The request was canceled” error message in the application Event Log. This occurred mostly under heavy load situations, but could be duplicated even with fairly small load.
If you search online for this exception, you’ll often see folks just say to “turn off keep-alives” which we all agreed was a cheap way to solve a issue while introducing performance problems. To dig further into why this connection to the WebLogic server was getting dropped, we actually began listening in on protocol communication using Wireshark. I started going bleary-eyed looking for ACK and FIN and everything else, so I went and applied .NET tracing to the BizTalk configuration file (btsntsvc.exe.config). The BizTalk documentation shows you how to set up System.Net logging in BizTalk (for more information on the settings, you can read this).
My config is slightly different, but looks like this …
<system.diagnostics> <sources> <source name="System.Net"> <listeners> <add name="System.Net"/> </listeners> </source> <source name="System.Net.Sockets"> <listeners> <add name="System.Net"/> </listeners> </source> <source name="System.Net.Cache"> <listeners> <add name="System.Net"/> </listeners> </source> </sources> <switches> <add name="System.Net" value="Verbose" /> <add name="System.Net.Sockets" value="Error" /> <add name="System.Net.Cache" value="Verbose" /> </switches> <sharedListeners> <add name="System.Net" type="System.Diagnostics.TextWriterTraceListener" initializeData="c:\BizTalkTrace.log" /> </sharedListeners> <trace autoflush="true" /> </system.diagnostics>
What this yielded was a great log file containing a more readable format than reading straight network communication. Specifically, when the request was canceled message showed up in the Event Log, I jumped to the .NET trace log and found this message …
A connection that was expected to be kept alive was closed by the server.
So indeed, keep-alives were causing a problem. Much more useful than the message that pops up in the Event Log. When we did turn keep-alives off on the destination WebLogic server (just as a test), the problem went away. But, that couldn’t be our final solution. We finally discovered that the keep-alive timeouts were different on the .NET box (BizTalk) and the WebLogic server. WebLogic had a keep-alive of 30 seconds, while it appears that the .NET framework uses the same value for keep-alives as for service timeouts (e.g. 90 seconds). The Windows box was attempting to reuse a connection while the WebLogic box had disconnected it. So, we modified the WebLogic application by synchronizing the keep-alive timeout with the Windows/BizTalk box, and the problem went away complete.
Now, we still get the responsible network connection pooling of keep-alives, while still solving the problem of canceled web requests.
Technorati Tags: BizTalk