Serviced component code not re-entrant

by acha11 7. October 2005 02:14

after around 30 person-days of effort working on improving the performance and locking characteristics of a particularly horrendous distributed transaction in our .NET, ServicedComponent-based legacy architecture, we encounter this microsoft hotfix which resolves the issue instantly.

to be precise, this wasn't actually a performance issue, it was a process-wide hang which snared any thread that made a ServicedComponent call after something, somewhere happened. we're still not sure what the something is, because the microsoft hotfix bug description basically says "the servicedcomponent code we shipped isn't re-entrant. here's an unsupported, available-by-phone-only, hotfix". a shitty level of support, imho, given what it cost us to diagnose the problem.

the top portion of the stack trace of the hung threads looks like this:

system.enterpriseservices.thunk.dll!System.EnterpriseServices.Thunk.Proxy.RevokeObject(int cookie) + 0x80 bytes	
system.enterpriseservices.dll!System.EnterpriseServices.ServicedComponentProxy.CleanupQueues(bool bGit) + 0x71 bytes	
system.enterpriseservices.dll!System.EnterpriseServices.ServicedComponentProxyAttribute.CreateInstance(System.Type serverType) + 0x3b bytes	
mscorlib.dll!System.Runtime.Remoting.Activation.ActivationServices.IsCurrentContextOK(System.Type serverType, System.Object[] props, bool bNewObj) + 0x4b bytes	
mscorlib.dll!System.Activator.CreateInstance(System.Type type, bool nonPublic) + 0x43 bytes	
mscorlib.dll!System.Activator.CreateInstance(System.Type type) + 0x8 bytes	  

the key warning sign is the RevokeObject method at the top. from the
"CleanupQueues" method (second in the call stack), i'm inferring that
the code does opportunistic clean-up each time ServicedComponentProxyAttribute.CreateInstance() is called. something
has hosed the queue structures, or something referenced by them,
causing the RevokeObject to block infinitely, loop, or otherwise die.
looking at the disassembly for RevokeObject, there's some GIT (global
interface table) calling going on which is opaque to me, given my level
of understanding. i haven't drilled down any further than that.

other ... people ... have ... encountered the same problem. the cost to developers and clients is significant. this is not the right way to manage an issue of this sort, MS.

update [later the same day]: a little more research turns up a DisableAsyncFinalization registry key:

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\COM3\System.EnterpriseServices]
"DisableAsyncFinalization"=dword:00000001
which is suggested as a possible solution here.

an ms kb article exists titled FIX: COM+ application that uses the Global Interface Table (GIT) may deadlock. remember that within RevokeObject, there's a GIT-style method call. two excerpts from the article:

  • "If you experience this issue, multiple threads in the process show call stacks that involve access to the Global Interface Table (GIT)."
  • "When you use COM+ components that are written by using managed code, such as Visual C# or Visual Basic .NET and you do not explicitly call the Dispose method on these objects."
so it's possible that our app is just not being rigorous enough about doing its Dispose()s/using()s. more later.

Tags:

Comments

Comments are closed

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen

RecentComments

Comment RSS