XmlSerializer and CSC.exe hanging under ASP.NET when running as NT AUTHORITY\SYSTEM

by acha11 3. February 2005 02:44

well, i just went through around 5 hours of serious pain. if you're ever working in ASP.NET and experience hangs while calling the XmlSerializer(Type) constructor, use procexp (from sysinternals) to check whether there's a dangling csc.exe sitting underneath your aspnet_wp.exe instance. if so, you might be having the problem i had. check your machine.config to see what account asp.net is configured to run under. try changing back to "Machine" if appropriate - that fixed the problem for me. the reason i was running under a non-machine account? AQTime profiler (which, for other reasons, and non-sarcastically, is my current favourite application) changes asp.net to run under system automatically.

other people who had the same problem: one, two.

and here's a journal of my problem solving efforts:

This is a weird one.

I ran a load test with 1 concurrent user once through a PCM use case.

I went away for a while, then did an iisreset, then tried to run the test again. This time, ACT couldnt get past the first request. I stopped ACT and tried making a request using IE, which didn't get anything back (although I was prompted for credentials, and tcptrace showed the request after the Negotiate was making it up to the server, but nothing came back.) I did an iisreset and tried again with IE. Same story. Scratched my head a bit, and reset the box.

Everything was fine. Then, I started having the same problem again. So I hooked up the debugger, and found what was hanging.

Its the line to create the XmlSerializer. Notice also there are several instances of the C# compiler hanging around. These stay around even after I do an iisreset. The new XmlSerializer line gets called about once for every node in the config xml. Something else weird: in four runthroughs separated by iisreset, the XmlSerializer constructor hung on a different type each time.

So the summary is:

- problem persists through iisresets.

- Problem doesnt persist across box resets.

- Lots of CSCs hanging around, I assume deadlocked on something

- Hangs on a different type each time.

- Magically went away after Id learnt this much.

More info:

a) This is still happening.

b) Procexp reveals that of the csc.exe instances, one (with the highest PID) was spawned by the current instance of aspnet_wp.exe.

c) Using procexp to kill that csc.exe instance (because taskmanager gives access denied) successfully kills it. Another is spawned immediately, which then goes away, then another arrives, until one hangs there. Keep killing the ones that hang, and eventually you get to the pcm home screen. Oddly, nothing shows up in the logs when I do this, either. Perhaps the XmlSerializer or whatever it is thats wrapping CSC.exe detects process exit and tries again?

d) Looking at one of the hung csc.exe process shows two threads.

a. One started inside csc.exe, and is in state Wait:WrLpcReply.

b. One started inside RPCRT4.dll (RPC runtime, I guess), and is in state Wait:WrLpcReceive.

All the hung csc.exe process Ive looked at have the same two threads in the same state.

e) Killing all the csc.exes as well as aspnet_wp.exe and inetinfo.exe, which Id expect to be a thorough restart, doesnt resolve the problem; the first request causes csc.exe to deadlock.

f) Iisreset also still doesnt fix the problem.

g) When a csc.exe process first starts, it has a third thread in ole32.dll+0x6502c in state Wait:DelayExecution. Soon it changes to Wait:UserRequest, then eventually goes away.

h) It wasnt an interaction with Infrastructure Central; closing it and iisresetting and manually killing dangling aspnet_wp and cs still didn't fix the problem.

Now it's getting seriously fun.

I hooked up filemon to csc.exe to see what was happening. There were a lot of errors occurring, but they mostly looked like the standard file not founds you get when a .NET PE starts up.

So I looked at the last thing csc.exe did, which was to write to a file C:\WINNT\Temp\ttohy1s5.out, which by the looks of the name is a generated XmlSerializer assembly-related file. So I open that up, and its stdout for the compiler. There are tens of groups of generated assembly-related files in here based on the dates, theyre the ones that have been generated and never cleaned up while I was tracking this down.

It includes two compiler errors.

C:\WINNT\system32> "c:\winnt\microsoft.net\framework\v1.1.4322\csc.exe" /t:library /utf8output /R:"c:\winnt\assembly\gac\system.xml\1.0.5000.0__b77a5c561934e089\system.xml.dll" /R:"c:\winnt\microsoft.net\framework\v1.1.4322\temporary asp.net files\abode_pcm\3b1976b1\1cc1142c\assembly\dl2\751be669\ca2bcd37_2808c501\infcore.dll" /R:"c:\winnt\microsoft.net\framework\v1.1.4322\mscorlib.dll" /out:"C:\WINNT\TEMP\ttohy1s5.dll" /debug- /optimize+ /w:1  "C:\WINNT\TEMP\ttohy1s5.0.cs"

Microsoft (R) Visual C# .NET Compiler version 7.10.3052.4

for Microsoft (R) .NET Framework version 1.1.4322

Copyright (C) Microsoft Corporation 2001-2002. All rights reserved.

c:\WINNT\Temp\ttohy1s5.0.cs(51,9): error CS0647: Error emitting 'System.Security.Permissions.PermissionSetAttribute' attribute -- 'Unspecified error '

c:\WINNT\Temp\ttohy1s5.0.cs(156,9): error CS0647: Error emitting 'System.Security.Permissions.PermissionSetAttribute' attribute -- 'Unspecified error '

Heres somebody experiencing exactly this problem:

http://groups.google.com.au/groups?q=CS0647+%22unspecified+error%22&hl=en&lr=&selm=B6390A05-6C3F-4C60-9DBA-02BE86600E66%40microsoft.com&rnum=1

Somebody else:

http://dotnet247.com/247reference/msgs/51/259679.aspx

They both mention rapidly & repeatedly calling the XmlSerializer, and ones using the XmlSerializer(Type) constructor.

No fix mentioned, however.

So why arent we seeing this in dev? Well, the production box is W2k Server maybe it differs somehow?

The web is configured for FullTrust.

Changing the Abode and PCM applications to be in process didnt solve the problem.

Running a test, waiting for a deadlock, and then copying the input files to csc that caused the deadlock to a new folder and manually running csc interactively against those files worked perfectly no errors from the compiler, only warnings that a few XML-related types were defined in multiple places.

Compiling those files in-place in c:\winnt\temp also worked with the same warnings, and no errors.

Then I iisreset, and tried with IE, and it worked first time.

Then I tried clearing out the temp folder, reasoning that it might be lingering clutter that is causing the problem to recur, and an unlikely deadlock the first time.

Then I iisreset and tried with IE, and got an immediate csc hang.

Then I went to change aspnet to run under SYSTEM and found that AQTime already had changed it, so I changed it back to machine, iisrest, and tried with IE. It worked first time.

Closed IE, iisreset, try again. It worked again.

Closed IE, iisreset, try again. It worked again.

Closed IE, iisreset, change aspnet to run under SYSTEM. Iisreset. Try again. It failed first time.

Closed IE, iisreset, kill lingering CSC.EXEs, change aspnet to run under ASPNET. Iisreset. Try again. It worked first time.

Closed IE, iisreset, change aspnet to run under SYSTEM. Iisreset. Try again. It failed first time.

Well, that's it. There's some kind of sporadic bug in csc when run as NT AUTHORITY\SYSTEM which manifests itself just often enough to happen every time we start reading configuration, but not always on the same loaded type.

There goes about 5 hours right there J.

Tags:

Comments

Comments are closed

Powered by BlogEngine.NET 1.4.5.0
Theme by Mads Kristensen

RecentComments

Comment RSS