Unloaded modules

Unloaded modules

Postby J_R » Wed Jan 06, 2010 9:54 pm

I'll try and be as concise as possible. Any assistance is much appreciated. The problem is a memory leak.

I have a 32bit dll written in c# that runs as a com component on a windows 2003 server. This component processes biztalk 2002 messages. The process runs under 200mb most of the time until it spikes up above a gig and will run out of memory around 2 gig unless it’s recycled. The memory is never released even when the process is idle for some time.
I used perfmon to verify that when the process is using a gig of memory, only 30-40 megs are on the managed heap so I know it’s not .net memory. !address –summary shows growth is the native heap. !heap command shows one heap (the first one) contains 95% of the memory used. Typically at this point I would use debugdiag to give me a summary of allocations. When I ran the process under debugdiag… no leak for a month. Detached debugdiag and the process blew up in a day. So I tried umdh.exe this time. I turned on user heap traces with “gflags –i dllhost +ust”, restarted the process and took an initial snapshot with umdh.exe. Again, the process did not leak for a week. I turned off user mode stack traces and restarted the process 4 hours ago and it is at 600mb already. I grabbed a dump of it when it was around 500mb.

So that’s my first issue. Why does it not leak with heap traces on? It’s possible that the timing is a coincidence but the odds of that seem low. I know that turning on heap traces and/or running under a debugger disables the LFH, turns on page heap, and tweaks some flags on the heap so technically there is a difference in heap behavior. I see that with traces on and off, the !heap –s command shows “L” in the fast heap column for the large problem heap and not LFH if that helps.

I decided to take a look at the 500mb dump I got without stack traces and poke around in the heap and see what was there. Heap 00090000 has 450+ mb of data.

!heap -stat -h 00090000
heap @ 00090000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
21120 116 - 23e98c0 (13.73)
3c 769ef - 1bcd404 (10.63)
2e 66d0f - 12798b2 (7.06)
1c 667e8 - b35d60 (4.29)
24 4f942 - b30d48 (4.28)
2a 3a03c - 9849d8 (3.64)
3a 27c6d - 9030b2 (3.45)
7c80 116 - 873300 (3.23)
6c5c 116 - 75abe8 (2.81)


I though those large allocations of 21120 were odd so I dumped those
!heap -flt s 21120
_HEAP @ 90000
HEAP_ENTRY Size Prev Flags UserPtr UserSize - state
1bbd0040 4225 0000 [01] 1bbd0048 21120 - (busy)
? <Unloaded_elp.dll>+1b6dc7b7
1be12fc8 4225 4225 [01] 1be12fd0 21120 - (busy)
? <Unloaded_elp.dll>+1b6c21a7
1bf24188 4225 4225 [01] 1bf24190 21120 - (busy)
? <Unloaded_elp.dll>+1b5f68ef
1c011000 4225 4225 [01] 1c011008 21120 - (busy)
? <Unloaded_elp.dll>+1b72542f
1c074320 4225 4225 [01] 1c074328 21120 - (busy)
? <Unloaded_elp.dll>+1b84dc37
1c096fe0 4225 4225 [01] 1c096fe8 21120 - (busy)
? <Unloaded_elp.dll>+1bad1a57
1c0daf98 4225 4225 [01] 1c0dafa0 21120 - (busy)
? <Unloaded_elp.dll>+1b961f7f
1c0fc128 4225 4225 [01] 1c0fc130 21120 - (busy)


I did the same for the next largest allocation, 3c, and got similar results for 90% of the entries

1b7d61e8 0009 0009 [01] 1b7d61f0 0003c - (busy)
? <Unloaded_dll>+2d0b22


Lm output shows this for unloaded modules
Unloaded modules:
00320033 00960061 Unknown_Module_00320033
Missing image name, possible paged-out or corrupt data.
00680063 00cc00a8 Unknown_Module_00680063
0000f50f 0075f572 dll
00000001 45d70a37 elp.dll
71af0000 71b12000 ShimEng.dll

Any tips on next steps here? Spot checking those addresses with db command doesn’t show any strings or recognizable pattern. I cannot find info on this elp.dll anywhere on the internet and there is no such dll on the server. When I randomly break into the live process with the debugger and dump modules, elp.dll is always in the unloaded modules area. And what’s with the address for elp.dll, 00000001 ?
J_R
 
Posts: 6
Joined: Mon Nov 24, 2008 9:25 pm

Re: Unloaded modules

Postby J_R » Wed Jan 06, 2010 11:21 pm

Update.

It was a coincidence that the app never leaked with debugdiag attached. I just watched the process blow up to 650 mb or so and then i decided to attach debugdiag to it. I did so and in the next 20 minutes it went up another 100+ mb. I got the report and the leaked memory is from the AD dll adsldpc.dll. It is easy to leak memory from .net land when you use this dll if you do not dispose objects correctly. We suspected this in the beginning and thouroughly checked the code to make sure all native objects were released. There is some error condition that seems to be preventing the code from running that frees the memory.
J_R
 
Posts: 6
Joined: Mon Nov 24, 2008 9:25 pm

Re: Unloaded modules

Postby J_R » Fri Jan 08, 2010 5:43 pm

ok, another update.

If I attach debugdiag to a running process it will still leak but if I start the process under a debugger or with gflags +ust, it will not.
There are two servers in the cluster and yesterday 1 I started the com dll after turning on user stack traces and the other I just let run. same server setup/code. The one without gflags has ran up to the 800mb limit I set twice already and recycled while the other one has never broke 200mb.

I know from debugdiag and other reports what dll is leaking, it's always adsldpc.dll

(from debugdiag)
Function adsldpc!AllocADsMem+10
Allocation type Heap allocation(s)
Heap handle 0x00090000
Allocation Count 2721603 allocation(s)
Allocation Size 188.41 MBytes
Leak Probability 95%

I even know where in the code this gets called from but from a c# side, we are doing it right with either dispose in the finally or with the 'using' keyword to make sure unmanaged objects are released.

it also seems that my mystery dll mentioned above, <Unloaded_elp.dll>, may really be adsldpc.dll. I had a debugger attached on it broke in with an access violation in the mystery dll.

(90.1e84): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=014dd758 ecx=00000000 edx=00000000 esi=014f9c28 edi=014f9c70
eip=22b9396e esp=261af008 ebp=261af028 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
<Unloaded_elp.dll>+0x22b9396d:
22b9396e 8b01 mov eax,dword ptr [ecx] ds:0023:00000000=????????


0:048> kb
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
261af028 22b9336b 014dd744 8119adc0 014dcd68 <Unloaded_elp.dll>+0x22b9396d
261af0b0 22b92375 014d6200 00000000 00000000 <Unloaded_elp.dll>+0x22b9336a
261af1fc 79f68c4e 014da358 014d61a4 79ef4a38 <Unloaded_elp.dll>+0x22b92374
261af2e8 79f68d5b 3c231ff8 261af4e0 261af558 mscorwks!COMToCLRWorkerBody+0x1de
261af344 79f68ec4 3c231ff8 261af4e0 261af558 mscorwks!COMToCLRWorkerDebuggerWrapper+0x37
261af518 0127b375 3c231ff8 261af558 8119adc0 mscorwks!COMToCLRWorker+0x157
261af540 77c80193 261af564 261af764 261af9bc <Unloaded_elp.dll>+0x127b374
261af580 77ce33e1 011d423b 261af768 00000003 RPCRT4!Invoke+0x30
261af980 77ce1968 6a0370f0 233a8db8 42f98268 RPCRT4!NdrStubCall2+0x299
261af9d8 77d05644 6a0370f0 42f98268 233a8db8 RPCRT4!CStdStubBuffer_Invoke+0x3f
261af9fc 7778d01b 6a0370f0 42f98268 233a8db8 OLEAUT32!CUnivStubWrapper::Invoke+0xc5
261afa40 7778cfc8 42f98268 000fe6b0 23e7b348 ole32!SyncStubInvoke+0x37
261afa88 776c120b 42f98268 6a9f92e0 6b0ab338 ole32!StubInvoke+0xa7
261afb64 776c0bf5 233a8db8 00000000 6b0ab338 ole32!CCtxComChnl::ContextInvoke+0xec
261afb80 7778d2a7 42f98268 00000001 6b0ab338 ole32!MTAInvoke+0x1a
261afbb0 7778cd66 d0908070 233a8db8 6b0ab338 ole32!AppInvoke+0xa3
261afc84 7778d2c6 42f98210 000a0db0 000fe698 ole32!ComInvokeWithLockAndIPID+0x2c5
261afcd0 77c7ff7a 25f40f04 000fe698 25f40f04 ole32!ThreadInvoke+0x2e3
261afd04 77c8042d 7778d238 25f40f04 261afdec RPCRT4!DispatchToStubInCNoAvrf+0x38
261afd58 77c80353 00000000 00000000 7767bfc8 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0x11f

When I list modules with lm, adsldpc.dll is not in the loaded or unloaded module list. Of course when I run the code that allocated the AD objects in a different windows form app for testing purposed I can loop through it thousands of times without any significant memory loss. The difference is the prod app is hosted as a dllhost.exe com server. I made sure the .net finalizer thread was good. From .net side objects are being finalized and cleared. I suspect heap corruption at this point.
J_R
 
Posts: 6
Joined: Mon Nov 24, 2008 9:25 pm

Re: Unloaded modules

Postby J_R » Mon Jan 11, 2010 4:55 pm

Since right around the time of the last post I have been running the dllhost com process with debugdiag and full page heap turned on. No access violations created at all and 600+ mb memory leaked. I am using debugdiag to monitor all thread creation and deletion in an attempt to find out when/why all the dead threads and unloaded.dll appear in memory dumps.
Doesn't look like it's heap corruption...
J_R
 
Posts: 6
Joined: Mon Nov 24, 2008 9:25 pm

Re: Unloaded modules

Postby Ketandp » Thu Apr 15, 2010 4:02 am

Hi J_R,

I have observed some traces of this mystery dll in a couple of crash dumps.
I have not been able to find any information about the elp.dll so far, some of the websites reported that they are still trying to identify.

Have you ever got the resolution of the memory leak issue? Were you able to figure out what is your mystery dll? is it really the AD dll you mentioned?
Please share your thoughts on it, I shall update if I find more details about this.

Thanks
Ketan
Ketandp
 
Posts: 3
Joined: Thu Nov 02, 2006 7:26 am
Location: Bangalore


Return to User mode dumps

Who is online

Users browsing this forum: No registered users and 0 guests

cron