Recently a co-worker was complaining about a windows service which took ages to start. Because this exact code runs on a large number of servers I suspected the the fault to be on the machine rather than the code. The task manager yielded no interesting information (memory consumption was ok and so was the CPU usage). So I downloaded and installed the Windows Debugging tools, configured the symbol server and attached the debugger to the process in question. To enable debugging of managed applications, the SOS extension has to be loaded:
The !eestack command lists the stacktraces of all running threads and the –ee option restricts the output to managed methods. This makes the trace much more readable. Anyway, most threads where just sitting around doing nothing, but one got my attention:
1: Thread 6
2: *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v2.0.50727_64\System\65f46521e7fca2cd2d216162175f2fd6\System.ni.dll
3: *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v2.0.50727_64\System.ServiceModel\e000a1cd822ffb6f6483426a67622d75\System.ServiceModel.ni.dll
4: *** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v2.0.50727_64\Netatwork.Common\73da8de102292417562adaf43872eec6\Netatwork.Common.ni.dll
5: *** ERROR: Module load completed but symbols could not be loaded for C:\Windows\assembly\NativeImages_v2.0.50727_64\Netatwork.Common\73da8de102292417562adaf43872eec6\Netatwork.Common.ni.dll
6: Child-SP RetAddr Call Site
7: 0000000001f9e900 000007fef1b64304 System_ni!DomainBoundILStubClass.IL_STUB(UInt32, System.Security.Cryptography.SafeLocalAllocHandle, System.Security.Cryptography.OidGroup)+0x77
8: 0000000001f9ea00 000007fef1b2b402 System_ni!System.Security.Cryptography.CAPI.c(UInt32, System.Security.Cryptography.SafeLocalAllocHandle, System.Security.Cryptography.OidGroup)+0x94
9: 0000000001f9eac0 000007fef1b2b28c System_ni!System.Security.Cryptography.X509Certificates.X509Utils.FindOidInfo(UInt32, System.String, System.Security.Cryptography.OidGroup)+0x152
10: 0000000001f9eba0 000007fef1b282ec System_ni!System.Security.Cryptography.Oid..ctor(System.String, System.Security.Cryptography.OidGroup, Boolean)+0x2c
11: 0000000001f9ebe0 000007fef1b27f03 System_ni!System.Security.Cryptography.X509Certificates.X509Certificate2.get_PublicKey()+0x8c
12: 0000000001f9ec40 000007feee6c5425 System_ni!System.Security.Cryptography.X509Certificates.X509Certificate2.get_PrivateKey()+0xb3
13: 0000000001f9eca0 000007feee6c538d System_ServiceModel_ni!System.ServiceModel.Security.SecurityUtils.EnsureCertificateCanDoKeyExchange(System.Security.Cryptography.X509Certificates.X509Certificate2)+0x55
14: 0000000001f9ed20 000007feee6c5349 System_ServiceModel_ni!System.ServiceModel.Security.ServiceCredentialsSecurityTokenManager.CreateServerX509TokenProvider()+0x2d
15: 0000000001f9ed60 000007feee6c52e1 System_ServiceModel_ni!System.ServiceModel.Security.ServiceCredentialsSecurityTokenManager.CreateLocalSecurityTokenProvider(System.ServiceModel.Security.Tokens.RecipientServiceModelSecurityTokenRequirement)+0x49
16: 0000000001f9edc0 000007feef00fcd9 System_ServiceModel_ni!System.ServiceModel.Security.ServiceCredentialsSecurityTokenManager.CreateSecurityTokenProvider(System.IdentityModel.Selectors.SecurityTokenRequirement)+0x41
17: 0000000001f9ee10 000007feef010a0f System_ServiceModel_ni!(System.ServiceModel.Security.ServiceCredentialsSecurityTokenManager.CreateTlsnegoServerX509TokenProvider(System.ServiceModel.Security.Tokens.RecipientServiceModelSecurityTokenRequirement)+0xd9
18: 0000000001f9ee60 000007feee6c5811 System_ServiceModel_ni!System.ServiceModel.Security.ServiceCredentialsSecurityTokenManager.CreateTlsnegoSecurityTokenAuthenticator(System.ServiceModel.Security.Tokens.RecipientServiceModelSecurityTokenRequirement, Boolean, System.IdentityModel.Selectors.SecurityTokenResolver ByRef)+0x23f
19: 0000000001f9eee0 000007feef624a7a System_ServiceModel_ni!System.ServiceModel.Security.ServiceCredentialsSecurityTokenManager.CreateSecurityTokenAuthenticator(System.IdentityModel.Selectors.SecurityTokenRequirement, System.IdentityModel.Selectors.SecurityTokenResolver ByRef)+0x2a1
20: 0000000001f9ef40 000007feee77dc27 System_ServiceModel_ni!System.ServiceModel.Security.SymmetricSecurityProtocolFactory.OnOpen(System.TimeSpan)+0xe9fc7a
21: 0000000001f9efb0 000007feee6ebe43 System_ServiceModel_ni!System.ServiceModel.Security.WrapperSecurityCommunicationObject.OnOpen(System.TimeSpan)+0x17
22: 0000000001f9efe0 000007feef18bf45 System_ServiceModel_ni!System.ServiceModel.Channels.CommunicationObject.Open(System.TimeSpan)+0x233
23: 0000000001f9f0d0 000007feef1881cb System_ServiceModel_ni!System.ServiceModel.Security.SecurityListenerSettingsLifetimeManager.Open(System.TimeSpan)+0x65
24: 0000000001f9f140 000007feee6ebe43 System_ServiceModel_ni!System.ServiceModel.Channels.SecurityChannelListener`1[[System.__Canon, mscorlib]].OnOpen(System.TimeSpan)+0xcb
25: 0000000001f9f1b0 000007feee727aa7 System_ServiceModel_ni!System.ServiceModel.Channels.CommunicationObject.Open(System.TimeSpan)+0x233
26: 0000000001f9f2a0 000007feee6ebe43 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ChannelDispatcher.OnOpen(System.TimeSpan)+0x57
27: 0000000001f9f2f0 000007feee6b936e System_ServiceModel_ni!System.ServiceModel.Channels.CommunicationObject.Open(System.TimeSpan)+0x233
28: 0000000001f9f3e0 000007feee6ebe43 System_ServiceModel_ni!System.ServiceModel.ServiceHostBase.OnOpen(System.TimeSpan)+0x6e
29: 0000000001f9f450 000007fef09bec81 System_ServiceModel_ni!System.ServiceModel.Channels.CommunicationObject.Open(System.TimeSpan)+0x233
30: …
For administrative reasons, this server exposes a number of encrypted WCF endpoints. During startup the WCF runtime examines the certificate and checks whether it is valid. During this operation, it tries to access the private key and calls the Win32 native function CryptFindOidInfo. As it turns out, this function may contact a domain controller to fetch certain information. This “feature” can be disabled by passing the CRYPT_OID_DISABLE_SEARCH_DS_FLAG to the function, but the .NET framework doesn’t do this. Apparently, the function had difficulties contacting its domain controller. A common cause for this is an incorrect DNS configuration, so I checked this first. As it turned out, the DNS configuration was quite complicated and a conditional forwarder for the domain the machine was on was missing. Once this was fixed, the startup speed of the service was back to normal.