More Debugging in CE

So I just posted a blog entry about debugging and lo and behold, not 30 minutes later I his an exception in my platform.  I’ve done this tracing before, but this time I decided I’d document my discovery process and post it here as a real-world example.  It turned out to be an interesting example that I’d not hit before, so all the more fun to post.


So here’s the background: I have a CE 4.2 PXA255-based system.  When I go to sleep and then wake back up with a USB device (keyboard, mouse or mass torage device) inserted I get the following exception:


RaiseException: Thread=81a8db58 Proc=816e7490 ‘device.exe’
AKY=00000005 PC=03fd3f80 RA=800bd398 BVA=00000001 FSR=00000001


This exception only happens at wake – if the device is in during boot there’s no problem.


Nicely, when PB makes an image, it dumps the output into a PLG file in the same folder as your WCE and PBW files.  This output provides the map of where everything gets assembled. 


So let’s take a look at what my exception is telling me.  I like to start with the return address (RA in the exception output) to see who the caller is causing the problem.  From Sue’s entry, we can tell immediately it’s in the kernel because the address is after 80000000. 


We need to find the offset (plus verify this assumption) so we’ll look at the build output.Here’s an excerpt from my myproject.plg file:


MODULES Section
Module                 Section  Start     Length  psize   vsize   Filler
———————- ——– ——— ——- ——- ——- ——
nk.exe                 .text    800b9000  258048  255488  255440 o32_rva=00001000
nk.exe                 .pdata   800f8000    8192    8192    7872 o32_rva=0006a000
cplmain.cpl            .text    800fa000  110592  109056  108976 o32_rva=00001000
cplmain.cpl            .rsrc    80115000  126976  126464  126276 o32_rva=0001f000


We can see that our return address of 800bd398 is in nk.exe at offset 0x4398 (0x800db398-0x800b9000).


Now we look in our _FLATRELEASEDIR and pull out the MAP file for NK.EXE.  This tells us the offsets for the beginnings of every function in the module (keep an eye on statics listed at the bottom of the map file).  Here’s an excerpt from NK.MAP:


 0001:000041f8       UndefException             8c5851f8     nk:armtrap.obj
 0001:00004218       SWIHandler                 8c585218     nk:armtrap.obj
 0001:0000421c       FIQHandler                 8c58521c     nk:armtrap.obj
 0001:00004238       PrefetchAbortEH            8c585238     nk:armtrap.obj
 0001:00004240       PrefetchAbort              8c585240     nk:armtrap.obj
 0001:000044bc       MD_CBRtn                   8c5854bc     nk:armtrap.obj
 0001:00004578       CommonHandler              8c585578     nk:armtrap.obj
 0001:00004584       SaveAndReschedule          8c585584     nk:armtrap.obj


My calculated offset of 0x4398 is after the start of PrefetchAbort and before MD_CBRtn, so my exception is being throw by PrefetchAbort.  That surprises me – why the hell would PrefetchAbort be throwing an exception?  Even better, why is my code in PrefetchAbort in the first place?


Let’s see where the exception is coming from exactly by chasing the program counter (PC=03fd3f80) from the exception.  Again we trun to our PLG file.  Here’s an excerpt:


Module coredll.dll   at offset 01fff000 data, 03f71000 code
Module regenum.dll   at offset 01ffd000 data, 03f61000 code
Module pm.dll        at offset 01ffb000 data, 03f51000 code


Based on this we can see that our program counter address of 0x03fd3f80 is in coredll.dll at offset 0x62f80.  This is odd as I’d expect it to be in phci.dll or device.exe.  It appears that maybe we’re passing some invalid info to a Win32 API, which then causes it to throw an exception?  Let’s do more digging.


We go back to our _FLATRELEASEDIR and open up COREDLL.MAP to find what function in there is the culprit.  Here’s an excerpt:


 0001:00062e40       __fp_mult_uncommon         10063e40     coredll_ALL:mul.obj
 0001:00062f5c       __rt_div0                  10063f5c f   coredll_ALL:__div0.obj
 0001:00062f90       __ld12mul                  10063f90 f   coredll_ALL:tenpow.obj


Ahhh…now things come together.  An offset of 0x62f80 puts us in a function called “__rt_div0”, which I’m going to assume is a runtime divide by zero exception.  That explains why we see an RA leading to PrefetchAbort.  Something in the platform code is causing a divide by zero error, which then causes PrefetchAbort to be called, which in turn enters __rt_div0.


So now what?  I still don’t know what library or function caused the exception, and herein is a flaw in trying to track problems with addresses.  In this case (and this is the first I’ve seen this happen) I’ve got almost nothing.  I know it’s in the USB host driver simply because of the  physical effects (happens only with USB devices, and when it happens the device stops working).  I know it’s a divide by zero error.  That’s all I get from this entire exercise – I wish it was a prettier case study that led me right to the line of code causing it, but hey, if it were simple everyone would do it.

Debugging in CE

Everyone that’s done much development in CE knows about Doug Boling’s definitive article on CE memory management.  If you’ve not read it, you should – if you have read it it doesn’t hurt to read it again.  Go ahead and click the link – I’ll wait here for you….


Done reading?  Good.  Now let’s take a deeper look through the magic of blog voyeurism (yes, I’m not giving any info here other than links – I’m lazy).  Back in February, John Eldridge talked about how to tell if a debug symbol is even correctly being reported, and it good info.  Again, I’ll wait while you read….


Now this got Sue Loh to thinking about what he said and a couple days later she put together a fantastic blog entry on making sense virtual addresses.  This one is short, but the info in it is invaluable if you do any amount of debugging at all.  No clue how I missed it until today, but it’s one that every CE developer should read.


She then followed that up with an entry that shows how to go from an address to a symbol, allowing you to make use of that cryptic crap that Data Aborts give.


So there you go, links to some of the most valuable debugging information I’ve probably ever read, all in one place.  If you have other good debugging links, post them in the comments.

NTLM Domain Authentication in CE

A question came up in todays CF chat about authenticating a CE device to a domain.  While I don’t have ready-made CF code for it (though that might be an interesting task) here’s a C version:

#include <WINDOWS.H>
#include <SECURITY.H>
#include <SSPI.H>
#define SEC_PACKAGE _T(“Microsoft Unified Security Protocol Provider”)
typedef struct _AUTH_SEQ {
BOOL fInitialized;
BOOL fHaveCredHandle;
BOOL fHaveCtxtHandle;
CredHandle hcred;
struct _SecHandle hctxt;
} AUTH_SEQ, *PAUTH_SEQ;
BOOL GenClientContext(PAUTH_SEQ pAS, PSEC_WINNT_AUTH_IDENTITY pAuthIdentity,
PVOID pIn, DWORD cbIn, PVOID pOut, PDWORD pcbOut, PBOOL pfDone);
BOOL GenServerContext(PAUTH_SEQ pAS, PVOID pIn, DWORD cbIn, PVOID pOut,
PDWORD pcbOut, PBOOL pfDone);
BOOL WINAPI SSPLogonUser(LPTSTR szDomain, LPTSTR szUser, LPTSTR szPassword);

TCHAR *lpszUserName = _T(“ctacke”);
TCHAR *lpszDomainName = _T(“mydomain”);
TCHAR *lpszPassword = _T(“mypassword”);

int WINAPI WinMain(HINSTANCE hInst, HINSTANCE hPrevInst, LPWSTR lpCmdLine, int nCmShow)
{
return SSPLogonUser(lpszDomainName, lpszUserName, lpszPassword);
}
BOOL GenClientContext(PAUTH_SEQ pAS, PSEC_WINNT_AUTH_IDENTITY pAuthIdentity,
PVOID pIn, DWORD cbIn, PVOID pOut, PDWORD pcbOut, PBOOL pfDone)
{
DEBUGMSG(TRUE, (_T(“+GenServerContextrn”)));
SECURITY_STATUS ss;
TimeStamp tsExpiry;
SecBufferDesc sbdOut;
SecBuffer sbOut;
SecBufferDesc sbdIn;
SecBuffer sbIn;
ULONG fContextAttr;
if (!pAS->fInitialized)
{
DEBUGMSG(TRUE, (_T(“Calling AcquireCredentialsHandle…”)));
ss = AcquireCredentialsHandle(NULL, SEC_PACKAGE,
SECPKG_CRED_OUTBOUND, NULL, pAuthIdentity, NULL, NULL,
&pAS->hcred, &tsExpiry);

if (ss < 0)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
return FALSE;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
if(ss < 0)
{
fprintf(stderr, “AcquireCredentialsHandle failed with %08Xn”, ss);
return FALSE;
}
pAS->fHaveCredHandle = TRUE;
}
// Prepare output buffer
sbdOut.ulVersion = 0;
sbdOut.cBuffers = 1;
sbdOut.pBuffers = &sbOut;
sbOut.cbBuffer = *pcbOut;
sbOut.BufferType = SECBUFFER_TOKEN;
sbOut.pvBuffer = pOut;
// Prepare input buffer
if (pAS->fInitialized) {
sbdIn.ulVersion = 0;
sbdIn.cBuffers = 1;
sbdIn.pBuffers = &sbIn;
sbIn.cbBuffer = cbIn;
sbIn.BufferType = SECBUFFER_TOKEN;
sbIn.pvBuffer = pIn;
}
ss = InitializeSecurityContext(&pAS->hcred,
pAS->fInitialized ? &pAS->hctxt : NULL, NULL, 0, 0,
SECURITY_NATIVE_DREP, pAS->fInitialized ? &sbdIn : NULL,
0, &pAS->hctxt, &sbdOut, &fContextAttr, &tsExpiry);
if (ss < 0) {
//
fprintf(stderr, “InitializeSecurityContext failed with %08Xn”, ss);
return FALSE;
}
pAS->fHaveCtxtHandle = TRUE;
// If necessary, complete token
if (ss == SEC_I_COMPLETE_NEEDED || ss == SEC_I_COMPLETE_AND_CONTINUE) {
ss = CompleteAuthToken(&pAS->hctxt, &sbdOut);
if (ss < 0) {
fprintf(stderr, “CompleteAuthToken failed with %08Xn”, ss);
return FALSE;
}
}
*pcbOut = sbOut.cbBuffer;
if (!pAS->fInitialized)
pAS->fInitialized = TRUE;
*pfDone = !(ss == SEC_I_CONTINUE_NEEDED
|| ss == SEC_I_COMPLETE_AND_CONTINUE );
DEBUGMSG(TRUE, (_T(“-GenServerContextrn”)));
return TRUE;
}

///////////////////////////////////////////////////////////////////////////////

BOOL GenServerContext(PAUTH_SEQ pAS, PVOID pIn, DWORD cbIn, PVOID pOut,
PDWORD pcbOut, PBOOL pfDone)
{
DEBUGMSG(TRUE, (_T(“+GenServerContextrn”)));
SECURITY_STATUS ss;
TimeStamp tsExpiry;
SecBufferDesc sbdOut;
SecBuffer sbOut;
SecBufferDesc sbdIn;
SecBuffer sbIn;
ULONG fContextAttr;
if (!pAS->fInitialized)
{
DEBUGMSG(TRUE, (_T(“Calling AcquireCredentialsHandle…”)));
ss = AcquireCredentialsHandle(NULL, SEC_PACKAGE,
SECPKG_CRED_INBOUND, NULL, NULL, NULL, NULL, &pAS->hcred,
&tsExpiry);

if (ss < 0)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
return FALSE;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
pAS->fHaveCredHandle = TRUE;
}
// Prepare output buffer
sbdOut.ulVersion = 0;
sbdOut.cBuffers = 1;
sbdOut.pBuffers = &sbOut;
sbOut.cbBuffer = *pcbOut;
sbOut.BufferType = SECBUFFER_TOKEN;
sbOut.pvBuffer = pOut;
// Prepare input buffer
sbdIn.ulVersion = 0;
sbdIn.cBuffers = 1;
sbdIn.pBuffers = &sbIn;
sbIn.cbBuffer = cbIn;
sbIn.BufferType = SECBUFFER_TOKEN;
sbIn.pvBuffer = pIn;
DEBUGMSG(TRUE, (_T(“Calling AcceptSecurityContext…”)));
ss = AcceptSecurityContext(&pAS->hcred,
pAS->fInitialized ? &pAS->hctxt : NULL, &sbdIn, 0,
SECURITY_NATIVE_DREP, &pAS->hctxt, &sbdOut, &fContextAttr,
&tsExpiry);
if (ss < 0)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
return FALSE;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
pAS->fHaveCtxtHandle = TRUE;
// If necessary, complete token
if (ss == SEC_I_COMPLETE_NEEDED || ss == SEC_I_COMPLETE_AND_CONTINUE)
{
DEBUGMSG(TRUE, (_T(“Calling CompleteAuthToken…”)));
ss = CompleteAuthToken(&pAS->hctxt, &sbdOut);
if (ss < 0)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
return FALSE;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
}
*pcbOut = sbOut.cbBuffer;
if (!pAS->fInitialized)
pAS->fInitialized = TRUE;
*pfDone = !(ss = SEC_I_CONTINUE_NEEDED
|| ss == SEC_I_COMPLETE_AND_CONTINUE);
DEBUGMSG(TRUE, (_T(“-GenServerContextrn”)));
return TRUE;
}

///////////////////////////////////////////////////////////////////////////////

BOOL WINAPI SSPLogonUser(LPTSTR szDomain, LPTSTR szUser, LPTSTR szPassword)
{
DEBUGMSG(TRUE, (_T(“rn+SSPLogonUserrn”)));
AUTH_SEQ asServer = {0};
AUTH_SEQ asClient = {0};
BOOL fDone = FALSE;
BOOL fResult = FALSE;
DWORD cbOut = 0;
DWORD cbIn = 0;
DWORD cbMaxToken = 0;
PVOID pClientBuf = NULL;
PVOID pServerBuf = NULL;
SecPkgInfo *pSPI = NULL;
HMODULE hModule = NULL;
SECURITY_STATUS ss;
SEC_WINNT_AUTH_IDENTITY ai;
ULONG packages = 0;
PSecPkgInfo pPackageInfo = NULL;
__try
{
EnumerateSecurityPackages(&packages, &pPackageInfo);
DEBUGMSG(TRUE, (_T(” Available security packages:rn”)));
for(UINT i = 0 ; i < packages ; i++)
{
DEBUGMSG(TRUE, (_T(“t%srn”), pPackageInfo[i].Name));
}
DEBUGMSG(TRUE, (_T(“rn”)));

DEBUGMSG(TRUE, (_T(“Calling FreeContextBuffer…”)));
ss =FreeContextBuffer(pPackageInfo);
if(ss != SEC_E_OK)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
__leave;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
// Get max token size
DEBUGMSG(TRUE, (_T(“Calling QuerySecurityPackageInfo…”)));
ss = QuerySecurityPackageInfo(SEC_PACKAGE, &pSPI);
if(ss != SEC_E_OK)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
__leave;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
cbMaxToken = pSPI->cbMaxToken;
DEBUGMSG(TRUE, (_T(“Calling FreeContextBuffer…”)));
ss =FreeContextBuffer(pSPI);
if(ss != SEC_E_OK)
{
DEBUGMSG(TRUE, (_T(“failed with 0x%08xrn”), ss));
__leave;
}
DEBUGMSG(TRUE, (_T(“okrn”)));
// Allocate buffers for client and server messages
DEBUGMSG(TRUE, (_T(“Allocating heaps…”)));
pClientBuf = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, cbMaxToken);
pServerBuf = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, cbMaxToken);
DEBUGMSG(TRUE, (_T(“okrn”)));
// Initialize auth identity structure
ZeroMemory(&ai, sizeof(ai));
ai.Domain = szDomain;
ai.DomainLength = _tcslen(szDomain);
ai.User = szUser;
ai.UserLength = _tcslen(szUser);
ai.Password = szPassword;
ai.PasswordLength = _tcslen(szPassword);
ai.Flags = SEC_WINNT_AUTH_IDENTITY_UNICODE;
// Prepare client message (negotiate) .
cbOut = cbMaxToken;
DEBUGMSG(TRUE, (_T(“Calling GenClientContextrn”)));
if (!GenClientContext(&asClient, &ai, NULL, 0, pClientBuf, &cbOut, &fDone))
__leave;
// Prepare server message (challenge) .
cbIn = cbOut;
cbOut = cbMaxToken;
DEBUGMSG(TRUE, (_T(“Calling GenServerContextrn”)));
if (!GenServerContext(&asServer, pClientBuf, cbIn, pServerBuf, &cbOut, &fDone))
__leave;
// Most likely failure: AcceptServerContext fails with SEC_E_LOGON_DENIED
// in the case of bad szUser or szPassword.
// Unexpected Result: Logon will succeed if you pass in a bad szUser and
// the guest account is enabled in the specified domain.
// Prepare client message (authenticate) .
cbIn = cbOut;
cbOut = cbMaxToken;
DEBUGMSG(TRUE, (_T(“Calling GenClientContextrn”)));
if (!GenClientContext(&asClient, &ai, pServerBuf, cbIn, pClientBuf, &cbOut, &fDone))
__leave;
// Prepare server message (authentication) .
cbIn = cbOut;
cbOut = cbMaxToken;
DEBUGMSG(TRUE, (_T(“Calling GenServerContextrn”)));
if (!GenServerContext(&asServer, pClientBuf, cbIn, pServerBuf, &cbOut, &fDone))
__leave;
fResult = TRUE;
}
__finally
{
// Clean up resources
if (asClient.fHaveCtxtHandle)
{
DEBUGMSG(TRUE, (_T(“Calling DeleteSecurityContextrn”)));
DeleteSecurityContext(&asClient.hctxt);
}
if (asClient.fHaveCredHandle)
{
DEBUGMSG(TRUE, (_T(“Calling FreeCredentialsHandlern”)));
FreeCredentialsHandle(&asClient.hcred);
}
if (asServer.fHaveCtxtHandle)
{
DEBUGMSG(TRUE, (_T(“Calling DeleteSecurityContextrn”)));
DeleteSecurityContext(&asServer.hctxt);
}
if (asServer.fHaveCredHandle)
{
DEBUGMSG(TRUE, (_T(“Calling FreeCredentialsHandlern”)));
FreeCredentialsHandle(&asServer.hcred);
}
DEBUGMSG(TRUE, (_T(“Freeing heaps…”)));
HeapFree(GetProcessHeap(), 0, pClientBuf);
HeapFree(GetProcessHeap(), 0, pServerBuf);
DEBUGMSG(TRUE, (_T(“okrn”)));
}
DEBUGMSG(TRUE, (_T(“-SSPLogonUserrn”)));
return fResult;
}