SOS from Your Production Environment
Developing large enterprise applications is a complex and difficult undertaking. Writing the code is just one of many tasks you have to do. You worry about requirements, designs, architecture, unit testing, daily builds, release builds to QC, and many more things. All this effort is spent to create a reliable, scalable, well-performing, and functioning application. Then comes the day where you move it into production (if you are lucky and it is hosted by your own organization) or the customer starts installing it on his servers. This is a big day, a day of celebration. You see the fruits of all your labor and you are excited to see users using the application, getting their feedback, and improving the application. But, too often it starts to haunt you. The customer reports crashes, instability, or unpredictable behavior. You tell yourself, but it is working on our environments. What is different between our test environments and the customer environments?
Gather Data to Make Informed Decisions
Applications can behave very differently in various environments and under load. First, stop worrying about all the shouting. Concentrate on gathering the right data so you can narrow down what is going on. Start with basic information, like which OS and Windows patches are installed. Look at the event log to find out if there are system or application errors reported. If not done automatically, run a virus check too make sure there is no virus infection going on. Enable your custom application logs and comb through them to find out what is happening. If all that does not uncover anything, understand how the application is used: which features are used heavily by users, how many concurrent users are on the system, and so on. Then, replicate a similar environment in house and run a load test against it; this simulates a usage scenario as close as possible (see my article about concurrent users stress testing).
If all that does not bring you closer to a resolution, you need to take a snapshot of the application in production and analyze it. This article will introduce you to the basic approach for this and then point you to more advanced articles. It is easier than most people believe. Microsoft has built a very nice debugging story—in the unmanaged as well as managed world.
The "Debugging Tools for Windows"
Microsoft provides debugging tools for Windows NT 4.0, Windows 2000, Windows XP, and Windows 2003. The homepage for the "Debugging Tools for Windows" can be found here. Follow the "Install Debugging Tools for Windows 32-bit Version" link to download the latest version of them (this article uses version 6.4.7.2). The tools, by default, are installed in the "c:\program files\debugging tools for windows" folder. The install also adds a "Debugging Tools for Windows" menu group under "All Programs." This includes a "Debugging Help" that provides some very good information.
There are a number of debuggers that you can use to debug your application. This article will concentrate on how you can take a dump of your application and then analyze these dumps on another environment and not the production environment itself. You will see how you can take a dump when the application hangs, crashes, or just while it is running. These dumps include a complete memory dump so you can see all the threads executing, all the objects on the stack, and the like. This is the least intrusive approach in really understanding what is happening in your application while used in production. This does also not require any files to be registered; this makes it easier to get permission to use it in production and also to remove again when no longer needed (which the customer might request). Install the debugging tools on any machine you want and then copy the following five files from the "c:\program files\debugging tools for windows" folder to the production environment:
- adsplus.vbs
- cdb.exe
- dbgeng.dll
- dbghelp.dll
- tlist.exe
You don't need to register the DLLs. The cdb.exe file is the "Microsoft Console Debugger" and the adsplus.vbs file is a Windows scripting file that is used to automate the CDB debugger. This requires the Windows Scripting Host 5.6 to be installed (run cscript.exe to check the version number). If required, download the version from here and install it on the production server.
Always Create the Symbol Files for Your Binaries
A debugger needs symbol files to show you more then just class, method, and object addresses. Symbols enable debuggers to show you the class names, variable names, and so forth. You can debug an application without symbols, but it is much harder and needs a lot of experience. You want to make your life as easy as possible; therefore, always generate the symbol files. When you compile your application in debug mode, you will see in the same folder where the DLL or EXE gets generated, also a PDB file. The PDB file is the symbol file that you need for debugging purposes. Of course, you do not want to release the debugging version of your binaries. You can tell the compiler also to generate these symbol files when compiling in release mode. Open the project settings in your Visual Studio .NET IDE (menu Project | Settings). Select the Build tab, select "Release" in the Configuration drop-down box if not already selected, and then click on the Advanced button. In the "Debug Info" drop-down box, select "PDB-only." Close your project settings and rebuild your project. You need to do that for all project files. Make it a habit that, when you release your application, you not just release the binaries (DLLs and EXEs) but also all its symbols. Therefore, you have the symbols ready anytime you need them for debugging purposes.
Symbol files contain information such as all the class names, method names, global and local variable names, as well as source line numbers. They are kept separate so that your binaries are smaller and faster when running. Later in the article, I explain how you can load these symbols into the debugger. You can also obtain all the symbols for the Windows OS, the .NET framework, and many other Microsoft products. You can tell the debugger to download it as needed from the Internet or, if you do not have access to the Internet while debugging, you can download them from the Microsoft site (Windows symbols ). The article will explain how to set up your debugger to download Microsoft symbols files as needed.
Using ADPlus to Take Application Dumps
Now, you are ready to take dumps. First, start your application. The article has a ThrowException .NET sample application attached; it allows you to generate two unhandled exceptions. You will use this sample application to walk through all the examples in this article. Next, open the task manager and go to the "Process" tab. Select the "Show processes from all users" check box at the bottom so you can see all processes running. Next, find the process named "ThrowException.exe" and note down the process ID (shown in the PID column).
ADPlus has a number of command line operations. First, you need to decide whether you want to perform a crash dump or hang dump. A crash dump is for situations when your application unexpectedly terminates. Hang dumps can be used to take a dump when your application hangs or any time while it is running. ADPlus cannot be used in scenarios where your application crashes while starting up. It can only be used for applications that are running and then crash. Use the CDB or WinDbg debuggers for scenarios where your application crashes during startup. ADPlus automates the CDB debugger and attaches it to your process. It also can be used to attach it to multiple processes; for example, when your application runs under IIS and uses also COM+. When CDB kicks in, it freezes all processes it has been attached to, takes a dump for each asynchronously, and then lets these processes continue to run.
Running ADPlus in Crash Mode
Open a command prompt and go to the folder where you installed or copied the debugging files. You need to provide at a minimum the following command line arguments when running ADPlus:
- Mode: The mode you want the CDB debugger to run in. Add "-crash" for crash mode or "-hang" for hang mode.
- Process to monitor: Add "-p <process id>" to tell CDB which process to attach to. You can repeat that option for each process you want to monitor. For each process, it spawns a separate instance of CDB.
- Quiet mode: When you run ADPlus, it will show a dialog box at the beginning, telling you which mode has been chosen and where the log files will be created. When you run ADPlus on a remote machine, you need to suppress this dialog box; otherwise, ADPlus itself will hang (see later in the article). Add the option "-quiet".
- Location of log files: With the "-o <log file path>" option, you can specify the path where the log file will be created. The CDB debugger creates a unique folder each time it runs under that log file path. The folder name will be a combination of the mode and date and time the CDB has been started, for example:
This guarantees that no dump will be overwritten with another dump. In that folder, you find the actual memory dump as well as a number of log files. The file "ADPlus_report.txt" contains information about the configuration the CDB debugger has been started up with. The "Process_List.txt" file lists information about all the processes running when CDB started. The "PID-<process id>__<process name>__<date>__<time>.log" file contains all the output of the CDB debugger while running. The actual dump generated by CDB gets placed in the "PID-<process id>__<process name>__<...>.dmp" file.Crash_Mode__Date_04-01-2005__Time_19-57-18PM
- Symbol path: The option "-y <path> specifies the path where the symbol files can be found. The path contains three pieces of information:
- Symbol server: The symbol server to use. This should always be "srv" unless you have a custom symbol server you utilize.
- Downstream store: The downstream symbol store; for example, "c:\symbols". CDB will cache symbols from the upstream store to the downstream store, providing a cascading symbol store cache.
- Upstream store: the upstream symbol store. This can be a local path, a network path, or a URL.
All three pieces of the path should be separated by a "*". The following example points to the public symbol store from Microsoft and uses a local downstream store:
This allows you to download CDB the symbols to your local store; this makes it much faster for any subsequent access to the symbol file. Symbols are copied to the downstream store as CDB requires it. So that it doesn't, just go ahead and copy every symbol file. You also can list multiple symbol stores by separating each with a semicolon. The next example points to the Microsoft public symbol store as well as the symbol files of your application:-y "srv*c:\local symbols*http://msdl.microsoft.com/
download/symbols" - Symbol server: The symbol server to use. This should always be "srv" unless you have a custom symbol server you utilize.
You also can use the "_NT_SYMBOL_PATH" environment variable instead of using the "-y" option. As mentioned earlier in the article, you can download all the Microsoft symbols if the production environment does not have Internet access. This also means that all your application symbols should be copied to a folder on the production environment. The following article provides a much more comprehensive explanation of the symbol stores and symbol server.-y "srv*c:\local symbols*http://msdl.microsoft.com/download/
symbols;
srv*c:\local symbols*c:\ThrowException\bin\Release"
- Full dump on first-chance exceptions : The "-FullOnFirst" option tells ADPlus to take a full dump for first-chance exceptions.
- No dump on first-chance exceptions : The "-NoDumpOnFirst" option tells ADPlus to take no dumps at all for first-chance exceptions.
- Mini dump for second-chance exceptions : By default, ADPlus takes a full dump for second-chance exceptions. The "-MiniOnSecond" option tells ADPlus to take only mini dumps at second-chance exceptions. This is useful when you need to send the dump to someone to look at. These are small dumps, whereas full dumps can be hundreds of megabytes and are difficult to send around.
- No dump on second exceptions . The "NoDumpOnSecond" option tells ADPlus not to generate any dumps on second-chance exceptions.
For a complete list of all the ADPlus command line arguments, please refer to the topic "ADPlus Command-Line Options" in the "Debugging Help" section. It also explains how you can create a configuration file with all these settings and tell ADPlus with the "-c <configuration file path>" option to use the configuration file instead. Assuming that the application ThrowException runs under the process ID 2828, here is how to start ADPlus in crash mode, logging all information in the "c:\crashlogs" folder.
ADPlus .crash -p 2828 -o c:\crashlogs
-y "srv* \symbols*c:\ThrowException;
srv* c:\symbols*http://msdl.microsoft.com/download/symbols"
-quiet -FullOnFirst
This spawns a new window that shows the CDB debugger attached to your application. You can press Ctrl+C in that window any time to take a hang dump if no crash happens. But, this will terminate the process. ADPlus cannot be run in crash mode through Terminal Server on Windows NT 4.0 and Windows 2000. The following article explains how to run in crash mode remotely. It also contains more detailed information about how to use ADPlus.
Downloads
Running ADPlus in Hang Mode
A hang dump will be taken the moment you run it. The CDB debugger attaches to the process, freezes the process, takes a full dump, detaches again and then resumes the process again. This does not terminate the process at all. The hang mode can be run locally or remotely through Terminal Server. All command line options explained in the previous section apply to the hang mode, except the Exception mode and Notification. Here is a sample of a hang dump:
ADPlus -hang -p 2828 -o c:\crashlogs
-y "srv* c:\symbols*c:\ThrowException;
srv* c:\symbols*http://msdl.microsoft.com/
download/symbols" -quiet
Analyzing the Dump File
srv*c:\symbols*c:\ThrowException;
srv*c:\symbols*http://msdl.microsoft.com/download/symbols
Loading the SOS Debugger Extension
- -load: The load command can be used to load debugging extensions. If you have an environment variable that has a path to the .NET framework files, you can simply use "-load sos." You can also be explicit and type in the full path "-load c:\program files\debugging tools for windows\clr10\sos." You will debug a .NET 2.0 Beta 1 application, so you use "-load c:\windows\Microsoft.NET\ Framework\v2.0.40607\sos" instead.
- -unload: Is used to unload a debugger extension. You can provide the name of the extension, for example sos, or if no name is provided, it will unload the last extension loaded. You also can use the unloadall command to unload any extension loaded.
- -chain: Used to display the chain of loaded debugger extensions.
Digging into the .NET Dump File Using SOS
ID ThreadOBJ State Domain APT Exception
. 0 1 001530f8 6020 00149ff8 STA System.IO.
DirectoryNotFoundException (00c01c3c)
2 2 00161248 b220 00149ff8 MTA (Finalizer)
0:000> !PrintException 00c01c3c
Exception type: System.IO.DirectoryNotFoundException
Message: Could not find a part of the path 'w:\MyFile.txt'.
InnerException: <none>
StackTrace (generated):
<none>
StackTraceString: <none>
HResult: 80070003
ESP EIP
0012f140 77e649d3 [Frame: 12f140]
0012f180 78cb9da9 System.IO.__Error.WinIOError(Int32, System.String),
mdToken: 06003489
0012f1ac 78a8add4 System.IO.FileStream.Init(System.String, ...),
mdToken: 060035d9
0012f258 78a8aa13 System.IO.FileStream..ctor(System.String, ...),
mdToken: 060035d8
0012f288 78a8d295 System.IO.FileStream..ctor(System.String, ...),
mdToken: 060035d5
0012f2b0 78b66e59 System.IO.File.Create(System.String, Int32, Boolean),
mdToken: 0600355e
0012f2c4 78b66e05 System.IO.File.Create(System.String),
mdToken: 0600355c
0012f2c8 78b68250 System.IO.FileInfo.Create(), mdToken: 0600359b
0012f2cc 0520039d ThrowException.ThrowException.
UnhandledException2_Click(...), mdToken:0600000d
0012f2d8 7b3b5f15 System.Windows.Forms.Control.OnClick(System.EventArgs),
mdToken: 0600126d
0012f2e8 7b3ed65d System.Windows.Forms.Button.OnClick(System.EventArgs),
mdToken: 06001b69
0012f2f4 7b3ed7a9 System.Windows.Forms.Button.OnMouseUp(...),
mdToken: 06001b6b
0012f31c 7b3ba2f7 System.Windows.Forms.Control.WmMouseUp(...),
mdToken: 06001346
0012f358 7b363775 System.Windows.Forms.Control.WndProc(...),
mdToken: 06001359
0012f374 7b371345 [Frame: 12f374]
Downloads
No comments:
Post a Comment