Thursday, December 29, 2011

PerfMonitor primer

Perfmonitor is a ETW based tool for 'profiling' managed code, it can be downloaded from CodePlex. The December issue of MSDN magazine has a nice article to get you started: Performance Diagnostics of .NET Applications Using ETW. It works best on .NET 4.0 but some features also work on CLR2.x/.NET 3.5.

One of the great things about PerfMonitor (besides being a free download) is that it is easy to install on a target system (no installers just copy the exe). Just carry it along on your USB drive and within minutes your are ready to gather data which can help you solve a performance problem.

The tool has a build in manual, running the tool without a commandline parameters will start your browser and will show the user guide.

There are two steps to using the tool
1. Collect data
2. Post processing of the data and generating a report.

A typical scenario might look like this:
  1. Get the process ID of the process your are monitoring (tasklist)
  2. start the collection: perfmonitor.exe collect
  3. <perform use case>
  4. <stop collection>
  5. Merge the ETL files that have been generard: performonitor.exe merge
  6. Generate the report: performonitor.exe /process:<ProcID> analyze 
  7. Evaluate the report
A list of command line options and qualifiers can be found below.

Data collection
To collect the data there are a number of options (the complete overview can be found in the table below):
  1. Explicitly start and stop the data collection (perfmonitor.exe start [<filename.etl>] / performonitor.exe stop)
  2. perfmonitor.exe collect
  3. perfmonitor.exe. runAnalyze <executable>
    The 'runAnalyze' option collects the data and automatically generates a report.

Data CollectionDescription
runAnalyze CommandAndArgs ...Performs a run command then the analyze command.
run CommandAndArgs ...Starts data collection, runs a command and stops.
collect [DataFile]Starts data logging, then displays a messagebox. When dismissed, it then stops collection.
start [DataFile]Starts machine wide profile data collection.
stopStop collecting profile data (machine wide).
abortInsures that any active PerfMonitor sessions are stopped.
listSessionsLists active ETW sessions.
Listen Providers ...Turns on ETW logging for EventSources (providers), and then writes text file (by default to stdout).
runDump CommandAndArgs ...Performs a run command then the Dump command.
monitor CommandAndArgs ...Turns on all event sources, performs a run command.
monitorDump CommandAndArgs ...Turns on all event sources, performs a run command then the Dump command.
monitorPrint CommandAndArgs ...Turns on all event sources, performs a run command then the PrintSources command.

Reporting
 After the data has been collected a report can be generated. This can be done using the following command: perfmonitor analyze [<filename.etl>]

There are several options that can be used when generating the report, here is a list of all the options.

ReportingDescription
analyzeCreates a general performance report report (CPU, GC, JIT ...).
merge [DataFile]Combine separate ETL files into a single ETL file (that can be decoded on another machine).
stats [DataFile]Produce an XML report of what events are in an ETL file.
procs [DataFile]Display the processes that started in the trace.
GCTimeCreates a Report on .NET garbage collection
CpuTimeCreates a Report on CPU usage time.
JitTimeCreates a Report on .NET Just In Time compilation.
etlx [DataFile]Create a ETLX file from the ETL files.
PrintSourcesTakes an ETL file and generates a text file that prints EventSource events as formatted strings.
dump [DataFile]Convert the events to an XML file.
rawDump [DataFile]Convert the events with no preprocessing (thus stacks are individual events, and rundown events are shown.

MiscDescription
UsersGuideDisplays the users Guide.
cpuTestRun a simple, CPU bound routine. Useful as a tutorial example.
listSources ExeFileLists all System.Diagnostics.EventSources in a executable (or its static dependencies)
monitorProcsEXPERIMENTAL: Monitor process creation in real time.


Here is an overview of all the qualifiers
Qualifiers Description
-BufferSize:INT32(64)The size the buffers (in MB) the OS should use. Increase this if PerfMonitor warns you that the log lost events.
-Circular:INT32(0)Do Circular logging with a file size in MB. Zero means non-circular.
-MergeDo a merge after stopping collection.
-EtlxConvert to etlx (contains symbolic information) after stopping collection.
-NoRundownDon't request CLR Rundown events.
-RundownTimeout:INT32(30)Maximum number of seconds to wait for CLR rundown to complete.
-MinRundownTime:INT32(0)Minimum number of seconds to wait for CLR rundown to complete.
-SymbolsForDlls:STRING,...A comma separated list of DLL names (without extension) for which to look up symbolic information (PDBs).
-PrimeSymbolsNormally only machine local paths are searched for symbol info. This indicates symbol servers should be searched too.
-DataFile:STRINGFileName of the profile data to generate.
-AdditionalProviders:STRING,...Providers in ADDITION to the kernel and CLR providers. This is comma separated list of ProviderGuid:Keywords:Level specs.
-Providers:STRING,...Providers to turn on INSTEAD of the kernel and CLR providers. This is comma separated list of Provider:Keywords:Level specs were Provider is either a GUID, or @FileName#EventSourceName.
-ClrEvents:KEYWORDS(Default)A comma separated list of .NET CLR events to turn on. Legal values: None, GC, Binder, Loader, Jit, NGen, StartEnumeration, StopEnumeration, Security, AppDomainResourceManagement, JitTracing, Interop, Contention, Exception, Threading, Default, Stack.
-KernelEvents:KEYWORDS(Default)A comma separated list of windows OS kernel events to turn on. See Users guide for details. Legal values: None, Process, Thread, ImageLoad, ProcessCounters, ContextSwitch, DeferedProcedureCalls, Interrupt, SystemCall, DiskIO, DiskFileIO, DiskIOInit, Dispatcher, MemoryPageFaults, MemoryHardFaults, VirtualAlloc, NetworkTCPIP, Registry, AdvancedLocalProcedureCalls, SplitIO, Driver, OS, Profile, Default, FileIO, FileIOInit, Verbose, All.
-NoBrowserDon't launch a browser on an HTML report.
-Process:STRINGFilter events to just one process with the given name (exe without extension) or numeric process ID.
-StartTime:DOUBLE(0)Filter events happening before this time (msec from start of trace.
-EndTime:DOUBLE(Infinity)Filter events happening after this time (msec from start of trace.
-Threshold:SINGLE(5)Stack items less than this % threshold are folded into their parent node.
-NoOSGroupingTurn off the grouping of OS functions in stack traces.
-NoBrowserDon't launch a browser on an HTML report.
-WriteXmlWrite an XML report as well as the HTML report.

Ok, I hope that this gets you started with this great tool!

Debugging Resources

I regularly surf the web to read about debugging tips and tricks. This entry contains a list resources that I found worth wile. I will be updating as I go along.

WinDbg

Visual Studio

Friday, December 23, 2011

A simple approach that paid off

I am currently migrating a relatively large application from VS2008/NET3.5 to VS2010/NET4. During this migration we ran into a problem. After we had migrated the code we wanted to test the application. For this we have a number of automated tests. We use Ranorex to drive part of our application.
While testing we found that the application reacted in a sluggish way and the automated tests failed due to time-outs.  We also observed that when doing the same tests ,manually the problem did occur. So what was happening and why this strange behavior.

I followed a simple approach  which paid of and which gave us enough information pin point the problem.
I fired up WinDbg attached to the process and loaded the sos extension (.loadby sos clr).
I listed all the thread (!threads) and looked up the thread ID of the UI thread.
I then resumed the program. 

I prepared the following command (17 being the ID of the UI thread):  ~17s;!clrstack;g

I then reproduced the problem. At that point I quickly hit break and executed the above command. 
I repeated this a number of times. After doing so I started looking at the different calls stacks.
What I noticed that all the stacks looked similar.

OS Thread Id: 0x1f08 (17)
Child SP IP       Call Site
0a54be30 5c075e00 Ranorex.Libs.WinForms.AssemblyResolver.AssemblyResolveHandler(System.Object, System.ResolveEventArgs)
0a54be34 68db0de1 System.AppDomain.OnAssemblyResolveEvent(System.Reflection.RuntimeAssembly, System.String)
0a54c074 6a2121db [GCFrame: 0a54c074] 
0a54d0ac 6a2121db [HelperMethodFrame_PROTECTOBJ: 0a54d0ac] System.Reflection.RuntimeAssembly._nLoad(System.Reflection.AssemblyName, System.String, System.Security.Policy.Evidence, System.Reflection.RuntimeAssembly, System.Threading.StackCrawlMark ByRef, Boolean, Boolean, Boolean)
0a54d154 68dce713 System.Reflection.RuntimeAssembly.InternalGetSatelliteAssembly(System.String, System.Globalization.CultureInfo, System.Version, Boolean, System.Threading.StackCrawlMark ByRef)
0a54d17c 68dce4ea System.Resources.ManifestBasedResourceGroveler.GetSatelliteAssembly(System.Globalization.CultureInfo, System.Threading.StackCrawlMark ByRef)
0a54d1c0 68dbd314 System.Resources.ManifestBasedResourceGroveler.GrovelForResourceSet(System.Globalization.CultureInfo, System.Collections.Generic.Dictionary`2<System.String,System.Resources.ResourceSet>, Boolean, Boolean, System.Threading.StackCrawlMark ByRef)
0a54d230 68dbcf95 System.Resources.ResourceManager.InternalGetResourceSet(System.Globalization.CultureInfo, Boolean, Boolean, System.Threading.StackCrawlMark ByRef)
0a54d298 68dbcd51 System.Resources.ResourceManager.InternalGetResourceSet(System.Globalization.CultureInfo, Boolean, Boolean)
0a54d2b4 68dd324a System.Resources.ResourceManager.GetResourceSet(System.Globalization.CultureInfo, Boolean, Boolean)
.....

To further investigate the problem I put a breakpoint on the AssemblyResolveHandler. To get the address of the JIT'ed code I used the following command:

0:083> !name2ee Ranorex.Libs.WinForms.dll Ranorex.Libs.WinForms.AssemblyResolver.AssemblyResolveHandler
Module:      5c021000
Assembly:    Ranorex.Libs.WinForms.dll
Token:       060001cd
MethodDesc:  5c0354a4
Name:        Ranorex.Libs.WinForms.AssemblyResolver.AssemblyResolveHandler(System.Object, System.ResolveEventArgs)
JITTED Code Address: 5c075e00

Next I set a breakpoint using : bu 5c075e00 "!clrstack;g"
and opened a logfile (.logopen <filename>) and reran the test.

This way I found that there were indeed a large number of calls to this method. I repeated the experiment on our VS2008 build and here I did not see a large amount of calls to this method. 

The next step in the investigation was to examine the call-stack in more detail. We found that the "System.AppDomain.OnAssemblyResolveEvent" behave differently in framework version 4. MSDN states: 

Beginning with the .NET Framework 4, the ResolveEventHandler event is raised for all assemblies, including resource assemblies. In earlier versions, the event was not raised for resource assemblies. If the operating system is localized, the handler might be called multiple times: once for each culture in the fallback chain.

Debugging along we found that the library that our application is using could not find a resource (tooltip) which resulted in lots of callback to the Ranorex code. 

So using this simple method and the help of google (leading us to MSDN) we were able to pin point the problem in a relatively short time. 

Happy bug hunting

Sunday, December 18, 2011

The right tool for the job

Having the right tools for the job can save your day. Over the years I have been using a number of tools when trouble shooting problems. Here my shopping list of (mostly free) tools that I use in my day to day work.

If you have a favourite must have tool for trouble shooting which is not on the list then please let me know!

Performance

  • Xperf
  • ProcessExplorer: Taks manager on steroids
  • ProcMon: SysInternals tool for monitoring file, registry and network events
  • PerformanceMonitor: including PAL
  • PerfMonitor: .NET/CLR profiling
  • RML: SQL performance problems
  • AQ Time (commercial): profiling C# code
  • V-Tune (commercial)

Debugging

General Purpose Tools

Sunday, December 11, 2011

WinDbg: conditional break-points to the rescue

I work on a large .NET based application and although I have access to all the source code (million lines of source code) it is somethimes not easy to pin point a problem. One approach that has been helpful over time is to use the debugger and set a conditional break-point.

Our test systems are not equiped with Visual Studio so most of the time I use WinDbg (which I alway easily can install from my USB drive). In this particular case I needed to figure out which piece of code was opening a particular file (I knew that the problem I was looking at was related to this file by using one of my other favorite tools Xperf).

So after installing WinDbg and setting up the system I attached the debugger and loaded the sos debugging extension (.loadby sos clr OR .loadby sos mscorwks if you are still on CLR 2.0).

Then I set the break-point:

bu KERNEL32!CreateFileW ";as /mu ${/v:fileName} poi(@esp+4); .if ( $spat( \"${fileName}\", \"*test*\" ) != 0 ) { du poi(esp+4);!clrstack; g } .else { g }"


This breakpoint does the following:
- Break on a call to CreateFileW (also used for opening files)
- stores the name of the file that needs to be opened (including the path) into the alias 'fileName'
- Checks if the filename contains a certain pattern (in this case *test*) and if so:
- show the filename
- shows the calls stack
- continue the execution

I then executed the particular test case to get a repro. The debugger output led me right to the code opening the file.

You can apply this technique to many other problems. For example you can figure out who is creating threads or who is allocating large blocks of memory.

Happy bug hunting!

Hello world

This blog is intended to contain a random selection of posts on performance and debugging problems which might be interesting to the rest of the world. I have learned a lot from blog post from other and like to add my share. So I hope you find some usefull stuff on this blog that can save you day.