SRC Technical Note 1997-016a
Continuous Profiling: Where Have All the Cycles Gone?
Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika
R. Henzinger, Shun-Tak A. Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A.
Waldspurger, and William E. Weihl.
Note #1997-016a. July 28, 1997. Modified September 3, 1997.
Supersedes SRC Technical Note 1997-016.
This paper describes the Digital Continuous Profiling Infrastructure (DCPI),
a sampling-based profiling system designed to run continuously on production
systems. The system supports multiprocessors, works on unmodified executables,
and collects profiles for entire systems, including user programs, shared
libraries, and the operating system kernel. Samples are collected at a high rate
(over 5200 samples/sec per 333-MHz processor), yet with low overhead (1-3%
slowdown for most workloads).
Analysis tools supplied with the profiling system use the sample data to
produce a precise and accurate accounting, down to the level of pipeline stalls
incurred by individual instructions, of where time is being spent. When
instructions incur stalls, the tools identify possible reasons, such as cache
misses, branch mispredictions, and functional unit contention. The fine-grained
instruction-level analysis guides users and automated optimizers to the causes
of performance problems and provides important insights for fixing them.
Go to the
SRC Technical Notes main page.
Download note as: