SRC Technical Note 1997-016a
Continuous Profiling: Where Have All the Cycles Gone?
Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika R.
Henzinger, Shun-Tak A. Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger,
and William E. Weihl.
Note #1997-016a. July 28, 1997. Modified September 3, 1997. Supersedes
SRC Technical Note 1997-016.
This paper describes the Digital Continuous Profiling Infrastructure
(DCPI), a sampling-based profiling system designed to run continuously on production
system supports multiprocessors, works on unmodified executables, and collects
profiles for entire systems, including user programs, shared libraries, and
the operating system
kernel. Samples are collected at a high rate (over 5200 samples/sec per 333-MHz
processor), yet with low overhead (1-3% slowdown for most workloads).
Analysis tools supplied with the profiling system use the sample data to produce a
precise and accurate accounting, down to the level of pipeline stalls incurred by
individual instructions, of where time is being spent. When instructions incur stalls, the
tools identify possible reasons, such as cache misses, branch mispredictions, and
functional unit contention. The fine-grained instruction-level analysis guides users and
automated optimizers to the causes of performance problems and provides important insights
for fixing them.
Go to the SRC
Technical Notes main page.
Download note as: