 |
» |
|
|
|
 |
 |
dcpildlatency(1)
NAME
dcpildlatency - A DCPI value profiler for measuring load latencies.
OVERVIEW
DCPI's value-profiling infrastructure contains experimental support for
measuring the actual latencies experienced by loads in running programs.
A dcpivprofiler(1) value-profiling module
named vp-ldlatency.so is included with the DCPI release.
DESCRIPTION
The DCPI value profiler includes an Alpha interpreter that fetches and
interprets a number of instructions starting with the interrupted PC. As
each instruction is interpreted, values of interest are captured and recorded.
When the interpreter encounters a load instruction, it executes additional
timing code to measure the elapsed time required to complete the load. This
timing code uses Alpha rpcc instructions and is carefully structured
to prevent unwanted out-of-order execution.
The raw latencies captured by the interpreter must be adjusted slightly
to account for extra cycles taken by the instructions used to perform the
timing and enforce ordering constraints. Also, although the latency value
is measured directly, there are still some sources of potential error, such
as cache interference from the interrupt handler and performance counter
interrupt code.
DATA COLLECTION
To collect load latency data, start dcpid(1) with
the -vtrace option, specifying the vp-ldlatency.so vprofiler
module. Note that the absolute pathname must be specified. For example:
% dcpid -vtrace /usr/lib/dcpi/vp-ldlatency.so db
This command will start dcpid with load latency value profiling. The
underlying value-profiling infrastructure will store a value hotlist associated
with the PC of each profiled load instruction. Each value hotlist has a fixed
size (currently 16 entries), and is updated using statistical techniques that
maintain the most frequently occurring values and their relative frequencies.
Since individual load latencies will vary, it is sometimes desirable to
cluster raw latency values into histogram bins associated with levels
of the memory hierarchy. This can be accomplished by specifying an optional latency-bins file
name argument with the vp-ldlatency.so module; note the need to
quote the library and its argument together as a single option string:
% dcpid -vtrace '/usr/lib/dcpi/vp-ldlatency.so latency-bins' db
The latency-bins argument names a text file containing mappings from
raw latency values into representative values and associated names. The file
format is very simple: blank lines and lines starting with the comment character # are
ignored. Each remaining line must contain four values separated by white space: MIN, MAX, REP,
and NAME. This specifies that raw latency values (measured in processor
cycles) in the interval [MIN, MAX] should be mapped into the representative
value REP during data collection. The string NAME is used by
tools that report data values for analysis. Raw latency values not covered
by any of the specified intervals are not modified.
Note that the latency-bins file must be manually constructed with
the proper values (measured in processor cycles) for a particular machine
and memory system. As mentioned above, raw latency values need some adjustments;
raw values for the on-chip caches are typically too large due to the cost
of timing code, while raw values for slower memory levels are typically too
small, perhaps due to prefetching. Separate tools can be used to automatically
probe the cycle latencies associated with various levels of the memory hierarchy
(e.g., by repeatedly striding through carefully-sized arrays), but no such
tools are included with the current DCPI release.
DATA REPORTING
The dcpilist(1) command can be used to produce
procedure listings annotated with load latency value profile information
collected using dcpid(1). The same -vtrace option
used with dcpid should be specified to dcpilist. For example,
the following command will display the load latency values along with each
sampled instruction for the procedure procedure in the image binary:
% dcpilist -vtrace /usr/lib/dcpi/vp-ldlatency.so procedure image
Note that the values reported will be raw latency values if no latency-bins file
was used during data collection, or the representative values if such a file
was used. If the same latency-bins file argument is used with dcpilist,
the string names associated with each bin will also be reported:
% dcpilist -vtrace '/usr/lib/dcpi/vp-ldlatency.so latency-bins' procedure image
FILES
Here is a sample latency bins file used with an Alpha 21164 workstation.
Note that the bins for main memory are somewhat arbitrary. To ensure that
all collected values are reported, no more than 16 bins (the current hotlist
size) should be used:
# Example Load Latency Bins
# Entry format:
# MIN MAX REP NAME
#
# Maps [MIN, MAX] => REP in profiles.
# NAME is used for reports (dcpilist).
# Miata EV56 @ 600MHz
# lottery.pa.dec.com
# L1 (dcache)
0 10 2 D
# L2 (scache)
11 20 7 S
# L3 (bcache)
21 50 35 B
# (memory)
51 70 60 M1
71 90 80 M1
91 110 100 M1
111 130 120 M2
131 150 140 M2
151 170 160 M2
171 190 180 M2
191 210 200 M2
CAVEATS
If the same latency-bins file is not specified for both dcpid(1) and dcpilist(1),
the string names reported with values may be incorrect. However, it is OK
to use a latency-bins file during data collection with dcpid while
not using any file with dcpilist; in this case, no string names
will be reported.
SEE ALSO
dcpi(1), dcpi2bb(1), dcpi2pix(1), dcpi2ps(1), dcpicalc(1), dcpicat(1), dcpicc(1), dcpicoverage(1), dcpictl(1), dcpid(1), dcpidiff(1), dcpidis(1), dcpiepoch(1), dcpiflow(1), dcpiflush(1), dcpikdiff(1), dcpilabel(1), dcpilist(1), dcpiprof(1), dcpiprofileme(1), dcpiquit(1), dcpiscan(1), dcpisource(1), dcpistats(1), dcpisumxct(1), dcpitar(1), dcpitopcounts(1), dcpitopstalls(1), dcpiuninstall(1), dcpiupcalls(1), dcpivarg(1), dcpivcat(1), dcpiversion(1), dcpivlst(1), dcpivprofiler(1), dcpiwhatcg(1), dcpix(1), dcpiformat(4), dcpiexclusions(4)
For more information, see the DCPI project home page http://h30097.www3.hp.com/dcpi.
COPYRIGHT
Copyright 1996-2004, Hewlett-Packard Company.
All rights reserved.
|
|