 |
» |
|
|
|
 |
 |
dcpivprofiler(1)
NAME
dcpivprofiler - Value profilers
OVERVIEW
DCPI's value-profiling support allows you to specify what values to capture,
how to process those values before merging them into the profile files, and
how to format the values for printing. A value profiler is a dynamically
loadable shared library written by the user. It encapsulates all the code
needed to perform user-specified processing.
You can specify a value profiler to dcpid(1) and dcpilist(1) with
the -vtrace command-line argument. dcpid calls the appropriate
routine in the value profiler to determine what values should be captured
and passes the necessary information to the driver. When dcpid receives
the captured values from the driver, it calls another routine in the value
profiler to process those values and merge the returned values into the profile
database. dcpilist extracts values from the profile database and
calls other routines in the value profiler to format the values for printing.
Two value profilers are included in the DCPI distribution: vp-classic.so collects
the same information as the "classic value profiling" hardwired into DCPI; vp-addr.so collects
the effective addresses of memory operands in load and store instructions.
The library and header files referred to here are installed by default
into /usr/lib/dcpi and /usr/include/dcpi respectively.
INTERFACE
A value profiler must implement an interface consisting of the functions
below. It should be linked with libtrace.a, distributed with DCPI and installed
by default into /usr/lib/dcpi, to produce a shared library (with the -shared switch).
The interface is defined in vprofiler.h. This and related header
files are also in the DCPI distribution, and they are installed by default
into /usr/include/dcpi.
To simplify the following discussion, we give these functions specific
names here. In fact, the value profiler should have a dispatch table called _dispatch containing
pointers to these functions. The function names are arbitrary.
- vp_name
- This returns the name of the value profile. The name is used internally
by DCPI to locate profile files and has no other special meaning. Different
value profilers must have different names, and the names "values", "values.replay",
and "values.trace" are reserved. vp_name is called by both dcpid and dcpilist.
Although vp_name may generate the name dynamically, obviously
the same name should be returned so that dcpilist can find the
profile files created by dcpid.
- vp_init
- This is called after the value profiler is loaded, for example to perform
various initializations.
- vp_release
- This is called before the value profiler is unloaded, for example
to perform various cleanup functions.
- vp_prof_init
- This returns the data structure specifying what values the driver
should capture. This is called before value profiling is performed.
See below for what values may be specified and how to construct the
specification.
- vp_prof_release
- This is called after value profiling is done, for example
to clean up data structures.
- vp_prof_process
- This takes a trace of values captured
by the driver and produces an array of (pc,
context, value) tuples. dcpid calls vp_prof_process with
values coming from the driver and merges
the tuples returned by vp_prof_process into
the profile database.
The value trace from the driver consists of a series of entries, one
for each instruction selected for profiling (see below for how to specify
which instructions). Each entry contains the pc, the 32-bit instruction
code, context values if specified by the -vcontext command-line
argument of dcpid, and zero or more 64-bit values.
vp_prof_process is called with a parser object that helps
in decoding the value trace from the driver. vp_prof_process may
get general information with routines like trace_parser_get_pid and trace_parser_has_context.
Most importantly, it may use trace_parser_next to go through
and extract the information in the trace entries. See pcount/trace-parse.h for
details on the parser interface.
- vp_preface
- This takes a 32-bit instruction code and
returns a string that will be printed before
the list of most frequent values ("hotlist")
by dcpilist. If it returns NULL,
the entire value list is omitted.
- vp_format
- This takes a 32-bit instruction code and
a 64-bit value, and formats the value for
printing by dcpilist. Naturally,
it should return a string that represents
the value in a way most useful to human users.
For example, the result operand of a floating-point
instruction should probably be printed as
a floating-point number rather than a 64-bit
hexadecimal integer.
It can be assumed that for each instruction with a non-empty value
hotlist, dcpilist calls vp_preface exactly once and then vp_format as
many times as necessary before moving on to the next instruction. Therefore, vp_preface can
store information in static data structures for subsequent executions
of vp_format to pick up. This helps to avoid the overhead of
repeatedly parsing an instruction to figure out how the values in the
hotlist should be formatted.
TRACE SPECIFICATIONS
You can specify what values to capture based on an instruction's 6-bit
opcode. A vlist is the list of values that the driver should capture for
all instructions having the same opcode. A trace specification is the set
of vlists for all valid opcodes.
For a particular opcode, you may specify that the instruction be ignored.
The instruction is still executed, of course, but the value trace from the
driver will contain no record of it.
If the instruction is not ignored, the driver records some basic information:
the pc, the 32-bit instruction code, and two context values if dcpid is
called with the -vcontext command-line argument. Capturing this
information typically requires only 8 bytes per instruction to be passed
from the driver to dcpid because the data are encoded incrementally.
However, to minimize overhead, you may still want to ignore instructions
whose execution is of no interest at all, depending on the particular value-profiling
application.
In addition, you may ask the driver to record zero or more values in the
trace, up to seven in the current implementation. Possible values include
- a specific integer or floating-point register (say, r17, or f4)
- register operands Ra, Rb, and Rc
- the effective address for any memory operand
- load instruction latency
Of course, this list may grow as other useful values are identified. All
values are captured after the instruction has been executed. Currently
there is no check to determine whether the specified value makes sense for
the instruction. For example, if Rc is specified for an instruction that
does not have an Rc operand, the driver will capture some undetermined value
without any warning.
Typically, a vlist is constructed by adding values to an initially empty
vlist, and similarly a trace specification is constructed by adding vlists
to an initially empty trace specification. The following routines can be
used for this purpose. See trace-vlist.h for the function prototypes.
This is only an experimental interface. It will be revised based on more
usage experience.
- trace_vlist_table_init(spec)
- Initialize an empty trace specification (i.e., the driver keeps no record
of any instruction execution).
- trace_vlist_init(vlist)
- Initialize a vlist that records only the basic information about the
instruction, namely the pc, 32-bit instruction code, and context values
(if dcpid is called with -vcontext).
- trace_add_value_to_vlist(vlist, value)
- Add value to vlist. vlist is not (yet) associated
with any trace specification.
- trace_add_vlist_by_opc(spec, opc, vlist)
- Add the values in vlist to the vlist for instruction having
the opcode opc.
- trace_set_vlist_by_pick_value(spec, func)
- The function func should take an opcode as an argument
and return a value type. It is called for each valid opcode.
The result that it returns is the only value that
will be captured for instructions having that opcode.
- trace_add_vlist_by_select(spec, selector, vlist
- The function selector should take an opcode as
an argument and return a boolean result. It is called for
each valid opcode. The values in vlist are added to
the vlist for that opcode if and only if selector returns
true.
EXAMPLE
Here is a sample value profiler. It captures the effective addresses of
the memory operands in all ldq and ldq_u instructions.
#include <stdio.h>
#include <stdlib.h>
#include <machine/inst.h>
#include <vprofiler.h>
#define VSAMPLE_BUFFER_SIZE 1024
static vsample_t vsample_buffer[VSAMPLE_BUFFER_SIZE];
static int select_loads_stores(uchar opc, uchar func)
{
switch (opc) {
case op_ldq:
case op_ldq_u:
return 1;
}
return 0;
}
static trace_vlist_table_t* table;
static trace_vlist_table_t* vp_prof_init(void)
{
trace_vlist_t* vlist;
table = trace_vlist_table_alloc();
trace_vlist_table_init(table);
vlist = trace_vlist_alloc();
trace_vlist_init(vlist);
trace_add_value_to_vlist(vlist, TRACE_REGB);
trace_add_vlist_by_select(table, select_loads_stores, vlist);
free(vlist);
return table;
}
static int vp_prof_process(uint pid,
trace_parser_t* parser,
vsample_t** vsamples)
{
int n = 0, nvalues, no_context;
ulong pc, c0, c1, rb;
union alpha_instruction inst;
no_context = (! trace_parser_has_context(parser));
while ((nvalues =
trace_parser_next(parser, (uint*) &inst, &pc, &c0, &c1, 1, &rb)) >= 0) {
if (nvalues == 1) {
if (no_context) {
c0 = c1 = 0;
}
vsample_buffer[n].pc = pc;
vsample_buffer[n].value = rb + (ulong) inst.m_format.memory_displacement;
vsample_buffer[n].context0 = c0;
vsample_buffer[n].context1 = c1;
n++;
if (n >= VSAMPLE_BUFFER_SIZE) {
break;
}
}
}
*vsamples = vsample_buffer;
return n;
}
static const char* vp_name(void)
{
return "vp-ldq-addr";
}
static const char* vp_preface(uint inst)
{
return "addr";
}
static const char* vp_format(uint inst, ulong value)
{
static char buffer[32];
sprintf(buffer, "%lx", value & ((1UL << 48) - 1));
return buffer;
}
vp_dispatch_t _dispatch = {
NULL,
NULL,
vp_prof_init,
NULL,
vp_prof_process,
vp_name,
vp_preface,
vp_format
};
SEE ALSO
dcpi(1), dcpi2bb(1), dcpi2pix(1), dcpi2ps(1), dcpicalc(1), dcpicat(1), dcpicc(1), dcpicoverage(1), dcpictl(1), dcpid(1), dcpidiff(1), dcpidis(1), dcpiepoch(1), dcpiflow(1), dcpiflush(1), dcpikdiff(1), dcpilabel(1), dcpildlatency(1), dcpilist(1), dcpiprof(1), dcpiprofileme(1), dcpiquit(1), dcpiscan(1), dcpisource(1), dcpistats(1), dcpisumxct(1), dcpitar(1), dcpitopcounts(1), dcpitopstalls(1), dcpiuninstall(1), dcpiupcalls(1), dcpivarg(1), dcpivcat(1), dcpiversion(1), dcpivlst(1), dcpiwhatcg(1), dcpix(1), dcpiformat(4), dcpiexclusions(4)
For more information, see the DCPI project home page http://h30097.www3.hp.com/dcpi.
COPYRIGHT
Copyright 1996-2004, Hewlett-Packard Company.
All rights reserved.
|
|