Most modern processors support hardware performance counters, measuring various aspects of the processor's operations. These counters can provide information varying from the number of cache misses for each cache level, through the number of retired instructions, and sometimes even the number of memory operations queued on the processor. Needless to say, these counters are processor dependant. Moreover, these counters may not even be backward compatible with previous versions of the same architecture.
To make Klogger as efficient as possible, all pieces of code accessing the hardware performance counters are inlined. The result is that all abstractions used are taking place at compile time, incurring no runtime overhead from abstract indirections (as oppose to the approach taken by other tools, such as PAPI [2]). In that context, user should be warned that since the counters are processor model dependant, code compiled with performance counter support for one model may not run on another! Simply put, kernels compiled to utilize PentiumIII's performance counters might not even boot on a PentiumIV machine.