When running benchmarks, the user may choose to request collection of performance counters. This may be useful in investigation scenarios - narrowing down the cause of a regression; or verifying that the underlying cause of a performance improvement matches expectations.
This feature is available if:
The feature does not require modifying benchmark code. Counter collection is handled at the boundaries where timer collection is also handled.
To opt-in:
--define pfm=1
to your build flagslibpfm4-dev
, e.g. apt-get install libpfm4-dev
.BENCHMARK_ENABLE_LIBPFM
in CMakeLists.txt
.To use, pass a comma-separated list of counter names through the --benchmark_perf_counters
flag. The names are decoded through libpfm - meaning, they are platform specific, but some (e.g. CYCLES
or INSTRUCTIONS
) are mapped by libpfm to platform-specifics - see libpfm documentation for more details.
The counter values are reported back through the User Counters mechanism, meaning, they are available in all the formats (e.g. JSON) supported by User Counters.