Capturing Results¶
ReBench is designed to be used with existing benchmarking harnesses, which means we need to capture the measurements created by them. For this purpose, ReBench uses what we call 'gauge adapters'. They typically parse the output generated by harnesses.
Available Harness Support¶
ReBench currently provides builtin support for the following benchmark harnesses:
JMH
: JMH, Java's microbenchmark harnessPlainSecondsLog
: a plain seconds log, i.e., a floating point number per lineReBenchLog
: the ReBench log format, which indicates benchmark name and run time in milliseconds or microsecondsSavinaLog
: the harness of the Savina benchmarksValidationLog
: the format used by SOMns's ImpactHarnessTime
: a harness that uses/usr/bin/time
automatically
PlainSecondsLog
¶
This adapter attempts to read every line of program output as a millisecond
measurement. Lines which cannot be parsed as floats are skipped, e.g. 1.0
and
2
are valid, while out: 1
and 1, 2, 3
are not and would be ignored.
Example output from a harness or benchmark:
342
543
100.23
54.12
Implementation Notes:
- Python's
float()
function is used for parsing
ReBenchLog
¶
The ReBenchLog parser is the most commonly used and has most features. It supports parsing of microseconds and milliseconds values. Though, internally, ReBench stores all values as milliseconds using Python's floating point numbers, i.e., as 64-bit values.
Furthermore, it supports capturing other criteria in addition to the overall
run time, which can be useful to measure the time of subtasks or metrics such
as memory use. When other criteria a provided, the total
time is expected to
be the last in the output, concluding the overall data point.
The approximate format that ReBenchLog
parses is as follows:
optional_prefix benchmark_name: iterations=123 runtime: 1000[ms|us]
Example output from a harness or benchmark, each a different value for total run time:
Dispatch: iterations=1 runtime: 557ms
LanguageFeatures.Dispatch: iterations=1 runtime: 309557us
LanguageFeatures.Dispatch total: iterations=2342 runtime: 557ms
Example output with additional criteria, that indicate memory use:
Savina.Chameneos: trace size: 3903398byte
Savina.Chameneos: external data: 40byte
Savina.Chameneos: iterations=1 runtime: 64208us
Savina.Chameneos: trace size: 3903414byte
Savina.Chameneos: external data: 40byte
Savina.Chameneos: iterations=1 runtime: 48581us
Implementation Notes:
-
For parsing of the total run time, the following regular expression is used:
^(?:.*: )?([^\s]+)( [\w\.]+)?: iterations=([0-9]+) runtime: (?P<runtime>(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?)(?P<unit>[mu])s
-
For arbitrary criteria, which may also be used for the
total
criteria, the following regular expression should match^(?:.*: )?([^\s]+): (?P<criterion>[^:]{1,30}):\s*(?P<value>(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?)(?P<unit>[a-zA-Z]+)
Time
¶
The Time
adapter uses Unix's /usr/bin/time
command.
If it the time
program supports it, we will also use the
-f
switch of the time
command to record the maximum resident set size,
i.e., the maximum amount of memory the program used.
Example configuration for a suite:
Suite:
gauge_adapter: Time
benchmarks:
- Bench1
Note:
Compatible time
binaries are looked for in:
- /usr/bin/time
- /opt/local/bin/gtime
On MacOS, a GNU time command can be installed for instance with Homebrew and MacPorts.
Supporting other Benchmark Harnesses¶
To add support for your own harness, you can implement your own gauge adapter and use it in the configuration by naming the class and giving a path to the file implementing it. The path is expected to be relative to the configuration file.
Example configuration with a custom adapter:
Suite:
gauge_adapter:
MyAdapter: my_adapter.py
benchmarks:
- Bench1
A custom adapter is expected to behave like a GaugeAdapter
object,
and it is recommended to inherit from the GaugeAdapter
base class.
The simplest adapter would look like this:
from rebench.interop.adapter import GaugeAdapter
from rebench.model.data_point import DataPoint
from rebench.model.measurement import Measurement
class MyAdapter(GaugeAdapter):
def parse_data(self, data, run_id, invocation):
iteration = 1
data_points = []
current = DataPoint(run_id)
data_points.append(current)
measure = Measurement(invocation, iteration, 1.0, 'ms', run_id, 'total')
current.add_measurement(measure)
return data_points
The key method to implement is parse_data(self, data, run_id, invocation)
.
The method is expected to return a list of DataPoint
objects.
Each data point can contain a number of Measurement
objects, where one of
them needs to be indicated as the total
value.
The criterion identifies what is measured. This can be different phases of a
benchmark or different properties, for instance memory usage.
Each criterion is encoded as a separate measurement. The overall run time is
assumed to be the final measurement to conclude the information for a single
iteration of a benchmark, using the total
criterion.
Other criterion names are not standardized.
For more examples, see the rebench.interop
module.
In there, the adapter
module contains the GaugeAdapter
base class.
A good example to study is the rebench_log_adapter
implementation.