Capturing Results

ReBench is designed to be used with existing benchmarking harnesses, which means we need to capture the measurements created by them. For this purpose, ReBench uses what we call 'gauge adapters'. They typically parse the output generated by harnesses.

Available Harness Support

ReBench currently provides builtin support for the following benchmark harnesses:

  • JMH: JMH, Java's microbenchmark harness
  • PlainSecondsLog: a plain seconds log, i.e., a floating point number per line
  • ReBenchLog: the ReBench log format, which indicates benchmark name and run time in milliseconds or microseconds
  • SavinaLog: the harness of the Savina benchmarks
  • ValidationLog: the format used by SOMns's ImpactHarness
  • Time: a harness that uses /usr/bin/time automatically

PlainSecondsLog

This adapter attempts to read every line of program output as a millisecond measurement. Lines which cannot be parsed as floats are skipped, e.g. 1.0 and 2 are valid, while out: 1 and 1, 2, 3 are not and would be ignored.

Example output from a harness or benchmark:

342
543
100.23
54.12

Implementation Notes:

  • Python's float() function is used for parsing

ReBenchLog

The ReBenchLog parser is the most commonly used and has most features. It supports parsing of microseconds and milliseconds values. Though, internally, ReBench stores all values as milliseconds using Python's floating point numbers, i.e., as 64-bit values.

Furthermore, it supports capturing other criteria in addition to the overall run time, which can be useful to measure the time of subtasks or metrics such as memory use. When other criteria a provided, the total time is expected to be the last in the output, concluding the overall data point.

The approximate format that ReBenchLog parses is as follows:

optional_prefix benchmark_name: iterations=123 runtime: 1000[ms|us]

Example output from a harness or benchmark, each a different value for total run time:

Dispatch: iterations=1 runtime: 557ms
LanguageFeatures.Dispatch: iterations=1 runtime: 309557us
LanguageFeatures.Dispatch total: iterations=2342 runtime: 557ms

Example output with additional criteria, that indicate memory use:

Savina.Chameneos: trace size:    3903398byte
Savina.Chameneos: external data: 40byte
Savina.Chameneos: iterations=1 runtime: 64208us
Savina.Chameneos: trace size:    3903414byte
Savina.Chameneos: external data: 40byte
Savina.Chameneos: iterations=1 runtime: 48581us

Implementation Notes:

  • For parsing of the total run time, the following regular expression is used: ^(?:.*: )?([^\s]+)( [\w\.]+)?: iterations=([0-9]+) runtime: (?P<runtime>(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?)(?P<unit>[mu])s

  • For arbitrary criteria, which may also be used for the total criteria, the following regular expression should match ^(?:.*: )?([^\s]+): (?P<criterion>[^:]{1,30}):\s*(?P<value>(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?)(?P<unit>[a-zA-Z]+)

Time

The Time adapter uses Unix's /usr/bin/time command. If it the time program supports it, we will also use the -f switch of the time command to record the maximum resident set size, i.e., the maximum amount of memory the program used.

Example configuration for a suite:

    Suite:
        gauge_adapter: Time
        benchmarks:
          - Bench1

Note:

Compatible time binaries are looked for in: - /usr/bin/time - /opt/local/bin/gtime

On MacOS, a GNU time command can be installed for instance with Homebrew and MacPorts.

Supporting other Benchmark Harnesses

To add support for your own harness, you can implement your own gauge adapter and use it in the configuration by naming the class and giving a path to the file implementing it. The path is expected to be relative to the configuration file.

Example configuration with a custom adapter:

    Suite:
        gauge_adapter:
           MyAdapter: my_adapter.py
        benchmarks:
          - Bench1

A custom adapter is expected to behave like a GaugeAdapter object, and it is recommended to inherit from the GaugeAdapter base class.

The simplest adapter would look like this:

from rebench.interop.adapter   import GaugeAdapter
from rebench.model.data_point  import DataPoint
from rebench.model.measurement import Measurement


class MyAdapter(GaugeAdapter):

    def parse_data(self, data, run_id, invocation):
        iteration = 1
        data_points = []
        current = DataPoint(run_id)
        data_points.append(current)

        measure = Measurement(invocation, iteration, 1.0, 'ms', run_id, 'total')
        current.add_measurement(measure)

        return data_points

The key method to implement is parse_data(self, data, run_id, invocation). The method is expected to return a list of DataPoint objects. Each data point can contain a number of Measurement objects, where one of them needs to be indicated as the total value.

The criterion identifies what is measured. This can be different phases of a benchmark or different properties, for instance memory usage. Each criterion is encoded as a separate measurement. The overall run time is assumed to be the final measurement to conclude the information for a single iteration of a benchmark, using the total criterion. Other criterion names are not standardized.

For more examples, see the rebench.interop module. In there, the adapter module contains the GaugeAdapter base class. A good example to study is the rebench_log_adapter implementation.