Introduction

running-ng is a collection of scripts that help people run workloads in a methodologically sound settings.

Disclaimer

At this stage, the focus of this project is driven by the internal use of members from Steve Blackburn's lab. If you are a member of the lab, you know what to do if you encounter any issue, and you can ignore the below.

If you are a member of the public, please kindly note that the project is open-source and documented on a "good-faith" basis. We might not have the time to consider your features requests. Please don't be offended if we ignore these. Having said that, you are very welcomed to use it, and we will be very pleased if this helps anyone. In particular, we are grateful if you report bugs you found with steps to reproduce it.

⚠️ Warning

The syntax (of configuration files and command line arguments) of running-ng is not stabilized yet. When you use it, expect breaking changes, although we will try to minimize this where possible.

running-ng has been tested by few people, and we think it is stable enough to use for your experiments. However, there are probably few wrinkles to be ironed out. Please file any bug or feature request on the issue tracker.

You are also welcome to implement new features and/or fix bugs by opening pull requests. But before you do so, please discuss with Steve first for major design changes. For non-user-facing changes, please discuss with the maintainers first.

History

The predecessor of running-ng is running, a set of scripts written in Perl, dating back to 2005. However, the type of workloads we are evaluation has changed a bit, and we want a new set of scripts that fit our needs better.

Two major sources of inspiration are mu-perf-benchmarks and menthol.

mu-perf-benchmarks is a performance regression framework built for The Mu Micro Virtual Machine. Zixian coauthored the framework with John Zhang in 2017. It features a web frontend for displaying results. You can see the live instance here.

menthol is a benchmarking framework built for running benchmarks in high-performance computing (HPC) settings. Zixian built it for his research project about evaluating Chapel's performance in 2018. The framework can run benchmarks in different languages on either single node or on a cluster through PBS job scheduler.

Maintainers

Zixian Cai

Installation

pip3 install --user -U running-ng

The base configuration files can be usually be found in paths like ~/.local/lib/python3.6/site-packages/running/config/base. The exact path might differ depending on your Python version, etc.

Adding `running` to `PATH`

You will need to add the folder where running is installed to your PATH. On a typical Linux installation, running is installed to ~/.local/bin.

You will need to refer to the documentation of the shell you are using.

Here is an example for bash.

# Add the following to ~/.bashrc
PATH=$PATH:$HOME/.local/bin

You don't need to use export. Generally, $PATH already exists and is exported to child processes.

Please check whether your ~/.bash_profile or ~/.profile sources ~/.bashrc. If not, when you use a login shell (e.g., in the case of tmux), the content of ~/.bashrc might not be applied.

To ensure ~/.bashrc is always sourced, you can add the following to ~/.bash_profile.

if [ -f ~/.bashrc ]; then
  . ~/.bashrc
fi

If you are a moma user, please change these dotfiles on squirrel.moma, and then run sudo /moma-admin/config/update_self.fish. Note that you should run this command using a SSH session on a standard terminal instead of using the integrated terminal in VSCode Remote. Please check here for how to setup a UNIX password for sudo.

Quickstart

This guide will show you how to use running-ng to compare two different builds of JVMs.

Note that for each occurrence in the form /path/to/*, you need to replace it with the real path of the respective item in the filesystem.

Installation

Please follow the installation guide to install running-ng. You will need Python 3.6+.

Then, create a file two_builds.yml with the following content.

includes:
  - "$RUNNING_NG_PACKAGE_DATA/base/runbms.yml"

The YAML file represents a dictionary (key-value pairs) that defines the experiments you are running. The includes directive here will populate the dictionary with some default values shipped with running-ng.

If you use moma machines, please substitute runbms.yml with runbms-anu.yml.

Prepare Benchmarks

Add the following to two_builds.yml.

benchmarks:
  dacapochopin-29a657f:
    - avrora
    - batik
    - biojava
    - cassandra
    - eclipse
    - fop
    - graphchi
    - h2
    - h2o
    - jme
    - jython
    - luindex
    - lusearch
    - pmd
    - sunflow
    - tradebeans 
    - tradesoap
    - tomcat
    - xalan
    - zxing

This specify a list of benchmarks used in this experiment from the benchmark suite dacapochopin-29a657f. The benchmark suite is defined in $RUNNING_NG_PACKAGE_DATA/base/dacapo.yml. By default, the minimum heap sizes of dacapochopin-29a657f benchmarks are measured with AdoptOpenJDK 15 using G1 GC. If you are using OpenJDK 11 or 17, you can override the value of suites.dacapochopin-29a657f.minheap to temurin-17-G1 or temurin-11-G1. That is, you can, for example, add "suites.dacapochopin-29a657f.minheap": "temurin-17-G1" to overrides.

Then, add the following to two_builds.yml.

overrides:
  "suites.dacapochopin-29a657f.timing_iteration": 5
  "suites.dacapochopin-29a657f.callback": "probe.DacapoChopinCallback"

That is, we want to run five iterations for each invocation, and use DacapoChopinCallback because it is the appropriate callback for this release of DaCapo.

Prepare Your Builds

In this guide, we assume you use mmtk-openjdk. Please follow its build guide.

I assume you produced two different builds you want to compare. Add the following to two_builds.yml.

runtimes:
  build1:
    type: OpenJDK
    release: 11
    home: "/path/to/build1/jdk" # make sure /path/to/build1/jdk/bin/java exists
  build2:
    type: OpenJDK
    release: 11
    home: "/path/to/build2/jdk" # make sure /path/to/build2/jdk/bin/java exists

This defines two builds of runtimes.

I recommend that you use absolute paths for the builds, although relative paths will work, and will be relative to where you run running.

I strongly recommend you rename the builds (both the name in the configuration file and the folder name) to something more sensible, preferably with the commit hash for easy troubleshooting and performance debugging later.

Prepare Probes

Please clone probes, and run make.

Add the following to two_builds.yml.

modifiers:
  probes_cp:
    type: JVMClasspath
    val: "/path/to/probes/out /path/to/probes/out/probes.jar"
  probes:
    type: JVMArg
    val: "-Djava.library.path=/path/to/probes/out -Dprobes=RustMMTk"

This defines two modifiers, which will be used later to modify the JVM command line arguments.

Please only use absolute paths for all the above.

Prepare Configs

Finally, add he following to two_builds.yml.

configs:
  - "build1|ms|s|c2|mmtk_gc-SemiSpace|tph|probes_cp|probes"
  - "build2|ms|s|c2|mmtk_gc-SemiSpace|tph|probes_cp|probes"

The syntax is described here.

Sanity Checks

The basic form of usage looks like this.

running runbms /path/to/log two_builds.yml 8

That is, run the experiments as specified by two_builds.yml, store the results in /path/to/log, and explore eight different heap sizes (with careful arrangement of which size to run first and which to run later).

See here for a complete reference of runbms.

Dry run

A dry run (by supplying -d to running NOT runbms) allows you to see the commands to be executed.

running -d runbms /path/to/log two_builds.yml 8 -i 1

Make sure it looks like what you want.

Single Invocation

Now, actually run the experiment, but only for one invocation (by supplying -i 1 to runbms).

running runbms /path/to/log two_builds.yml 8 -i 1

This allows you to see any issue before wasting several days only realizing that something didn't work.

Run It

Once you are happy with everything, run the experiments.

running runbms /path/to/log two_builds.yml 8 -p "two_builds"

Don't forget to give the results folder a prefix so that you can later tell what the experiment was for.

Analysing Results

This is outside the scope of this quickstart guide.

Basics

Briefly talk about how basic concepts fit together here...

Before diving into the details, please read the design principles to help you better understand why things are organized in such way.

Design Principles

Sound methodology

Sound methodology is crucial for the type of performance analysis work we do. Please see the documentation for each of the command for details. We also try to include sensible default values in the base configuration files.

Reproducibility

It should be easy to reproduce a set of experiments. To this end, various commands will save as much metadata with the results. For example, runbms saves the flattened configuration file and command line arguments in the results folder. For each log, basic information about the execution environment, such as uname, the model name of the CPU, and frequencies of CPU cores, is saved as well.

Extensibility

Broadly, the project consists of two parts: the core and the commands. The core provides abstractions for core concepts, such as benchmarks and execution environments, and can be extended through class inheritance.

The commands are the user-facing parts that uses the core to provide concrete functionalities.

Reusability

The configuration files can be easily reused through the includes and overrides mechanisms. For example, people might want to run multiple sets of experiments with minor tweaks, and being able to share a common base configuration file is ergonomic. This is also crucial to the first point that people can get a set of sensible default values by including base configuration files shipped with the project.

Human-readable syntax

We use YAML as the format for the configuration files. Please read the syntax reference for more details.

Configuration File Syntax

The configuration file is in YAML format. You can find a good YAML tutorial here. Below is the documentation for all the top-level keys that are common to all commands.

`benchmarks`

A YAML list of benchmarks to run in each specified benchmark suite.

For example:

benchmarks:
  dacapo2006:
    - eclipse
  dacapobach:
    - avrora
    - fop

specifies running to run the eclipse benchmark from the dacapo2006 benchmark suite; and the avrora and fop benchmarks from the dacapobach benchmark suite. These benchmark suites have to be defined previously (usually through an includes key).

Note that each benchmark of a benchmark suite can either be a string or a suite-specific dictionary. For example, for the DaCapo benchmark suite, the following two snippets are equivalent.

benchmarks:
  dacapo2006:
    - eclipse

benchmarks:
  dacapo2006:
    - {name: eclipse, bm_name: eclipse, size: default}

`configs`

A YAML list of configuration strings to be used to run the benchmarks. These are specified as a runtime followed by a '|' separated list of modifiers, i.e. "<runtime>|<modifier>|...|<modifier>".

For example:

configs:
  - "openjdk11|ms|s|c2"
  - "openjdk15|ms|s"

specifies running to use the openjdk11 runtime with ms, s, and c2 modifiers; and the openjdk15 runtime with the ms, and s modifiers. In the example above, we assume that both the runtimes and modifiers have been previously defined (in either the current configuration file or in an includes file).

Each segment in the configuration strings can have whitespaces in them, so that it's easier for multi-line editing.

For example:

configs:
  - "openjdk8 |foo-1 |bar|buzz"
  - "openjdk15|foo-16|   |buzz"

`includes`

A YAML list of paths to YAML files that are to be included into the current configuration file for definitions of some keys.

This is primarily used to provide re-usability and extensibility of configuration files. A pre-processor step in running takes care of including all the specified files. A flattened version of the final configuration file is also generated and placed in the results folder for reproducibility.

The paths can be either absolute or relative. Relative paths are solved relative to the current file. For example, if $HOME/configs/foo.yml has an include line ../bar.yml, the line is interpreted as $HOME/bar.yml. Similarly,

includes:
 - "./base/suites.yml"
 - "./base/modifiers.yml"

includes the suites.yml and modifiers.yml files located at ./base respectively.

Any environment variable in the paths are also resolved before any further processing. This include a special environment variable $RUNNING_NG_PACKAGE_DATA that allows you to refer to various configuration files shipping with running-ng, regardless how you installed running-ng. For example, in a global pip installation, $RUNNING_NG_PACKAGE_DATA will look like /usr/local/lib/python3.10/dist-packages/running/config.

`overrides`

Under construction 🚧.

`modifiers`

A YAML dictionary of program arguments or environment variables that are to be used with config strings. Cannot use - in the key for a modifier. Each modifier requires a type key with other keys being specific to that type. For more information regarding the different types of modifiers, please refer to this page.

Warning preview feature ⚠️. We can exclude certain benchmarks from using a specific modifier by using an excludes key along with a YAML list of benchmarks to be excluded from each benchmark suite.

For example:

modifiers:
  s:
    type: JVMArg
    val: "-server"
  c2:
    type: JVMArg
    val: "-XX:-TieredCompilation -Xcomp"
    excludes:
      dacapo2006:
        - eclipse

specifies two modifiers, s and c2, both of type JVMArg with their respective values. Here, the eclipse benchmark from the dacapo2006 benchmark suite has been excluded from the c2 modifier.

Warning preview feature ⚠️. Similarly, we can attach the modifier only to specific benchmarks by using an includes key.

For example:

modifiers:
  c2:
    type: JVMArg
    val: "-XX:-TieredCompilation -Xcomp"
    includes:
      dacapo2006:
        - eclipse

The c2 modifier will only be attached when running the eclipse benchmark from the dacapo2006 benchmark suite.

excludes has a higher priority than includes.

For example:

modifiers:
  c2:
    type: JVMArg
    val: "-XX:-TieredCompilation -Xcomp"
    includes:
      dacapo2006:
        - eclipse
        - fop
    excludes:
      dacapo2006:
        - fop

The c2 modifier will only be attached when running the eclipse benchmark from the dacapo2006 benchmark suite, no other benchmark will run with this modifier (not even fop even though it appears in the includes).

Value Options

These are special modifiers whose values can be specified through their use in a configuration string. Concrete values are specified as - separated values after the modifier's name in a configuration string. These values will be indexed by the modifier through syntax similar to Python format strings.

This is best understood via an example:

modifiers:
  env_var:
    type: EnvVar
    var: "FOO{0}"
    val: "{1}"

[...]

configs:
  - "openjdk11|env_var-42-43"

specifies to run the openjdk11 runtime with the environment variable FOO42 set to 43. Note that value options are not limited only to environment variables, and can be used for all modifier types.

`runtimes`

A YAML dictionary of runtime definitions that are to be used with config strings. Each runtime requires a type key with other keys being specific to that type. For more information regarding the different types of runtimes, please refer to this page.

`suites`

A YAML dictionary of benchmark suite definitions that are to be used as keys of benchmarks. Each benchmark suite requires a type key with other keys being specific to that type. For more information regarding the different types of benchmark suites, please refer to this page.

Benchmark Suite

`BinaryBenchmarkSuite` (preview ⚠️)

A BinaryBenchmarkSuite is a suite of programs which can be used to run binary benchmarks such as for C/C++ benchmarking.

Keyboard shortcuts

running-ng Documentation