https://github.com/cwida/benchmarker

https://github.com/cwida/benchmarker

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: cwida
  • Language: Python
  • Default Branch: main
  • Size: 316 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme

README.md

Benchmarker

This is a simple benchmarking tool for SQL-based systems.

Installation

This project uses poetry for dependency management. First, install poetry: bash pip install poetry Then, clone the repository and install the dependencies: bash git clone https://github.com/cwida/benchmarker.git cd benchmarker poetry install You can also directly run the project with poetry: bash poetry run python experiments/tpcds_nightly_experiment.py

Experiment Configuration

You can create an experiment by adding a new pythonscript in the experiments directory. For an easy start, just copy an existing experiment and adjust the parameters. All output will be stored in the `output/results///directory, with the raw run data in theoutput/runs///` directory and the summary in the

Example:

```python from config.benchmark.tpcds import gettpcdsbenchmark from config.systems.duckdb import DUCKDBV113, DUCKDBNIGHTLY from src.models import RunConfig from src.runner.experiment_runner import run

def main(): sfs = [1, 3] config: RunConfig = { 'name': 'tpcdsnightlyexperiment', 'runsettings': { 'nparallel': 1, 'nruns': 3, }, 'systemsettings': [ {'nthreads': 1}, {'nthreads': 2}, {'nthreads': 4}, {'nthreads': 8}, ], 'systems': [DUCKDBV113, DUCKDBNIGHTLY], 'benchmarks': gettpcdsbenchmark(sfs), } run(config)

if name == "main": main() ```

Each experiment should have a main() that calls the run() function with the experiment configuration. The configuration is structured as follows:

  1. name: The name of the experiment. This is used to create a directory in the _output/results directory. Additionally, there will also be one folder per run in this directory so you can run the same experiment multiple times and compare the results.
  2. run_settings: The settings for the experiment run:
    1. n_parallel: The number of parallel runs for running multiple runs in parallel to speed up the experiment. Make sure that your number of cores is not exceeded. Defaults to 1.
    2. n_runs: The number of runs per system setting. This is useful if you want to run multiple runs to get a more stable result. Defaults to 5.
    3. seed: The seed for the random number generator during data generation. Defaults to 0.42.
    4. timeout: The timeout for each run in seconds. Defaults to 60.
    5. offset: The offset for the run number if you want to continue an experiment that was interrupted. Defaults to 0.
    6. max_n_experiments: The maximum number of experiments to run. Defaults to None.
  3. system_settings: Can be a single dictionary or a list of dictionaries. If multiple dictionaries are provided, the experiment will be run for each system setting. Each dictionary should contain the system setting parameters:
    1. n_threads: The number of threads to use for the system. Defaults to 1.
  4. systems: A list of system configurations to run the experiment on. See more at System Configuration.
  5. benchmarks: A list of benchmarks to run. See more at Benchmark Configuration.

System Configuration

You can configure the systems in the config/systems directory. Each system should have a python file with a dictionary. For DuckDB, there is already a configuration in config/systems/duckdb.py:

python DUCK_DB_MAIN: System = { 'version': 'v1.0.0', 'name': 'duckdb', 'build_config': { 'build_command': 'GEN=ninja BUILD_HTTPFS=1 BUILD_TPCH=1 BUILD_TPCDS=1 make', 'location': { 'location': 'github', 'github_url': 'https://github.com/duckdb/duckdb/commit/1f98600c2cf8722a6d2f2d805bb4af5e701319fc', }, }, 'run_config': { 'run_file': 'build/release/duckdb <', 'run_file_relative_to_build': True, }, 'setup_script': '', 'set_threads_command': lambda n_threads: f"PRAGMA threads = {n_threads};", 'get_start_profiler_command': get_duckdb_profile_script, 'get_metrics': get_duckdb_runtime_and_cardinality, }

The name and version are used for identifying the system in the output.

Build Configuration

To build a system from source, you can pass a build_command together with the location of the source code. The location can either be a github repository or a local directory.

GitHub Repository:

  • location: Must be set to github.
  • github_url: The URL of the github repository. Can be a link to a) a specific commit, b) a branch, or c) the root of the repository.

Local Directory:

  • location: Must be set to local.
  • local_path: The path to the local directory.

At the start of the experiment, the system will be cloned into the _output/systems directory. The build command will be executed in the cloned directory. Also, when on a branch or repository, the newest commit will be fetched. So one workflow could be to push a new commit to the repository and then re-run the experiment to test the new changes.

Run Configuration

The run_config contains the command to run the system. The run_file is the command to run the system. If the command is relative to the build directory, set run_file_relative_to_build to True. This is necessary e.g. when building from GitHub.

The run command must be able to take a SQL file as input. This is why for DuckDB, the command is build/release/duckdb <. This will be then used as build/release/duckdb < <SQL_FILE>.

Setup Script

Can be used to set up the system before running the benchmarks. Not necessary for DuckDB. But here e.g. one could install a DuckDB community extension.

Set Threads Command

This callback function is used to set the number of threads for the system. For DuckDB, this is done by setting the PRAGMA threads command.

Get Start Profiler Command

This callback function is used to start the profiler for the system. As experiments can be run in parallel, the profiler takes the thread_index as an argument so you can distinguish between the different threads. For DuckDB, this looks like:

python def get_duckdb_profile_script(thread: int) -> str: path = get_profile_path_duckdb(thread) string = f"PRAGMA enable_profiling = 'json';pragma profile_output='{path}';" return string

Get Metrics

Here we need to return the metrics per query. We collect the cardinality for consistency checks and the runtime. For DuckDB, this is a bit complicated as we need to parse the JSON output of the profiler we just configured above.

```python def getduckdbruntimeandcardinality(thread: int) -> Optional[Tuple[float, int]]: # load the json jsonpath = getprofilepathduckdb(thread)

# if the query crashed, the file does not exist
if not os.path.exists(json_path):
    return None

with open(json_path, 'r') as f:
    profile = json.load(f)

# get the runtime, can be either in the timing or operator_timing field because of different DuckDB versions
if 'timing' in profile:
    runtime = profile['timing']
    cardinality = profile['children'][0]['children'][0]['cardinality']
elif 'operator_timing' in profile:
    runtime = profile['operator_timing']
    try:
        cardinality = profile['children'][0]['cardinality']
    except KeyError:  # Sometimes operator cardinality is not found for whatever reason
        cardinality = profile['children'][0]['operator_cardinality']

elif 'latency' in profile:
    runtime = profile['latency']
    cardinality = profile['result_set_size']

else:
    # throw an error if the runtime is not found
    raise ValueError(f'Runtime not found in profile: {profile}')
# delete the file
os.remove(json_path)
return runtime, cardinality

```

Benchmark Configuration

The Benchmarks consist of a list of queries and a list of datasets.

```python class Benchmark(TypedDict): name: str datasets: List[DataSet] queries: List[Query]

class DataSet(TypedDict): name: str setup_script: Script config: Dict[str, any]

class Query(TypedDict): name: str index: int run_script: Script `` For each benchmark, *eachQuerywill be run on eachDataSet`*. Therefore, the must be compatible (all the tables that the query needs must be present in the dataset).

For TPC-DS and TPC-H, there are already configurations in config/benchmark/tpcds.py and config/benchmark/tpch.py for DuckDB.

Example: TPC-H

python TPC_H_QUERIES: List[Query] = [ { 'name': f'tpch{i + 1}', 'index': i, 'run_script': { "duckdb": f"PRAGMA tpch({i + 1});", } } for i in range(22) ] Running the TPC-H queries is as simple as setting the run_script to the PRAGMA command that runs the query. But if you would like to run a custom query, you have to see the duckdb implementation of the run script: python 'run_script': { "duckdb": f"SELECT * FROM WHATEVER WHERE SOMETHING = {i + 1};", } For TPC-H, the data is also generated and stored under output/data/tpch/. To generate data for your custom query, look at the tpch.py file in the config/benchmark directory and adjust it for your needs.

Evaluation

Aggregated and raw results of the experiment are stored in the _output/results/<experiment_name>/<timestamp>/ directory. It also contains a Summary.md file that contains the aggregated results and plots for the experiment and some basic plots. The raw data of each run is stored in the output/runs/<experiment_name>/<timestamp>/ directory.

Example Plot: Runtime per System and Query

System

Evaluation

Aggregated and raw results of the experiment are stored in the _output/results/<experiment_name>/<timestamp>/ directory. It also contains a Summary.md file that contains the aggregated results and plots for the experiment and some basic plots. The raw data of each run is stored in the output/runs/<experiment_name>/<timestamp>/ directory.

Example Plot: Runtime per System and Query

System

Owner

  • Name: CWI Database Architectures Group
  • Login: cwida
  • Kind: organization
  • Location: Amsterdam, The Netherlands

GitHub Events

Total
  • Watch event: 3
  • Push event: 11
  • Fork event: 1
Last Year
  • Watch event: 3
  • Push event: 11
  • Fork event: 1

Dependencies

poetry.lock pypi
  • colorama 0.4.6
  • contourpy 1.3.1
  • cycler 0.12.1
  • duckdb 1.1.3
  • fonttools 4.55.8
  • kiwisolver 1.4.8
  • logging 0.4.9.6
  • matplotlib 3.10.0
  • narwhals 1.24.1
  • numpy 2.2.2
  • packaging 24.2
  • pandas 2.2.3
  • pillow 11.1.0
  • plotly 6.0.0
  • psutil 6.1.1
  • pyparsing 3.2.1
  • python-dateutil 2.9.0.post0
  • pytz 2025.1
  • six 1.17.0
  • tqdm 4.67.1
  • tzdata 2025.1
pyproject.toml pypi
  • duckdb ^1.1.3
  • logging ^0.4.9.6
  • matplotlib ^3.10.0
  • numpy ^2.2.2
  • pandas ^2.2.3
  • plotly ^6.0.0
  • psutil ^6.1.1
  • python ^3.10
  • tqdm ^4.67.1