https://github.com/cwida/benchmarker
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: cwida
- Language: Python
- Default Branch: main
- Size: 316 KB
Statistics
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Benchmarker
This is a simple benchmarking tool for SQL-based systems.
Installation
This project uses poetry for dependency management. First, install poetry:
bash
pip install poetry
Then, clone the repository and install the dependencies:
bash
git clone https://github.com/cwida/benchmarker.git
cd benchmarker
poetry install
You can also directly run the project with poetry:
bash
poetry run python experiments/tpcds_nightly_experiment.py
Experiment Configuration
You can create an experiment by adding a new pythonscript in the experiments directory. For an easy start,
just copy an existing experiment and adjust the parameters. All output will be stored in the
`output/results/directory,
with the raw run data in theoutput/runs/
Example:
```python from config.benchmark.tpcds import gettpcdsbenchmark from config.systems.duckdb import DUCKDBV113, DUCKDBNIGHTLY from src.models import RunConfig from src.runner.experiment_runner import run
def main(): sfs = [1, 3] config: RunConfig = { 'name': 'tpcdsnightlyexperiment', 'runsettings': { 'nparallel': 1, 'nruns': 3, }, 'systemsettings': [ {'nthreads': 1}, {'nthreads': 2}, {'nthreads': 4}, {'nthreads': 8}, ], 'systems': [DUCKDBV113, DUCKDBNIGHTLY], 'benchmarks': gettpcdsbenchmark(sfs), } run(config)
if name == "main": main() ```
Each experiment should have a main() that calls the run() function with the experiment configuration. The configuration is structured as follows:
name: The name of the experiment. This is used to create a directory in the_output/resultsdirectory. Additionally, there will also be one folder per run in this directory so you can run the same experiment multiple times and compare the results.run_settings: The settings for the experiment run:n_parallel: The number of parallel runs for running multiple runs in parallel to speed up the experiment. Make sure that your number of cores is not exceeded. Defaults to1.n_runs: The number of runs per system setting. This is useful if you want to run multiple runs to get a more stable result. Defaults to5.seed: The seed for the random number generator during data generation. Defaults to0.42.timeout: The timeout for each run in seconds. Defaults to60.offset: The offset for the run number if you want to continue an experiment that was interrupted. Defaults to0.max_n_experiments: The maximum number of experiments to run. Defaults toNone.
system_settings: Can be a single dictionary or a list of dictionaries. If multiple dictionaries are provided, the experiment will be run for each system setting. Each dictionary should contain the system setting parameters:n_threads: The number of threads to use for the system. Defaults to1.
systems: A list of system configurations to run the experiment on. See more at System Configuration.benchmarks: A list of benchmarks to run. See more at Benchmark Configuration.
System Configuration
You can configure the systems in the config/systems directory. Each system should have a python file with a
dictionary. For DuckDB, there is already a configuration in config/systems/duckdb.py:
python
DUCK_DB_MAIN: System = {
'version': 'v1.0.0',
'name': 'duckdb',
'build_config': {
'build_command': 'GEN=ninja BUILD_HTTPFS=1 BUILD_TPCH=1 BUILD_TPCDS=1 make',
'location': {
'location': 'github',
'github_url': 'https://github.com/duckdb/duckdb/commit/1f98600c2cf8722a6d2f2d805bb4af5e701319fc',
},
},
'run_config': {
'run_file': 'build/release/duckdb <',
'run_file_relative_to_build': True,
},
'setup_script': '',
'set_threads_command': lambda n_threads: f"PRAGMA threads = {n_threads};",
'get_start_profiler_command': get_duckdb_profile_script,
'get_metrics': get_duckdb_runtime_and_cardinality,
}
The name and version are used for identifying the system in the output.
Build Configuration
To build a system from source, you can pass a build_command together with the location of the source code.
The location can either be a github repository or a local directory.
GitHub Repository:
location: Must be set togithub.github_url: The URL of the github repository. Can be a link to a) a specific commit, b) a branch, or c) the root of the repository.
Local Directory:
location: Must be set tolocal.local_path: The path to the local directory.
At the start of the experiment, the system will be cloned into the _output/systems directory. The build command will
be executed in the cloned directory.
Also, when on a branch or repository, the newest commit will be fetched. So one workflow could be to push a new commit
to the repository and then re-run the experiment to test the new changes.
Run Configuration
The run_config contains the command to run the system. The run_file is the command to run the system.
If the command is relative to the build directory, set run_file_relative_to_build to True. This is necessary e.g.
when building from GitHub.
The run command must be able to take a SQL file as input. This is why for DuckDB, the command is
build/release/duckdb <. This will be then used as build/release/duckdb < <SQL_FILE>.
Setup Script
Can be used to set up the system before running the benchmarks. Not necessary for DuckDB. But here e.g. one could install a DuckDB community extension.
Set Threads Command
This callback function is used to set the number of threads for the system. For DuckDB, this is done by setting the
PRAGMA threads command.
Get Start Profiler Command
This callback function is used to start the profiler for the system. As experiments can be run in parallel, the profiler
takes the thread_index as an argument so you can distinguish between the different threads. For DuckDB, this looks
like:
python
def get_duckdb_profile_script(thread: int) -> str:
path = get_profile_path_duckdb(thread)
string = f"PRAGMA enable_profiling = 'json';pragma profile_output='{path}';"
return string
Get Metrics
Here we need to return the metrics per query. We collect the cardinality for consistency checks and the runtime. For DuckDB, this is a bit complicated as we need to parse the JSON output of the profiler we just configured above.
```python def getduckdbruntimeandcardinality(thread: int) -> Optional[Tuple[float, int]]: # load the json jsonpath = getprofilepathduckdb(thread)
# if the query crashed, the file does not exist
if not os.path.exists(json_path):
return None
with open(json_path, 'r') as f:
profile = json.load(f)
# get the runtime, can be either in the timing or operator_timing field because of different DuckDB versions
if 'timing' in profile:
runtime = profile['timing']
cardinality = profile['children'][0]['children'][0]['cardinality']
elif 'operator_timing' in profile:
runtime = profile['operator_timing']
try:
cardinality = profile['children'][0]['cardinality']
except KeyError: # Sometimes operator cardinality is not found for whatever reason
cardinality = profile['children'][0]['operator_cardinality']
elif 'latency' in profile:
runtime = profile['latency']
cardinality = profile['result_set_size']
else:
# throw an error if the runtime is not found
raise ValueError(f'Runtime not found in profile: {profile}')
# delete the file
os.remove(json_path)
return runtime, cardinality
```
Benchmark Configuration
The Benchmarks consist of a list of queries and a list of datasets.
```python class Benchmark(TypedDict): name: str datasets: List[DataSet] queries: List[Query]
class DataSet(TypedDict): name: str setup_script: Script config: Dict[str, any]
class Query(TypedDict):
name: str
index: int
run_script: Script
``
For each benchmark, *eachQuerywill be run on eachDataSet`*. Therefore, the must be compatible (all the tables that
the query needs must be present in the dataset).
For TPC-DS and TPC-H, there are already configurations in config/benchmark/tpcds.py and config/benchmark/tpch.py for
DuckDB.
Example: TPC-H
python
TPC_H_QUERIES: List[Query] = [
{
'name': f'tpch{i + 1}',
'index': i,
'run_script': {
"duckdb": f"PRAGMA tpch({i + 1});",
}
} for i in range(22)
]
Running the TPC-H queries is as simple as setting the run_script to the PRAGMA command that runs the query. But
if you would like to run a custom query, you have to see the duckdb implementation of the run script:
python
'run_script': {
"duckdb": f"SELECT * FROM WHATEVER WHERE SOMETHING = {i + 1};",
}
For TPC-H, the data is also generated and stored under output/data/tpch/. To generate data for your custom query,
look at the tpch.py file in the config/benchmark directory and adjust it for your needs.
Evaluation
Aggregated and raw results of the experiment are stored in the _output/results/<experiment_name>/<timestamp>/ directory.
It also contains a Summary.md file that contains the aggregated results and plots for the experiment and some
basic plots.
The raw data of each run is stored in the output/runs/<experiment_name>/<timestamp>/ directory.
Example Plot: Runtime per System and Query

Evaluation
Aggregated and raw results of the experiment are stored in the _output/results/<experiment_name>/<timestamp>/ directory.
It also contains a Summary.md file that contains the aggregated results and plots for the experiment and some
basic plots.
The raw data of each run is stored in the output/runs/<experiment_name>/<timestamp>/ directory.
Example Plot: Runtime per System and Query

Owner
- Name: CWI Database Architectures Group
- Login: cwida
- Kind: organization
- Location: Amsterdam, The Netherlands
- Website: https://www.cwi.nl/research/groups/database-architectures
- Twitter: cwi_da
- Repositories: 19
- Profile: https://github.com/cwida
GitHub Events
Total
- Watch event: 3
- Push event: 11
- Fork event: 1
Last Year
- Watch event: 3
- Push event: 11
- Fork event: 1
Dependencies
- colorama 0.4.6
- contourpy 1.3.1
- cycler 0.12.1
- duckdb 1.1.3
- fonttools 4.55.8
- kiwisolver 1.4.8
- logging 0.4.9.6
- matplotlib 3.10.0
- narwhals 1.24.1
- numpy 2.2.2
- packaging 24.2
- pandas 2.2.3
- pillow 11.1.0
- plotly 6.0.0
- psutil 6.1.1
- pyparsing 3.2.1
- python-dateutil 2.9.0.post0
- pytz 2025.1
- six 1.17.0
- tqdm 4.67.1
- tzdata 2025.1
- duckdb ^1.1.3
- logging ^0.4.9.6
- matplotlib ^3.10.0
- numpy ^2.2.2
- pandas ^2.2.3
- plotly ^6.0.0
- psutil ^6.1.1
- python ^3.10
- tqdm ^4.67.1