join-order-benchmark

The Join Order Benchmark (JOB) queries from "How Good Are Query Optimizers, Really?"

https://github.com/olros/join-order-benchmark

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

The Join Order Benchmark (JOB) queries from "How Good Are Query Optimizers, Really?"

Basic Info

Host: GitHub
Owner: olros
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 19.3 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation

Join-Order-Benchmark

Based on https://github.com/winkyao/join-order-benchmark

This package contains the Join Order Benchmark (JOB) queries from: "How Good Are Query Optimizers, Really?" by Viktor Leis, Andrey Gubichev, Atans Mirchev, Peter Boncz, Alfons Kemper, Thomas Neumann PVLDB Volume 9, No. 3, 2015

The csv_files/imdb-create-tables.sql and queries/*.sql are modified to MySQL syntax.

Quick Start

Obtain the data: shell cd csv_files/ wget http://homepages.cwi.nl/~boncz/job/imdb.tgz tar -xvzf imdb.tgz
Launch the database server and connect (with local-infile turned on in the database server)
Create IMDb tables in MySQL:

sqlmysql mysql> SOURCE /Users/olafrosendahl/Documents/GitHub/join-order-benchmark/csv_files/imdb-create-tables.sql

Load data in MySQL: sqlmysql mysql> SOURCE /Users/olafrosendahl/Documents/GitHub/join-order-benchmark/csv_files/imdb-load-data.sql
Add indexes to the IMDb database in MySQL sqlmysql mysql> SOURCE /Users/olafrosendahl/Documents/GitHub/join-order-benchmark/csv_files/imdb-index-tables.sql

Copy the data-directory afterwards to allow restoring the database data without loading it again if necessary. The data-directory to make a copy of is: /build/mysql-test/var/mysqld.1/data

Running the queries

We use hyperfine as a benchmarking-tool to measure the queries, you'll therefore need to install it before running the queries. To run all queries, run the following in your terminal:

bash ./run_queries.sh

This will run the queries in the queries-folder one-by-one, first without re-optimization, and then with re-optimization using different variables for the re-optimization hint. The results are outputted to different folders in the results-folder as json-files for each query. You'll be able to see the progress in the terminal as the queries are being executed.

Run single query

You can also run a single query without and with re-optimization by running the following in your terminal, replace <query> with the name of the query you want to run:

sh ./run_query.sh <query>

The result wil be outputted to a file in the results-folder as a json-file and will also be visible in the terminal.

Order Problem

Please note that queries/17b.sql and queries/8d.sql may exhibit order issues due to the use of different order rules from MySQL. This is not a real bug.

Analyze results

We've created a Python-script with lots of different methods for visualizing the results in visulize-info.py. Open it to chose which results you want visualized and before running it.

Owner

Name: Olaf Rosendahl
Login: olros
Kind: user
Location: Trondheim, Norway
Company: Kantega

Website: https://olafros.com
Repositories: 2
Profile: https://github.com/olros

Computer Engineering student at NTNU Trondheim

Citation (CITATION.cff)

cff-version: 1.2.0
title: Join Order Benchmark
message: >-
  If you use this software in scientific
  publications, please consider citing it using the
  metadata from this file.
type: software
authors:
  - given-names: Olaf
    family-names: Rosendahl
    email: olafrosendahl@gmail.com
repository-code: 'https://github.com/olros/join-order-benchmark'
abstract: The Join Order Benchmark in MySQL with scripts for running the benchmark.
license: MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science