join-order-benchmark

The Join Order Benchmark (JOB) queries from "How Good Are Query Optimizers, Really?"

https://github.com/olros/join-order-benchmark

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

The Join Order Benchmark (JOB) queries from "How Good Are Query Optimizers, Really?"

Basic Info
  • Host: GitHub
  • Owner: olros
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 19.3 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

Join-Order-Benchmark

Based on https://github.com/winkyao/join-order-benchmark

This package contains the Join Order Benchmark (JOB) queries from: "How Good Are Query Optimizers, Really?" by Viktor Leis, Andrey Gubichev, Atans Mirchev, Peter Boncz, Alfons Kemper, Thomas Neumann PVLDB Volume 9, No. 3, 2015

The csv_files/imdb-create-tables.sql and queries/*.sql are modified to MySQL syntax.

Quick Start

  1. Obtain the data: shell cd csv_files/ wget http://homepages.cwi.nl/~boncz/job/imdb.tgz tar -xvzf imdb.tgz

  2. Launch the database server and connect (with local-infile turned on in the database server)

  3. Create IMDb tables in MySQL:

sqlmysql mysql> SOURCE /Users/olafrosendahl/Documents/GitHub/join-order-benchmark/csv_files/imdb-create-tables.sql

  1. Load data in MySQL: sqlmysql mysql> SOURCE /Users/olafrosendahl/Documents/GitHub/join-order-benchmark/csv_files/imdb-load-data.sql

  2. Add indexes to the IMDb database in MySQL sqlmysql mysql> SOURCE /Users/olafrosendahl/Documents/GitHub/join-order-benchmark/csv_files/imdb-index-tables.sql

Copy the data-directory afterwards to allow restoring the database data without loading it again if necessary. The data-directory to make a copy of is: /build/mysql-test/var/mysqld.1/data

Running the queries

We use hyperfine as a benchmarking-tool to measure the queries, you'll therefore need to install it before running the queries. To run all queries, run the following in your terminal:

bash ./run_queries.sh

This will run the queries in the queries-folder one-by-one, first without re-optimization, and then with re-optimization using different variables for the re-optimization hint. The results are outputted to different folders in the results-folder as json-files for each query. You'll be able to see the progress in the terminal as the queries are being executed.

Run single query

You can also run a single query without and with re-optimization by running the following in your terminal, replace <query> with the name of the query you want to run:

sh ./run_query.sh <query>

The result wil be outputted to a file in the results-folder as a json-file and will also be visible in the terminal.

Order Problem

Please note that queries/17b.sql and queries/8d.sql may exhibit order issues due to the use of different order rules from MySQL. This is not a real bug.

Analyze results

We've created a Python-script with lots of different methods for visualizing the results in visulize-info.py. Open it to chose which results you want visualized and before running it.

Owner

  • Name: Olaf Rosendahl
  • Login: olros
  • Kind: user
  • Location: Trondheim, Norway
  • Company: Kantega

Computer Engineering student at NTNU Trondheim

Citation (CITATION.cff)

cff-version: 1.2.0
title: Join Order Benchmark
message: >-
  If you use this software in scientific
  publications, please consider citing it using the
  metadata from this file.
type: software
authors:
  - given-names: Olaf
    family-names: Rosendahl
    email: olafrosendahl@gmail.com
repository-code: 'https://github.com/olros/join-order-benchmark'
abstract: The Join Order Benchmark in MySQL with scripts for running the benchmark.
license: MIT

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • matplotlib *
  • pandas *
  • seaborn *