geospatial-benchmark

Benchmark NoSQL(-like) database with geospatial.

https://github.com/guiyunbao/geospatial-benchmark

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Benchmark NoSQL(-like) database with geospatial.

Basic Info
  • Host: GitHub
  • Owner: guiyunbao
  • License: cc-by-sa-4.0
  • Language: TypeScript
  • Default Branch: main
  • Size: 18.9 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 1
Created almost 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License Citation

README.md

Benchmark geospatial query performance on NoSQL(-like) databases

The full paper can be found in Releases. Source paper can be found at report.typ. You may want to compile it with typst.

Test Environment

Run Benchmarks

```log npm run start [-- [options]]

-t, --type type of dataset "inat2017", "random", "grid", "cluster" -c, --count number of data points (default: "100000") -r, --repeat number of repeat (default: "1") ```

Test dataset

By default, this benchmark uses iNaturalist 2017's Fine Grained Geolocation Datasets (visipedia/fg_geo). Which contains 654,818 records of geolocation point.

File was placed at datasets/inat2017/inat2017_file_name_to_geo.csv with header.

Format:

csv filename,latitude,longitude

We also provide 3 other runtime generated datasets:

  • Random
    • Points are totally placed by RNG.
  • Grid
  • Cluster
    • Every 50 points will be placed together with a bit offset as a cluster, and all clusters will be distributed randomly.

Test queries

  • Data should be loaded into the database before running the queries.
    • Warm up index is allowed.
    • Warm up query is not allowed.
  • Storage cost will be calculated.
    • For memory-storage databases, both memory and persist storage cost will be calculated.
  • In theory, all databases should return the same result.

Basic test requires all queries runs one by one in a single process/thread.
Advanced test allows queries to run in parallel, and allows to optimized for test host.

Query A: Find nearest location

Pick a random location, find the closest location in the dataset.

Query B: Find locations within a radius

Pick a random location from the dataset, find all locations within certain distance.

Query C: Find locations within a radius and order them.

Pick a random location from the dataset, find all locations within certain distance, order by distance.

License and Citation

This project is licensed under the terms of the CC BY-SA 4.0 license.

To cite this report, check CITATION.cff.

Owner

  • Name: 桂运宝
  • Login: guiyunbao
  • Kind: organization
  • Location: China

广西桂运宝科技有限公司

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this result, please cite it as below."
authors:
- family-names: "Shen"
  given-names: "LiangXiang"
  url: "https://github.com/kj415j45"
  email: "kj415j45@gmail.com"
  affiliation: "Guangxi GuiYunBao Tech Inc."
- family-names: "Zhou"
  given-names: "Feng"
  url: "https://github.com/guimasharing"
  affiliation: "Guangxi GuiYunBao Tech Inc."
title: "Benchmark geospatial query performance on NoSQL(-like) databases"
version: 1.0.0
url: "https://github.com/guiyunbao/geospatial-benchmark"

GitHub Events

Total
Last Year