zCurve

zCurve maps multidimensional data to one dimension while preserving locality of the data points.

https://github.com/resc2801/zCurve

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.1%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

zCurve maps multidimensional data to one dimension while preserving locality of the data points.

Basic Info
  • Host: GitHub
  • Owner: resc2801
  • License: other
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 31.3 KB
Statistics
  • Stars: 9
  • Watchers: 2
  • Forks: 2
  • Open Issues: 1
  • Releases: 0
Created about 5 years ago · Last pushed about 5 years ago
Metadata Files
Readme License

README.md

zCurve

DOI

zCurve is a Python module with methods to efficiently map multidimensional data to a single dimension while preserving locality of the data points.

This mapping is commonly known as Z-order, Lebesgue curve, Morton space filling curve, Morton order or Morton code.

Image by David Eppstein, 2008

The Morton code of a multi-dimensional data point is calculated by bitwise interlacing the binary representations of its coordinate values.

zCurve provides two functions for handling the encoding and decoding of data points with arbitrary dimensionality and arbitrary coordinate size:

python interlace(*data_point: int, dims: int = None, bits_per_dim: int = None) -> int python deinterlace(code_point: int, dims: int = 3) -> List[int]

When handling large multi-dimensional dataset (n > 10.000), zCurve offers some simple but convenient means of parallelizing the Morton encoding and decoding:

python par_interlace(data_points: List[List[int]], dims: int = None, bits_per_dim: int = None) -> List[int] python par_deinterlace(code_points: List[int], dims: int = 3) -> List[List[int]]

Given the Morton codes of a multi-dimensional dataset, we can perform multi-dimensional range search using only a one-dimensional data structure. For range searching, zCurve offers two functions for calculating the necesaary LITMAX and BIGMIN values: python prev_morton(code_point: int, rmin_code: int, rmax_code: int, dims: int = 3) -> int python next_morton(code_point: int, rmin_code: int, rmax_code: int, dims: int = 3) -> int

This implementation is based on the following paper

Tropf, Herbert, and Helmut Herzog. "Multidimensional Range Search in Dynamically Balanced Trees." ANGEWANDTE INFO. 2 (1981): 71-77.

and it makes heavy use of the excellent gmpy2 module.

Installation

bash pip install zCurve

Usage

Basics

python import zCurve as z imports the module. python code = z.interlace(2,16,8) interlaces the 3D data point (2,16,8) into Morton code point 10248.

When explicitly specify dimensionality and bits per dimension of your data point python code = z.interlace(2,16,8, dims=3, bits_per_dim=5) performance will benefit substantially.

python z.deinterlace(4711) deinterlaces the Morton code point 4711 into the 3D data point (29,1,3).

Parallel interlacing/deinterlacing

Given a potentially large list of n-dimensional data_points ````python from random import randrange

bitsize = 16 maxval = 2bitsize - 1 nosamples = 106 datapoints = [(randrange(0, maxval), randrange(0, maxval), randrange(0, maxval)) for i in range(nosamples)] ``` we can speed up things by usingparinterlaceandpardeinterlace ``python mortoncodes = z.parinterlace(datapoints, dims=3, bitsperdim=16) datapoints == z.pardeinterlaces(morton_codes, dims=3) ````

Range searching

Image by Tropf and Herzog, 1981

When range searching, we can prune the search space by calculating BIGMIN (aka "GetNextZ-address") and LITMAX (aka "GetPrevZ-address") values.
```python point = z.interlace(6, 3, dims=2) # => 30 rmin = z.interlace(5, 3, dims=2) # => 27 rmax = z.interlace(10, 5, dims=2) # => 102

BIGMIN = z.nextmorton(point, rmin, rmax, dims=2) # => 31 LITMAX = z.prevmorton(point, rmin, rmax, dims=2) # => 27 In addition, we can easily check if a given Morton code point is within a specified range python z.inrange(58,27,102, dims=2) # => False z.inrange(49,27,102, dims=2) # => True ```

Citation

bibtex @misc{rmrschub_2021_zCurve, author = {René Schubotz}, title = {{zCurve: Multi-dimensional indexing using Morton space filling curves.}}, month = may, year = 2021, doi = {10.5281/zenodo.4777584}, version = {0.0.4}, publisher = {Zenodo}, url = {https://github.com/rmrschub/zCurve} }

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Owner

  • Login: resc2801
  • Kind: user
  • Location: Germany

GitHub Events

Total
  • Watch event: 1
  • Pull request event: 1
  • Fork event: 1
Last Year
  • Watch event: 1
  • Pull request event: 1
  • Fork event: 1

Dependencies

setup.py pypi
  • gmpy2 *
  • typing *