https://github.com/fandreuz/parallel-mapped-distance-matrix

Parallel mapped distance matrix with NumPy and Numba

https://github.com/fandreuz/parallel-mapped-distance-matrix

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.1%) to scientific vocabulary

Keywords

hacktoberfest hpc numba numpy
Last synced: 6 months ago · JSON representation

Repository

Parallel mapped distance matrix with NumPy and Numba

Basic Info
  • Host: GitHub
  • Owner: fandreuz
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 43.9 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
hacktoberfest hpc numba numpy
Created over 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme License

README.md

Parallel MDM

Mapped distance matrix

The Mapped Distance Matrix (MDM) of two sets $\mathcal{X}, \mathcal{Y}$ of n-dimensional points is an algebraic structure which is defined in general as follows, given a mapping $f$:

$$\mathbf{M}(\mathcal{X}, \mathcal{Y}, f)_{i,j} := f(\Vert \mathcal{X}_i - \mathcal{Y}_j\Vert)$$

where $\Vert \cdot \Vert$ is an appropriate distance notion on the space of definition of $\mathcal{X}$ and $\mathcal{Y}$.

The problem might be augmented by weighting the contributions with a matrix of weights $\mathbf{W}$; the updated definition is then:

$$\mathbf{M}(\mathcal{X}, \mathcal{Y}, f)_{i,j} := \mathbf{W}_{i,j} f(\Vert \mathcal{X}_{i} - \mathcal{Y}_{j}\Vert)$$

A particularly popular form of the problem (which is also what we treat in this repository) occurs when weights are defined individually for the members of $\mathcal{Y}$ (i.e. the columns of $\mathbf{W}$ are taken constants):

$$\mathbf{M}(\mathcal{X}, \mathcal{Y}, f)_{i,j} := \mathbf{W}_{j} f(\Vert \mathcal{X}_{i} - \mathcal{Y}_{j}\Vert)$$

A notable case: uniform grid

In general $\mathcal{X}, \mathcal{Y}$ identify two general sets of points. A few applications allow more assumptions on the two sets. For instance, $\mathcal{X}$ might be taken to be an uniform grid. In this case a few interesting optimizization can be taken into account for the computation of the matrix.

More assumptions

Practical applications usually require huge sets of points, which causes memory errors on commonly used devices. This is why it's preferrable to compute the vector $\tilde{\mathbf{M}}$ defined below instead of $\mathbf{M}$:

$$\tilde{\mathbf{M}}_{i} := \sum_{j} \mathbf{M}_{i,j}$$

For most use cases this is enough.

Roadmap

  • Algorithms
    • [x] Uniform grid algorithm
    • [x] Scattered points algorithm
    • [ ] Fourier-transfor based algorithm
  • [ ] Backends
    • [ ] NumPy/Numba
    • [ ] PyTorch
    • [ ] JAX(?)
  • [ ] Parallelization
    • [x] Multithreading/Multiprocessing
    • [ ] GPU w/ PyTorch
    • [ ] GPU w/ JAX
    • [ ] CUDA kernels(?)
  • [ ] Tests
  • [ ] Documentation
  • [ ] Benchmark (+comparison with competitors)
    • [ ] CPU
    • [ ] GPU
    • [ ] Several different bin sizes
    • [ ] pts_per_future
  • [ ] Future
    • [ ] Periodicity
    • [ ] More general about distance definitions

Owner

  • Name: Francesco Andreuzzi
  • Login: fandreuz
  • Kind: user
  • Location: Geneva, Switzerland
  • Company: CERN

CSE MSc student | SWE @cern

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 4
  • Average time to close issues: N/A
  • Average time to close pull requests: 14 minutes
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • fandreuz (4)
Top Labels
Issue Labels
Pull Request Labels