https://github.com/fandreuz/parallel-mapped-distance-matrix

Parallel mapped distance matrix with NumPy and Numba

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.1%) to scientific vocabulary

Keywords

hacktoberfest hpc numba numpy

Last synced: 6 months ago · JSON representation

Repository

Parallel mapped distance matrix with NumPy and Numba

Basic Info

Host: GitHub
Owner: fandreuz
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 43.9 KB

Statistics

Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

hacktoberfest hpc numba numpy

Created over 3 years ago · Last pushed about 1 year ago

Metadata Files

Readme License

Parallel MDM

Mapped distance matrix

The Mapped Distance Matrix (MDM) of two sets $\mathcal{X}, \mathcal{Y}$ of n-dimensional points is an algebraic structure which is defined in general as follows, given a mapping $f$:

$$\mathbf{M}(\mathcal{X}, \mathcal{Y}, f)_{i,j} := f(\Vert \mathcal{X}_i - \mathcal{Y}_j\Vert)$$

where $\Vert \cdot \Vert$ is an appropriate distance notion on the space of definition of $\mathcal{X}$ and $\mathcal{Y}$.

The problem might be augmented by weighting the contributions with a matrix of weights $\mathbf{W}$; the updated definition is then:

$$\mathbf{M}(\mathcal{X}, \mathcal{Y}, f)_{i,j} := \mathbf{W}_{i,j} f(\Vert \mathcal{X}_{i} - \mathcal{Y}_{j}\Vert)$$

A particularly popular form of the problem (which is also what we treat in this repository) occurs when weights are defined individually for the members of $\mathcal{Y}$ (i.e. the columns of $\mathbf{W}$ are taken constants):

$$\mathbf{M}(\mathcal{X}, \mathcal{Y}, f)_{i,j} := \mathbf{W}_{j} f(\Vert \mathcal{X}_{i} - \mathcal{Y}_{j}\Vert)$$

A notable case: uniform grid

In general $\mathcal{X}, \mathcal{Y}$ identify two general sets of points. A few applications allow more assumptions on the two sets. For instance, $\mathcal{X}$ might be taken to be an uniform grid. In this case a few interesting optimizization can be taken into account for the computation of the matrix.

More assumptions

Practical applications usually require huge sets of points, which causes memory errors on commonly used devices. This is why it's preferrable to compute the vector $\tilde{\mathbf{M}}$ defined below instead of $\mathbf{M}$:

$$\tilde{\mathbf{M}}_{i} := \sum_{j} \mathbf{M}_{i,j}$$

For most use cases this is enough.

Roadmap

Algorithms
- [x] Uniform grid algorithm
- [x] Scattered points algorithm
- [ ] Fourier-transfor based algorithm
[ ] Backends
- [ ] NumPy/Numba
- [ ] PyTorch
- [ ] JAX(?)
[ ] Parallelization
- [x] Multithreading/Multiprocessing
- [ ] GPU w/ PyTorch
- [ ] GPU w/ JAX
- [ ] CUDA kernels(?)
[ ] Tests
[ ] Documentation
[ ] Benchmark (+comparison with competitors)
- [ ] CPU
- [ ] GPU
- [ ] Several different bin sizes
- [ ] pts_per_future
[ ] Future
- [ ] Periodicity
- [ ] More general about distance definitions

Owner

Name: Francesco Andreuzzi
Login: fandreuz
Kind: user
Location: Geneva, Switzerland
Company: CERN

Repositories: 1
Profile: https://github.com/fandreuz

CSE MSc student | SWE @cern

GitHub Events

Total

Push event: 1

Last Year

Push event: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 0
Total pull requests: 4
Average time to close issues: N/A
Average time to close pull requests: 14 minutes
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/fandreuz/parallel-mapped-distance-matrix

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Parallel MDM

Mapped distance matrix

A notable case: uniform grid

More assumptions

Roadmap

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels