sparsestack
Memory efficient stack of multiple 2D sparse arrays.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Repository
Memory efficient stack of multiple 2D sparse arrays.
Basic Info
- Host: GitHub
- Owner: matchms
- License: mit
- Language: Python
- Default Branch: main
- Size: 257 KB
Statistics
- Stars: 4
- Watchers: 3
- Forks: 5
- Open Issues: 8
- Releases: 13
Metadata Files
README.md

Memory efficient stack of multiple 2D sparse arrays.

Installation
Requirements
Python 3.10 or higher
Pip Install
Simply install using pip: pip install sparsestack
First code example
```python import numpy as np from sparsestack import StackedSparseArray
Create some fake data
scores1 = np.random.random((12, 10)) scores1[scores1 < 0.9] = 0 # make "sparse" scores2 = np.random.random((12, 10)) scores2[scores2 < 0.75] = 0 # make "sparse" sparsestack = StackedSparseArray(12, 10) sparsestack.adddensematrix(scores1, "scores_1")
Add second scores and filter
sparsestack.adddensematrix(scores2, "scores2", jointype="left")
Scores can be accessed using (limited) slicing capabilities
sparsestack[3, 4] # => scores1 and scores2 at position row=3, col=4 sparsestack[3, :] # => tuple with row, col, scores for all entries in row=3 sparsestack[:, 2] # => tuple with row, col, scores for all entries in col=2 sparsestack[3, :, 0] # => tuple with row, col, scores1 for all entries in row=3 sparsestack[3, :, "scores1"] # => same as the one before
Scores can also be converted to a dense numpy array:
scores2aftermerge = sparsestack.toarray("scores2") ```
Adding data to a sparsestack-array
Sparsestack provides three options to add data to a new layer.
1) .add_dense_matrix(input_array)
Can be used to add all none-zero elements of input_array to the sparsestack. Depending on the chosen join_type either all such values will be added (join_type="outer" or join_type="right"), or only those which are already present in underlying layers ("left" or "inner" join).
2) .add_sparse_matrix(input_coo_matrix)
This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using join_type.
3) .add_sparse_data(row, col, data)
This essentially does the same as .add_sparse_matrix(input_coo_matrix) but might in some cases be a bit more flexible because row, col and data are separate input arguments.
Accessing data from sparsestack-array
The collected sparse data can be accessed in multiple ways.
1) Slicing.
sparsestack allows multiple types of slicing (see also code example above).
python
sparsestack[3, 4] # => tuple with all scores at position row=3, col=4
sparsestack[3, :] # => tuple with row, col, scores for all entries in row=3
sparsestack[:, 2] # => tuple with row, col, scores for all entries in col=2
sparsestack[3, :, 0] # => tuple with row, col, scores_1 for all entries in row=3
sparsestack[3, :, "scores_1"] # => same as the one before
2) .to_array()
Creates and returns a dense numpy array of size .shape. Can also be used to create a dense numpy array of only a single layer when used like .to_array(name="layerX").
Carefull: Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros.
3) .to_coo(name="layerX")
Returns a scipy sparse COO-matrix of the specified layer.
Owner
- Name: matchms
- Login: matchms
- Kind: organization
- Repositories: 4
- Profile: https://github.com/matchms
Citation (CITATION.cff)
# YAML 1.2
---
abstract: "Memory efficient stack of multiple 2D sparse arrays."
authors:
-
affiliation: "Centre for Digitalisation and Digitality, Univery of Applied Sciences Düsseldorf"
family-names: Huber
given-names: Florian
orcid: https://orcid.org/0000-0002-3535-9406
cff-version: 1.2.0
license: "MIT Licence"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/florian-huber/sparsestack"
title: sparsestack
GitHub Events
Total
- Create event: 7
- Release event: 7
- Issues event: 6
- Watch event: 1
- Delete event: 2
- Issue comment event: 5
- Member event: 1
- Push event: 15
- Pull request review event: 1
- Pull request event: 28
Last Year
- Create event: 7
- Release event: 7
- Issues event: 6
- Watch event: 1
- Delete event: 2
- Issue comment event: 5
- Member event: 1
- Push event: 15
- Pull request review event: 1
- Pull request event: 28
Dependencies
- actions/checkout v2 composite
- actions/setup-python v1 composite
- conda-incubator/setup-miniconda v2 composite
- actions/checkout v2 composite
- actions/setup-python v1 composite
- pypa/gh-action-pypi-publish master composite
- decorator ^5.1.1 develop
- isort ^5.13.2 develop
- poetry-bumpversion ^0.3.2 develop
- prospector ^1.12.1 develop
- pytest ^8.3.3 develop
- pytest-cov ^6.0.0 develop
- testfixtures ^8.3.0 develop
- yapf ^0.40.2 develop
- numba ^0.60.0
- numpy >1.24
- python >=3.10,<3.13
- scipy ^1.14.1