sparsestack

Memory efficient stack of multiple 2D sparse arrays.

https://github.com/matchms/sparsestack

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Memory efficient stack of multiple 2D sparse arrays.

Basic Info

Host: GitHub
Owner: matchms
License: mit
Language: Python
Default Branch: main
Size: 257 KB

Statistics

Stars: 4
Watchers: 3
Forks: 5
Open Issues: 8
Releases: 13

Created almost 4 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

README.md

GitHub GitHub Workflow Status

sparsestack logo

Memory efficient stack of multiple 2D sparse arrays.

sparsestack-overview-figure

Installation

Requirements

Python 3.10 or higher

Pip Install

Simply install using pip: pip install sparsestack

First code example

```python import numpy as np from sparsestack import StackedSparseArray

Create some fake data

scores1 = np.random.random((12, 10)) scores1[scores1 < 0.9] = 0 # make "sparse" scores2 = np.random.random((12, 10)) scores2[scores2 < 0.75] = 0 # make "sparse" sparsestack = StackedSparseArray(12, 10) sparsestack.adddensematrix(scores1, "scores_1")

Add second scores and filter

sparsestack.adddensematrix(scores2, "scores2", jointype="left")

Scores can be accessed using (limited) slicing capabilities

sparsestack[3, 4] # => scores1 and scores2 at position row=3, col=4 sparsestack[3, :] # => tuple with row, col, scores for all entries in row=3 sparsestack[:, 2] # => tuple with row, col, scores for all entries in col=2 sparsestack[3, :, 0] # => tuple with row, col, scores1 for all entries in row=3 sparsestack[3, :, "scores1"] # => same as the one before

Scores can also be converted to a dense numpy array:

scores2aftermerge = sparsestack.toarray("scores2") ```

Adding data to a `sparsestack`-array

Sparsestack provides three options to add data to a new layer. 1) .add_dense_matrix(input_array) Can be used to add all none-zero elements of input_array to the sparsestack. Depending on the chosen join_type either all such values will be added (join_type="outer" or join_type="right"), or only those which are already present in underlying layers ("left" or "inner" join). 2) .add_sparse_matrix(input_coo_matrix) This method will expect a COO-style matrix (e.g. scipy) which has attributes .row, .col and .data. The join type can again be specified using join_type. 3) .add_sparse_data(row, col, data) This essentially does the same as .add_sparse_matrix(input_coo_matrix) but might in some cases be a bit more flexible because row, col and data are separate input arguments.

Accessing data from `sparsestack`-array

The collected sparse data can be accessed in multiple ways.

1) Slicing. sparsestack allows multiple types of slicing (see also code example above). python sparsestack[3, 4] # => tuple with all scores at position row=3, col=4 sparsestack[3, :] # => tuple with row, col, scores for all entries in row=3 sparsestack[:, 2] # => tuple with row, col, scores for all entries in col=2 sparsestack[3, :, 0] # => tuple with row, col, scores_1 for all entries in row=3 sparsestack[3, :, "scores_1"] # => same as the one before 2) .to_array() Creates and returns a dense numpy array of size .shape. Can also be used to create a dense numpy array of only a single layer when used like .to_array(name="layerX").
Carefull: Obviously by converting to a dense array, the sparse nature will be lost and all empty positions in the stack will be filled with zeros. 3) .to_coo(name="layerX") Returns a scipy sparse COO-matrix of the specified layer.

Owner

Name: matchms
Login: matchms
Kind: organization

Repositories: 4
Profile: https://github.com/matchms

Citation (CITATION.cff)

# YAML 1.2
---
abstract: "Memory efficient stack of multiple 2D sparse arrays."
authors:
  -
    affiliation: "Centre for Digitalisation and Digitality, Univery of Applied Sciences Düsseldorf"
    family-names: Huber
    given-names: Florian
    orcid: https://orcid.org/0000-0002-3535-9406

cff-version: 1.2.0
license: "MIT Licence"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/florian-huber/sparsestack"
title: sparsestack

GitHub Events

Total

Create event: 7
Release event: 7
Issues event: 6
Watch event: 1
Delete event: 2
Issue comment event: 5
Member event: 1
Push event: 15
Pull request review event: 1
Pull request event: 28

Last Year

Create event: 7
Release event: 7
Issues event: 6
Watch event: 1
Delete event: 2
Issue comment event: 5
Member event: 1
Push event: 15
Pull request review event: 1
Pull request event: 28

Dependencies

.github/workflows/CI_build.yml actions

actions/checkout v2 composite
actions/setup-python v1 composite
conda-incubator/setup-miniconda v2 composite

.github/workflows/CI_publish_pypi.yml actions

actions/checkout v2 composite
actions/setup-python v1 composite
pypa/gh-action-pypi-publish master composite

pyproject.toml pypi

decorator ^5.1.1 develop
isort ^5.13.2 develop
poetry-bumpversion ^0.3.2 develop
prospector ^1.12.1 develop
pytest ^8.3.3 develop
pytest-cov ^6.0.0 develop
testfixtures ^8.3.0 develop
yapf ^0.40.2 develop
numba ^0.60.0
numpy >1.24
python >=3.10,<3.13
scipy ^1.14.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

sparsestack

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Installation

Requirements

Pip Install

First code example

Create some fake data

Add second scores and filter

Scores can be accessed using (limited) slicing capabilities

Scores can also be converted to a dense numpy array:

Adding data to a `sparsestack`-array

Accessing data from `sparsestack`-array

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies

sparsestack

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Installation

Requirements

Pip Install

First code example

Create some fake data

Add second scores and filter

Scores can be accessed using (limited) slicing capabilities

Scores can also be converted to a dense numpy array:

Adding data to a sparsestack-array

Accessing data from sparsestack-array

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies

Adding data to a `sparsestack`-array

Accessing data from `sparsestack`-array