annsel

A user-friendly library that brings familiar DataFrame-style operations to AnnData objects with a simple and expressive API.

https://github.com/srivarra/annsel

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (5.1%) to scientific vocabulary

Keywords

anndata narwhals

Last synced: 10 months ago · JSON representation ·

Repository

A user-friendly library that brings familiar DataFrame-style operations to AnnData objects with a simple and expressive API.

Basic Info

Host: GitHub
Owner: srivarra
License: mit
Language: Python
Default Branch: main
Homepage: https://annsel.readthedocs.io/en/latest/
Size: 23.2 MB

Statistics

Stars: 10
Watchers: 1
Forks: 1
Open Issues: 5
Releases: 12

Topics

anndata narwhals

Created over 1 year ago · Last pushed 10 months ago

Metadata Files

Readme Changelog Contributing License Citation

annsel

| | | | :-----------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | **Status** | [![Build][badge-build]][link-build] [![Tests][badge-test]][link-test] [![Documentation][badge-docs]][link-docs] [![codecov][badge-codecov]][link-codecov] [![pre-commit][badge-pre-commit]][link-pre-commit] | | **Meta** | [![Hatch project][badge-hatch]][link-hatch] [![Ruff][badge-ruff]][link-ruff] [![uv][badge-uv]][link-uv] [![License][badge-license]][link-license] [![gitmoji][badge-gitmoji]][link-gitmoji] | | **Package** | [![PyPI][badge-pypi]][link-pypi] [![PyPI][badge-python-versions]][link-pypi] | | **Ecosystem** | [![scverse][badge-scverse]][link-scverse] | | | |

Annsel is a user-friendly library that brings familiar dataframe-style operations to AnnData objects.

It's built on the narwhals compatibility layer for dataframes.

Take a look at the GitHub Projects board for features and future plans: Annsel Features

Getting started

Please refer to the documentation, in particular, the API documentation.

There's also a brief tutorial on how to use all the features of annsel: All of Annsel.

Installation

You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv. There are several ways to install annsel:

Install the most recent release:

With uv:

zsh uv add annsel

With pip:

zsh pip install annsel
Install the latest development version:

With uv:

zsh uv add git+https://github.com/srivarra/annsel

With pip:

zsh pip install git+https://github.com/srivarra/annsel.git@main

Examples

annsel comes with a small dataset from Cell X Gene to help you get familiar with the API.

```python import annsel as an

adata = an.datasets.leukemicbonemarrow_dataset() ```

The dataset looks like this:

```shell AnnData object with nobs × nvars = 31586 × 458 obs: 'ClusterID', 'donorid', 'SampleTag', 'Celllabel', 'isprimarydata', 'organismontologytermid', 'selfreportedethnicityontologytermid', 'assayontologytermid', 'tissueontologytermid', 'Genotype', 'developmentstageontologytermid', 'sexontologytermid', 'diseaseontologytermid', 'celltypeontologytermid', 'suspensiontype', 'tissuetype', 'celltype', 'assay', 'disease', 'organism', 'sex', 'tissue', 'selfreportedethnicity', 'developmentstage', 'observationjoinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'featureisfiltered', 'Unnamed: 0', 'featurename', 'featurereference', 'featurebiotype', 'featurelength', 'featuretype' uns: 'celltypeontologytermidcolors', 'citation', 'defaultembedding', 'schemareference', 'schemaversion', 'title' obsm: 'Xbothumap', 'Xpca', 'Xprojected', 'Xprojectedmean', 'Xtsneni', 'Xumapni'

```

Filter

You can filter on obs, var, var_names, obs_names, X and it's layers, as well as obsm and varm matrices as a key-value pair containing the attribute's key name and the predicate to filter on. Currently the column names are numerical indices for obsm and varm matrices.

python adata.an.filter( obs=( an.col(["Cell_label"]).is_in(["Classical Monocytes", "CD8+CD103+ tissue resident memory T cells"]), an.col(["sex"]) == "male", ), var=an.col(["vst.mean"]) >= 3, obsm={"X_pca": an.col([0]) > 0}, # PC1 values greater than 0 copy=False, # Whether to return a copy of the AnnData object or just a view of it. )

shell View of AnnData object with n_obs × n_vars = 736 × 67 obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type' uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title' obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Select

You can select on obs, var, var_names, obs_names, X and it's layers. Selecting returns a new AnnData object. It's useful if you don't need all the columns in obs or var and just want to work with a few.

python adata.an.select( obs=an.col(["Cell_label"]), var=an.col(["vst.mean", "vst.std"]), )

Group By

You can group over obs and var columns which returns a generator of objects containing the grouped data and the grouping parameters.

python gb_adata_result = adata.an.group_by( obs=an.col(["Cell_label"]), var=an.col(["feature_type"]), copy=False, )

Here's what the first group looks like:

python next(adata.an.group_by( obs=an.col(["Cell_label"]), copy=False, ))

shell GroupByAnnData: ├── Observations: │ └── Cell_label: Lymphomyeloid prog ├── Variables: │ └── (all variables) └── AnnData: View of AnnData object with n_obs × n_vars = 913 × 458 obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type' uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title' obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Pipe

There's also a small utility method which allows you to chain operations together like in Xarray and Pandas called pipe.

python import scanpy as sc adata.an.pipe(sc.pl.embedding, basis="X_tsneni", color="Cell_label")

Release notes

See the changelog.

Contact

For questions and help requests, you can reach out in the scverse discourse or the discussions tab. If you found a bug, please use the issue tracker.

Citation

Varra, S. R. annsel [Computer software]. https://github.com/srivarra/annsel

Owner

Name: Sricharan Reddy Varra
Login: srivarra
Kind: user
Location: Menlo Park, California
Company: @angelolab

Repositories: 1
Profile: https://github.com/srivarra

Interested in: Data Science, Computer Graphics, Procedural Generation, and Computational Biology.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: annsel
message: >-
    If you find this software helpful, please cite it using
    the metadata from this file.
type: software
authors:
    - given-names: Sricharan Reddy
      family-names: Varra
      email: srivarra@stanford.edu
      affiliation: "Department of Pediatrics, Stanford Medicine"
      orcid: "https://orcid.org/0009-0000-5757-6818"
repository-code: "https://github.com/srivarra/annsel"
repository: "https://annsel.readthedocs.io/en/latest/"
abstract: >+
    annsel brings familiar DataFrame-style operations to
    AnnData objects, making filtering and selection intuitive
    and straightforward. Built on the narwhals library, it
    provides a seamless interface for manipulating complex
    biological datasets stored in AnnData format.

keywords:
    - AnnData
    - Data Manipulation
license: MIT

GitHub Events

Total

Create event: 57
Release event: 28
Issues event: 31
Watch event: 13
Delete event: 29
Issue comment event: 36
Push event: 155
Pull request event: 69
Fork event: 1

Last Year

Create event: 57
Release event: 28
Issues event: 31
Watch event: 13
Delete event: 29
Issue comment event: 36
Push event: 155
Pull request event: 69
Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 9
Total pull requests: 22
Average time to close issues: 5 days
Average time to close pull requests: 3 days
Total issue authors: 1
Total pull request authors: 2
Average comments per issue: 0.0
Average comments per pull request: 0.73
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 9

Past Year

Issues: 9
Pull requests: 22
Average time to close issues: 5 days
Average time to close pull requests: 3 days
Issue authors: 1
Pull request authors: 2
Average comments per issue: 0.0
Average comments per pull request: 0.73
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 9

View more stats

Top Authors

Issue Authors

srivarra (14)
pre-commit-ci[bot] (1)

Pull Request Authors

srivarra (20)
pre-commit-ci[bot] (14)

Top Labels

Issue Labels

enhancement (4) bug (2) github-actions (1) documentation (1)

Pull Request Labels

documentation (4) enhancement (4) dependencies (2) github-actions (1) bug (1)

Packages

Total packages: 1
Total downloads:
- pypi 108 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 11
Total maintainers: 1

pypi.org: annsel

A Narwhals powered DataFrame-style selection, filtering and indexing operations on AnnData Objects.

Homepage: https://github.com/srivarra/annsel
Documentation: https://annsel.readthedocs.io/
License: MIT License Copyright (c) 2024, Sricharan Reddy Varra Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Latest release: 0.1.1
published about 1 year ago

Versions: 11
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 108 Last month

Rankings

Dependent packages count: 10.1%

Forks count: 31.9%

Average: 35.1%

Stargazers count: 41.6%

Dependent repos count: 56.6%

Maintainers (1)

srivarra

Last synced: 10 months ago

Dependencies

.github/workflows/build.yaml actions

actions/checkout v4 composite
actions/upload-artifact v4 composite
astral-sh/setup-uv v3 composite

.github/workflows/release.yaml actions

actions/checkout v4 composite
actions/download-artifact v4 composite
actions/upload-artifact v4 composite
astral-sh/setup-uv v3 composite
pypa/gh-action-pypi-publish release/v1.9 composite
sigstore/gh-action-sigstore-python v3.0.0 composite

.github/workflows/test.yaml actions

actions/checkout v4 composite
astral-sh/setup-uv v3 composite
codecov/codecov-action v4 composite
codecov/test-results-action v1 composite

pyproject.toml pypi

anndata *
more-itertools >=10.5
narwhals >=1.13.2
session-info *

annsel

Science Score: 44.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

annsel

Getting started

Installation

Examples

Filter

Select

Group By

Pipe

Release notes

Contact

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: annsel

Rankings

Maintainers (1)

Dependencies