annsel

A user-friendly library that brings familiar DataFrame-style operations to AnnData objects with a simple and expressive API.

https://github.com/srivarra/annsel

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.1%) to scientific vocabulary

Keywords

anndata narwhals
Last synced: 10 months ago · JSON representation ·

Repository

A user-friendly library that brings familiar DataFrame-style operations to AnnData objects with a simple and expressive API.

Basic Info
Statistics
  • Stars: 10
  • Watchers: 1
  • Forks: 1
  • Open Issues: 5
  • Releases: 12
Topics
anndata narwhals
Created over 1 year ago · Last pushed 10 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

annsel

| | | | :-----------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | | **Status** | [![Build][badge-build]][link-build] [![Tests][badge-test]][link-test] [![Documentation][badge-docs]][link-docs] [![codecov][badge-codecov]][link-codecov] [![pre-commit][badge-pre-commit]][link-pre-commit] | | **Meta** | [![Hatch project][badge-hatch]][link-hatch] [![Ruff][badge-ruff]][link-ruff] [![uv][badge-uv]][link-uv] [![License][badge-license]][link-license] [![gitmoji][badge-gitmoji]][link-gitmoji] | | **Package** | [![PyPI][badge-pypi]][link-pypi] [![PyPI][badge-python-versions]][link-pypi] | | **Ecosystem** | [![scverse][badge-scverse]][link-scverse] | | | |

Annsel is a user-friendly library that brings familiar dataframe-style operations to AnnData objects.

It's built on the narwhals compatibility layer for dataframes.

Take a look at the GitHub Projects board for features and future plans: Annsel Features

Getting started

Please refer to the documentation, in particular, the API documentation.

There's also a brief tutorial on how to use all the features of annsel: All of Annsel.

Installation

You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv. There are several ways to install annsel:

  1. Install the most recent release:

    With uv:

    zsh uv add annsel

    With pip:

    zsh pip install annsel

  2. Install the latest development version:

    With uv:

    zsh uv add git+https://github.com/srivarra/annsel

    With pip:

    zsh pip install git+https://github.com/srivarra/annsel.git@main

Examples

annsel comes with a small dataset from Cell X Gene to help you get familiar with the API.

```python import annsel as an

adata = an.datasets.leukemicbonemarrow_dataset() ```

The dataset looks like this:

```shell AnnData object with nobs × nvars = 31586 × 458 obs: 'ClusterID', 'donorid', 'SampleTag', 'Celllabel', 'isprimarydata', 'organismontologytermid', 'selfreportedethnicityontologytermid', 'assayontologytermid', 'tissueontologytermid', 'Genotype', 'developmentstageontologytermid', 'sexontologytermid', 'diseaseontologytermid', 'celltypeontologytermid', 'suspensiontype', 'tissuetype', 'celltype', 'assay', 'disease', 'organism', 'sex', 'tissue', 'selfreportedethnicity', 'developmentstage', 'observationjoinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'featureisfiltered', 'Unnamed: 0', 'featurename', 'featurereference', 'featurebiotype', 'featurelength', 'featuretype' uns: 'celltypeontologytermidcolors', 'citation', 'defaultembedding', 'schemareference', 'schemaversion', 'title' obsm: 'Xbothumap', 'Xpca', 'Xprojected', 'Xprojectedmean', 'Xtsneni', 'Xumapni'

```

Filter

You can filter on obs, var, var_names, obs_names, X and it's layers, as well as obsm and varm matrices as a key-value pair containing the attribute's key name and the predicate to filter on. Currently the column names are numerical indices for obsm and varm matrices.

python adata.an.filter( obs=( an.col(["Cell_label"]).is_in(["Classical Monocytes", "CD8+CD103+ tissue resident memory T cells"]), an.col(["sex"]) == "male", ), var=an.col(["vst.mean"]) >= 3, obsm={"X_pca": an.col([0]) > 0}, # PC1 values greater than 0 copy=False, # Whether to return a copy of the AnnData object or just a view of it. )

shell View of AnnData object with n_obs × n_vars = 736 × 67 obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type' uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title' obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Select

You can select on obs, var, var_names, obs_names, X and it's layers. Selecting returns a new AnnData object. It's useful if you don't need all the columns in obs or var and just want to work with a few.

python adata.an.select( obs=an.col(["Cell_label"]), var=an.col(["vst.mean", "vst.std"]), )

Group By

You can group over obs and var columns which returns a generator of objects containing the grouped data and the grouping parameters.

python gb_adata_result = adata.an.group_by( obs=an.col(["Cell_label"]), var=an.col(["feature_type"]), copy=False, )

Here's what the first group looks like:

python next(adata.an.group_by( obs=an.col(["Cell_label"]), copy=False, ))

shell GroupByAnnData: ├── Observations: │ └── Cell_label: Lymphomyeloid prog ├── Variables: │ └── (all variables) └── AnnData: View of AnnData object with n_obs × n_vars = 913 × 458 obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type' uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title' obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Pipe

There's also a small utility method which allows you to chain operations together like in Xarray and Pandas called pipe.

python import scanpy as sc adata.an.pipe(sc.pl.embedding, basis="X_tsneni", color="Cell_label")

Release notes

See the changelog.

Contact

For questions and help requests, you can reach out in the scverse discourse or the discussions tab. If you found a bug, please use the issue tracker.

Citation

Varra, S. R. annsel [Computer software]. https://github.com/srivarra/annsel

Owner

  • Name: Sricharan Reddy Varra
  • Login: srivarra
  • Kind: user
  • Location: Menlo Park, California
  • Company: @angelolab

Interested in: Data Science, Computer Graphics, Procedural Generation, and Computational Biology.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: annsel
message: >-
    If you find this software helpful, please cite it using
    the metadata from this file.
type: software
authors:
    - given-names: Sricharan Reddy
      family-names: Varra
      email: srivarra@stanford.edu
      affiliation: "Department of Pediatrics, Stanford Medicine"
      orcid: "https://orcid.org/0009-0000-5757-6818"
repository-code: "https://github.com/srivarra/annsel"
repository: "https://annsel.readthedocs.io/en/latest/"
abstract: >+
    annsel brings familiar DataFrame-style operations to
    AnnData objects, making filtering and selection intuitive
    and straightforward. Built on the narwhals library, it
    provides a seamless interface for manipulating complex
    biological datasets stored in AnnData format.

keywords:
    - AnnData
    - Data Manipulation
license: MIT

GitHub Events

Total
  • Create event: 57
  • Release event: 28
  • Issues event: 31
  • Watch event: 13
  • Delete event: 29
  • Issue comment event: 36
  • Push event: 155
  • Pull request event: 69
  • Fork event: 1
Last Year
  • Create event: 57
  • Release event: 28
  • Issues event: 31
  • Watch event: 13
  • Delete event: 29
  • Issue comment event: 36
  • Push event: 155
  • Pull request event: 69
  • Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 9
  • Total pull requests: 22
  • Average time to close issues: 5 days
  • Average time to close pull requests: 3 days
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.73
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 9
Past Year
  • Issues: 9
  • Pull requests: 22
  • Average time to close issues: 5 days
  • Average time to close pull requests: 3 days
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.73
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 9
Top Authors
Issue Authors
  • srivarra (14)
  • pre-commit-ci[bot] (1)
Pull Request Authors
  • srivarra (20)
  • pre-commit-ci[bot] (14)
Top Labels
Issue Labels
enhancement (4) bug (2) github-actions (1) documentation (1)
Pull Request Labels
documentation (4) enhancement (4) dependencies (2) github-actions (1) bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 108 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 11
  • Total maintainers: 1
pypi.org: annsel

A Narwhals powered DataFrame-style selection, filtering and indexing operations on AnnData Objects.

  • Homepage: https://github.com/srivarra/annsel
  • Documentation: https://annsel.readthedocs.io/
  • License: MIT License Copyright (c) 2024, Sricharan Reddy Varra Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  • Latest release: 0.1.1
    published about 1 year ago
  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 108 Last month
Rankings
Dependent packages count: 10.1%
Forks count: 31.9%
Average: 35.1%
Stargazers count: 41.6%
Dependent repos count: 56.6%
Maintainers (1)
Last synced: 10 months ago

Dependencies

.github/workflows/build.yaml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
  • astral-sh/setup-uv v3 composite
.github/workflows/release.yaml actions
  • actions/checkout v4 composite
  • actions/download-artifact v4 composite
  • actions/upload-artifact v4 composite
  • astral-sh/setup-uv v3 composite
  • pypa/gh-action-pypi-publish release/v1.9 composite
  • sigstore/gh-action-sigstore-python v3.0.0 composite
.github/workflows/test.yaml actions
  • actions/checkout v4 composite
  • astral-sh/setup-uv v3 composite
  • codecov/codecov-action v4 composite
  • codecov/test-results-action v1 composite
pyproject.toml pypi
  • anndata *
  • more-itertools >=10.5
  • narwhals >=1.13.2
  • session-info *