annsel
A user-friendly library that brings familiar DataFrame-style operations to AnnData objects with a simple and expressive API.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (5.1%) to scientific vocabulary
Keywords
Repository
A user-friendly library that brings familiar DataFrame-style operations to AnnData objects with a simple and expressive API.
Basic Info
- Host: GitHub
- Owner: srivarra
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://annsel.readthedocs.io/en/latest/
- Size: 23.2 MB
Statistics
- Stars: 10
- Watchers: 1
- Forks: 1
- Open Issues: 5
- Releases: 12
Topics
Metadata Files
README.md
annsel
Annsel is a user-friendly library that brings familiar dataframe-style operations to AnnData objects.
It's built on the narwhals compatibility layer for dataframes.
Take a look at the GitHub Projects board for features and future plans: Annsel Features
Getting started
Please refer to the documentation, in particular, the API documentation.
There's also a brief tutorial on how to use all the features of annsel: All of Annsel.
Installation
You need to have Python 3.10 or newer installed on your system. If you don't have
Python installed, we recommend installing uv.
There are several ways to install annsel:
Install the most recent release:
With
uv:zsh uv add annselWith
pip:zsh pip install annselInstall the latest development version:
With
uv:zsh uv add git+https://github.com/srivarra/annselWith
pip:zsh pip install git+https://github.com/srivarra/annsel.git@main
Examples
annsel comes with a small dataset from Cell X Gene to help you get familiar with the API.
```python import annsel as an
adata = an.datasets.leukemicbonemarrow_dataset() ```
The dataset looks like this:
```shell AnnData object with nobs × nvars = 31586 × 458 obs: 'ClusterID', 'donorid', 'SampleTag', 'Celllabel', 'isprimarydata', 'organismontologytermid', 'selfreportedethnicityontologytermid', 'assayontologytermid', 'tissueontologytermid', 'Genotype', 'developmentstageontologytermid', 'sexontologytermid', 'diseaseontologytermid', 'celltypeontologytermid', 'suspensiontype', 'tissuetype', 'celltype', 'assay', 'disease', 'organism', 'sex', 'tissue', 'selfreportedethnicity', 'developmentstage', 'observationjoinid' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'featureisfiltered', 'Unnamed: 0', 'featurename', 'featurereference', 'featurebiotype', 'featurelength', 'featuretype' uns: 'celltypeontologytermidcolors', 'citation', 'defaultembedding', 'schemareference', 'schemaversion', 'title' obsm: 'Xbothumap', 'Xpca', 'Xprojected', 'Xprojectedmean', 'Xtsneni', 'Xumapni'
```
Filter
You can filter on obs, var, var_names, obs_names, X and it's layers, as well as obsm and varm matrices as a key-value pair containing the attribute's key name and the predicate to filter on. Currently the column names are numerical indices for obsm and varm matrices.
python
adata.an.filter(
obs=(
an.col(["Cell_label"]).is_in(["Classical Monocytes", "CD8+CD103+ tissue resident memory T cells"]),
an.col(["sex"]) == "male",
),
var=an.col(["vst.mean"]) >= 3,
obsm={"X_pca": an.col([0]) > 0}, # PC1 values greater than 0
copy=False, # Whether to return a copy of the AnnData object or just a view of it.
)
shell
View of AnnData object with n_obs × n_vars = 736 × 67
obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'
Select
You can select on obs, var, var_names, obs_names, X and it's layers. Selecting returns a new AnnData object. It's useful if you don't need all the columns in obs or var and just want to work with a few.
python
adata.an.select(
obs=an.col(["Cell_label"]),
var=an.col(["vst.mean", "vst.std"]),
)
Group By
You can group over obs and var columns which returns a generator of objects containing the grouped data and the grouping parameters.
python
gb_adata_result = adata.an.group_by(
obs=an.col(["Cell_label"]),
var=an.col(["feature_type"]),
copy=False,
)
Here's what the first group looks like:
python
next(adata.an.group_by(
obs=an.col(["Cell_label"]),
copy=False,
))
shell
GroupByAnnData:
├── Observations:
│ └── Cell_label: Lymphomyeloid prog
├── Variables:
│ └── (all variables)
└── AnnData:
View of AnnData object with n_obs × n_vars = 913 × 458
obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'
Pipe
There's also a small utility method which allows you to chain operations together like in Xarray and Pandas called pipe.
python
import scanpy as sc
adata.an.pipe(sc.pl.embedding, basis="X_tsneni", color="Cell_label")
Release notes
See the changelog.
Contact
For questions and help requests, you can reach out in the scverse discourse or the discussions tab. If you found a bug, please use the issue tracker.
Citation
Varra, S. R. annsel [Computer software]. https://github.com/srivarra/annsel
Owner
- Name: Sricharan Reddy Varra
- Login: srivarra
- Kind: user
- Location: Menlo Park, California
- Company: @angelolab
- Repositories: 1
- Profile: https://github.com/srivarra
Interested in: Data Science, Computer Graphics, Procedural Generation, and Computational Biology.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: annsel
message: >-
If you find this software helpful, please cite it using
the metadata from this file.
type: software
authors:
- given-names: Sricharan Reddy
family-names: Varra
email: srivarra@stanford.edu
affiliation: "Department of Pediatrics, Stanford Medicine"
orcid: "https://orcid.org/0009-0000-5757-6818"
repository-code: "https://github.com/srivarra/annsel"
repository: "https://annsel.readthedocs.io/en/latest/"
abstract: >+
annsel brings familiar DataFrame-style operations to
AnnData objects, making filtering and selection intuitive
and straightforward. Built on the narwhals library, it
provides a seamless interface for manipulating complex
biological datasets stored in AnnData format.
keywords:
- AnnData
- Data Manipulation
license: MIT
GitHub Events
Total
- Create event: 57
- Release event: 28
- Issues event: 31
- Watch event: 13
- Delete event: 29
- Issue comment event: 36
- Push event: 155
- Pull request event: 69
- Fork event: 1
Last Year
- Create event: 57
- Release event: 28
- Issues event: 31
- Watch event: 13
- Delete event: 29
- Issue comment event: 36
- Push event: 155
- Pull request event: 69
- Fork event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 9
- Total pull requests: 22
- Average time to close issues: 5 days
- Average time to close pull requests: 3 days
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.73
- Merged pull requests: 18
- Bot issues: 0
- Bot pull requests: 9
Past Year
- Issues: 9
- Pull requests: 22
- Average time to close issues: 5 days
- Average time to close pull requests: 3 days
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.73
- Merged pull requests: 18
- Bot issues: 0
- Bot pull requests: 9
Top Authors
Issue Authors
- srivarra (14)
- pre-commit-ci[bot] (1)
Pull Request Authors
- srivarra (20)
- pre-commit-ci[bot] (14)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 108 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 11
- Total maintainers: 1
pypi.org: annsel
A Narwhals powered DataFrame-style selection, filtering and indexing operations on AnnData Objects.
- Homepage: https://github.com/srivarra/annsel
- Documentation: https://annsel.readthedocs.io/
- License: MIT License Copyright (c) 2024, Sricharan Reddy Varra Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
Latest release: 0.1.1
published about 1 year ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v4 composite
- actions/upload-artifact v4 composite
- astral-sh/setup-uv v3 composite
- actions/checkout v4 composite
- actions/download-artifact v4 composite
- actions/upload-artifact v4 composite
- astral-sh/setup-uv v3 composite
- pypa/gh-action-pypi-publish release/v1.9 composite
- sigstore/gh-action-sigstore-python v3.0.0 composite
- actions/checkout v4 composite
- astral-sh/setup-uv v3 composite
- codecov/codecov-action v4 composite
- codecov/test-results-action v1 composite
- anndata *
- more-itertools >=10.5
- narwhals >=1.13.2
- session-info *