mudata

Multimodal Data (.h5mu) implementation for Python

https://github.com/scverse/mudata

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    2 of 15 committers (13.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.4%) to scientific vocabulary

Keywords

anndata data-analysis genomics mudata multi-omics multimodal-omics-analysis muon scverse

Keywords from Contributors

scanpy scrna-seq cite-seq multimodal-data scatac-seq bioinformatics transcriptomics single-cell zarr omics
Last synced: 4 months ago · JSON representation ·

Repository

Multimodal Data (.h5mu) implementation for Python

Basic Info
  • Host: GitHub
  • Owner: scverse
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage: https://mudata.rtfd.io
  • Size: 1.05 MB
Statistics
  • Stars: 99
  • Watchers: 6
  • Forks: 20
  • Open Issues: 14
  • Releases: 10
Topics
anndata data-analysis genomics mudata multi-omics multimodal-omics-analysis muon scverse
Created over 4 years ago · Last pushed 8 months ago
Metadata Files
Readme Contributing License Citation

README.md

mudata header

Documentation Status PyPi version Powered by NumFOCUS

MuData – multimodal data

Documentation | Publication

For using MuData in multimodal omics applications see muon.

Data structure

In the same vein as AnnData is designed to represent unimodal annotated datasets in Python, MuData is designed to provide functionality to load, process, and store multimodal omics data.

MuData .obs -- annotation of observations (cells, samples) .var -- annotation of features (genes, genomic loci, etc.) .obsm -- multidimensional cell annotation, incl. a boolean for each modality that links .obs to the cells of that modality .varm -- multidimensional feature annotation, incl. a boolean vector for each modality that links .var to the features of that modality .mod AnnData .X -- data matrix (cells x features) .obs -- cell metadata (assay-specific) .var -- annotation of features (genes, peaks, genomic sites) .obsm .varm .uns .uns

Overview

Input

MuData can be thought of as a multimodal container, in which every modality is an AnnData object:

```py from mudata import MuData

mdata = MuData({'rna': adatarna, 'atac': adataatac}) ```

If multimodal data from 10X Genomics is to be read, convenient readers are provided by muon that return a MuData object with AnnData objects inside, each corresponding to its own modality:

```py import muon as mu

mu.read10xh5("filteredfeaturebc_matrix.h5")

MuData object with nobs × nvars = 10000 × 80000

2 modalities

rna: 10000 x 30000

var: 'geneids', 'featuretypes', 'genome', 'interval'

atac: 10000 x 50000

var: 'geneids', 'featuretypes', 'genome', 'interval'

uns: 'atac', 'files'

```

I/O with .h5mu files

MuData objects represent modalities as collections of AnnData objects. These collections can be saved to disk and retrieved using HDF5-based .h5mu files, which design is based on .h5ad file structure.

```py import mudata as md

mdatapbmc.write("pbmc10k.h5mu") mdata = md.read("pbmc_10k.h5mu") ```

It allows to effectively use the hierarchical nature of HDF5 files and to read/write AnnData object directly from/to .h5mu files:

py adata = md.read("pbmc_10k.h5mu/rna") md.write("pbmc_10k.h5mu/rna", adata)

Citation

If you use mudata in your work, please cite the publication as follows:

MUON: multimodal omics analysis framework

Danila Bredikhin, Ilia Kats, Oliver Stegle

Genome Biology 2022 Feb 01. doi: 10.1186/s13059-021-02577-8.

You can cite the scverse publication as follows:

The scverse project provides a computational ecosystem for single-cell omics data analysis

Isaac Virshup, Danila Bredikhin, Lukas Heumos, Giovanni Palla, Gregor Sturm, Adam Gayoso, Ilia Kats, Mikaela Koutrouli, Scverse Community, Bonnie Berger, Dana Pe’er, Aviv Regev, Sarah A. Teichmann, Francesca Finotello, F. Alexander Wolf, Nir Yosef, Oliver Stegle & Fabian J. Theis

Nat Biotechnol. 2023 Apr 10. doi: 10.1038/s41587-023-01733-8.

mudata is part of the scverse® project (website, governance) and is fiscally sponsored by NumFOCUS. If you like scverse® and want to support our mission, please consider making a tax-deductible donation to help the project pay for developer time, professional services, travel, workshops, and a variety of other needs.

Owner

  • Name: scverse
  • Login: scverse
  • Kind: organization

Foundational tools for omics data in the life sciences

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Bredikhin"
  given-names: "Danila"
  orcid: "https://orcid.org/0000-0001-8089-6983"
- family-names: "Kats"
  given-names: "Ilia"
  orcid: "https://orcid.org/0000-0001-5220-5671"
title: "muon"
version: 1.0.0
date-released: 2021-06-01
url: "https://github.com/scverse/muon"
preferred-citation:
  type: article
  authors:
  - family-names: "Bredikhin"
    given-names: "Danila"
    orcid: "https://orcid.org/0000-0001-8089-6983"
  - family-names: "Kats"
    given-names: "Ilia"
    orcid: "https://orcid.org/0000-0001-5220-5671"
  - family-names: "Stegle"
    given-names: "Oliver"
    orcid: "https://orcid.org/0000-0002-8818-7193"
  doi: "10.1186/s13059-021-02577-8"
  journal: "Genome Biology"
  month: 2
  title: "MUON: multimodal omics analysis framework"
  year: 2022

GitHub Events

Total
  • Create event: 2
  • Release event: 1
  • Issues event: 25
  • Watch event: 22
  • Issue comment event: 22
  • Push event: 19
  • Pull request event: 8
  • Fork event: 4
Last Year
  • Create event: 2
  • Release event: 1
  • Issues event: 25
  • Watch event: 22
  • Issue comment event: 22
  • Push event: 19
  • Pull request event: 8
  • Fork event: 4

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 570
  • Total Committers: 15
  • Avg Commits per committer: 38.0
  • Development Distribution Score (DDS): 0.204
Past Year
  • Commits: 47
  • Committers: 5
  • Avg Commits per committer: 9.4
  • Development Distribution Score (DDS): 0.255
Top Committers
Name Email Commits
Danila Bredikhin d****n@e****e 454
Ilia Kats i****s@g****t 83
mkeller 7****k 5
ilan-gold i****d@g****m 5
Max Frank m****k@g****m 4
Wouter-Michiel Vierdag w****v@h****m 3
Isaac Virshup i****p@g****m 3
bv2 b****n@g****m 3
Isaac E m****y@g****m 2
Philipp Weiler w****p@g****m 2
Lukas Heumos l****s@p****t 2
Jeongbin Park p****7@g****m 1
Michaela Müller 5****e 1
Robrecht Cannoodt r****d@g****m 1
mikelkou m****i@c****k 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 5 months ago

All Time
  • Total issues: 68
  • Total pull requests: 39
  • Average time to close issues: 8 months
  • Average time to close pull requests: 3 months
  • Total issue authors: 41
  • Total pull request authors: 13
  • Average comments per issue: 2.44
  • Average comments per pull request: 0.92
  • Merged pull requests: 27
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 17
  • Pull requests: 12
  • Average time to close issues: 2 months
  • Average time to close pull requests: 12 days
  • Issue authors: 12
  • Pull request authors: 3
  • Average comments per issue: 1.41
  • Average comments per pull request: 0.83
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ivirshup (9)
  • gtca (7)
  • grst (5)
  • Zethson (4)
  • scverse-bot (4)
  • emdann (2)
  • racng (2)
  • bio-la (2)
  • ouyaqing (1)
  • mumichae (1)
  • Imipenem (1)
  • joshchiou (1)
  • martinkim0 (1)
  • raozuming (1)
  • danli349 (1)
Pull Request Authors
  • gtca (26)
  • ilan-gold (7)
  • ilia-kats (4)
  • votti (4)
  • IsaacUtah1379 (2)
  • martinkim0 (2)
  • keller-mark (1)
  • mffrank (1)
  • Zethson (1)
  • rcannood (1)
  • mumichae (1)
  • ivirshup (1)
Top Labels
Issue Labels
enhancement (29) bug (29) good first issue (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 65,477 last-month
  • Total docker downloads: 7,246
  • Total dependent packages: 26
    (may contain duplicates)
  • Total dependent repositories: 12
    (may contain duplicates)
  • Total versions: 17
  • Total maintainers: 1
pypi.org: mudata

Multimodal data

  • Versions: 14
  • Dependent Packages: 24
  • Dependent Repositories: 12
  • Downloads: 65,477 Last month
  • Docker Downloads: 7,246
Rankings
Dependent packages count: 0.7%
Docker downloads count: 1.5%
Downloads: 1.9%
Dependent repos count: 4.3%
Average: 4.7%
Stargazers count: 9.1%
Forks count: 10.6%
Maintainers (1)
Last synced: 5 months ago
conda-forge.org: mudata
  • Versions: 3
  • Dependent Packages: 2
  • Dependent Repositories: 0
Rankings
Dependent packages count: 19.5%
Dependent repos count: 34.0%
Average: 35.2%
Stargazers count: 41.4%
Forks count: 46.0%
Last synced: 5 months ago

Dependencies

docs/source/notebooks/requirements.txt pypi
  • anndata *
  • mudata *
  • numpy *
  • pandas *
.github/workflows/black.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • psf/black stable composite
.github/workflows/codecov.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v2 composite
.github/workflows/dev.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
.github/workflows/pythonpackage.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
pyproject.toml pypi