https://github.com/alleninstitute/mmc_gene_mapper

Gene ID mapper/ortholog finder for MapMyCells

https://github.com/alleninstitute/mmc_gene_mapper

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.0%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Gene ID mapper/ortholog finder for MapMyCells

Basic Info
  • Host: GitHub
  • Owner: AllenInstitute
  • License: other
  • Language: Python
  • Default Branch: main
  • Size: 427 KB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 9 months ago
Metadata Files
Readme Contributing License

README.md

MapMyCells Gene Mapper

Overview

This code provides a tool to map genes from one species/authority (where by "authority" we mean an institution like ENSEMBL or NCBI) to another species/authority using documented cross-authority and crosss-species (orthologous) gene equvialencies. This code was originally written to allow users to map data collected from an arbitrary species to the species-specific cell type taxonomies supported by the Allen Institute's MapMyCells tool and celltypemapper library.

It can, in principle, be applied to other use cases where it would be useful to map genes from one identification scheme to another.

This is a re-implementation in Python of the functionality available through the R GeneOrthology package.

Installation

To install this library, either

A) Clone the repository and run pip install . from the root directory of the repository

or

B) Run pip install "mmc_gene_mapper @ git+https://github.com/AllenInstitute/mmc_gene_mapper"

To install a specific version, you can run pip install "mmc_gene_mapper @ git+https://github.com/AllenInstitute/mmc_gene_mapper@{VERSION}" where {VERSION} is one of the valid tags listed in this repository.

Use

This Jupyter notebook demonstrates how to use the code in this repository.

In broad strokes, you must first create a sqlite database file containing the valid gene mappings as determined from data published by NCBI and ENSEMBL. This can be done either programmatically, according to cell [5] the notebook referenced above, or using the command line tool python -m mmc_gene_mapper.cli.create_db_file --help The --help will cause the arguments expected by the command line tool to appear in stdout. A good default would be python -m mmc_gene_maper.cli.create_db_file \ --db_path data/my_gene_mapper.db \ --local_dir data/downloaded_files/ \ --ensembl_version 114

Before you do this you must download the model that bkbit (a data modelling package written for BICAN) uses to parse ENSEMBL genome annotations. Do this by running the command bkbit download-ncbi-taxonomy (bkbit will already be installed as a part of installing this package)

Note: This will produce a 15 GB file on disk.

Once that file has been created, you can instantiate the class MMCGeneMapper (again, see cell [5] of the exampe notebook), and you are ready to map genes.

Level of support

We are providing this tool to the community and any and all who want to use it. Issues and pull requests are welcome, however, this code is also intended as part of the backend for the Allen Institute Brain Knowledge Platform. As such, issues and pull requests may be declined if they interfere with the functionality required to support that service.

Owner

  • Name: Allen Institute
  • Login: AllenInstitute
  • Kind: organization
  • Location: Seattle, WA

Please visit http://alleninstitute.github.io/ for more information.

GitHub Events

Total
  • Push event: 33
  • Create event: 5
Last Year
  • Push event: 33
  • Create event: 5

Dependencies

pyproject.toml pypi
  • abc_atlas_access @ git+https://github.com/AllenInstitute/abc_atlas_access
  • bkbit @ git+https://github.com/danielsf/bkbit@generalize/gff/reader/20250624
  • h5py *
  • jupyter *
  • numpy *
  • pandas *
  • pytest *
  • pytest-xdist *
  • schemasheets @ git+https://github.com/linkml/schemasheets
  • scipy <1.15.0