https://github.com/alleninstitute/mmc_gene_mapper
Gene ID mapper/ortholog finder for MapMyCells
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary
Repository
Gene ID mapper/ortholog finder for MapMyCells
Basic Info
- Host: GitHub
- Owner: AllenInstitute
- License: other
- Language: Python
- Default Branch: main
- Size: 427 KB
Statistics
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
MapMyCells Gene Mapper
Overview
This code provides a tool to map genes from one species/authority (where by "authority" we mean an institution like ENSEMBL or NCBI) to another species/authority using documented cross-authority and crosss-species (orthologous) gene equvialencies. This code was originally written to allow users to map data collected from an arbitrary species to the species-specific cell type taxonomies supported by the Allen Institute's MapMyCells tool and celltypemapper library.
It can, in principle, be applied to other use cases where it would be useful to map genes from one identification scheme to another.
This is a re-implementation in Python of the functionality available through the R GeneOrthology package.
Installation
To install this library, either
A) Clone the repository and run pip install . from the root directory of the
repository
or
B) Run
pip install "mmc_gene_mapper @ git+https://github.com/AllenInstitute/mmc_gene_mapper"
To install a specific version, you can run
pip install "mmc_gene_mapper @ git+https://github.com/AllenInstitute/mmc_gene_mapper@{VERSION}"
where {VERSION} is one of the
valid tags listed in this repository.
Use
This Jupyter notebook demonstrates how to use the code in this repository.
In broad strokes, you must first create a sqlite database file containing the valid gene mappings as determined from data published by NCBI and ENSEMBL. This can be done either programmatically, according to cell [5] the notebook
referenced above, or using the command line tool
python -m mmc_gene_mapper.cli.create_db_file --help
The --help will cause the arguments expected by the command line tool
to appear in stdout. A good default would be
python -m mmc_gene_maper.cli.create_db_file \
--db_path data/my_gene_mapper.db \
--local_dir data/downloaded_files/ \
--ensembl_version 114
Before you do this you must download the model that
bkbit (a data modelling package
written for BICAN) uses to parse ENSEMBL genome annotations. Do this by running
the command
bkbit download-ncbi-taxonomy
(bkbit will already be installed as a part of installing this package)
Note: This will produce a 15 GB file on disk.
Once that file has been created, you can instantiate the class
MMCGeneMapper (again, see cell [5] of the exampe notebook), and you are
ready to map genes.
Level of support
We are providing this tool to the community and any and all who want to use it. Issues and pull requests are welcome, however, this code is also intended as part of the backend for the Allen Institute Brain Knowledge Platform. As such, issues and pull requests may be declined if they interfere with the functionality required to support that service.
Owner
- Name: Allen Institute
- Login: AllenInstitute
- Kind: organization
- Location: Seattle, WA
- Website: https://alleninstitute.org
- Repositories: 184
- Profile: https://github.com/AllenInstitute
Please visit http://alleninstitute.github.io/ for more information.
GitHub Events
Total
- Push event: 33
- Create event: 5
Last Year
- Push event: 33
- Create event: 5
Dependencies
- abc_atlas_access @ git+https://github.com/AllenInstitute/abc_atlas_access
- bkbit @ git+https://github.com/danielsf/bkbit@generalize/gff/reader/20250624
- h5py *
- jupyter *
- numpy *
- pandas *
- pytest *
- pytest-xdist *
- schemasheets @ git+https://github.com/linkml/schemasheets
- scipy <1.15.0