orthogene

🧬 o r t h o g e n e 🧬✨✨✨✨✨✨✨ Interspecies gene mapping✨✨✨✨✨ 🦠 πŸ” 🌱 πŸ” 🌳 πŸ” 🍎 πŸ” 🍊 πŸ” πŸͺ± πŸ” πŸͺ° πŸ” 🐟 πŸ” 🦎 πŸ” πŸ“ πŸ” πŸ¦‡ πŸ” πŸ„ πŸ” πŸ– πŸ” 🐐 πŸ” 🐎 πŸ” 🐈 πŸ” πŸ• πŸ” 🐁 πŸ” πŸ’ πŸ” 🦧 πŸ” 🦍 πŸ” πŸƒβ€β™€οΈ

https://github.com/neurogenomics/orthogene

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • β—‹
    CITATION.cff file
  • βœ“
    codemeta.json file
    Found codemeta.json file
  • β—‹
    .zenodo.json file
  • β—‹
    DOI references
  • βœ“
    Academic publication links
    Links to: ncbi.nlm.nih.gov
  • βœ“
    Committers with academic emails
    2 of 5 committers (40.0%) from academic institutions
  • β—‹
    Institutional organization owner
  • β—‹
    JOSS paper metadata
  • β—‹
    Scientific vocabulary similarity
    Low similarity (11.6%) to scientific vocabulary

Keywords

animal-models bioconductor bioconductor-package bioinformatics biomedicine comparative-genomics evolutionary-biology genes genomics ontologies r r-package translational-research

Keywords from Contributors

grna-sequence immune-repertoire sequencing biology taxonomy proteomics statistical-analysis data-access microbiome-analysis looking-for-maintainer
Last synced: 6 months ago · JSON representation

Repository

🧬 o r t h o g e n e 🧬✨✨✨✨✨✨✨ Interspecies gene mapping✨✨✨✨✨ 🦠 πŸ” 🌱 πŸ” 🌳 πŸ” 🍎 πŸ” 🍊 πŸ” πŸͺ± πŸ” πŸͺ° πŸ” 🐟 πŸ” 🦎 πŸ” πŸ“ πŸ” πŸ¦‡ πŸ” πŸ„ πŸ” πŸ– πŸ” 🐐 πŸ” 🐎 πŸ” 🐈 πŸ” πŸ• πŸ” 🐁 πŸ” πŸ’ πŸ” 🦧 πŸ” 🦍 πŸ” πŸƒβ€β™€οΈ

Basic Info
Statistics
  • Stars: 44
  • Watchers: 2
  • Forks: 4
  • Open Issues: 12
  • Releases: 2
Topics
animal-models bioconductor bioconductor-package bioinformatics biomedicine comparative-genomics evolutionary-biology genes genomics ontologies r r-package translational-research
Created over 4 years ago · Last pushed 12 months ago
Metadata Files
Readme Changelog

README.Rmd

---
title: "`orthogene`: Interspecies gene mapping"  
author: "`r rworkflows::use_badges(branch='main', add_bioc_release = TRUE, add_bioc_download_month = TRUE, add_bioc_download_rank = TRUE, add_bioc_download_total = TRUE)`" 
date: "

README updated: `r format( Sys.Date(), '%b-%d-%Y')`

" output: github_document --- ```{r, echo=FALSE, include=FALSE} pkg <- read.dcf("DESCRIPTION", fields = "Package")[1] title <- read.dcf("DESCRIPTION", fields = "Title")[1] description <- read.dcf("DESCRIPTION", fields = "Description")[1] URL <- read.dcf('DESCRIPTION', fields = 'URL')[1] owner <- tolower(strsplit(URL,"/")[[1]][4]) ``` # Intro `r description` In brief, `orthogene` lets you easily: - [**`convert_orthologs`** between any two species.](https://neurogenomics.github.io/orthogene/articles/orthogene#convert-orthologs) - [**`map_species`** names onto standard taxonomic ontologies.](https://neurogenomics.github.io/orthogene/articles/orthogene#map-species) - [**`report_orthologs`** between any two species.](https://neurogenomics.github.io/orthogene/articles/orthogene#report-orthologs) - [**`map_genes`** onto standard ontologies](https://neurogenomics.github.io/orthogene/articles/orthogene#map-genes) - [**`aggregate_mapped_genes`** in a matrix.](https://neurogenomics.github.io/orthogene/articles/orthogene#aggregate-mapped-genes) - [get **`all_genes`** from any species.](https://neurogenomics.github.io/orthogene/articles/orthogene#get-all-genes) - [**`infer_species`** from gene names.](https://neurogenomics.github.io/orthogene/articles/infer_species.html) - [**`create_background`** gene lists based one, two, or more species.](https://neurogenomics.github.io/orthogene/reference/create_background.html) - [**`get_silhouettes`** of each species from phylopic.](https://neurogenomics.github.io/orthogene/reference/get_silhouettes.html) - [**`prepare_tree`** with evolutionary divergence times across >147,000 species.](https://neurogenomics.github.io/orthogene/reference/prepare_tree.html) ## Citation If you use ``r pkg``, please cite: > `r citation(pkg)$textVersion` ## [Documentation website](https://neurogenomics.github.io/orthogene/) ## [PDF manual](https://github.com/neurogenomics/orthogene/blob/main/inst/orthogene_1.5.1.pdf) # Installation ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") # orthogene is only available on Bioconductor>=3.14 if(BiocManager::version()<"3.14") BiocManager::install(update = TRUE, ask = FALSE) BiocManager::install("orthogene") ``` ## Docker `orthogene` can also be installed via a [Docker](https://hub.docker.com/repository/docker/neurogenomicslab/orthogene) or [Singularity](https://sylabs.io/guides/2.6/user-guide/singularity_and_docker.html) container with Rstudio pre-installed. Further [instructions provided here](https://neurogenomics.github.io/orthogene/articles/docker). # Methods ```{r setup} library(orthogene) data("exp_mouse") # Setting to "homologene" for the purposes of quick demonstration. # We generally recommend using method="gprofiler" (default). method <- "homologene" ``` For most functions, `orthogene` lets users choose between different methods, each with complementary strengths and weaknesses: `"gprofiler"`, `"homologene"`, and `"babelgene"` In general, we recommend you use `"gprofiler"` when possible, as it tends to be more comprehensive. While `"babelgene"` contains less species, it queries a wide variety of orthology databases and can return a column "support_n" that tells you how many databases support each ortholog gene mapping. This can be helpful when you need a semi-quantitative measure of mapping quality. It's also worth noting that for smaller gene sets, the speed difference between these methods becomes negligible. ```{r pros_cons, echo=FALSE} pros_cons <- data.frame( gprofiler=c("Reference organisms"="700+", "Gene mappings"="More comprehensive", "Updates"="Frequent", "Orthology databases"=paste("Ensembl", "HomoloGene", "WormBase",sep = ", "), "Data location"="Remote", "Internet connection"="Required", "Speed"="Slower"), homologene=c("# reference organisms"="20+", "Gene mappings"="Less comprehensive", "Updates"="Less frequent", "Orthology databases"="HomoloGene", "Data location"="Local", "Internet connection"="Not required", "Speed"="Faster"), babelgene=c("# reference organisms"="19 (but cannot convert between pairs of non-human species)", "Gene mappings"="More comprehensive", "Updates"="Less frequent", "Orthology databases"="HGNC Comparison of Orthology Predictions (HCOP), which includes predictions from eggNOG, Ensembl Compara, HGNC, HomoloGene, Inparanoid, NCBI Gene Orthology, OMA, OrthoDB, OrthoMCL, Panther, PhylomeDB, TreeFam and ZFIN", "Data location"="Local", "Internet connection"="Not required", "Speed"="Medium") ) knitr::kable(pros_cons) ``` # Quick example ## Convert orthologs [`convert_orthologs`](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html) is very flexible with what users can supply as `gene_df`, and can take a `data.frame`/`data.table`/`tibble`, (sparse) `matrix`, or `list`/`vector` containing genes. Genes, transcripts, proteins, SNPs, or genomic ranges will be recognised in most formats (HGNC, Ensembl, RefSeq, UniProt, etc.) and can even be a mixture of different formats. All genes will be mapped to gene symbols, unless specified otherwise with the `...` arguments (see `?orthogene::convert_orthologs` or [here ](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html) for details). ### Note on non-1:1 orthologs A key feature of [`convert_orthologs`](https://neurogenomics.github.io/orthogene/reference/convert_orthologs.html) is that it handles the issue of genes with many-to-many mappings across species. This can occur due to evolutionary divergence, and the function of these genes tend to be less conserved and less translatable. Users can address this using different strategies via `non121_strategy=`. ```{r convert_orthologs} gene_df <- orthogene::convert_orthologs(gene_df = exp_mouse, gene_input = "rownames", gene_output = "rownames", input_species = "mouse", output_species = "human", non121_strategy = "drop_both_species", method = method) knitr::kable(as.matrix(head(gene_df))) ``` `convert_orthologs` is just one of the many useful functions in `orthogene`. Please see the [documentation website](https://neurogenomics.github.io/orthogene/articles/orthogene) for the full vignette. # Additional resources ## [Hex sticker creation](https://github.com/neurogenomics/orthogene/blob/main/inst/hex/hexSticker.Rmd) ## [Benchmarking methods](https://github.com/neurogenomics/orthogene/blob/main/inst/benchmark/benchmarks.Rmd) # Session Info
```{r Session Info} utils::sessionInfo() ```
# Related projects ## Tools - [`gprofiler2`](https://cran.r-project.org/web/packages/gprofiler2/vignettes/gprofiler2.html): `orthogene` uses this package. `gprofiler2::gorth()` pulls from [many orthology mapping databases](https://biit.cs.ut.ee/gprofiler/page/organism-list). - [`homologene`](https://github.com/oganm/homologene): `orthogene` uses this package. Provides API access to NCBI [HomoloGene](https://www.ncbi.nlm.nih.gov/homologene) database. - [`babelgene`](https://cran.r-project.org/web/packages/babelgene/vignettes/babelgene-intro.html): `orthogene` uses this package. `babelgene::orthologs()` pulls from [many orthology mapping databases](https://cran.r-project.org/web/packages/babelgene/vignettes/babelgene-intro.html). - [`annotationTools`](https://www.bioconductor.org/packages/release/bioc/html/annotationTools.html): For interspecies microarray data. - [`orthology`](https://www.leibniz-hki.de/en/orthology-r-package.html): R package for ortholog mapping (deprecated?). - [`hpgltools::load_biomart_orthologs()`](https://rdrr.io/github/elsayed-lab/hpgltools/man/load_biomart_orthologs.html): Helper function to get orthologs from biomart. - [`JustOrthologs`](https://github.com/ridgelab/JustOrthologs/): Ortholog inference from multi-species genomic sequences. - [`orthologr`](https://github.com/drostlab/orthologr): Ortholog inference from multi-species genomic sequences. - [`OrthoFinder`](https://github.com/davidemms/OrthoFinder): Gene duplication event inference from multi-species genomics. ## Databases - [HomoloGene](https://www.ncbi.nlm.nih.gov/homologene): NCBI database that the R package [homologene](https://github.com/oganm/homologene) pulls from. - [gProfiler](https://biit.cs.ut.ee/gprofiler): Web server for functional enrichment analysis and conversions of gene lists. - [OrtholoGene](http://orthologene.org/resources.html): Compiled list of gene orthology resources. ## Contact ### [Neurogenomics Lab](https://www.neurogenomics.co.uk/) UK Dementia Research Institute Department of Brain Sciences Faculty of Medicine Imperial College London [GitHub](https://github.com/neurogenomics) [DockerHub](https://hub.docker.com/orgs/neurogenomicslab)

Owner

  • Name: neurogenomics
  • Login: neurogenomics
  • Kind: organization
  • Location: United Kingdom

Neurogenomics Lab, UK Dementia Research Institute at Imperial College London

GitHub Events

Total
  • Issues event: 5
  • Watch event: 3
  • Issue comment event: 11
  • Push event: 4
  • Create event: 2
Last Year
  • Issues event: 5
  • Watch event: 3
  • Issue comment event: 11
  • Push event: 4
  • Create event: 2

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 265
  • Total Committers: 5
  • Avg Commits per committer: 53.0
  • Development Distribution Score (DDS): 0.14
Past Year
  • Commits: 43
  • Committers: 3
  • Avg Commits per committer: 14.333
  • Development Distribution Score (DDS): 0.093
Top Committers
Name Email Commits
bschilder b****r@a****u 228
Brian M. Schilder 3****r 29
Nitesh Turaga n****a@g****m 4
J Wokaty j****y 2
J Wokaty j****y@s****u 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 65
  • Total pull requests: 2
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 28 days
  • Total issue authors: 13
  • Total pull request authors: 1
  • Average comments per issue: 1.54
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 0
  • Average time to close issues: 9 days
  • Average time to close pull requests: N/A
  • Issue authors: 4
  • Pull request authors: 0
  • Average comments per issue: 2.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • bschilder (28)
  • sagarutturkar (2)
  • igordot (1)
  • mkutmon (1)
  • Al-Murphy (1)
  • willgearty (1)
  • cjuigne (1)
  • shanie-landen (1)
  • forraving (1)
  • agata-sm (1)
  • quelobs (1)
  • ehickman0817 (1)
  • mdsutcliffe (1)
Pull Request Authors
  • Al-Murphy (2)
Top Labels
Issue Labels
enhancement (21) bug (16) GitHub Actions (2) documentation (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • bioconductor 20,880 total
  • Total dependent packages: 1
  • Total dependent repositories: 0
  • Total versions: 5
  • Total maintainers: 1
bioconductor.org: orthogene

Interspecies gene mapping

  • Versions: 5
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 20,880 Total
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 22.2%
Downloads: 66.6%
Last synced: 7 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.1 depends
  • DelayedArray * imports
  • Matrix * imports
  • Matrix.utils * imports
  • babelgene * imports
  • data.table * imports
  • dplyr * imports
  • ggplot2 * imports
  • ggpubr * imports
  • ggtree * imports
  • gprofiler2 * imports
  • grr * imports
  • homologene * imports
  • jsonlite * imports
  • methods * imports
  • parallel * imports
  • patchwork * imports
  • repmis * imports
  • stats * imports
  • tools * imports
  • utils * imports
  • BiocStyle * suggests
  • Cairo * suggests
  • GenomeInfoDbData * suggests
  • OmaDB * suggests
  • RColorBrewer * suggests
  • TreeTools * suggests
  • ape * suggests
  • badger * suggests
  • covr * suggests
  • desc * suggests
  • ggimage * suggests
  • haven * suggests
  • here * suggests
  • hrbrthemes * suggests
  • knitr * suggests
  • magick * suggests
  • markdown * suggests
  • phytools * suggests
  • piggyback * suggests
  • remotes * suggests
  • rmarkdown * suggests
  • rphylopic * suggests
  • testthat >= 3.0.0 suggests
  • yulab.utils * suggests
.github/workflows/rworkflows.yml actions
  • neurogenomics/rworkflows master composite