gggenomes

A grammar of graphics for comparative genomics

https://github.com/thackl/gggenomes

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary

Keywords

biological-data comparative-genomics genomics-visualization ggplot-extension ggplot2
Last synced: 6 months ago · JSON representation

Repository

A grammar of graphics for comparative genomics

Basic Info
Statistics
  • Stars: 690
  • Watchers: 13
  • Forks: 68
  • Open Issues: 51
  • Releases: 0
Topics
biological-data comparative-genomics genomics-visualization ggplot-extension ggplot2
Created about 8 years ago · Last pushed 8 months ago
Metadata Files
Readme License

README.md

gggenomes

A grammar of graphics for comparative genomics

gggenomes is a versatile graphics package for comparative genomics. It extends the popular R visualization package ggplot2 by adding dedicated plot functions for genes, syntenic regions, etc. and verbs to manipulate the plot to, for example, quickly zoom in into gene neighborhoods.

A realistic use case comparing six viral genomes

gggenomes makes it easy to combine data and annotations from different sources into one comprehensive and elegant plot. Here we compare the genomic architecture of 6 viral genomes initially described in Hackl et al.: Endogenous virophages populate the genomes of a marine heterotrophic flagellate

```R library(gggenomes)

to inspect the example data shipped with gggenomes

data(package="gggenomes")

gggenomes( genes = emalegenes, seqs = emaleseqs, links = emaleava, feats = list(emaletirs, ngaros=emalengaros, gc=emalegc)) |> addsublinks(emaleprotava) |> sync() + # synchronize genome directions based on links geomfeat(position="identity", size=6) + geomseq() + geomlink(data=links(2)) + geombinlabel() + geomgene(aes(fill=name)) + geomgenetag(aes(label=name), nudgey=0.1, checkoverlap = TRUE) + geomfeat(data=feats(ngaros), alpha=.3, size=10, position="identity") + geomfeatnote(aes(label="Ngaro-transposon"), data=feats(ngaros), nudgey=.1, vjust=0) + geomwiggle(aes(z=score, linetype="GC-content"), feats(gc), fill="lavenderblush4", position=positionnudge(y=-.2), height = .2) + scalefill_brewer("Genes", palette="Dark2", na.value="cornsilk3")

ggsave("emales.png", width=8, height=4) ```

For a reproducible recipe describing the full evolution of an earlier version of this plot with an older version of gggenomes starting from a mere set of contigs, and including the bioinformatics analysis workflow, have a look at From a few sequences to a complex map in minutes.

Motivation & concept

Visualization is a corner stone of both exploratory analysis and science communication. Bioinformatics workflows, unfortunately, tend to generate a plethora of data products often in adventurous formats making it quite difficult to integrate and co-visualize the results. Instead of trying to cater to the all these different formats explicitly, gggenomes embraces the simple tidyverse-inspired credo:

  • Any data set can be transformed into one (or a few) tidy data tables
  • Any data set in a tidy data table can be easily and elegantly visualized

As a result gggenomes helps bridge the gap between data generation, visual exploration, interpretation and communication, thereby accelerating biological research.

Under the hood gggenomes uses a light-weight track system to accommodate a mix of related data sets, essentially implementing ggplot2 with multiple tidy tables instead of just one. The data in the different tables are tied together through a global genome layout that is automatically computed from the input and defines the positions of genomic sequences (chromosome/contigs) and their associated features in the plot.

Inspiration

gggenomes draws inspiration from some brilliant packages, in particular:

Installation

gggenomes is available as stable release on CRAN (from v1.0.1). The lastest developmental versions are available on github.

```R

Install from CRAN

install.packages("gggenomes")

optionally install ggtree to plot genomes next to trees

https://bioconductor.org/packages/release/bioc/html/ggtree.html

if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("ggtree")

Install latest developmental version from github

devtools::install_github("thackl/gggenomes") ```

Owner

  • Name: Thomas Hackl
  • Login: thackl
  • Kind: user
  • Location: Groningen, Netherlands
  • Company: Max Planck Institute for Medical Research

GitHub Events

Total
  • Issues event: 13
  • Watch event: 97
  • Delete event: 7
  • Member event: 1
  • Issue comment event: 23
  • Push event: 21
  • Pull request event: 6
  • Fork event: 7
  • Create event: 2
Last Year
  • Issues event: 13
  • Watch event: 97
  • Delete event: 7
  • Member event: 1
  • Issue comment event: 23
  • Push event: 21
  • Pull request event: 6
  • Fork event: 7
  • Create event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 5
  • Total pull requests: 5
  • Average time to close issues: 8 months
  • Average time to close pull requests: 2 months
  • Total issue authors: 3
  • Total pull request authors: 4
  • Average comments per issue: 1.4
  • Average comments per pull request: 0.8
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 4
  • Average time to close issues: 1 day
  • Average time to close pull requests: 18 days
  • Issue authors: 3
  • Pull request authors: 3
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • thackl (7)
  • iimog (2)
  • waltercostamb (2)
  • SergejRuff (2)
  • 22paulae (1)
  • shaodongyan (1)
  • danpal96 (1)
  • m-bogaerts (1)
  • tania-k (1)
  • Jigyasa3 (1)
  • yzhong005 (1)
  • TlaskalV (1)
  • xvtyzn (1)
  • xiahui625649 (1)
  • AlexSCFraser (1)
Pull Request Authors
  • iimog (3)
  • thackl (3)
  • SergejRuff (1)
  • cramsing (1)
  • olivroy (1)
Top Labels
Issue Labels
bug (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 392 last-month
  • Total docker downloads: 149
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 2
  • Total maintainers: 1
cran.r-project.org: gggenomes

A Grammar of Graphics for Comparative Genomics

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 392 Last month
  • Docker Downloads: 149
Rankings
Stargazers count: 0.6%
Forks count: 1.1%
Dependent packages count: 28.6%
Average: 30.4%
Dependent repos count: 35.3%
Downloads: 86.5%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/pkgdown.yaml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc master composite
  • r-lib/actions/setup-r master composite
DESCRIPTION cran
  • R >= 3.4.2 depends
  • dplyr * depends
  • gggenes * depends
  • ggplot2 * depends
  • jsonlite * depends
  • purrr * depends
  • readr >= 2.0.0 depends
  • snakecase * depends
  • stringr * depends
  • thacklr * depends
  • tibble * depends
  • tidyr * depends
  • vctrs * depends
  • Hmisc * suggests
  • ggtree * suggests
  • patchwork * suggests
  • rtracklayer * suggests
  • testthat * suggests