cogeqc

An R package to perform systematic quality checks on comparative genomics analyses

https://github.com/almeidasilvaf/cogeqc

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Committers with academic emails
    1 of 4 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.5%) to scientific vocabulary

Keywords

comparative-genomics evolutionary-genomics rstats

Keywords from Contributors

bioconductor-package immune-repertoire grna-sequence gene ontology sequencing genomics proteomics bioinformatics microbiome-data
Last synced: 9 months ago · JSON representation

Repository

An R package to perform systematic quality checks on comparative genomics analyses

Basic Info
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
comparative-genomics evolutionary-genomics rstats
Created over 4 years ago · Last pushed over 2 years ago

https://github.com/almeidasilvaf/cogeqc/blob/devel/



# cogeqc 



[![GitHub
issues](https://img.shields.io/github/issues/almeidasilvaf/cogeqc)](https://github.com/almeidasilvaf/cogeqc/issues)
[![Lifecycle:
stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![R-CMD-check-bioc](https://github.com/almeidasilvaf/cogeqc/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/almeidasilvaf/cogeqc/actions)
[![Codecov test
coverage](https://codecov.io/gh/almeidasilvaf/cogeqc/branch/devel/graph/badge.svg)](https://codecov.io/gh/almeidasilvaf/cogeqc?branch=devel)


The goal of `cogeqc` is to facilitate systematic quality checks on
standard comparative genomics analyses to help researchers detect issues
and select the most suitable parameters for each data set. Currently,
cogeqc can be used to assess:

1.  **Genome assembly and annotation quality:** using two approaches:

    - *Statistics in a context:* users can extract summary assembly and
      annotation statistics for genomes on NCBI (via the [NCBI Datasets
      API](https://www.ncbi.nlm.nih.gov/datasets/)) and compare their
      observed values (e.g., genome size, number of genes, contiguity
      measures) with previously reported values on NCBI genomes.

    - *Gene space completeness with BUSCOs:* users can assess gene space
      completeness using Best Universal Single-Copy Orthologs (BUSCOs)
      through wrapper functions that run
      [BUSCO](https://doi.org/10.1093/bioinformatics/btv351) from the
      comfort of an R session and create publication-ready plots with
      summary statistics.

2.  **Orthogroup inference:** orthogroups are assessed based on the
    percentage of shared protein domains in all ortogroups. The
    rationale for this approach is that genes in the same orthogroup
    evolved from a common ancestor, so the percentage of conserved
    protein domains in an orthogroup should be as high as possible.

3.  **Synteny detection:** synteny detection is assessed using
    network-based approaches, namely the clustering coefficient and
    degree of a synteny network.

## Installation instructions

Get the latest stable `R` release from
[CRAN](http://cran.r-project.org/). Then install `cogeqc` using from
[Bioconductor](http://bioconductor.org/) the following code:

``` r
if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("cogeqc")
```

And the development version from
[GitHub](https://github.com/almeidasilvaf/cogeqc) with:

``` r
BiocManager::install("almeidasilvaf/cogeqc")
```

## Citation

Below is the citation output from using `citation('cogeqc')` in R.
Please run this yourself to check for any updates on how to cite
**cogeqc**.

``` r
print(citation('cogeqc'), bibtex = TRUE)
#> 
#> To cite package 'cogeqc' in publications use:
#> 
#>   Almeida-Silva F, Van de Peer Y (2022). _cogeqc: Systematic quality
#>   checks on comparative genomics analyses_. R package version 1.3.1,
#>   .
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {cogeqc: Systematic quality checks on comparative genomics analyses},
#>     author = {Fabrcio Almeida-Silva and Yves {Van de Peer}},
#>     year = {2022},
#>     note = {R package version 1.3.1},
#>     url = {https://github.com/almeidasilvaf/cogeqc},
#>   }
```

Please note that the `cogeqc` was only made possible thanks to many
other R and bioinformatics software authors, which are cited either in
the vignettes and/or the paper(s) describing this package.

## Code of Conduct

Please note that the `cogeqc` project is released with a [Contributor
Code of Conduct](http://bioconductor.org/about/code-of-conduct/). By
contributing to this project, you agree to abide by its terms.

## Development tools

- Continuous code testing is possible thanks to [GitHub
  actions](https://www.tidyverse.org/blog/2020/04/usethis-1-6-0/)
  through *[usethis](https://CRAN.R-project.org/package=usethis)*,
  *[remotes](https://CRAN.R-project.org/package=remotes)*, and
  *[rcmdcheck](https://CRAN.R-project.org/package=rcmdcheck)* customized
  to use [Bioconductors docker
  containers](https://www.bioconductor.org/help/docker/) and
  *[BiocCheck](https://bioconductor.org/packages/3.15/BiocCheck)*.
- Code coverage assessment is possible thanks to
  [codecov](https://codecov.io/gh) and
  *[covr](https://CRAN.R-project.org/package=covr)*.
- The [documentation website](http://almeidasilvaf.github.io/cogeqc) is
  automatically updated thanks to
  *[pkgdown](https://CRAN.R-project.org/package=pkgdown)*.
- The documentation is formatted thanks to
  *[devtools](https://CRAN.R-project.org/package=devtools)* and
  *[roxygen2](https://CRAN.R-project.org/package=roxygen2)*.

For more details, check the `dev` directory.

This package was developed using
*[biocthis](https://bioconductor.org/packages/3.15/biocthis)*.

Owner

  • Name: Fabrício Almeida-Silva
  • Login: almeidasilvaf
  • Kind: user
  • Location: Technologiepark 71, Ghent, Belgium
  • Company: VIB-UGent Center for Plant Systems Biology

Bioinformatics. Network biology. Plant genomics and evolution. #rstats addict.

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 83
  • Total Committers: 4
  • Avg Commits per committer: 20.75
  • Development Distribution Score (DDS): 0.072
Past Year
  • Commits: 11
  • Committers: 3
  • Avg Commits per committer: 3.667
  • Development Distribution Score (DDS): 0.364
Top Committers
Name Email Commits
almeidasilvaf f****a@h****m 77
J Wokaty j****y@s****u 2
Nitesh Turaga n****a@g****m 2
J Wokaty j****y 2
Committer Domains (Top 20 + Academic)

Packages

  • Total packages: 1
  • Total downloads:
    • bioconductor 6,841 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 6
  • Total maintainers: 1
bioconductor.org: cogeqc

Systematic quality checks on comparative genomics analyses

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 6,841 Total
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 29.8%
Downloads: 89.4%
Last synced: 10 months ago