QuaC

QuaC: A Pipeline Implementing Quality Control Best Practices for Genome Sequencing and Exome Sequencing Data - Published in JOSS (2023)

https://github.com/uab-cgds-worthey/quac

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 9 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    2 of 5 committers (40.0%) from academic institutions
  • Institutional organization owner
    Organization uab-cgds-worthey has institutional domain (sites.uab.edu)
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

pipeline

Keywords from Contributors

mesh
Last synced: 6 months ago · JSON representation

Repository

🦆 Quality Control of WGS and exome samples 🦆

Basic Info
  • Host: GitHub
  • Owner: uab-cgds-worthey
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Homepage: https://quac.readthedocs.io
  • Size: 37.5 MB
Statistics
  • Stars: 6
  • Watchers: 3
  • Forks: 1
  • Open Issues: 15
  • Releases: 10
Topics
pipeline
Created about 3 years ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License

README.md

Snakemake ReadTheDocs DOI JOSS Zenodo

QuaC

🦆🦆 Don't duck that QC thingy 🦆🦆

NOTE: In a past life, QuaC used a different remote Git management provider, UAB Gitlab. It was migrated to Github in Jan 2023, and the Gitlab version has been archived.

What is QuaC?

QuaC is a snakemake-based pipeline that runs several QC tools for WGS/WES samples and then summarizes their results using pre-defined, configurable QC thresholds.

In summary, QuaC performs the following:

  • Runs several QC tools using BAM and VCF files as input. At our center CGDS, these files are produced as part of the small variant caller pipeline.
  • Using QuaC-Watch tool, it performs QC checkup based on the expected thresholds for certain QC metrics and summarizes the results for easier human consumption
  • Aggregates QC output as well as QuaC-Watch output using MulitQC, both at the sample level and project level.
  • Optionally, above mentioned QuaC-Watch and QC aggregation steps can accept pre-run results from few QC tools (fastqc, fastq-screen, picard's markduplicates) when run with flag --include_prior_qc.

NOTE: QuaC is built to use with Human WGS/WES data. If you would like to use it with non-human data, please modify the pipeline as needed -- especially the thresholds used in QuaC-Watch configs.

Documentation

Full documentation, including installation and how to run QuaC, is available at https://quac.readthedocs.io.

Citing QuaC

If you use QuaC, please cite:

Gajapathy et al., (2023). QuaC: A Pipeline Implementing Quality Control Best Practices for Genome Sequencing and Exome Sequencing Data. Journal of Open Source Software, 8(90), 5313, https://doi.org/10.21105/joss.05313

Repo owner

  • Manavalan Gajapathy

License

GNU GPLv3

Contributing

See here for contributing guidelines.

Changelog

See here

Owner

  • Name: Center for Computational Genomics and Data Science (CGDS)
  • Login: uab-cgds-worthey
  • Kind: organization

JOSS Publication

QuaC: A Pipeline Implementing Quality Control Best Practices for Genome Sequencing and Exome Sequencing Data
Published
October 23, 2023
Volume 8, Issue 90, Page 5313
Authors
Manavalan Gajapathy ORCID
Center for Computational Genomics and Data Science, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America, Department of Genetics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
Brandon M. Wilk ORCID
Center for Computational Genomics and Data Science, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America, Department of Genetics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
Elizabeth A. Worthey ORCID
Center for Computational Genomics and Data Science, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America, Department of Genetics, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, United States of America
Editor
Lorena Pantano ORCID
Tags
snakemake quality control genome sequencing exome sequencing QC review multiqc singularity bam vcf

GitHub Events

Total
  • Release event: 1
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 20
  • Pull request review event: 4
  • Pull request event: 4
  • Create event: 3
Last Year
  • Release event: 1
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 20
  • Pull request review event: 4
  • Pull request event: 4
  • Create event: 3

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 614
  • Total Committers: 5
  • Avg Commits per committer: 122.8
  • Development Distribution Score (DDS): 0.186
Past Year
  • Commits: 21
  • Committers: 1
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Manavalan Gajapathy m****g@u****u 500
Manavalan Gajapathy m****g@g****m 110
Brandon M. Wilk w****7@g****m 2
dependabot[bot] 4****] 1
Manavalan Gajapathy - UAB u****1@d****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 73
  • Total pull requests: 26
  • Average time to close issues: about 1 year
  • Average time to close pull requests: 5 days
  • Total issue authors: 4
  • Total pull request authors: 3
  • Average comments per issue: 2.3
  • Average comments per pull request: 0.65
  • Merged pull requests: 24
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 0
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 16 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ManavalanG (68)
  • Redmar-van-den-Berg (2)
  • wilkb777 (1)
  • brentp (1)
Pull Request Authors
  • ManavalanG (23)
  • wilkb777 (2)
  • dependabot[bot] (1)
Top Labels
Issue Labels
bug (7) enhancement (6) has attachment (3)
Pull Request Labels
documentation (2) dependencies (1) enhancement (1)