cnr-flow
CUT&RUN-Flow, A Nextflow pipeline for QC, tag trimming, normalization, and peak calling for data from CUT&RUN experiments.
Science Score: 64.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
1 of 1 committers (100.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary
Keywords
chip-seq-pipelines
cutandrun
cutandrun-seq
cutrun
genomics
nextflow
peak-calling
Last synced: 6 months ago
·
JSON representation
·
Repository
CUT&RUN-Flow, A Nextflow pipeline for QC, tag trimming, normalization, and peak calling for data from CUT&RUN experiments.
Basic Info
Statistics
- Stars: 5
- Watchers: 2
- Forks: 4
- Open Issues: 0
- Releases: 1
Topics
chip-seq-pipelines
cutandrun
cutandrun-seq
cutrun
genomics
nextflow
peak-calling
Created over 5 years ago
· Last pushed almost 4 years ago
Metadata Files
Readme
License
Citation
README.rst
***********************
CUT&RUN-Flow (CnR-flow)
***********************
.. image:: https://img.shields.io/github/v/release/rennelab/cnr-flow?include_prereleases&logo=github
:target: https://github.com/rennelab/cnr-flow/releases
:alt: GitHub release (latest by date including pre-releases)
.. image:: https://circleci.com/gh/RenneLab/CnR-flow.svg?style=shield&circle-token=0c2e0d49a95709cbb3f0bb8b7d8d05ffa4547d14
:target: https://app.circleci.com/pipelines/github/RenneLab/CnR-flow
:alt: CircleCI Build Status
.. image:: https://img.shields.io/readthedocs/cnr-flow?logo=read-the-docs
:target: https://CnR-flow.readthedocs.io/en/latest/?badge=latest
:alt: ReadTheDocs Documentation Status
.. image:: https://img.shields.io/badge/nextflow-%3E%3D20.10.6-green
:target: https://www.nextflow.io/
:alt: Nextflow Version Required >= 20.10.6
.. image:: https://img.shields.io/badge/License-GPLv3+-blue?logo=GNU
:target: https://www.gnu.org/licenses/gpl-3.0.en.html
:alt: GNU GPLv3+ License
.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.4015698.svg
:target: https://doi.org/10.5281/zenodo.4015698
:alt: Zenodo DOI:10.5281/zenodo.4015698
| Welcome to *CUT&RUN-Flow* (*CnR-flow*), a Nextflow pipeline for QC, tag
trimming, normalization, and peak calling for paired-end sequencing
data from CUT&RUN experiments.
| This software is available via GitHub, at
http://www.github.com/RenneLab/CnR-flow .
| Full project documentation is available at |docs_link|_.
Pipeline Design:
| CUT&RUN-Flow is built using `Nextflow`_, a powerful
domain-specific workflow language built to create flexible and
efficient bioinformatics pipelines.
Nextflow provides extensive flexibility in utilizing cluster
computing environments such as `PBS`_ and `SLURM`_,
and in automated and compartmentalized handling of dependencies using
`Conda`_ / `Bioconda`_, `Docker`_, `Singularity`_ or `Environment Modules`_.
Dependencies:
| In addition to local configurations, Nextflow handles
dependencies in separated working environments within the same pipeline
using `Conda`_ or `Environment Modules`_ within your working environment,
or using container-encapsulated execution with `Docker`_ or `Singularity`_.
**CnR-flow is pre-configured to auto-acquire dependencies with no additional setup,
either using Conda recipes from the Bioconda project,
or by using Docker or Singularity to execute Docker images hosted by the
BioContainers project** (`Bioconda`_; `BioContainers`_).
| CUT&RUN-Flow utilizes
`UCSC Genome Browser Tools`_ and `Samtools`_
for reference library preparation,
`FastQC`_ for tag quality control,
`Trimmomatic`_ for tag trimming, `Bowtie2`_ for tag alignment,
`Samtools`_, `bedtools`_ and `UCSC Genome Browser Tools`_
for alignment manipulation, and `MACS2`_ and/or `SEACR`_
for peak calling, as well as their associated language subdependencies of
Java, Python2/3, R, and C++.
Pipeline Features:
* One-step reference database prepration using a path (or URL)
to a FASTA file.
* Ability to specify groups
of samples containing both treatment (Ex: H3K4me3) and
control (Ex: IgG) antibody
groups, with automated association of each control sample with the
respective treatment samples during the peak calling step
* Built-in normalization
protocol to normalize to a sequence library of the user's choice
when spike-in DNA is used in the CUT&RUN Protocol (Optional, includes an
*E. coli* reference genome for utiliziation of *E. coli*
as a spike-in control as described by |Meers2019|
[see the |References| section of |docs_link|_])
* OR: CPM-normalization to normalize total read counts between samples (beta).
* SLURM, PBS... and many other job scheduling environments
enabled natively by Nextflow
* Output of memory-efficient CRAM (alignment),
bedgraph (genome coverage),
and bigWig (genome coverage) file formats
|pipe_dotgraph|
| For a full list of required dependencies and tested versions, see
the |Dependencies| section of |docs_link|_, and for dependency
configuration options see the |Dependency Config| section.
.. _Quickstart:
Quickstart
------------
Here is a brief introduction on how to install and get started using the pipeline.
For full details, see |docs_link|_.
Prepare Task Directory:
| Create a task directory, and navigate to it.
.. code-block:: bash
$ mkdir ./my_task # (Example)
$ cd ./my_task # (Example)
Install Nextflow (if necessary):
| Download the nextflow executable to your current directory.
| (You can move the nextflow executable and add to $PATH for
future usage)
.. code-block:: bash
$ curl -s https://get.nextflow.io | bash
# For the following steps, use:
nextflow # If nextflow executable on $PATH (assumed)
./nextflow # If running nextflow executable from local directory
Download and Install CnR-flow:
| Nextflow will download and store the pipeline in the
user's Nextflow info directory (Default: ``~/.nextflow/``)
.. code-block:: bash
$ nextflow run RenneLab/CnR-flow --mode initiate
Configure, Validate, and Test:
Conda:
* Install miniconda (if necessary).
`Installation instructions `_
* The CnR-flow configuration with Conda should then work "out-of-the-box."
Docker:
* Add '-profile docker' to all nextflow commands
Singularity:
* Add '-profile singularity' to all nextflow commands
| If using an alternative configuration, see the |Dependency Config|
section of |docs_link|_ for dependency configuration options.
|
| Once dependencies have been configured, validate all dependencies:
.. code-block:: bash
# Conda or other configs:
$ nextflow run CnR-flow --mode validate_all
# OR Docker Configuration:
$ nextflow run CnR-flow -profile docker --mode validate_all
# OR Singularity Configuration:
$ nextflow run CnR-flow -profile singularity --mode validate_all
| Fill the required task input parameters in "nextflow.config"
For detailed setup instructions, see the |Task Setup|
section of |docs_link|_
*Additionally, for usage on a SLURM, PBS, or other cluster systems,
configure your system executor, time, and memory settings.*
.. code-block:: bash
# Configure:
$ nextflow.config # Task Input, Steps, etc. Configuration
#REQUIRED values to enter (all others *should* work as default):
# ref_fasta (or some other ref-mode/location)
# treat_fastqs (input paired-end fastq[.gz] file paths)
# [OR fastq_groups] (mutli-group input paired-end .fastq[.gz] file paths)
Prepare and Execute Pipeline:
| Prepare your reference databse (and normalization reference) from .fasta[.gz]
file(s):
.. code-block:: bash
$ nextflow run CnR-flow --mode prep_fasta
| Perform a test run to check inputs, paramater setup, and process execution:
.. code-block:: bash
$ nextflow run CnR-flow --mode dry_run
| If satisifed with the pipeline setup, execute the pipeline:
.. code-block:: bash
$ nextflow run CnR-flow --mode run
| Further documentation on CUT&RUN-Flow components, setup, and usage can
be found in |docs_link|_.
.. |References| replace:: *References*
.. |Meers2019| replace:: *Meers et. al. (eLife 2019)*
.. |Dependency Config| replace:: *Dependency Configuration*
.. |Dependencies| replace:: *Dependencies*
.. |Task Setup| replace:: *Task Setup*
.. |pipe_dotgraph| image:: build_info/dotgraph_parsed.png
:alt: CUT&RUN-Flow Pipe Flowchart
.. |docs_link| replace:: CUT&RUN-Flow's ReadTheDocs
.. _docs_link: https://cnr-flow.readthedocs.io#
.. _Nextflow: http://www.nextflow.io
.. _Bioconda: https://bioconda.github.io/
.. _CUTRUNTools: https://bitbucket.org/qzhudfci/cutruntools/src
.. _SEACR: https://github.com/FredHutch/SEACR
.. _R: https://www.r-project.org/
.. _Bowtie2: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
.. _faCount: https://hgdownload.cse.ucsc.edu/admin/exe/
.. _Samtools: http://www.htslib.org/
.. _FastQC: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
.. _Trimmomatic: http://www.usadellab.org/cms/?page=trimmomatic
.. _bedtools: https://bedtools.readthedocs.io/en/latest/
.. _bedGraphToBigWig: https://hgdownload.cse.ucsc.edu/admin/exe/
.. _MACS2: https://github.com/macs3-project/MACS
.. _PBS: https://www.openpbs.org/
.. _SLURM: https://slurm.schedmd.com/
.. _CONDA: https://anaconda.org/
.. _Environment Modules: http://modules.sourceforge.net/
.. _Docker: http://www.docker.com/
.. _Singularity: https://sylabs.io/
.. _BioContainers: https://biocontainers.pro/
.. _UCSC Genome Browser Tools: https://hgdownload.cse.ucsc.edu/admin/exe/
.. _kseq_test: https://bitbucket.org/qzhudfci/cutruntools/src
.. _CUT&RUN-Tools: https://bitbucket.org/qzhudfci/cutruntools/src
Owner
- Name: Renne Lab
- Login: RenneLab
- Kind: organization
- Location: Gainesville, FL
- Website: https://www.rennelab.com/
- Repositories: 2
- Profile: https://github.com/RenneLab
Citation (Citations.bib)
@article{di2017nextflow,
title={Nextflow enables reproducible computational workflows},
author={Di Tommaso, Paolo and Chatzou, Maria and Floden, Evan W and Barja, Pablo Prieto and Palumbo, Emilio and Notredame, Cedric},
journal={Nature biotechnology},
volume={35},
number={4},
pages={316--319},
year={2017},
publisher={Nature Publishing Group}
}
@article{gruning2018bioconda,
title={Bioconda: sustainable and comprehensive software distribution for the life sciences},
author={Gr{\"u}ning, Bj{\"o}rn and Dale, Ryan and Sj{\"o}din, Andreas and Chapman, Brad A and Rowe, Jillian and Tomkins-Tinch, Christopher H and Valieris, Renan and K{\"o}ster, Johannes},
journal={Nature methods},
volume={15},
number={7},
pages={475--476},
year={2018},
publisher={Nature Publishing Group}
}
@article{zhu2019cut,
title={CUT\&RUNTools: a flexible pipeline for CUT\&RUN processing and footprint analysis},
author={Zhu, Qian and Liu, Nan and Orkin, Stuart H and Yuan, Guo-Cheng},
journal={Genome biology},
volume={20},
number={1},
pages={192},
year={2019},
publisher={Springer}
}
@article{meers2019peak,
title={Peak calling by Sparse Enrichment Analysis for CUT\&RUN chromatin profiling},
author={Meers, Michael P and Tenenbaum, Dan and Henikoff, Steven},
journal={Epigenetics \& chromatin},
volume={12},
number={1},
pages={42},
year={2019},
publisher={Springer}
}
@Manual{,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2017},
url = {https://www.R-project.org/},
}
@article{10.1093/bioinformatics/btx192,
author = {da Veiga Leprevost, Felipe and Grüning, Björn A and Alves Aflitos, Saulo and Röst, Hannes L and Uszkoreit, Julian and Barsnes, Harald and Vaudel, Marc and Moreno, Pablo and Gatto, Laurent and Weber, Jonas and Bai, Mingze and Jimenez, Rafael C and Sachsenberg, Timo and Pfeuffer, Julianus and Vera Alvarez, Roberto and Griss, Johannes and Nesvizhskii, Alexey I and Perez-Riverol, Yasset},
title = "{BioContainers: an open-source and community-driven framework for software standardization}",
journal = {Bioinformatics},
volume = {33},
number = {16},
pages = {2580-2582},
year = {2017},
month = {03},
abstract = "{BioContainers (biocontainers.pro) is an open-source and community-driven framework which provides platform independent executable environments for bioinformatics software. BioContainers allows labs of all sizes to easily install bioinformatics software, maintain multiple versions of the same software and combine tools into powerful analysis pipelines. BioContainers is based on popular open-source projects Docker and rkt frameworks, that allow software to be installed and executed under an isolated and controlled environment. Also, it provides infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with a special focus on omics technologies. These containers can be integrated into more comprehensive bioinformatics pipelines and different architectures (local desktop, cloud environments or HPC clusters).The software is freely available at github.com/BioContainers/.}",
issn = {1367-4803},
doi = {10.1093/bioinformatics/btx192},
url = {https://doi.org/10.1093/bioinformatics/btx192},
eprint = {https://academic.oup.com/bioinformatics/article-pdf/33/16/2580/25163480/btx192.pdf},
}
@article{langmead2012fast,
title={Fast gapped-read alignment with Bowtie 2},
author={Langmead, Ben and Salzberg, Steven L},
journal={Nature methods},
volume={9},
number={4},
pages={357},
year={2012},
publisher={Nature Publishing Group}
}
@article{kent2002human,
title={The human genome browser at UCSC},
author={Kent, W James and Sugnet, Charles W and Furey, Terrence S and Roskin, Krishna M and Pringle, Tom H and Zahler, Alan M and Haussler, David},
journal={Genome research},
volume={12},
number={6},
pages={996--1006},
year={2002},
publisher={Cold Spring Harbor Lab}
}
@article{li2009sequence,
title={The sequence alignment/map format and SAMtools},
author={Li, Heng and Handsaker, Bob and Wysoker, Alec and Fennell, Tim and Ruan, Jue and Homer, Nils and Marth, Gabor and Abecasis, Goncalo and Durbin, Richard},
journal={Bioinformatics},
volume={25},
number={16},
pages={2078--2079},
year={2009},
publisher={Oxford University Press}
}
@misc{andrews2015quality,
title={A quality control tool for high throughput sequence data. 2010},
author={Andrews, Simon and FastQC, A},
year={2015}
}
@article{bolger2014trimmomatic,
title={Trimmomatic: a flexible trimmer for Illumina sequence data},
author={Bolger, Anthony M and Lohse, Marc and Usadel, Bjoern},
journal={Bioinformatics},
volume={30},
number={15},
pages={2114--2120},
year={2014},
publisher={Oxford University Press}
}
@article{quinlan2010bedtools,
title={BEDTools: a flexible suite of utilities for comparing genomic features},
author={Quinlan, Aaron R and Hall, Ira M},
journal={Bioinformatics},
volume={26},
number={6},
pages={841--842},
year={2010},
publisher={Oxford University Press}
}
@article{kent2010bigwig,
title={BigWig and BigBed: enabling browsing of large distributed datasets},
author={Kent, W James and Zweig, Ann S and Barber, G and Hinrichs, Angie S and Karolchik, Donna},
journal={Bioinformatics},
volume={26},
number={17},
pages={2204--2207},
year={2010},
publisher={Oxford University Press}
}
@article{zhang2008model,
title={Model-based analysis of ChIP-Seq (MACS)},
author={Zhang, Yong and Liu, Tao and Meyer, Clifford A and Eeckhoute, J{\'e}r{\^o}me and Johnson, David S and Bernstein, Bradley E and Nusbaum, Chad and Myers, Richard M and Brown, Myles and Li, Wei and others},
journal={Genome biology},
volume={9},
number={9},
pages={1--9},
year={2008},
publisher={BioMed Central}
}
GitHub Events
Total
Last Year
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Dan Stribling | ds@u****u | 227 |
Committer Domains (Top 20 + Academic)
ufl.edu: 1
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 6
- Total pull requests: 0
- Average time to close issues: 3 months
- Average time to close pull requests: N/A
- Total issue authors: 4
- Total pull request authors: 0
- Average comments per issue: 5.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mhriris (2)
- EllieDuan (2)
- zakiF (1)
- Xinming-W (1)
Pull Request Authors
Top Labels
Issue Labels
bug (1)