TwistMethylFlow

A NextFlow pipeline for Twist NGS DNA Methylation data analysis

https://github.com/JD2112/TwistMethylFlow

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary

Keywords

dna dna-methylation docker nextflow singularity twist
Last synced: 6 months ago · JSON representation ·

Repository

A NextFlow pipeline for Twist NGS DNA Methylation data analysis

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 4
Topics
dna dna-methylation docker nextflow singularity twist
Created over 1 year ago · Last pushed 6 months ago
Metadata Files
Readme License Code of conduct Citation Codeowners Security Support

README.md

DOI GitBook Docs build-docs GitHub Invite Collaborators

wakatime

Overview

This Nextflow pipeline is designed for the analysis of Twist NGS Methylation data, including quality control, alignment, methylation calling, differential methylation analysis, and post-processing. It integrates various tools and custom scripts to provide a comprehensive analysis workflow.

Features

| Step | Workflow | | ------------------------------------------ | ----------------- | | Generate Reference Genome Index (optional) | Bismark | | Raw data QC | FastQC | | Adapter sequence trimming | Trim Galore | | Align Reads | Bismark (bowtie2) | | Deduplicate Alignments | Bismark | | Sort and indexing | Samtools | | Extract Methylation Calls | Bismark | | Sample Report | Bismark | | Summary Report | Bismark | | Alignment QC | Qualimap | | QC Reporting | MultiQC | | Differential Methylation Analysis | EdgeR/MethylKit | | Post processing | ggplot2 | | GO analysis | Gene Ontology |

Pipeline Schema

Requirements

Usage

  1. User can start from the FASTQ files or Bismark aligned BAM files. Find the details on the manual

  2. User can choose to run the differential methylation analysis - either EdgeR or MethylKit or both. Find the details on the manual

  3. User can also use --skip_diff_meth to avoid the differential methylation analysis.

--run_both_methods (default: true)

```

when using the reference genome indexing, --genome_fasta

nextflow run JD2112/TwistMethylFlow \ -profile singularity \ --samplesheet Samplesheettwist.csv \ --genomefasta /data/referencegenome/hg38/hg38.fa \ --runbothmethods \ --gtffile /data/Homosapiens.GRCh38.104.gtf \ --refseqfile /data/hg38RefSeq.bed.gz \ --outdir Results/TwistMethylFlowboth

if you already have the bisulfite genome index, --bismark_index

nextflow run JD2112/TwistMethylFlow \ -profile singularity \ --samplesheet Samplesheettwist.csv \ --bismarkindex /data/referencegenome/hg38/ \ --runbothmethods \ --gtffile /data/Homosapiens.GRCh38.104.gtf \ --refseqfile /data/hg38RefSeq.bed.gz \
--outdir /mnt/Results/TwistMethylFlow
both ```

--diff_meth_method: EdgeR

```

when using the reference genome indexing, --genome_fasta

nextflow run JD2112/TwistMethylFlow \ -profile singularity \ --samplesheet Samplesheettwist.csv \ --genomefasta /data/referencegenome/hg38/hg38.fa \ --diffmethmethod edger \ --refseqfile /data/hg38RefSeq.bed.gz \ --outdir Results/TwistMethylFlowedgeR

if you already have the bisulfite genome index, --bismark_index

nextflow run JD2112/TwistMethylFlow \ -profile singularity \ --samplesheet Samplesheettwist.csv \ --bismarkindex /data/referencegenome/hg38/ \ --diffmethmethod edger \ --refseqfile /data/hg38RefSeq.bed.gz \ --outdir /mnt/Results/TwistMethylFlowedgeR ```

--diff_meth_method: MethylKit

```

when using the reference genome indexing, --genome_fasta

nextflow run JD2112/TwistMethylFlow \ -profile singularity \ --samplesheet Samplesheettwist.csv \ --genomefasta /data/referencegenome/hg38/hg38.fa \ --diffmethmethod methylkit \ --gtffile /data/Homosapiens.GRCh38.104.gtf \ --outdir Results/TwistMethylFlowmethylKit

if you already have the bisulfite genome index, --bismark_index

nextflow run JD2112/TwistMethylFlow \ -profile singularity \ --samplesheet Samplesheettwist.csv \ --bismarkindex /data/referencegenome/hg38/ \ --diffmethmethod methylkit \ --gtffile /data/Homosapiens.GRCh38.104.gtf \ --outdir Results/TwistMethylFlowmethylKit ```

[!TIP] "demo data check" Demo data runs with hg19 reference genome. Rememeber to update the GTF/Refseq file accordingly

Options:

| options | Description | |--------|-----------------------------------------------------------| | --sample_sheet | Path to the sample sheet CSV file (required) |
| --bismark_index | Path to the Bismark index directory (required unless --genome or --aligned_bams is provided) | | --genome | Path to the reference genome FASTA file (required if --bismark_index not provided)| | --aligned_bams | Path to aligned BAM files (use this to start from aligned BAM files instead of FASTQ files) | | --refseq_file | Path to RefSeq file for annotation (reuired to run both or methylkit) | | --gtf_file | Path to GTF file for annotation (reuired to run both or edger) | | --outdir | Output directory (default: ./results) | | --diff_meth_method | Differential methylation method to use: 'edger' or 'methylkit' (default: edger) | | --run_both_methods | Run both edgeR and methylkit for differential methylation analysis (default: false) | | --skip_diff_meth | Skip differential methylation analysis (default: false) | | --coverage_threshold | Minimum read coverage to consider a CpG site (default: 10) | | --logfc_cutoff | Differential methylation cut-off for Volcano or MA plot (default: 1.5) |
| --pvalue_cutoff | Differential methylation P-value cut-off for Volcano or MA plot (default: 0.05) | | --hyper_color | Hypermethylation color for Volcano or MA plot (default: red) | | --hypo_cutoff | Hypomethylation color for Volcano or MA plot (default: blue) | | --nonsig_color | Non-significant color for Volcano or MA plot (default: black) | | --compare_str | Comparison string for differential analysis (e.g. "Group1-Group2") | | --top_n_genes | Number of top differentially methylated genes to report for GOplot (default: 100) | | --help | Show this help message and exit |

HELP

nextflow run JD2112/TwistMethylFlow --help --outdir . Find the details on the manual

Credits

Citation

Das, J. (2024). TwistMethylFlow (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.14204261

HELP/FAQ/Troubleshooting

Please check the manual for details.

Please create issues on github.

License(s)

GNU-3 public license.

Acknowledgement

We would like to acknowledge the Core Facility, Faculty of Medicine and Health Sciences, Linköping University, Linköping, Sweden and Clinical Genomics Linköping, Science for Life Laboratory, Sweden for their support. We are grateful to PDC (KTH, Sweden) support for computational support to test and validate the pipeline on the Dardel HPC.

Owner

  • Name: Jyotirmoy Das
  • Login: JD2112
  • Kind: user
  • Location: Linköping, Sweden
  • Company: Linköping University

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Das
    given-names: Jyotirmoy
    orcid: https://orcid.org/0000-0002-5649-4658
title: "TwistNext"
version: 1.0.0
identifiers:
  - type: doi
    value: 10.5281/zenodo.14204261
date-released: 2024-11-22

GitHub Events

Total
  • Push event: 6
Last Year
  • Push event: 6

Dependencies

env/Dockerfile docker
  • continuumio/miniconda3 4.10.3 build
env/environment.yml pypi