rsvrecon

A pipeline for assembling genomic sequences of respiratory syncytial virus (RSV) from NGS data

https://github.com/stjudecab/rsvrecon

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: biorxiv.org, ncbi.nlm.nih.gov
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

A pipeline for assembling genomic sequences of respiratory syncytial virus (RSV) from NGS data

Basic Info
  • Host: GitHub
  • Owner: stjudecab
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 53 MB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created 12 months ago · Last pushed 9 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

stjudecab/rsvrecon

rsvrecon_logo


GitHub Actions Workflow Status GitHub Actions Workflow Status GitHub Repo stars GitHub Release GitHub Issues or Pull Requests
Nextflow run with docker run with singularity GitHub License Launch on Seqera Platform

News and Updates

Introduction

stjudecab/rsvrecon is a bioinformatics workflow developed to assemble and analyze genomic sequences of Respiratory Syncytial Virus (RSV) from Next-Generation Sequencing (NGS) data. It identifies genomic variations within RSV samples and highlights clinically relevant genomic features. To simplify interpretation, the workflow generates easy-to-understand HTML and PDF reports summarizing the results.

Built using Nextflow, the pipeline offers scalability, portability, and reproducibility across diverse computational infrastructures. Dependency management is simplified by employing containerization technologies such as Docker, Singularity, and Conda.

This pipeline utilizes the Nextflow DSL2 framework, featuring modularized processes with independent software environments, thereby making updates and maintenance straightforward. Processes are also integrated, whenever feasible, with the nf-core/modules repository to enhance usability and foster community contributions.

The schematic overview of the stjudecab/rsvrecon workflow is shown below:

rsvrecon_workflow

Pipeline Overview

Briefly, the rsvrecon pipeline performs the following major steps:

  1. Merge Raw Reads (optional) Concatenate re-sequenced FastQ files (cat).

  2. Quality Control of Raw Reads Evaluate sequencing quality (FastQC).

  3. Adapter Trimming Remove adapters and low-quality bases (fastp).

  4. RSV Database Mapping Map reads against RSV-specific databases (KMA).

  5. Sequence Alignment

  • Align reads to a reference (BWA(default) or STAR).
  • Sort and index aligned sequences (SAMtools).
  • Alignment quality assessment (SAMtools).
  1. Genome Assembly
  1. Variant Identification Determine viral clade assignments, mutations, and sequence quality (NextClade).

  2. Genotyping (Whole Genome & G-gene)

    • Perform BLAST search against reference genomes (BLAST).
    • Execute multiple sequence alignment (Mafft).
    • Generate a phylogenetic tree (FastTree).

Pipeline Usage

[!NOTE] If youre new to Nextflow or nf-core, please refer to the nf-core installation guide. Run the workflow first with -profile test to ensure proper functionality before applying it to your actual data.

Prepare your sample metadata in a CSV format (samplesheet.csv) with FASTQ files:

csv sample,fastq_1,fastq_2 sample_1,sample_1_R1_001.fastq.gz,sample_1_R2_001.fastq.gz sample_2,sample_2_R1_001.fastq.gz,sample_2_R2_001.fastq.gz

[!NOTE] If a sample has multiple sequencing lanes or replicates, list each replicate in a separate row with the same sample ID. The pipeline will automatically merge these reads before analysis. Spaces in sample IDs will be automatically converted to underscores (_).

Run the workflow using the following command structure:

```bash

For St. Jude HPC users, specify the institutional profile, e.g., '-profile stjude'

nextflow run stjudecab/rsvrecon \ -profile \ --input samplesheet.csv \ --outdir \ ```

[!WARNING] Pipeline parameters should only be provided via CLI arguments or Nextflow's -params-file option. Custom configuration files (-c) can specify system and pipeline configurations except for parameters. For detailed guidance, consult the configuration documentation.

For further information, please check the complete usage documentation and pipeline output descriptions.

Pipeline Output

Pipeline results are organized per sample ID within the specified output directory (<OUTDIR>), structured by analysis stages:

  • <OUTDIR>/<sample_id>/fastq: Contains trimmed FASTQ files ready for alignment.
  • <OUTDIR>/<sample_id>/bam: Includes aligned BAM files and index files (BAI).
  • <OUTDIR>/<sample_id>/qc: Quality control outputs for raw and aligned reads.
  • <OUTDIR>/<sample_id>/reference: Reference genomes and database files used for the analysis.
  • <OUTDIR>/<sample_id>/variant_calling: Variant calling and identification results.
  • <OUTDIR>/<sample_id>/phylogeny_tree: Results related to phylogenetic analyses.
  • <OUTDIR>/<sample_id>/assembly: Genome assemblies and coverage summaries.
  • (and more)

In addition to sample-level results, our pipeline delivers batch-level results as well, structured as follows:

  • <OUTDIR>/batch_qcs/: Contains MultiQC-based QC reports for both raw and filtered (trimmed) data.
  • <OUTDIR>/batch_reports/: Include combined results across samples as clinical reports in both PDF and HTML formats.

Credits

stjudecab/rsvrecon was developed by Haidong Yi (@HaidYi) and Lei Li (@LeiLi-Uchicago) at the Center for Applied Bioinformatics (CAB), St. Jude Children's Research Hospital. The pipeline design incorporates community-driven best practices, especially inspired by nf-core. We also thank the wider CAB team for their valuable inputs and feedbacks.

StJude_CAB

Contributions and Support

We encourage community contributions. Please adhere to nf-core guidelines for maintaining consistency. Suggestions and improvements are welcome through pull requests or issues on our GitHub repository.

Disclaimer

The pipeline logo is initially generated through ChatGPT's new 4o Image Generation function using the pipeline introduction as the prompt.

Citations

bibtex @article{Li2025.rsvrecon, author = {Li, Lei and Yi, Haidong and Brazelton, Jessica N. and Webby, Richard and Hayden, Randall T. and Wu, Gang and Hijano, Diego R.}, title = {Bridging Genomics and Clinical Medicine: RSVrecon Enhances RSV Surveillance with Automated Genotyping and Clinically-important Mutation Reporting}, elocation-id = {2025.06.03.657184}, year = {2025}, doi = {10.1101/2025.06.03.657184}, publisher = {Cold Spring Harbor Laboratory}, URL = {https://www.biorxiv.org/content/early/2025/06/09/2025.06.03.657184}, journal = {bioRxiv} }

Owner

  • Name: St. Jude Center for Applied Bioinformatics (CAB)
  • Login: stjudecab
  • Kind: organization
  • Location: Memphis, TN

GitHub Events

Total
  • Release event: 1
  • Watch event: 1
  • Public event: 1
  • Push event: 3
  • Pull request event: 1
Last Year
  • Release event: 1
  • Watch event: 1
  • Public event: 1
  • Push event: 3
  • Pull request event: 1

Dependencies

.github/workflows/branch.yml actions
  • mshick/add-pr-comment b8f338c590a895d50bcbfa6c5859251edc8952fc composite
.github/workflows/ci.yml actions
  • actions/checkout 0ad4b8fadaa221de15dcec353f45205ec38ea70b composite
  • conda-incubator/setup-miniconda a4260408e20b96e80095f42ff7f1a15b27dd94ca composite
  • eWaterCycle/setup-apptainer main composite
  • jlumbroso/free-disk-space 54081f138730dfa15788a46383842cd2f914a1be composite
  • nf-core/setup-nextflow v2 composite
.github/workflows/clean-up.yml actions
  • actions/stale 28ca1036281a5e5922ead5184a1bbf96e5fc984e composite
.github/workflows/download_pipeline.yml actions
  • actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
  • eWaterCycle/setup-apptainer 4bb22c52d4f63406c49e94c804632975787312b3 composite
  • jlumbroso/free-disk-space 54081f138730dfa15788a46383842cd2f914a1be composite
  • nf-core/setup-nextflow v2 composite
.github/workflows/fix-linting.yml actions
  • actions/checkout 0ad4b8fadaa221de15dcec353f45205ec38ea70b composite
  • actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
  • peter-evans/create-or-update-comment 71345be0265236311c031f5c7866368bd1eff043 composite
.github/workflows/linting.yml actions
  • actions/checkout 0ad4b8fadaa221de15dcec353f45205ec38ea70b composite
  • actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
  • actions/upload-artifact 65462800fd760344b1a7b4382951275a0abb4808 composite
  • nf-core/setup-nextflow v2 composite
  • pietrobolcato/action-read-yaml 1.1.0 composite
.github/workflows/linting_comment.yml actions
  • dawidd6/action-download-artifact bf251b5aa9c2f7eeb574a96ee720e24f801b7c11 composite
  • marocchino/sticky-pull-request-comment 331f8f5b4215f0445d3c07b4967662a32a2d3e31 composite
.github/workflows/template_version_comment.yml actions
  • actions/checkout 0ad4b8fadaa221de15dcec353f45205ec38ea70b composite
  • mshick/add-pr-comment b8f338c590a895d50bcbfa6c5859251edc8952fc composite
  • nichmor/minimal-read-yaml v0.0.2 composite
modules/nf-core/blast/blastn/meta.yml cpan
modules/nf-core/blast/makeblastdb/meta.yml cpan
modules/nf-core/bwa/index/meta.yml cpan
modules/nf-core/bwa/mem/meta.yml cpan
modules/nf-core/cat/cat/meta.yml cpan
modules/nf-core/cat/fastq/meta.yml cpan
modules/nf-core/fastp/meta.yml cpan
modules/nf-core/fastqc/meta.yml cpan
modules/nf-core/fasttree/meta.yml cpan
modules/nf-core/gunzip/meta.yml cpan
modules/nf-core/kma/index/meta.yml cpan
modules/nf-core/mafft/align/meta.yml cpan
modules/nf-core/multiqc/meta.yml cpan
modules/nf-core/nextclade/datasetget/meta.yml cpan
modules/nf-core/nextclade/run/meta.yml cpan
modules/nf-core/samtools/flagstat/meta.yml cpan
modules/nf-core/samtools/idxstats/meta.yml cpan
modules/nf-core/samtools/index/meta.yml cpan
modules/nf-core/samtools/mpileup/meta.yml cpan
modules/nf-core/samtools/sort/meta.yml cpan
modules/nf-core/samtools/stats/meta.yml cpan
modules/nf-core/star/align/meta.yml cpan
modules/nf-core/star/genomegenerate/meta.yml cpan
modules/nf-core/untar/meta.yml cpan
subworkflows/nf-core/utils_nextflow_pipeline/meta.yml cpan
subworkflows/nf-core/utils_nfcore_pipeline/meta.yml cpan
subworkflows/nf-core/utils_nfschema_plugin/meta.yml cpan
modules/local/generate_report/environment.yml pypi
modules/local/visualize_tree/environment.yml pypi
modules/nf-core/blast/blastn/environment.yml pypi
modules/nf-core/blast/makeblastdb/environment.yml pypi
modules/nf-core/bwa/index/environment.yml pypi
modules/nf-core/bwa/mem/environment.yml pypi
modules/nf-core/cat/cat/environment.yml pypi
modules/nf-core/cat/fastq/environment.yml pypi
modules/nf-core/fastp/environment.yml pypi
modules/nf-core/fastqc/environment.yml pypi
modules/nf-core/fasttree/environment.yml pypi
modules/nf-core/gunzip/environment.yml pypi
modules/nf-core/kma/index/environment.yml pypi
modules/nf-core/mafft/align/environment.yml pypi
modules/nf-core/multiqc/environment.yml pypi
modules/nf-core/nextclade/datasetget/environment.yml pypi
modules/nf-core/nextclade/run/environment.yml pypi
modules/nf-core/samtools/flagstat/environment.yml pypi
modules/nf-core/samtools/idxstats/environment.yml pypi
modules/nf-core/samtools/index/environment.yml pypi
modules/nf-core/samtools/mpileup/environment.yml pypi
modules/nf-core/samtools/sort/environment.yml pypi
modules/nf-core/samtools/stats/environment.yml pypi
modules/nf-core/star/align/environment.yml pypi
modules/nf-core/star/genomegenerate/environment.yml pypi
modules/nf-core/untar/environment.yml pypi