https://github.com/czbiohub-sf/rnaseq

RNA sequencing analysis pipeline using STAR or HISAT2, with gene counts and quality control

https://github.com/czbiohub-sf/rnaseq

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

RNA sequencing analysis pipeline using STAR or HISAT2, with gene counts and quality control

Basic Info
  • Host: GitHub
  • Owner: czbiohub-sf
  • License: mit
  • Language: Nextflow
  • Default Branch: master
  • Homepage: http://nf-co.re
  • Size: 49.6 MB
Statistics
  • Stars: 3
  • Watchers: 2
  • Forks: 0
  • Open Issues: 12
  • Releases: 0
Archived Fork of kerimoff/rnaseq
Created about 7 years ago · Last pushed over 6 years ago

https://github.com/czbiohub-sf/rnaseq/blob/master/

# ![czbiohub/rnaseq](docs/images/czbiohub-rnaseq_logo.png)

[![Build Status](https://travis-ci.org/czbiohub/rnaseq.svg?branch=master)](https://travis-ci.org/czbiohub/rnaseq)
[![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A50.32.0-brightgreen.svg)](https://www.nextflow.io/)
[![DOI](https://zenodo.org/badge/127293091.svg)](https://zenodo.org/badge/latestdoi/127293091)
[![Gitter](https://img.shields.io/badge/gitter-%20join%20chat%20%E2%86%92-4fb99a.svg)](https://gitter.im/czbiohub/Lobby)

[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/)
[![Docker Container available](https://img.shields.io/docker/automated/czbiohub/rnaseq.svg)](https://hub.docker.com/r/czbiohub/rnaseq/)
![Singularity Container available](
https://img.shields.io/badge/singularity-available-7E4C74.svg)


### Introduction

**czbiohub/rnaseq** is a bioinformatics analysis pipeline used for RNA sequencing data.

The workflow processes raw data from FastQ inputs ([FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), [fastp](https://github.com/OpenGene/fastp)), aligns the reads ([STAR](https://github.com/alexdobin/STAR) or [HiSAT2](https://ccb.jhu.edu/software/hisat2/index.shtml)), generates gene counts ([htseq-count](https://htseq.readthedocs.io/en/release_0.11.1/count.html), [StringTie](https://ccb.jhu.edu/software/stringtie/)) and performs extensive quality-control on the results ([RSeQC](http://rseqc.sourceforge.net/), [dupRadar](https://bioconductor.org/packages/release/bioc/html/dupRadar.html), [Preseq](http://smithlabresearch.org/software/preseq/), [edgeR](https://bioconductor.org/packages/release/bioc/html/edgeR.html), [MultiQC](http://multiqc.info/)). See the [output documentation](docs/output.md) for more details of the results.

Additionally, the pipeline is expanded to be able to quantify transcript, exon, alternative splicing and TxRevise expressions. See [optional quantification methods](docs/extra_phenotype_quantification.md) for details.

The pipeline is built using [Nextflow](https://www.nextflow.io), a bioinformatics workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.

### Documentation
The czbiohub/rnaseq pipeline comes with documentation about the pipeline, found in the `docs/` directory:

1. [Installation](docs/installation.md)
2. Pipeline configuration
    * [Local installation](docs/configuration/local.md)
    * [Amazon Web Services (aws)](docs/configuration/aws.md)
    * [Swedish UPPMAX clusters](docs/configuration/uppmax.md)
    * [Swedish cs3e Hebbe cluster](docs/configuration/c3se.md)
    * [Tbingen QBiC](docs/configuration/qbic.md)
    * [CCGA Kiel](docs/configuration/ccga.md)
    * [Adding your own system](docs/configuration/adding_your_own.md)
3. [Running the pipeline (Gene expression)](docs/usage.md)
4. [Running the pipeline (With additional quantification methods)](docs/extra_phenotype_quantification.md)
5. [Output and how to interpret the results](docs/output.md)
6. [Troubleshooting](docs/troubleshooting.md)

### General overview
The schema shown below represents the high level structure of the pipeline.
# ![czbiohub/rnaseq](docs/images/pipeline_high_level_schema.svg)

### Credits
These scripts were originally written for use at the [National Genomics Infrastructure](https://portal.scilifelab.se/genomics/), part of [SciLifeLab](http://www.scilifelab.se/) in Stockholm, Sweden, by Phil Ewels ([@ewels](https://github.com/ewels)) and Rickard Hammarn ([@Hammarn](https://github.com/Hammarn)). They have since taken on a life of their own at Chan Zuckerberg Biohub where they are maintained by Olga Botvinnik.

Many thanks to other who have helped out along the way too, including (but not limited to):
[@Galithil](https://github.com/Galithil),
[@pditommaso](https://github.com/pditommaso),
[@orzechoj](https://github.com/orzechoj),
[@apeltzer](https://github.com/apeltzer),
[@colindaven](https://github.com/colindaven).

Owner

  • Name: Chan Zuckerberg Biohub San Francisco
  • Login: czbiohub-sf
  • Kind: organization
  • Location: San Francisco

GitHub Events

Total
Last Year