https://github.com/a-slide/pycosnake

pycoSnake is a neatly wrapped collection of snakemake workflows for analysing nanopore and Illumina sequencing data

https://github.com/a-slide/pycosnake

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

pycoSnake is a neatly wrapped collection of snakemake workflows for analysing nanopore and Illumina sequencing data

Basic Info
  • Host: GitHub
  • Owner: a-slide
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 94.8 MB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 1
  • Open Issues: 1
  • Releases: 0
Archived
Created over 7 years ago · Last pushed over 5 years ago
Metadata Files
Readme Contributing License Code of conduct

README.md

pycoSnake v0.2.6

Snakemake Licence Build Status DOI

Anaconda version Anaconda last release Anaconda platforms

Anaconda Downloads

pycoSnake is a neatly wrapped collection of snakemake workflows for analysing Illumina and nanopore sequencing datasets. It is easy to install with conda and simple to run on a local computer or in a cluster environment


Installation

Install conda package manager

Conda is the only dependency that you need to install the package.

All the other packages and external program needed for this pipeline will then be automatically handled by conda itself.

Install conda following the official documentation for you system

https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html

Install the package in a conda environment

Recommended installation from Anaconda Cloud

You might not need all the extra channels depending on you conda configuration, but you need to add my channel: aleg

``` conda create -y -n pycoSnake -c aleg -c anaconda -c bioconda -c conda-forge python=3.6 pycosnake

conda activate pycoSnake ```

To update the package

``` conda activate pycoSnake

conda update pycoSnake -c aleg ```

Local installation in develop mode

Clone pycoSnake repository to your local machine

``` git clone git@github.com:a-slide/pycoSnake.git

cd pycoSnake ```

Create a new conda environment

``` conda create -y -n pycoSnake python=3.6

conda activate pycoSnake ```

Install pycoSnake with pip in develop MODE

pip install -e ./

pycoSnake workflows

At the moment there is only 2 workflows available in pycoSnake:

  • DNA_ONT : Analyse Basecalled Nanopore sequencing data.

    • Download and cleanup genome
    • Fastq merging and filtering
    • Alignment with minimap2
    • Alignment cleaning
    • Generate coverage plots IGV and bedgraph (optional)
    • Run Nanopore QC with pycoQC (optional)
    • Run DNA methylation analysing with Nanopolish + pycoMeth (optional)
    • Run SV analysis with Sniffles + filtering + multi-sample merging (optional)
  • RNA_illumina : Analyse Illumina long RNA-seq sequencing data.

    • Download and cleanup genome, transcriptome and annotations
    • Fastq filtering / control / pre-alignment quality control
    • Genome alignment with STAR
    • Summarize STAR counts for all samples
    • Alignment cleaning
    • Transcriptome pseudo-alignment and transcripts quantification with Salmon
    • Summarize Salmon counts for all samples
    • Generate coverage plots IGV and bedgraph (optional)
    • Count reads per genes with featurecounts and summarize counts for all samples (optional)
    • Calculation of FPKM with cufflinks and summarize FPKM for all samples (optional)

Configure a workflow

Generate the sample sheet and config template files required for the workflow you want to run.

``` conda activate pycoSnake

pycoSnake {WORKFLOW NAME} --generatetemplate config samplesheet --overwrite_template

Or for a cluster environment

pycoSnake {WORKFLOW NAME} --generatetemplate clusterconfig samplesheet --overwritetemplate ```

The samples.tsv file needs to be filled with the required informations detailed in the file header and passed to pycoSnake (--sample_sheet).

The config.yaml can be modified and passed to pycoSnake (--config). It is generally recommended to stick to the default parameters.

The cluster_config.yaml can be modified and passed to pycoSnake (--cluster_config). Use the file instead of config.yaml if you are executing the pipeline in a cluster environment. By default the file is for an LSF cluster, but it can be modified for other HPC platforms.

Execute a workflow

Call pycoSnake and choose your workflow

``` conda activate pycoSnake

pycoSnake {WORKFLOW NAME} {OPTIONS} ```

Example for the DNA workflow

Usage on a local machine

``` conda activate pycoSnake

pycoSnake DNAONT -r ref.fa -s samplesheet.tsv --config config.yaml --cores 10 ```

Usage in an LSF cluster environment

Use the clusterconfig option instead of the config file. The clusterconfig provided with pycoSnake is configured to work on an LSF cluster environment It contains the "prototype bsub command" to be used by Snakemake bsub -q {cluster.queue} -n {cluster.threads} -M {cluster.mem} -J {cluster.name} -oo {cluster.output} -eo {cluster.error} as well as the maximal number of cores and nodes to use.

``` conda activate pycoSnake

pycoSnake DNAONT -r ref.fa -s samplesheet.tsv --clusterconfig clusterconfig.yaml ```

Wrapper library

This repository contains snakemake wrappers for pycoSnake.

Using the wrappers library

Wrappers can also be used outside of pycoSnake in any Snakemake file by using the following option:

--wrapper-prefix https://raw.githubusercontent.com/a-slide/pycoSnake/master/pycoSnake/wrappers/

Testing Wrappers

The package contains test data and integrated tests for all the wrappers.

```

Test all the wrappers

pycoSnake test_wrappers --cores 8

Test individual wrappers

pycoSnake testwrappers --wrappers getannotation starcountmerge pycomethcompreport --cores 8

Keep of test output files generated by the wrappers

pycoSnake testwrappers --keepoutput --cores 8 ```

Command line help message

``` usage: pycoSnake [-h] [--version] {testwrappers,DNAONT,RNA_illumina} ...

pycoSnake is a neatly wrapped collection of snakemake workflows for analysing nanopore sequencing data

optional arguments: -h, --help show this help message and exit --version show program's version number and exit

subcommands: pycoSnake implements the following subcommands

{testwrappers,DNAONT,RNA_illumina} ```

Classifiers

  • Development Status :: 3 - Alpha
  • Intended Audience :: Science/Research
  • Topic :: Scientific/Engineering :: Bio-Informatics
  • License :: OSI Approved :: MIT License
  • Programming Language :: Python :: 3

citation

Adrien Leger. a-slide/pycoSnake: (2020). doi:10.5281/zenodo.4110611

licence

MIT

Copyright © 2020 Adrien Leger

Authors

  • Adrien Leger / aleg@ebi.ac.uk / https://adrienleger.com

Owner

  • Name: Adrien Leger
  • Login: a-slide
  • Kind: user
  • Location: Oxford, UK
  • Company: @nanoporetech

Research scientist at Oxford Nanopore Technologies

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 2
  • Total pull requests: 88
  • Average time to close issues: 7 months
  • Average time to close pull requests: 6 minutes
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 88
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • BenjaminSchwessinger (1)
  • a-slide (1)
Pull Request Authors
  • a-slide (58)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.versipy/setup.py pypi
  • dependency_1_pypi__ *