https://github.com/a-slide/pycosnake
pycoSnake is a neatly wrapped collection of snakemake workflows for analysing nanopore and Illumina sequencing data
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Repository
pycoSnake is a neatly wrapped collection of snakemake workflows for analysing nanopore and Illumina sequencing data
Basic Info
Statistics
- Stars: 8
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
pycoSnake v0.2.6

pycoSnake is a neatly wrapped collection of snakemake workflows for analysing Illumina and nanopore sequencing datasets. It is easy to install with conda and simple to run on a local computer or in a cluster environment
Installation
Install conda package manager
Conda is the only dependency that you need to install the package.
All the other packages and external program needed for this pipeline will then be automatically handled by conda itself.
Install conda following the official documentation for you system
https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html
Install the package in a conda environment
Recommended installation from Anaconda Cloud
You might not need all the extra channels depending on you conda configuration, but you need to add my channel: aleg
``` conda create -y -n pycoSnake -c aleg -c anaconda -c bioconda -c conda-forge python=3.6 pycosnake
conda activate pycoSnake ```
To update the package
``` conda activate pycoSnake
conda update pycoSnake -c aleg ```
Local installation in develop mode
Clone pycoSnake repository to your local machine
``` git clone git@github.com:a-slide/pycoSnake.git
cd pycoSnake ```
Create a new conda environment
``` conda create -y -n pycoSnake python=3.6
conda activate pycoSnake ```
Install pycoSnake with pip in develop MODE
pip install -e ./
pycoSnake workflows
At the moment there is only 2 workflows available in pycoSnake:
DNA_ONT : Analyse Basecalled Nanopore sequencing data.
- Download and cleanup genome
- Fastq merging and filtering
- Alignment with minimap2
- Alignment cleaning
- Generate coverage plots IGV and bedgraph (optional)
- Run Nanopore QC with pycoQC (optional)
- Run DNA methylation analysing with Nanopolish + pycoMeth (optional)
- Run SV analysis with Sniffles + filtering + multi-sample merging (optional)
RNA_illumina : Analyse Illumina long RNA-seq sequencing data.
- Download and cleanup genome, transcriptome and annotations
- Fastq filtering / control / pre-alignment quality control
- Genome alignment with STAR
- Summarize STAR counts for all samples
- Alignment cleaning
- Transcriptome pseudo-alignment and transcripts quantification with Salmon
- Summarize Salmon counts for all samples
- Generate coverage plots IGV and bedgraph (optional)
- Count reads per genes with featurecounts and summarize counts for all samples (optional)
- Calculation of FPKM with cufflinks and summarize FPKM for all samples (optional)
Configure a workflow
Generate the sample sheet and config template files required for the workflow you want to run.
``` conda activate pycoSnake
pycoSnake {WORKFLOW NAME} --generatetemplate config samplesheet --overwrite_template
Or for a cluster environment
pycoSnake {WORKFLOW NAME} --generatetemplate clusterconfig samplesheet --overwritetemplate ```
The samples.tsv file needs to be filled with the required informations detailed in the file header and passed to pycoSnake (--sample_sheet).
The config.yaml can be modified and passed to pycoSnake (--config). It is generally recommended to stick to the default parameters.
The cluster_config.yaml can be modified and passed to pycoSnake (--cluster_config). Use the file instead of config.yaml if you are executing the pipeline in a cluster environment. By default the file is for an LSF cluster, but it can be modified for other HPC platforms.
Execute a workflow
Call pycoSnake and choose your workflow
``` conda activate pycoSnake
pycoSnake {WORKFLOW NAME} {OPTIONS} ```
Example for the DNA workflow
Usage on a local machine
``` conda activate pycoSnake
pycoSnake DNAONT -r ref.fa -s samplesheet.tsv --config config.yaml --cores 10 ```
Usage in an LSF cluster environment
Use the clusterconfig option instead of the config file.
The clusterconfig provided with pycoSnake is configured to work on an LSF cluster environment
It contains the "prototype bsub command" to be used by Snakemake bsub -q {cluster.queue} -n {cluster.threads} -M {cluster.mem} -J {cluster.name} -oo {cluster.output} -eo {cluster.error} as well as the maximal number of cores and nodes to use.
``` conda activate pycoSnake
pycoSnake DNAONT -r ref.fa -s samplesheet.tsv --clusterconfig clusterconfig.yaml ```
Wrapper library
This repository contains snakemake wrappers for pycoSnake.
Using the wrappers library
Wrappers can also be used outside of pycoSnake in any Snakemake file by using the following option:
--wrapper-prefix https://raw.githubusercontent.com/a-slide/pycoSnake/master/pycoSnake/wrappers/
Testing Wrappers
The package contains test data and integrated tests for all the wrappers.
```
Test all the wrappers
pycoSnake test_wrappers --cores 8
Test individual wrappers
pycoSnake testwrappers --wrappers getannotation starcountmerge pycomethcompreport --cores 8
Keep of test output files generated by the wrappers
pycoSnake testwrappers --keepoutput --cores 8 ```
Command line help message
``` usage: pycoSnake [-h] [--version] {testwrappers,DNAONT,RNA_illumina} ...
pycoSnake is a neatly wrapped collection of snakemake workflows for analysing nanopore sequencing data
optional arguments: -h, --help show this help message and exit --version show program's version number and exit
subcommands: pycoSnake implements the following subcommands
{testwrappers,DNAONT,RNA_illumina} ```
Classifiers
- Development Status :: 3 - Alpha
- Intended Audience :: Science/Research
- Topic :: Scientific/Engineering :: Bio-Informatics
- License :: OSI Approved :: MIT License
- Programming Language :: Python :: 3
citation
Adrien Leger. a-slide/pycoSnake: (2020). doi:10.5281/zenodo.4110611
licence
MIT
Copyright © 2020 Adrien Leger
Authors
- Adrien Leger / aleg@ebi.ac.uk / https://adrienleger.com
Owner
- Name: Adrien Leger
- Login: a-slide
- Kind: user
- Location: Oxford, UK
- Company: @nanoporetech
- Website: https://adrienleger.com/
- Twitter: AdrienLeger2
- Repositories: 50
- Profile: https://github.com/a-slide
Research scientist at Oxford Nanopore Technologies
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 2
- Total pull requests: 88
- Average time to close issues: 7 months
- Average time to close pull requests: 6 minutes
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 1.5
- Average comments per pull request: 0.0
- Merged pull requests: 88
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- BenjaminSchwessinger (1)
- a-slide (1)
Pull Request Authors
- a-slide (58)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- dependency_1_pypi__ *