https://github.com/charlesfoster/mitowrap

A snakemake pipeline wrapping MitoZ and getOrganelle for de novo mitogenome assembly using short reads and subsequent QC.

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary

Keywords

genome-assembly mitochondria mitochondrial-genome-assembly mitogenome snakemake

Last synced: 9 months ago · JSON representation

Repository

A snakemake pipeline wrapping MitoZ and getOrganelle for de novo mitogenome assembly using short reads and subsequent QC.

Basic Info

Host: GitHub
Owner: charlesfoster
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 10.7 KB

Statistics

Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 0

Topics

genome-assembly mitochondria mitochondrial-genome-assembly mitogenome snakemake

Created over 3 years ago · Last pushed over 3 years ago

Metadata Files

Readme License

mitowrap

```

_ _

_ __ ___ () | _____ ___ __ __ _ _ __
| '_ _ \| | __/ _ \ \ /\ / / '__/ _ | '_ \ | | | | | | | || () \ V V /| | | (| | |) | || || |||__/ _/_/ || _,_| ./

|_|

```

A snakemake pipeline wrapping MitoZ and getOrganelle for de novo mitogenome assembly using short reads and subsequent QC.

Installation

Firstly, clone this repository:

``` git clone https://github.com/charlesfoster/mitowrap.git

cd mitowrap ```

Create a new conda environment:

```

install mamba if not already installed

conda install -c conda-forge mamba mamba env create -f environment.yml ```

If you are using Linux and want to run programs using Singularity containers (see sections below), you will also need to take the following step:

conda activate mitowrap mamba install -c conda-forge singularity==3.7.1

Usage

Each time you wish to run the program, make sure to activate the conda environment first:

conda activate mitowrap

Then, update the paths and parameters in config.yaml as you see fit. At minimum, you will likely want to update:

reference: the path to a file with reference mitochondrial genome(s) in fasta format. Used to extract mitochondrial reads from your input data.
suffix: the suffix in your reads filenames used to determine sample names properly. For example, if your forward reads are named "Sample1R1001.fastq.gz" and "SampleR1001.fastq.gz", respectively, then a suffix of "R1001.fastq.gz" will allow snakemake to recognise the sample name as "Sample1".
reads_dir: the path to the directory containing all sequencing reads to be used in the analysis
outdir: the path to the directory where you want your results to be saved

Minimal instructions to run:

snakemake -j 16

Note 1: Update the value after -j to reflect how many threads/cores you would like to allow snakemake to use.

Note 2: This method assumes all dependencies are installed correctly in your path. An easier option is to let the snakemake pipeline take care of all software dependencies for you by creating internal conda environments or using containers with Singularity.

Note 3: This method assumes all paths etc. for your run are defined in config.yaml. You can choose to have several config files instead, each with different names. If you take this path, you will always need to append the config file's name (e.g., 'otherconfig.yaml') to your snakemake command, e.g. `--configfile otherconfig.yaml`.

Using conda

Running the pipeline with conda is currently more stable than with Singularity, and is supported on both Mac OS and Linux. Usage:

snakemake -j 16 --configfile config.yaml --use-conda

Using singularity

Note currently experiencing some issues with the getOrganelle container. Try using conda instead.

Running the pipeline with Singularity is only supported on Linux. Usage:

snakemake -j 16 --use-singularity --singularity-args

Sometimes if you have necessary data stored on mounted drives, you need to tell Singularity to bind those drives. Example:

snakemake -j 16 ---use-singularity --singularity-args '--bind /home:/home,/data/data'

Other snakemake options

There are plenty of other options available within snakemake that I haven't touched on here. For example, you can save an image of the directed acyclical graph describing the workflow:

snakemake -c 1 --dag | dot -Tpdf > dag.pdf

Note: requires external dot program

You can also choose to print all commands being run by snakemake during the workflow by appending one simple flag to your command:

snakemake -j 16 --use-conda --printshellcmds

For all other options, look at snakemake -h or the snakemake website.

What does the pipeline do?

For each sample:

Raw reads are quality/adapter trimmed with fastp
Clean reads are mapped to the specified reference mitochondrial genome, and mitochondrial reads are extracted
Reads are de novo assembled into mitogenomes using (a) MitoZ, and (b) getOrganelle (then annotated using MitoZ)
QC metrics are calculated for each sample, and saved in an overall file

What do you need to run the program?

Check the config file. At minimum: * Paired-end short sequencing reads for one or more samples * File with mitochondrial genomes in fasta format * An up to date config.yaml file

Citations

If you use this program and find it useful, I'd appreciate some kind of attribution, such as a link to this GitHub repo. Please also cite the programs used within this pipeline.

Owner

Login: charlesfoster
Kind: user

Repositories: 2
Profile: https://github.com/charlesfoster

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: over 2 years ago

All Time

Total Commits: 4
Total Committers: 2
Avg Commits per committer: 2.0
Development Distribution Score (DDS): 0.25

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Charles Foster	c**s@p**n	3
charlesfoster	c**r@u**u	1

Committer Domains (Top 20 + Academic)

unsw.edu.au: 1 pop-os.localdomain: 1

Issues and Pull Requests

Last synced: over 2 years ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

https://github.com/charlesfoster/mitowrap

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

mitowrap

```

|_|

Installation

install mamba if not already installed

Usage

Using conda

Using singularity

Other snakemake options

What does the pipeline do?

What do you need to run the program?

Citations

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies