https://github.com/beardymcjohnface/trimnami

Trim lots of metagenomics samples all at once

https://github.com/beardymcjohnface/trimnami

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.1%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Trim lots of metagenomics samples all at once

Basic Info
  • Host: GitHub
  • Owner: beardymcjohnface
  • Language: Python
  • Default Branch: main
  • Size: 36.6 MB
Statistics
  • Stars: 10
  • Watchers: 3
  • Forks: 1
  • Open Issues: 2
  • Releases: 13
Created about 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme

README.md

install with bioconda GitHub last commit (branch) Unit tests Env builds codecov


Trim lots of metagenomics samples all at once.

Motivation

We keep writing pipelines that start with read trimming. Rather than copy-pasting code each time, this standalone Snaketool handles our trimming needs. The tool will collect sample names and files from a directory or TSV file, optionally remove host reads, and trim with your favourite read trimmer. Read trimming methods supported so far:

  • Fastp
  • Prinseq++
  • BBtools for Round A/B viral metagenomics
  • Filtlong + Rasusa for longreads

Install

Trimnami is still in development but can be easily installed with pip:

Easy install

shell pip install trimnami

Developer install shell git clone https://github.com/beardymcjohnface/Trimnami.git cd Trimnami/ pip install -e .

Test

Trimnami comes with inbuilt tests which you can run to check everything works fine.

```shell

test fastp only (default method)

trimnami test

test all SR methods

trimnami test fastp prinseq roundAB

test all SR methods with host removal

trimnami testhost fastp prinseq roundAB

test nanopore method (with host removal)

trimnami testnp ```

Usage

Trim reads with Fastp or Prinseq++

```shell

Fastp (default)

trimnami run --reads reads/

Prinseq++

trimnami run --reads reads/ prinseq

Why not both!

trimnami run --reads reads/ fastp prinseq ```

Include host removal

shell trimnami run --reads reads/ --host host_genome.fasta

Longreads with host removal. Specify 'nanopore' for targets and use the appropriate minimap preset.

shell trimnami run \ --reads reads/ \ --host host_genome.fasta \ --minimap map-ont \ nanopore

Parsing samples with --reads

You can pass either a directory of reads or a TSV file to --reads. - Directory: Trimnami will infer sample names and _R1/_R2 pairs from the filenames. - TSV file: Trimnami expects 2 or 3 columns, with column 1 being the sample name and columns 2 and 3 the reads files.

More information and examples here

Configure trimming parameters

You can customise the trimming parameters via the config file. Copy the default config file.

shell trimnami config

Then edit the config file trimnami.out/trimnami.config.yaml in your favourite text editor. Run trimnami like normal, or point to your custom config file if you've moved it.

shell trimnami run ... --configfile /my/awesome/config.yaml

Outputs

Trimmed reads will be saved in various subfolders in the output directory. e.g. if trimming with Fastp or Prinseq++, trimmed reads will be in trimnami.out/fastp/ or trimnami.out/prinseq/. Paired reads will yield three files: The R1 and R2 paired reads, and any singletons from trimming or host removal. Subsampling will produce extra files of subsampled trimmed reads. Multiqc-fastqc reports for any runs will be available in trimnami.out/reports/

Example outputs

Click to expand prinseq ```text trimnami.out/ └── prinseq ├── A13-04-182-06_TAGCTT.paired.R1.fastq.gz ├── A13-04-182-06_TAGCTT.paired.R2.fastq.gz ├── A13-04-182-06_TAGCTT.paired.S.fastq.gz ├── A13-12-250-06_GGCTAC.paired.R1.fastq.gz ├── A13-12-250-06_GGCTAC.paired.R2.fastq.gz ├── A13-12-250-06_GGCTAC.paired.S.fastq.gz └── A13-135-177-06_AGTTCC.single.fastq.gz ``` prinseq with fastqc reports ```text trimnami.out/ ├── prinseq │   ├── A13-04-182-06_TAGCTT.paired.R1.fastq.gz │   ├── A13-04-182-06_TAGCTT.paired.R2.fastq.gz │   ├── A13-04-182-06_TAGCTT.paired.S.fastq.gz │   ├── A13-12-250-06_GGCTAC.paired.R1.fastq.gz │   ├── A13-12-250-06_GGCTAC.paired.R2.fastq.gz │   ├── A13-12-250-06_GGCTAC.paired.S.fastq.gz │   └── A13-135-177-06_AGTTCC.single.fastq.gz └── reports ├── prinseq.fastqc.html └── untrimmed.fastqc.html ``` prinseq with host removal ```text trimnami.out/ └── prinseq ├── A13-04-182-06_TAGCTT.host_rm.paired.R1.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.R2.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.S.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.R1.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.R2.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.S.fastq.gz └── A13-135-177-06_AGTTCC.host_rm.single.fastq.gz ``` prinseq with host removal and subsampling ```text trimnami.out/ └── prinseq ├── A13-04-182-06_TAGCTT.host_rm.paired.R1.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.R1.subsampled.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.R2.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.R2.subsampled.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.S.fastq.gz ├── A13-04-182-06_TAGCTT.host_rm.paired.S.subsampled.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.R1.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.R1.subsampled.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.R2.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.R2.subsampled.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.S.fastq.gz ├── A13-12-250-06_GGCTAC.host_rm.paired.S.subsampled.fastq.gz ├── A13-135-177-06_AGTTCC.host_rm.single.fastq.gz └── A13-135-177-06_AGTTCC.host_rm.single.subsampled.fastq.gz ```

Owner

  • Name: Michael Roach
  • Login: beardymcjohnface
  • Kind: user
  • Company: Flinders University

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 4
  • Total pull requests: 15
  • Average time to close issues: 2 months
  • Average time to close pull requests: about 4 hours
  • Total issue authors: 3
  • Total pull request authors: 1
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.87
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • npbhavya (2)
  • linsalrob (1)
  • beardymcjohnface (1)
Pull Request Authors
  • beardymcjohnface (18)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Dependencies

setup.py pypi
  • Click >=8.1.3
  • metasnek >=0.0.1
  • pyyaml >=6.0
  • snakemake >=7.14.0
.github/workflows/codecov.yml actions
  • actions/checkout v3 composite
  • codecov/codecov-action v3 composite
  • conda-incubator/setup-miniconda v2 composite
.github/workflows/python-app.yml actions
  • actions/checkout v3 composite
  • conda-incubator/setup-miniconda v2 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
.github/workflows/trimnami-build-envs.yml actions
  • actions/checkout v3 composite
  • conda-incubator/setup-miniconda v3 composite