lorikeet-genome

Strain resolver for metagenomics

https://github.com/rhysnewell/lorikeet

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    8 of 13 committers (61.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Strain resolver for metagenomics

Basic Info
  • Host: GitHub
  • Owner: rhysnewell
  • License: agpl-3.0
  • Language: Rust
  • Default Branch: master
  • Size: 31.3 MB
Statistics
  • Stars: 78
  • Watchers: 5
  • Forks: 10
  • Open Issues: 14
  • Releases: 20
Created almost 7 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

install with bioconda DOI

Lorikeet

Lorikeet is a within-species variant analysis pipeline for metagenomic communities that utilizes both long and short reads. Lorikeet utilizes a re-implementaion of the GATK HaplotypeCaller algorithm, performing local re-assembly of potentially active regions within candidate genomes. Called variants can be clustered into likely strains using a combination of UMAP and HDBSCAN.

Documentation

For detailed documentation of Lorikeet and the various algorithms and concepts it touches on please visit the Lorikeet Docs

Quick Start

Installation

Lorikeet is distributed via Crates.io https://crates.io/crates/lorikeet-genome. Additional packages can be downloaded via conda using the lorikeet.yml file provided. Ensure that cargo is installed on your system:

bash curl https://sh.rustup.rs -sSf | sh

Then install lorikeet:

bash cargo install lorikeet-genome

Alongside required packages:

bash conda env create -f lorikeet.yml -n lorikeet conda activate lorikeet

Usage

Input can either be reads and reference genome, or MAG. Or a BAM file and associated genome.

``` Strain genotyping analysis for metagenomics

Usage: lorikeet ...

Main subcommands: genotype Experimental Resolve strain-level genotypes of MAGs from microbial communities consensus Creates consensus genomes for each input reference and for each sample call Performs variant calling with no downstream analysis evolve Calculate dN/dS values for genes from read mappings

Other options: -V, --version Print version information

Rhys J. P. Newell ```

Call variants from bam:

lorikeet call --bam-files my.bam --longread-bam-files my-longread.bam --genome-fasta-directory genomes/ -x fna --bam-file-cache-directory saved_bam_files --output-directory lorikeet_out/ --threads 10 --plot

Call variants from short reads and longread bam files:

lorikeet call -r input_genome.fna -1 forward_reads.fastq -2 reverse_reads.fastq -l longread.bam

Shell completion

Completion scripts for various shells e.g. BASH can be generated. For example, to install the bash completion script system-wide (this requires root privileges):

lorikeet shell-completion --output-file lorikeet --shell bash mv lorikeet /etc/bash_completion.d/

It can also be installed into a user's home directory (root privileges not required):

lorikeet shell-completion --shell bash --output-file /dev/stdout >>~/.bash_completion

In both cases, to take effect, the terminal will likely need to be restarted. To test, type lorikeet ca and it should complete after pressing the TAB key.

License

Code is GPL-3.0

Owner

  • Name: Rhys Newell
  • Login: rhysnewell
  • Kind: user
  • Location: Sydney, Australia
  • Company: Microba LifeSciences

Bioinformatics Software Engineer @ Microba. Awaiting for my PhD to be examined. Specialises in software development and analysis of big genomic data

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Newell
    given-names: Rhys J. P.
    orcid: https://orcid.org/0000-0002-1300-6116
  - family-names: McMaster
    given-names: Eilish S.
    orcid: https://orcid.org/0000-0002-7415-8690
  - family-names: Craig
    given-names: Philip
  - family-names: Boden
    given-names: Mikael
    orcid: https://orcid.org/0000-0003-3548-268X
  - family-names: Tyson
    given-names: Gene W.
    orcid: https://orcid.org/0000-0001-8559-9427
  - family-names: Woodcroft
    given-names: Ben J.
    orcid: https://orcid.org/0000-0003-0670-7480
title: "Lorikeet: strain-resolved metagenome analysis using local reassembly"
version: 0.8.2
doi: 10.5281/zenodo.10275469
date-released: 2023-11-05

GitHub Events

Total
  • Issues event: 1
  • Watch event: 9
  • Issue comment event: 1
Last Year
  • Issues event: 1
  • Watch event: 9
  • Issue comment event: 1

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 1,867
  • Total Committers: 13
  • Avg Commits per committer: 143.615
  • Development Distribution Score (DDS): 0.446
Past Year
  • Commits: 76
  • Committers: 2
  • Avg Commits per committer: 38.0
  • Development Distribution Score (DDS): 0.079
Top Committers
Name Email Commits
rhysnewell r****l@u****u 1,034
Rhys Newell r****l@h****u 743
Rhys Newell 4****l@u****m 54
rhysnewell r****l@u****u 11
Rhys Newell r****l@m****m 6
Rhys Newell n****9@c****u 4
Rhys Newell s****1@p****u 4
Ben Woodcroft b****t@g****m 2
Rhys Newell s****1@h****u 2
Rhys Newell n****9@c****u 2
Rhys Newell s****1@c****u 2
Ben Woodcroft d****n@g****m 2
Stefaan Verwimp 5****p@u****m 1

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 13
  • Total pull requests: 48
  • Average time to close issues: 7 months
  • Average time to close pull requests: 11 days
  • Total issue authors: 5
  • Total pull request authors: 4
  • Average comments per issue: 3.08
  • Average comments per pull request: 0.27
  • Merged pull requests: 30
  • Bot issues: 0
  • Bot pull requests: 17
Past Year
  • Issues: 4
  • Pull requests: 1
  • Average time to close issues: about 14 hours
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 2.25
  • Average comments per pull request: 1.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rhysnewell (4)
  • AroneyS (4)
  • jianshu93 (3)
  • rwolfe45 (1)
Pull Request Authors
  • rhysnewell (24)
  • dependabot[bot] (17)
  • wwood (7)
  • StefaanVerwimp (1)
Top Labels
Issue Labels
bug (3) enhancement (2) help wanted (1)
Pull Request Labels
dependencies (17)

Packages

  • Total packages: 2
  • Total downloads:
    • cargo 3,279 total
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 2
  • Total maintainers: 1
crates.io: lorikeet-rs

Strain resolver for metagenomics

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,836 Total
Rankings
Stargazers count: 15.7%
Forks count: 16.0%
Dependent repos count: 29.3%
Average: 33.0%
Dependent packages count: 33.8%
Downloads: 70.0%
Maintainers (1)
Last synced: 7 months ago
crates.io: lorikeet-genome

Strain resolver and variant caller via local reassembly for metagenomics

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,443 Total
Rankings
Stargazers count: 16.3%
Forks count: 16.9%
Dependent repos count: 30.7%
Dependent packages count: 36.2%
Average: 39.7%
Downloads: 98.4%
Maintainers (1)
Last synced: 7 months ago

Dependencies

.github/workflows/deploy-docs.yaml actions
  • actions-rs/toolchain v1 composite
  • actions/checkout v2 composite
  • crazy-max/ghaction-github-pages v3.0.0 composite
  • docker://pandoc/core 2.9 composite
.github/workflows/release-lorikeet.yml actions
  • actions/checkout main composite
  • actions/upload-artifact v2 composite
  • marvinpinto/action-automatic-releases latest composite
  • master-atul/tar-action v1.0.2 composite
  • rhysnewell/rust-cargo-musl-action v0.1.20 composite
.github/workflows/test-lorikeet.yml actions
  • actions/checkout v2 composite
  • conda-incubator/setup-miniconda v2 composite
Cargo.toml cargo
  • assert_cli 0.6.* development
  • ansi_term 0.12.1
  • approx 0.5.0
  • bio 0.41.0
  • bio-types 0.13.0
  • bird_tool_utils-man ^0.4
  • bstr ^0.2.17
  • clap ^3
  • clap_complete ^3
  • compare 0.1.0
  • enum-ordinalize ^3.1
  • env_logger 0.6
  • finch 0.4.3
  • gkl ^0.1.1
  • glob ^0.3
  • hashlink ^0.7
  • hts-sys 2.0.3
  • indexmap ^1.7
  • indicatif ^0.17.1
  • itertools ^0.8
  • lazy_static ^1.3
  • libm 0.2.1
  • log ^0.4
  • mathru 0.9.1
  • multimap 0.8.3
  • ndarray ^0.15.3
  • ndarray-npy ^0.8.0
  • needletail ^0.4.1
  • num 0.4.0
  • openssl ^0.10
  • openssl-sys ^0.9
  • ordered-float 1
  • partitions ^0.2
  • petgraph ^0.6.0
  • pyo3 ^0.17
  • rand 0.6
  • rayon ^1.5.1
  • roff ^0.2
  • rust-htslib 0.39.*
  • scoped_threadpool ^0.1.9
  • serde ^1
  • serde_derive ^1
  • statrs 0.13
  • strum ^0.17.1
  • strum_macros ^0.17.1
  • tempdir ^0.3
  • tempfile ^3
  • term 0.7.0