https://github.com/acg-team/rust-phylo

Phylo library in Rust

https://github.com/acg-team/rust-phylo

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README
  • Academic publication links
    Links to: pubmed.ncbi, ncbi.nlm.nih.gov, springer.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation

Repository

Phylo library in Rust

Basic Info
  • Host: GitHub
  • Owner: acg-team
  • License: apache-2.0
  • Language: Rust
  • Default Branch: develop
  • Size: 3.57 MB
Statistics
  • Stars: 8
  • Watchers: 6
  • Forks: 2
  • Open Issues: 29
  • Releases: 1
Created over 2 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License

README.md

Phylo

A high-performance Rust library for phylogenetic analysis and multiple sequence alignment under maximum likelihood and parsimony optimality criteria.

Licence CI codecov

Current FunctionalityGetting StartedCrate FeaturesRoadmapContributingRelated ProjectsSupportCitationLicence and Attributions

Current Functionality

  • Maximum Likelihood Phylogenetic Analysis: Efficient implementation of phylogenetic tree inference using SPR moves using likelihood or parsimony cost functions;
  • Multiple Sequence Alignment (MSA): Support for Multiple Sequence Alignment using the IndelMaP algorithm (paper, python implementation);
  • Sequence Evolution Models: Support for various DNA (JC69, K80, TN93, HKY, GTR) and protein (WAG, HIVB, BLOSUM62) substitution models as well as the Poisson Indel Process (PIP) (paper) model;
  • High Performance: Optimised tree search with optional parallel processing capabilities.

Getting Started

Note: This crate is not yet published on crates.io. To use it directly from GitHub, add this to your Cargo.toml:

toml [dependencies] phylo = { git = "https://github.com/acg-team/rust-phylo", package = "phylo" }

Once published on crates.io, you'll be able to use:

toml [dependencies] phylo = "0.1.0"

Minimum Supported Rust Version: 1.82.0

MSRV detected using cargo-msrv.

Example

```rust use std::path::Path;

use phylo::likelihood::TreeSearchCost; use phylo::optimisers::TopologyOptimiser; use phylo::phyloinfo::PhyloInfoBuilder; use phylo::substitutionmodels::{SubstModel, SubstitutionCostBuilder, K80};

fn main() -> std::result::Result<(), anyhow::Error> { // Note: This example uses test data from the repository let info = PhyloInfoBuilder::new(Path::new("./examples/data/K80.fasta").topathbuf()).build()?; let k80 = SubstModel::::new(&[], &[4.0, 1.0]); let c = SubstitutionCostBuilder::new(k80, info).build()?; let unoptcost = c.cost(); let optimiser = TopologyOptimiser::new(c); let result = optimiser.run()?; asserteq!(unoptcost, result.initialcost); assert!(result.finalcost > result.initialcost); assert!(result.iterations <= 100); assert_eq!(result.cost.tree().len(), 9); // The initial tree has 9 nodes, 5 leaves and 4 internal nodes, and so should the resulting tree. Ok(()) } ```

Crate Features

This crate supports several optional features:

  • par-regraft: Enable parallel regrafting operations using Rayon;
  • par-regraft-chunk: Enable chunked parallel regrafting;
  • par-regraft-manual: Enable manual parallel regrafting control;
  • precomputed-test-results: Speed up test runs with precomputed results (for local development).

Enable features in your Cargo.toml:

toml [dependencies] phylo = { git = "https://github.com/acg-team/rust-phylo", package = "phylo", features = ["par-regraft"] }

Roadmap

This crate is new and in active development at the moment. The basic existing functionality is mentioned above, but the following features are being currently implemented or planned:

  • Simultaneous tree and alignment estimation under the PIP model (paper);
  • Maximum likelihood tree search using NNI moves under the TKF92 long indel model (paper);
  • Extension to the PIP model that includes long insertions (manuscript in preparation);
  • Ancestral state reconstruction using PIP (paper), TKF92 and IndelMaP (paper);
  • Randomised starting trees for tree inference;
  • Generalisation of the tree structure for easier use in other crates.

Other minor features/improvements are documented on the GitHub issues page.

Contributing

This is a new library that is currently in active development. Contributions are highly welcome!

API Stability: As this crate is in active development, the API may change between versions until we reach 1.0. We'll follow semantic versioning and document breaking changes in release notes.

New Contributors

Please read our contributor guide!

Current Contributors:

Related Projects

Support

For questions, bug reports, or feature requests, please go to rust-phylo discussion page and/or open an issue on GitHub.

Citation

If you use this library in your research, please consider citing:

bibtex @software{phylo_rust, title = {Phylo: A Rust library for phylogenetic analysis}, author = {Pečerska, Jūlija and Mrzik, Mattes and Iartsev, Dmitrii and Gil, Manuel and Anisimova, Maria}, url = {https://github.com/acg-team/rust-phylo}, year = {2025} }

Licence and Attributions

Licence

This project is licensed under either of - Apache Licence, Version 2.0 (LICENSE-APACHE or www.apache.org/licenses/LICENSE-2.0), or - MIT Licence (LICENSE-MIT or opensource.org/licenses/MIT)

at your option.

Benchmarking Datasets

Datasets for benchmarking were taken from: - Zhou, Xiaofan (2017). Single-gene alignments. figshare. Dataset. Link; - Zhou, Xiaofan (2017). Supermatrices. figshare. Dataset. Link.

The datasets are licensed under CC BY 4.0.

The datasets were modified by normalising invalid/unrecognised sequence characters since the exact sequences are less relevant for pure performance measurements.

Owner

  • Name: Applied Computational Genomics Team
  • Login: acg-team
  • Kind: organization
  • Location: Wädenswil, Switzerland

Computational Genomics tools from Maria Anisimova and collaborators

GitHub Events

Total
  • Create event: 36
  • Issues event: 25
  • Watch event: 8
  • Delete event: 24
  • Member event: 3
  • Issue comment event: 63
  • Push event: 267
  • Pull request review event: 205
  • Pull request review comment event: 235
  • Pull request event: 58
  • Fork event: 1
Last Year
  • Create event: 36
  • Issues event: 25
  • Watch event: 8
  • Delete event: 24
  • Member event: 3
  • Issue comment event: 63
  • Push event: 267
  • Pull request review event: 205
  • Pull request review comment event: 235
  • Pull request event: 58
  • Fork event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 18
  • Total pull requests: 39
  • Average time to close issues: 9 days
  • Average time to close pull requests: 9 days
  • Total issue authors: 3
  • Total pull request authors: 3
  • Average comments per issue: 0.22
  • Average comments per pull request: 1.08
  • Merged pull requests: 25
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 17
  • Pull requests: 31
  • Average time to close issues: 12 days
  • Average time to close pull requests: 7 days
  • Issue authors: 3
  • Pull request authors: 3
  • Average comments per issue: 0.18
  • Average comments per pull request: 1.16
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • junniest (11)
  • MattesMrzik (6)
  • kalxed (1)
Pull Request Authors
  • junniest (32)
  • merlinio2000 (5)
  • MattesMrzik (2)
Top Labels
Issue Labels
enhancement (7) bug (2) documentation (2) good first issue (1)
Pull Request Labels

Dependencies

.github/workflows/test_coverage.yml actions
  • EnricoMi/publish-unit-test-result-action v1 composite
  • actions-rs/toolchain v1 composite
  • actions/cache v2 composite
  • actions/checkout v3 composite
  • codecov/codecov-action v3 composite
phylo/Cargo.toml cargo