phylodm

Efficient calculation of phylogenetic distance matrices.

https://github.com/aaronmussig/phylodm

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.0%) to scientific vocabulary

Keywords from Contributors

mesh sequences interactive hacking network-simulation
Last synced: 6 months ago · JSON representation ·

Repository

Efficient calculation of phylogenetic distance matrices.

Basic Info
  • Host: GitHub
  • Owner: aaronmussig
  • License: gpl-3.0
  • Language: Rust
  • Default Branch: main
  • Homepage:
  • Size: 653 KB
Statistics
  • Stars: 48
  • Watchers: 2
  • Forks: 3
  • Open Issues: 3
  • Releases: 21
Created almost 6 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog License Citation

README.md

🌲 PhyloDM

PyPI BioConda Crates DOI

PhyloDM is a high-performance library that converts a phylogenetic tree into a pairwise distance matrix.

For a tree with 30,000 taxa, PhyloDM will use:

  • ~14GB of memory (94% less than DendroPy)
  • ~1 minute of CPU time (183x faster than DendroPy).

PhyloDM is written in Rust and is exposed to Python via the Python PyO3 API. This means it can be used in either Python or Rust, however, the documentation below is written for use in Python. For Rust documentation, see Crates.io.

⚙ Installation

Requires Python 3.9+

PyPI

Pre-compiled binaries are packaged for most 64-bit Unix platforms. If you are installing on a different platform then you will need to have Rust installed to compile the binaries.

shell python -m pip install phylodm

Conda

shell conda install -c b bioconda phylodm

🐍 Quick-start

A pairwise distance matrix can be created from either a Newick file, or DendroPy tree.

```python from phylodm import PhyloDM

PREPARATION: Create a test tree

with open('/tmp/newick.tree', 'w') as fh: fh.write('(A:4,(B:3,C:4):1);')

1a. From a Newick file

pdm = PhyloDM.loadfromnewick_path('/tmp/newick.tree')

1b. From a DendroPy tree

import dendropy tree = dendropy.Tree.getfrompath('/tmp/newick.tree', schema='newick') pdm = PhyloDM.loadfromdendropy(tree)

2. Calculate the PDM

dm = pdm.dm(norm=False) labels = pdm.taxa()

""" /------------[4]------------ A + | /---------[3]--------- B ---[1]---+ ------------[4]------------- C

labels = ('A', 'B', 'C') dm = [[0. 8. 9.] [8. 0. 7.] [9. 7. 0.]] """ ```

Accessing data

The dm method generates a symmetrical NumPy matrix and returns a tuple of keys in the matrix row/column order.

```python

Calculate the PDM

dm = pdm.dm(norm=False) labels = pdm.taxa()

""" /------------[4]------------ A + | /---------[3]--------- B ---[1]---+ ------------[4]------------- C

labels = ('A', 'B', 'C') dm = [[0. 8. 9.] [8. 0. 7.] [9. 7. 0.]] """

e.g. The following commands (equivalent) get the distance between A and B

dm[0, 1] # 8 dm[labels.index('A'), labels.index('B')] # 8 ```

Normalisation

If the norm argument of dm is set to True, then the data will be normalised by the sum of all edges in the tree.

⏱ Performance

Tests were executed using scripts/performance/Snakefile on an Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz.

For large numbers of taxa it is beneficial to use PhyloDM, however, if you have a small number of taxa in the tree it is beneficial to use DendroPy for the great features it provides.

PhyloDM vs DendroPy resource usage

Owner

  • Name: Aaron Mussig
  • Login: aaronmussig
  • Kind: user
  • Location: Brisbane, Australia
  • Company: Australian Centre for Ecogenomics

Bioinformatics PhD student at the University of Queensland. Python and Rust enthusiast.

Citation (CITATION.cff)

cff-version: 1.2.0
abstract: "Efficient calculation of phylogenetic distance matrices."
message: "If you use this software, please cite it as below."
authors:
- family-names: "Mussig"
  given-names: "Aaron J."
  orcid: "https://orcid.org/0000-0002-9988-0866"
title: "PhyloDM"
version: 3.2.0
doi: 10.5281/zenodo.3998716
url: "https://github.com/aaronmussig/PhyloDM"
date-released: 2024-10-08
keywords:
 - phylogenetic
 - tree
 - "distance matrix"
license: GPL-3.0

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 95
  • Total Committers: 3
  • Avg Commits per committer: 31.667
  • Development Distribution Score (DDS): 0.137
Past Year
  • Commits: 14
  • Committers: 2
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.214
Top Committers
Name Email Commits
Aaron Mussig a****g@g****m 82
semantic-release-bot s****t@m****t 11
dependabot[bot] 4****] 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 7
  • Total pull requests: 13
  • Average time to close issues: 7 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 6
  • Total pull request authors: 3
  • Average comments per issue: 2.29
  • Average comments per pull request: 1.31
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • CNwangbin (2)
  • FinnOD (1)
  • drdna (1)
  • jianshu93 (1)
  • aaronmussig (1)
  • wanxn518 (1)
Pull Request Authors
  • aaronmussig (8)
  • dependabot[bot] (4)
  • FinnOD (2)
Top Labels
Issue Labels
enhancement (2) help wanted (1) bug (1)
Pull Request Labels
released (6) dependencies (4)

Packages

  • Total packages: 3
  • Total downloads:
    • pypi 707 last-month
    • cargo 11,090 total
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 2
    (may contain duplicates)
  • Total versions: 28
  • Total maintainers: 2
pypi.org: phylodm

Efficient calculation of phylogenetic distance matrices.

  • Versions: 17
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 690 Last month
Rankings
Dependent packages count: 10.0%
Stargazers count: 10.2%
Average: 17.4%
Dependent repos count: 21.7%
Downloads: 22.5%
Forks count: 22.6%
Maintainers (1)
Last synced: 7 months ago
pypi.org: metatree

Visualisation of polyphyletic groups between phylogenetic trees to a reference tree.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 17 Last month
Rankings
Dependent packages count: 10.0%
Stargazers count: 10.3%
Dependent repos count: 21.7%
Forks count: 22.6%
Average: 26.2%
Downloads: 66.3%
Maintainers (1)
Last synced: 7 months ago
crates.io: phylodm

Efficient calculation of phylogenetic distance matrices.

  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 11,090 Total
Rankings
Stargazers count: 18.5%
Dependent repos count: 29.3%
Dependent packages count: 33.8%
Average: 35.6%
Forks count: 37.5%
Downloads: 59.1%
Maintainers (1)
Last synced: 7 months ago

Dependencies

Cargo.lock cargo
  • aho-corasick 0.7.18
  • ansi_term 0.12.1
  • atty 0.2.14
  • autocfg 1.1.0
  • bitflags 1.3.2
  • cfg-if 1.0.0
  • clap 2.34.0
  • convert_case 0.4.0
  • derive_more 0.99.17
  • either 1.7.0
  • env_logger 0.8.4
  • getopt 1.1.3
  • getrandom 0.1.16
  • heck 0.3.3
  • hermit-abi 0.1.19
  • humantime 2.1.0
  • indoc 1.0.6
  • itertools 0.10.3
  • lazy_static 1.4.0
  • libc 0.2.126
  • light_phylogeny 1.4.5
  • lock_api 0.4.7
  • log 0.4.17
  • matrixmultiply 0.3.2
  • memchr 2.5.0
  • ndarray 0.15.4
  • num-complex 0.4.2
  • num-integer 0.1.45
  • num-traits 0.2.15
  • numpy 0.16.2
  • once_cell 1.13.0
  • parking_lot 0.12.1
  • parking_lot_core 0.9.3
  • ppv-lite86 0.2.16
  • proc-macro-error 1.0.4
  • proc-macro-error-attr 1.0.4
  • proc-macro2 1.0.40
  • pyo3 0.16.5
  • pyo3-build-config 0.16.5
  • pyo3-ffi 0.16.5
  • pyo3-macros 0.16.5
  • pyo3-macros-backend 0.16.5
  • quote 1.0.20
  • rand 0.7.3
  • rand_chacha 0.2.2
  • rand_core 0.5.1
  • rand_hc 0.2.0
  • rand_pcg 0.2.1
  • random_color 0.5.1
  • rawpointer 0.2.1
  • redox_syscall 0.2.13
  • regex 1.6.0
  • regex-syntax 0.6.27
  • roxmltree 0.14.1
  • rustc_version 0.4.0
  • scopeguard 1.1.0
  • semver 1.0.12
  • smallvec 1.9.0
  • strsim 0.8.0
  • structopt 0.3.26
  • structopt-derive 0.4.18
  • svg 0.8.2
  • syn 1.0.98
  • target-lexicon 0.12.4
  • termcolor 1.1.3
  • textwrap 0.11.0
  • unicode-ident 1.0.2
  • unicode-segmentation 1.9.0
  • unicode-width 0.1.9
  • unindent 0.1.9
  • vec_map 0.8.2
  • version_check 0.9.4
  • wasi 0.9.0+wasi-snapshot-preview1
  • winapi 0.3.9
  • winapi-i686-pc-windows-gnu 0.4.0
  • winapi-util 0.1.5
  • winapi-x86_64-pc-windows-gnu 0.4.0
  • windows-sys 0.36.1
  • windows_aarch64_msvc 0.36.1
  • windows_i686_gnu 0.36.1
  • windows_i686_msvc 0.36.1
  • windows_x86_64_gnu 0.36.1
  • windows_x86_64_msvc 0.36.1
  • xmlparser 0.13.3
package-lock.json npm
  • 457 dependencies
package.json npm
  • @semantic-release/changelog ^6.0.1 development
  • @semantic-release/commit-analyzer ^9.0.2 development
  • @semantic-release/exec ^6.0.3 development
  • @semantic-release/git ^10.0.1 development
  • @semantic-release/github ^8.0.5 development
  • @semantic-release/release-notes-generator ^10.0.3 development
  • semantic-release ^19.0.3 development
setup.py pypi
  • numpy *
.github/workflows/build-publish.yml actions
  • actions-rs/cargo v1 composite
  • actions-rs/toolchain v1 composite
  • actions/checkout v3 composite
  • actions/download-artifact v3 composite
  • actions/setup-node v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
  • pypa/cibuildwheel v2.10.0 composite
  • pypa/gh-action-pypi-publish release/v1 composite
Cargo.toml cargo
pyproject.toml pypi