https://github.com/huttleylab/phylim

Python project to validate a phylogenetic fitted models is within the limits of inference

https://github.com/huttleylab/phylim

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.5%) to scientific vocabulary

Keywords

cogent3-apps phylogenetics
Last synced: 9 months ago · JSON representation

Repository

Python project to validate a phylogenetic fitted models is within the limits of inference

Basic Info
  • Host: GitHub
  • Owner: HuttleyLab
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 255 KB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 2
  • Open Issues: 2
  • Releases: 9
Topics
cogent3-apps phylogenetics
Created over 3 years ago · Last pushed 9 months ago
Metadata Files
Readme License

README.md

phylim: a phylogenetic limit evaluation library built on cogent3

Coverage Status Release DOI

phylim evaluates the identifiability when estimating the phylogenetic tree using the Markov model. The identifiability is the key condition of the Markov model used in phylogenetics to fulfil consistency.

Establishing identifiability relies on the arrangement of specific types of transition probability matrices (e.g., DLC and sympathetic) while avoiding other types. A key concern arises when a tree does not meet the condition that, for each node, a path to a tip must exist where all matrices along the path are DLC. Such trees are not identifiable 🪚🎄! For instance, in the figure below, tree T' contains a node surrounded by a specific type of non-DLC matrix, rendering it non-identifiable. In contrast, compare T' with tree T.

phylim provides a quick, handy method to check the identifiability of a model fit, where we developed a main cogent3 app, phylim. phylim is compatible with piqtree, a python library that exposes features from iqtree2.

The following content will demonstrate how to set up phylim and give some tutorials on the main identifiability check app and other associated apps.

tree1

Installation

pip install phylim

Let's see if it has been done successfully. In the package directory:

pytest

Hope all tests passed! :whitecheckmark: :blush:

Run the check of identifiability

If you fit a model to an alignment and get the model result:

```python

from cogent3 import getapp, makealigned_seqs

aln = makealignedseqs( ... { ... "Human": "ATGCGGCTCGCGGAGGCCGCGCTCGCGGAG", ... "Gorilla": "ATGCGGCGCGCGGAGGCCGCGCTCGCGGAG", ... "Mouse": "ATGCCCGGCGCCAAGGCAGCGCTGGCGGAG", ... }, ... info={"moltype": "dna", "source": "foo"}, ... )

appfit = getapp("model", "GTR") result = app_fit(aln) ```

You can easily check the identifiability by:

```python

checker = get_app("phylim")

checked = checker(result) checked.is_identifiable

True ```

The phylim app wraps all information about phylogenetic limits.

```python

checked ```

Source Model Name Identifiable Has Boundary Values Version
brca1.fasta GTR True True 2025.1.12

You can also use features like classifying all matrices or checking boundary values in a model fit.

Label all transition probability matrices in a model fit You can call `classify_model_psubs` to give the category of all the matrices: ```python >>> from phylim import classify_model_psubs >>> labelled = classify_model_psubs(result) >>> labelled ```
Substitution Matrices Categories
edge namematrix category
GorillaDLC
HumanDLC
MouseDLC
Check if all parameter fits are within the boundary ```python >>> from phylim import check_fit_boundary >>> violations = check_fit_boundary(result) >>> violations BoundsViolation(source='foo', vio=[{'par_name': 'C/T', 'init': np.float64(1.0000000147345554e-06), 'lower': 1e-06, 'upper': 50}, {'par_name': 'A/T', 'init': np.float64(1.0000000625906854e-06), 'lower': 1e-06, 'upper': 50}]) ```

❗For users who want to check identifiability on a model with multiple likelihood functions (e.g. split codon model), please check https://github.com/HuttleyLab/PhyLim/issues/23#issuecomment-3125670158

Check identifiability for piqtree

phylim provides an app, phylim_to_model_result, which allows you to build the likelihood function from a piqtree output tree.

```python

phylo = getapp("piqbuild_tree", model="GTR") tree = phylo(aln)

lffrom = getapp("phylimtomodelresult") result = lffrom(tree)

checker = getapp("phylim") checked = checker(result) checked.isidentifiable

True ```

Colour the edges for a phylogenetic tree based on matrix categories

If you obtain a model fit, phylim can visualise the tree with labelled matrices.

phylim provides an app, phylim_style_tree, which takes an edge-matrix category map and colours the edges:

```python

from phylim import classifymodelpsubs

edgetocat = classifymodelpsubs(result) tree = result.tree

treestyler = getapp("phylimstyletree", edgetocat) tree_styler(tree) ```

tree1

You can also colour edges using a user-defined edge-matrix category map, applicable to any tree object!

```python

from cogent3 import make_tree from phylim import SYMPATHETIC, DLC

tree = maketree("(A, B, C);") edgeto_cat = {"A":SYMPATHETIC, "B":SYMPATHETIC, "C":DLC}

treestyler = getapp("phylimstyletree", edgetocat) tree_styler(tree) ```

tree1

Owner

  • Name: HuttleyLab
  • Login: HuttleyLab
  • Kind: organization

GitHub Events

Total
  • Create event: 9
  • Issues event: 4
  • Release event: 8
  • Watch event: 8
  • Issue comment event: 7
  • Public event: 1
  • Push event: 20
  • Pull request review event: 2
  • Pull request event: 32
  • Fork event: 2
Last Year
  • Create event: 9
  • Issues event: 4
  • Release event: 8
  • Watch event: 8
  • Issue comment event: 7
  • Public event: 1
  • Push event: 20
  • Pull request review event: 2
  • Pull request event: 32
  • Fork event: 2

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 3
  • Total pull requests: 12
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 1 hour
  • Total issue authors: 2
  • Total pull request authors: 3
  • Average comments per issue: 1.67
  • Average comments per pull request: 0.08
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 3
  • Pull requests: 12
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 1 hour
  • Issue authors: 2
  • Pull request authors: 3
  • Average comments per issue: 1.67
  • Average comments per pull request: 0.08
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
  • GavinHuttley (2)
  • YapengLang (1)
Pull Request Authors
  • YapengLang (15)
  • dependabot[bot] (2)
  • GavinHuttley (2)
Top Labels
Issue Labels
Pull Request Labels
dependencies (2) github_actions (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 133 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 2
pypi.org: phylim

A library for checking the limits of phylogenetic tree estimation.

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 133 Last month
Rankings
Dependent packages count: 9.9%
Average: 32.9%
Dependent repos count: 56.0%
Maintainers (2)
Last synced: 9 months ago