treespace-metrics

Python code to compute metrics on Rooted Phylogenetic Networks

https://github.com/andrewquijano/treespace_reu_2017

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.8%) to scientific vocabulary

Keywords

bioinformatics computational-biology phylogenetics treespace
Last synced: 6 months ago · JSON representation ·

Repository

Python code to compute metrics on Rooted Phylogenetic Networks

Basic Info
  • Host: GitHub
  • Owner: AndrewQuijano
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 503 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 1
  • Releases: 4
Topics
bioinformatics computational-biology phylogenetics treespace
Created almost 5 years ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

TreespaceREU2017

treespace-test

codecov

Installation

This code has been tested on Ubuntu 22 LTS. Upon downloading the repository, run the installation script to obtain all the required packages. When you download the repository, run the following command bash install.sh

Documentation

You can run commands to create python documentation as follows: ```bash pip install pdoc

will create an html folder with all generated documentation in .html format

pdoc3 --html treespace_metrics

If you want it to run on local-host

pdoc --http localhost:8080 treespace_metrics ``` To access documentation, you can find it here. You can also download the package via pip.

Usage — Metrics on Adjacency Lists/Newick Graphs

I would like to thank Professor van Iersel for this link containing phylogenetic networks we used to test the code in the Phylo directory. The name of the text file will identify the paper it came from to cite if you use these as well. Please note, I had to use the newick format with internal node names, so I can easily convert this into a DAG in networkx to be compatible with the algorithms.

Run the test cases to ensure the metrics work on pre-defined graphs, run:
pytest test

Add the following arguments as needed: * --dir, the input directory that has text files containing newick graphs or adjacency lists of phylogenetic networks * -n, the input directory has text files that has newick formatted phylogenetic trees * -d, draw the trees, bipartite graphs, etc.

After filling out the networks you want to get metrics for, here is how to execute the code:
python3 run_treespace.py --dir <directory> -d

Usage — Testing on Generated Networks

Louxin Zhang has provided me the source code to generate random binary phylogenetic networks, located in the phylo_generator. Feel free to see his original code here

After compiling the C code, run the following example to run generating 12 graphs with 3 leaves and 15 reticulation nodes. After generating the graphs, compute the metrics and store it with images into a directory for further analysis.
python3 run_treespace.py --generate -l 3 -r 15 -g 12 -d

Authors and Acknowledgment

Code Author: Andrew Quijano
This work was funded by a Research Experience for Undergraduates (REU) grant from the U.S. National Science Foundation (#1461094 to St. John and Owen).

Please cite the papers from which the algorithms are derived from if you use this library.

Jettan and van Iersal's Algorithm (jettan.py):
Laura Jetten and Leo van Iersel. Nonbinary tree-based phylogenetic networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1:205–217, 2018. On-line publication: October 2016.

Francis et al.'s Spanning Tree Algorithm (francis.py):
Andrew Francis, Charles Semple, and Mike Steel. New characterisations of tree-based networks and proximity measures. Advances in Applied Mathematics, 93:93–107, 2018.

Maximum Covering Subtrees for Phylogenetic Networks (max-cst.py):
Davidov, N., Hernandez, A., Mckenna, P., Medlin, K., Jian, J., Mojumder, R., Owen, M., Quijano, A., Rodriguez, A., John, K.S. and Thai, K., 2020. Maximum Covering Subtrees for Phylogenetic Networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics.

License

MIT

Project status

The project is currently fully tested and functional for rooted phylogenetic networks. If you want to extend this for unrooted networks and have funding, please feel free to reach out.

Currently, I am working on the following: * Solve the other problem about minimum number of trees spanning a network N, see create_trees.py * Use OOP to create a PyPi package for this library

Owner

  • Name: Andrew
  • Login: AndrewQuijano
  • Kind: user
  • Location: New York
  • Company: New York University

I am a NYU adjunct faculty member and PhD working for the MESS Lab headed by Professor Brendan Dolan-Gavitt

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Davidov"
  given-names: "Nathan"
- family-names: "Hernandez"
  given-names: "Amanda"
- family-names: "Jian"
  given-names: "Justin"
- family-names: "McKenna"
  given-names: "Patrick"
- family-names: "Medlin"
  given-names: "K.A."
- family-names: "Mojumder"
  given-names: "Roadra"
- family-names: "Owen"
  given-names: "Megan"
- family-names: "Quijano"
  given-names: "Andrew"
- family-names: "Rodriguez"
  given-names: "Amanda"
- family-names: "St. John"
  given-names: "Katherine"
- family-names: "Thai"
  given-names: "Katherine"
- family-names: "Uraga"
  given-names: "Meliza"
title: "Maximum Covering Subtrees for Phylogenetic Networks"
version: 2.0.0
doi: 10.1109/tcbb.2020.3040910
date-released: 2021-11-01
url: "https://github.com/AndrewQuijano/Treespace_REU_2017"
preferred-citation:
  type: article
  authors:
  - family-names: "Davidov"
    given-names: "Nathan"
  - family-names: "Hernandez"
    given-names: "Amanda"
  - family-names: "Jian"
    given-names: "Justin"
    orcid: "https://orcid.org/0000-0002-2585-5537"
  - family-names: "McKenna"
    given-names: "Patrick"
  - family-names: "Medlin"
    given-names: "K.A."
  - family-names: "Mojumder"
    given-names: "Roadra"
  - family-names: "Owen"
    given-names: "Megan"
  - family-names: "Quijano"
    given-names: "Andrew"
    orcid: "https://orcid.org/0000-0002-6673-4934"
  - family-names: "Rodriguez"
    given-names: "Amanda"
    orcid: "https://orcid.org/0000-0001-8098-1096"
  - family-names: "St. John"
    given-names: "Katherine"
    orcid: "https://orcid.org/0000-0003-1657-8301"
  - family-names: "Thai"
    given-names: "Katherine"
  - family-names: "Uraga"
    given-names: "Meliza"
  doi: "10.1109/tcbb.2020.3040910 "
  journal: "IEEE/ACM Transactions on Computational Biology and Bioinformatics"
  month: 11
  start: 2823 # First page number
  end: 2827 # Last page number
  title: "Maximum Covering Subtrees for Phylogenetic Networks"
  issue: 1
  volume: 1
  year: 2021

GitHub Events

Total
  • Release event: 3
  • Watch event: 3
  • Delete event: 6
  • Issue comment event: 3
  • Push event: 36
  • Pull request event: 10
  • Create event: 10
Last Year
  • Release event: 3
  • Watch event: 3
  • Delete event: 6
  • Issue comment event: 3
  • Push event: 36
  • Pull request event: 10
  • Create event: 10

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: 23 minutes
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.4
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: 23 minutes
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.4
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • AndrewQuijano (5)
Top Labels
Issue Labels
Pull Request Labels
Github-files (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 23 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 4
  • Total maintainers: 1
pypi.org: treespace-metrics

Python Interface to collect treespace metrics

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 23 Last month
Rankings
Dependent packages count: 9.2%
Average: 30.5%
Dependent repos count: 51.9%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/treespace-test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
requirements.txt pypi
  • biopython *
  • matplotlib *
  • networkx *
  • pydot *
  • pygraphviz *
setup.py pypi