phylogan
phyloGAN is a Generative Adversarial Network (GAN) that infers phylognetic relationships.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary
Repository
phyloGAN is a Generative Adversarial Network (GAN) that infers phylognetic relationships.
Basic Info
Statistics
- Stars: 9
- Watchers: 1
- Forks: 3
- Open Issues: 1
- Releases: 1
Metadata Files
README.md
phyloGAN
Introduction
phyloGAN is a Generative Adversarial Network (GAN) that infers phylognetic relationships. phyloGAN takes as input a concatenated alignments, or a set of gene alignments, and then infers a phylogenetic tree either considering or ignoring gene tree heterogeneity.
Installation
Depedencies
phyloGAN requires several python packages, along with AliSim.
Python packages
Handling trees and alignments: Biopython, dendropy, ete3
Machine learning: tensorflow
Miscellaneous utilities: copy, datetime, io, itertools, matplotlib, numpy, os, random, re, scipy, sys
AliSim
phyloGAN was developed using the version of AliSim distributed with IQ-TREE v2.2.0 (Beta). A version of IQ-TREE with AliSim must be installed, and the user must provide the path to the executable.
phyloGAN
To install phyloGAN, clone the GitHub repository:
git clone https://github.com/meganlsmith/phyloGAN.git
phyloGAN (concatenation version)
The original version of phyloGAN takes as input a concatenated alignment and infers a phylogenetic tree.
Input Files
Concatenated alignment
The concatenated alignment should be provided in phylip format. For an example see test_data/concatenated_test.phy.
Parameters file
The only other input to phyloGAN is the parameters file. For an example see test_data/params_concatenated.txt. In the example file, each line is described in a comment (following '#'). A few things to note:
- The temporary folder provided will be deleted during the run. DO NOT use an existing folder. This must be a new directory. If using an HPC, scratch spaces may be ideal because a lot of I/O to this directory will occur.
- The pseudoobserved setting is only recommended to be used in specific development contexts. Datasets are simulated from branch lengths drawn from an exponential distribution with mean lambda, and this is likely not desired in most simulation studies.
- It is recommended that users begin with a 'Random' start tree, rather than a 'NJ' start tree. Beginning with the 'NJ' tree seems to cause issues because the generated data seen early in training is too good.
Running phyloGAN.
To run phyloGAN:
python ./phyloGAN/scripts/phyloGAN.py ./test_data/params_concatenated.txt
To continue a run for which checkpoint files have previously been generated:
python ./phyloGAN/scripts/phyloGAN.py ./test_data/params_concatenated.txt checkpoint
Output
For an example of phyloGAN (concatenation) output, see example_results.
- Lambdas from stage 1 are recorded in
Lambdas.txt. - Discriminator accuracies are recorded in
DiscriminatorRealAcc.txtandDiscriminatorFakeAcc.txt. - Generator accuracies are recorded in
GeneratorFakeAcc.txt. - Discriminator losses are recorded in
DiscriminatorLoss.txt. - Generator losses are recorded in
GeneratorLoss.txt. - The trees at each iteration are recorded in
Trees.txt. - Robinson–Foulds distances between each tree and the true tree (when provided) are recorded in
RFdistances.txt. - Various plots are provided in
.pngformat.
phyloGAN-ILS
phyloGAN-ILS takes as input a folder with gene alignments and infers a species tree.
Input Files
Gene alignments
A directory containing single copy gene alignments should be provided to phyloGAN. If any species is missing from an alignment, it should be included with 'N's.
Parameters file
The params file is similar to that needed in the original version. For an example see test_data/params_coalescent.txt
Running phyloGAN-ILS.
Unpack test gene alignments:
cd test_data
tar -xzf gene_alignments.tar.gz
cd ../
To run phyloGAN:
python ./phyloGAN_ils/scripts/phyloGAN.py ./test_data/params_coalescent.txt
To continue a run for which checkpoint files have previously been generated:
python ./phyloGAN_ils/scripts/phyloGAN.py ./test_data/params_coalescent.txt checkpoint
Owner
- Name: Megan Smith
- Login: meganlsmith
- Kind: user
- Location: Bloomington, Indiana
- Company: Indiana University
- Website: meganlsmith.org
- Twitter: snaild_it
- Repositories: 1
- Profile: https://github.com/meganlsmith
Citation (CITATION.cff)
cff-version: 1.0.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Smith"
given-names: "M.L."
- family-names: "Hahn"
given-names: "M.W."
title: "phyloGAN"
version: 1.0.0
identifiers:
- type: doi
value: 10.5281/zenodo.12688252
date-released: 2023
GitHub Events
Total
- Issues event: 1
- Watch event: 3
Last Year
- Issues event: 1
- Watch event: 3
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- AntonioBaeza (1)