phylogan

phyloGAN is a Generative Adversarial Network (GAN) that infers phylognetic relationships.

https://github.com/meganlsmith/phylogan

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary
Last synced: 8 months ago · JSON representation ·

Repository

phyloGAN is a Generative Adversarial Network (GAN) that infers phylognetic relationships.

Basic Info
  • Host: GitHub
  • Owner: meganlsmith
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 4.52 MB
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 3
  • Open Issues: 1
  • Releases: 1
Created over 3 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Citation

README.md

phyloGAN

Introduction

phyloGAN is a Generative Adversarial Network (GAN) that infers phylognetic relationships. phyloGAN takes as input a concatenated alignments, or a set of gene alignments, and then infers a phylogenetic tree either considering or ignoring gene tree heterogeneity.

Installation

Depedencies

phyloGAN requires several python packages, along with AliSim.

Python packages

Handling trees and alignments: Biopython, dendropy, ete3

Machine learning: tensorflow

Miscellaneous utilities: copy, datetime, io, itertools, matplotlib, numpy, os, random, re, scipy, sys

AliSim

phyloGAN was developed using the version of AliSim distributed with IQ-TREE v2.2.0 (Beta). A version of IQ-TREE with AliSim must be installed, and the user must provide the path to the executable.

phyloGAN

To install phyloGAN, clone the GitHub repository:

git clone https://github.com/meganlsmith/phyloGAN.git

phyloGAN (concatenation version)

The original version of phyloGAN takes as input a concatenated alignment and infers a phylogenetic tree.

Input Files

Concatenated alignment

The concatenated alignment should be provided in phylip format. For an example see test_data/concatenated_test.phy.

Parameters file

The only other input to phyloGAN is the parameters file. For an example see test_data/params_concatenated.txt. In the example file, each line is described in a comment (following '#'). A few things to note:

  • The temporary folder provided will be deleted during the run. DO NOT use an existing folder. This must be a new directory. If using an HPC, scratch spaces may be ideal because a lot of I/O to this directory will occur.
  • The pseudoobserved setting is only recommended to be used in specific development contexts. Datasets are simulated from branch lengths drawn from an exponential distribution with mean lambda, and this is likely not desired in most simulation studies.
  • It is recommended that users begin with a 'Random' start tree, rather than a 'NJ' start tree. Beginning with the 'NJ' tree seems to cause issues because the generated data seen early in training is too good.

Running phyloGAN.

To run phyloGAN:

python ./phyloGAN/scripts/phyloGAN.py ./test_data/params_concatenated.txt

To continue a run for which checkpoint files have previously been generated:

python ./phyloGAN/scripts/phyloGAN.py ./test_data/params_concatenated.txt checkpoint

Output

For an example of phyloGAN (concatenation) output, see example_results.

  • Lambdas from stage 1 are recorded in Lambdas.txt.
  • Discriminator accuracies are recorded in DiscriminatorRealAcc.txt and DiscriminatorFakeAcc.txt.
  • Generator accuracies are recorded in GeneratorFakeAcc.txt.
  • Discriminator losses are recorded in DiscriminatorLoss.txt.
  • Generator losses are recorded in GeneratorLoss.txt.
  • The trees at each iteration are recorded in Trees.txt.
  • Robinson–Foulds distances between each tree and the true tree (when provided) are recorded in RFdistances.txt.
  • Various plots are provided in .png format.

phyloGAN-ILS

phyloGAN-ILS takes as input a folder with gene alignments and infers a species tree.

Input Files

Gene alignments

A directory containing single copy gene alignments should be provided to phyloGAN. If any species is missing from an alignment, it should be included with 'N's.

Parameters file

The params file is similar to that needed in the original version. For an example see test_data/params_coalescent.txt

Running phyloGAN-ILS.

Unpack test gene alignments:

cd test_data
tar -xzf gene_alignments.tar.gz
cd ../

To run phyloGAN:

python ./phyloGAN_ils/scripts/phyloGAN.py ./test_data/params_coalescent.txt

To continue a run for which checkpoint files have previously been generated:

python ./phyloGAN_ils/scripts/phyloGAN.py ./test_data/params_coalescent.txt checkpoint

Owner

  • Name: Megan Smith
  • Login: meganlsmith
  • Kind: user
  • Location: Bloomington, Indiana
  • Company: Indiana University

Citation (CITATION.cff)

cff-version: 1.0.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Smith"
  given-names: "M.L."
- family-names: "Hahn"
  given-names: "M.W."
title: "phyloGAN"
version: 1.0.0
identifiers:
  - type: doi
    value: 10.5281/zenodo.12688252
date-released: 2023

GitHub Events

Total
  • Issues event: 1
  • Watch event: 3
Last Year
  • Issues event: 1
  • Watch event: 3

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • AntonioBaeza (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels