network-tree-reconciliation
Code and data for "An efficient algorithm for the reconciliation of a gene network and species tree"
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.5%) to scientific vocabulary
Repository
Code and data for "An efficient algorithm for the reconciliation of a gene network and species tree"
Basic Info
- Host: GitHub
- Owner: yao-ban
- Language: C++
- Default Branch: main
- Size: 57.6 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Supplementary material for the paper "An efficient algorithm for the reconciliation of a gene network and species tree"
simulate_species.cpp
Usage:
./simulate_species <number of species>
This produces a species tree via a pure birth process with birth rate Srate (a macro defined in the code, defaulting to 1). The tree is output in the format:
<number of nodes> 1
<lines>
where each line represents one node in bottom-up order, in the format:
<node id> <type> <id of parent> <label if a leaf> <time>
Details:
- node id starts at 0 and increases by 1 up to the number of nodes (no gaps) and is in bottom-up order, so node id of child < node id of parent
- type is 0 = root, 1 = divergence, 2 = reticulation (does not happen for species trees), 3 = leaf
- time starts at 0 for leaf nodes (current day) and increases going backwards in time
- the root node always has no parents and 1 child (not 2 children)
simulate_gene.cpp
Usage:
./simulate_gene <species tree file> [--Drate <D rate>] [--Lrate <L rate>] [--Rrate <R rate>]
This produces a gene network via a birth-and-death process inside the species tree given by <species tree file>, which must be in the format output by simulate_species above.
The birth-and-death process simulates D events at a rate of <D rate> per lineage, L events at a rate of <L rate> per lineage, and R events at a rate of <R rate> per lineage past the first in a species. These rates have default values 0.5.
The network is output in the format:
<number of nodes> 0
<lines>
where each line represents one node in bottom-up order, in the format:
<node id> <type> <id of parent> <id of second parent if a reticulation> <label if a leaf> <breakpoint if a reticulation>
Then a reconciliation is output in the format:
<lines>
where each line represents the mapping of one gene node in id order, in the format:
<gene node id> <species node id> <event type>
Details:
- event type is C = current, S = speciation, D = duplication, R = reticulation, T = rooT
perecon.cpp
Usage:
./perecon <gene network file> <species tree file>
This produces the most parsimonious reconciliation between the gene network given by <gene network file> and the species tree given by <species tree file>. These must be in the format output by simulate_gene and simulate_species.
The MPR is calculated with each D event costing Dcost (default 1), and each L event costing Lcost (default 1), where Dcost and Lcost are macros defined in the code.
The reconciliation is output in the format:
<lines>
where each line represents the mapping of one gene node in id order, in the format:
<gene node id> <species node id> <event type>
Details:
- event type is C = current, S = speciation, D = duplication, R = reticulation, T = rooT
- currently the program also outputs a lot of other information including species tree, gene network, LCA reconciliation, LCA-HCA reconciliation, BCC decomposition, BCC tree-child-ness, intermediate cost calculations, and final MPR cost
wrapper.py
Usage:
python3 wrapper.py [-s <number of species>] [-d <D rate>] [-l <L rate>] [-r <R rate>] [--replicates <number of replicates>]
This runs simulate_species with <number of species> species, then simulate_gene with the given rates, then perecon to calculate an MPR. It then outputs in one line, separated by ,:
- number of D events with correct type and location;
- number of D events with correct location only;
- number of D events with correct type only;
- number of D events with neither correct type nor location;
- number of S events with correct type and location;
- number of S events with correct location only;
- number of S events with correct type only;
- number of S events with neither correct type nor location;
- number of R events with correct type and location;
- number of R events with correct location only (always 0);
- number of R events with correct type only;
- number of R events with neither correct type nor location (always 0);
- Proportion of sequence with correct paralogy.
This is then repeated a total of <number of replicates> times.
simulate.py
Usage:
python3 simulate.py
This runs wrapper.py for the range of parameters shown in the paper.
process.py
Usage:
python3 process.py
This opens all results/results-*.txt files and amalgamates them into the single file results/results.csv. Each line is copied as
<line>, <D rate>, <L rate>, <R rate>
analysis.R
This produces (from the results/results.csv file) all plots used in the paper.
Owner
- Login: yao-ban
- Kind: user
- Repositories: 1
- Profile: https://github.com/yao-ban
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: An efficient algorithm for the reconciliation of a gene network and species tree
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Yao-ban
family-names: Chan
email: yaoban@unimelb.edu.au
affiliation: The University of Melbourne
orcid: 'https://orcid.org/0000-0002-8425-8775'
repository-code: 'https://github.com/yao-ban/network-tree-reconciliation'
abstract: >-
Supplementary material for the paper "An efficient
algorithm for the reconciliation of a gene network and
species tree"
keywords:
- Reconciliation
- Phylogenetic network
- Paralog exchange
license: GPL-3.0