treemix
Scripts to analyze data using TreeMix. This pipeline runs TreeMix with bootstrapping, helps choose number of migration events and creates a consensus tree. It plots the maximum likelihood tree with bootstrap values, drift and residuals and calculates statistics for every migration event, such as migration support, standard error and p-values.
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.9%) to scientific vocabulary
Repository
Scripts to analyze data using TreeMix. This pipeline runs TreeMix with bootstrapping, helps choose number of migration events and creates a consensus tree. It plots the maximum likelihood tree with bootstrap values, drift and residuals and calculates statistics for every migration event, such as migration support, standard error and p-values.
Basic Info
- Host: GitHub
- Owner: carolindahms
- Language: R
- Default Branch: main
- Size: 38.1 KB
Statistics
- Stars: 34
- Watchers: 1
- Forks: 1
- Open Issues: 8
- Releases: 0
Metadata Files
README.md
TreeMix
Scripts to infer population splits and mixture events from alelle frequency data using TreeMix by Pickrell & Pritchard (2012). This pipeline runs TreeMix with bootstrapping, helps choose number of migration events and creates a consensus tree. It plots a maximum likelihood tree with bootstrap values, drift and residuals and calculates statistics for every migration event, such as migration support, standard error and p-values.
Based on scripts written by Vajana and Milanesi (2017) and R functions by Zecca, Labra and Grassi (2019).
Pipeline
1. Build consensus tree with multiple migration events
Assumes TreeMix input file, which can be created from plink file using scripts provided with the TreeMix software, or from a VCF by using Stacks with the populations module (--treemix).
Run Step1_TreeMix.sh in the command line, providing an input file, maximum number of cores, block size, outgroup (or 'noRoot' for unrooted trees),
number of bootstrap replicates, path to PHYLIP consense program, output file name, range of migration events (m) and their number of replicates, for example:
sh Step1_TreeMix.sh input.treemix.gz 10 100 Nipponicus 500 /appl/soft/phylip-3.697/exe/consense 3spine 1 10 10
This builds a consensus tree from bootstraps and adds a specified range of m. Tree replicates will be stored in the test_migrations folder.
2. Test migration edges with OptM
Set working directory to the test_migrations folder and run the R package OptM (step A) from the R script Step2&4_TreeMix.R.
This helps identify the optimum number of m.
3. Final runs with optimum number of migration edges
Run Step3_TreeMix.sh by providing an input file, maximum number of cores, block size, outgroup (alternatively 'noRoot' for unrooted trees), number of bootstrap replicates, number of migrations, output file name, number of independent runs (N), name of consensus tree built in Step 1, and path to consense program.
sh Step3_TreeMix.sh input.treemix.gz 10 100 Nipponicus 500 3 3spine 30 3spine_constree.newick /appl/soft/phylip-3.697/exe/consense
Returns trees from chosen number of independent runs with optimum number of m. Final runs of trees will be stored in the final_runs folder.
4. Tree visualization + Migration stats and support
For this step you will need to have saved the file TreeMix_functions.R
Set working directory to the final_runs folder, run steps B and C from the Step2&4_TreeMix.R script.
From the final runs, compares tree likelihoods, plots ML tree with bootstrap values and migration weights. Returns Migration Support (MS), exact MS (MSE) and statistics such as least significant p-value from all runs, standard error and migrations weight for each migration event averaged over N runs.
References
Milanesi, M., Capomaccio, S., Vajana, E., Bomba, L., Garcia, J.F., Ajmone-Marsan, P., Colli, L., 2017. BITE: an R package for biodiversity analyses. bioRxiv 181610. doi:10.1101/181610
Pickrell, J., & Pritchard, J. (2012). Inference of population splits and mixtures from genome-wide allele frequency data. Nature Precedings, 1-1.
Zecca, G., Labra, M., & Grassi, F. (2020). Untangling the Evolution of American Wild Grapes: Admixed Species and How to Find Them. Frontiers in Plant Science, 10, 1814.
Owner
- Login: carolindahms
- Kind: user
- Repositories: 1
- Profile: https://github.com/carolindahms
Citation (CITATION.cff)
cff-version: 1.2.0
title: TreeMix pipeline
message: >-
If you use this dataset, please cite it using the
metadata from this file.
type: dataset
authors:
- given-names: Carolin
family-names: Dahms
orcid: 'https://orcid.org/0000-0002-3283-7820'
email: carolin.dahms.ac@gmail.com
date-released: "2021-07-18"
identifiers:
- type: url
value: 'https://github.com/carolindahms/TreeMix'
abstract: >-
Scripts to infer population splits and mixture
events from allele frequency data using TreeMix by
Pickrell & Pritchard (2012).
commit: d01820453dd7cfc9bfd4a0bd8252b1048fbf7130
keywords:
- Phylogeny
- SNP
GitHub Events
Total
- Issues event: 2
- Watch event: 13
- Issue comment event: 4
Last Year
- Issues event: 2
- Watch event: 13
- Issue comment event: 4