homoplasyfinder
A tool to identify and annotate homoplasies on a phylogeny and sequence alignment
Science Score: 31.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary
Keywords
Repository
A tool to identify and annotate homoplasies on a phylogeny and sequence alignment
Basic Info
Statistics
- Stars: 18
- Watchers: 3
- Forks: 3
- Open Issues: 6
- Releases: 0
Topics
Metadata Files
README.md

HomoplasyFinder
Author: Joseph Crispell
Licence: GPL-3
Requires: R (>= v3.3.3) & rJava (>= v10.0.1)
Description
HomoplasyFinder is an open-source tool designed to identify homoplasies on a phylogeny and its nucleotide alignment. HomoplasyFinder uses the consistency index to identify sites in the nucleotide alignment that are inconsistent with the phylogeny provided. The current R package was written to allow easy use of the Java code (which HomoplasyFinder uses) in R. Full documentation is provided on the HomoplasyFinder wiki.
Installation
install.packages("devtools")
devtools::install_github("JosephCrispell/homoplasyFinder")
devtools::install_github("JosephCrispell/basicPlotteR") # Makes annotated plotted phylogeny prettier :-)
library(homoplasyFinder)
Executing
```
Find the FASTA and tree files attached to package
fastaFile <- system.file("extdata", "example.fasta", package = "homoplasyFinder") treeFile <- system.file("extdata", "example.tree", package = "homoplasyFinder")
Get the current working directory
workingDirectory <- paste0(getwd(), "/")
Run the HomoplasyFinder jar tool
inconsistentPositions <- runHomoplasyFinderInJava(treeFile=treeFile, fastaFile=fastaFile, path=workingDirectory)
Get the current date
date <- format(Sys.Date(), "%d-%m-%y")
Read in the output table
resultsFile <- paste0(workingDirectory, "consistencyIndexReport_", date, ".txt") results <- read.table(resultsFile, header=TRUE, sep="\t", stringsAsFactors=FALSE)
Read in the annotated tree
tree <- readAnnotatedTree(workingDirectory)
Plot the annotated tree
plotAnnotatedTree(tree, inconsistentPositions, fastaFile)
```
You should get the following plot:

Now extended to deal with the presence/absence of INDELs
HomoplasyFinder can now calculate the consistency of INDELs (or any regions) on a phylogeny. To do this simply replace the FASTA file with a CSV formatted table reporting the presence/absence of regions. Here is an example of a format:
start,end,isolateA,isolateB,isolateC
34802,35208,0,1,0
39068,39069,0,0,1
Test it out using the following: ```
Find the FASTA and tree files attached to package
presenceAbsenceFile <- system.file("extdata", "presenceAbsence_INDELs.csv", package = "homoplasyFinder") treeFile <- system.file("extdata", "example.tree", package = "homoplasyFinder")
Get the current working directory
workingDirectory <- paste0(getwd(), "/")
Run the HomoplasyFinder jar tool
inconsistentPositions <- runHomoplasyFinderInJava(treeFile=treeFile, presenceAbsenceFile=presenceAbsenceFile, path=workingDirectory)
Get the current date
date <- format(Sys.Date(), "%d-%m-%y")
Read in the output table
resultsFile <- paste0(workingDirectory, "consistencyIndexReport_", date, ".txt") results <- read.table(resultsFile, header=TRUE, sep="\t", stringsAsFactors=FALSE) ```
Source code
Java source code is available here and R package (wrapper) code here.
Citation
If you use HomoplasyFinder in your research, it would be great if you could cite the following article: Crispell, J., Balaz, D., & Gordon, S. V. (2019). HomoplasyFinder: a simple tool to identify homoplasies on a phylogeny. Microbial Genomics. https://doi.org/10.1099/mgen.0.000245
Owner
- Name: Joseph Crispell
- Login: JosephCrispell
- Kind: user
- Website: https://josephcrispell.github.io/
- Twitter: josephcrispell
- Repositories: 25
- Profile: https://github.com/JosephCrispell
I'm a data scientist helping to use and promote reproducible data science
Citation (CITATION)
@article{
bibtype = "Manual",
title = {HomoplasyFinder: a simple tool to identify homoplasies on a phylogeny},
author = c(person("Joseph", "Crispell"),
person("Joseph", "Crispell"),
person("Joseph", "Crispell")),
doi = "10.1099/mgen.0.000245",
issn = "2057-5858",
journal = "Microbial Genomics",
publisher = "Microbiology Society",
url = "http://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000245.v1",
year = 2019
}
GitHub Events
Total
- Watch event: 1
- Issue comment event: 1
Last Year
- Watch event: 1
- Issue comment event: 1
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| JosephCrispell | c****h@g****m | 112 |
Dependencies
- R >= 3.3.3 depends
- rJava >= 0.9 depends
- ape * imports
- rJava * imports