treemaker
treemaker: A Python tool for constructing a Newick formatted tree from a set of classifications. - Published in JOSS (2018)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 15 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Repository
A Python library for creating a Newick formatted tree from a set of classification strings
Basic Info
Statistics
- Stars: 10
- Watchers: 3
- Forks: 2
- Open Issues: 1
- Releases: 2
Topics
Metadata Files
README.md
treemaker
A Python library for creating a Newick formatted tree from a set of classification strings (e.g. a taxonomy)
treemaker is a Python library to convert a text-based classification schema into a Newick file for use in phylogenetic and bioinformatic programs.
Research in linguistics or cultural evolution often produces or uses tree taxonomies or classifications. However, these are usually not in a format readily available for use in programs that can understand and manipulate trees. For example, the global taxonomy of languages published by the Ethnologue classifies languages into families and subgroups using a taxonomy string e.g. the language Kalam is classified as "Trans-New Guinea, Madang, Kalam-Kobon", while Mauwake is classified as "Trans-New Guinea, Madang, Croisilles, Pihom", and Kare is "Trans-New Guinea, Madang, Croisilles, Kare". This classification indicates that while all these languages are part of the Madang subgroup of the Trans-New Guinea language family, Kare and Mauwake are more closely related (as they belong to the Croisilles subgroup).
Other publications use a tabular indented format to demarcate relationships, such as the example in Figure 1 from Stephen Wurm's classification of his proposed Yele-Solomons language phylum (Wurm 1975).
Both the taxonomy string and tabular format however are hard to load into software packages that can analyse, compare, visualise and manipulate trees. treemaker aims to make this easy by converting taxonomic data into Newick and Nexus (Maddison 1997) formats commonly used by phylogenetic manipulation programs.
Converting a Taxonomy to a Tree:
treemaker can convert a text file with a taxonomy to a tree. These taxonomies can easily be obtained from Ethnologue or manually entered, such as this example from Wurm's (outdated) classification of Yele-Solomons in Figure 1:
text
Bilua Yele-Solomons, Central Solomon
Baniata Yele-Solomons, Central Solomon
Lavukaleve Yele-Solomons, Central Solomon
Savosavo Yele-Solomons, Central Solomon
Kazukuru Yele-Solomons, Kazukuru
Guliguli Yele-Solomons, Kazukuru
Dororo Yele-Solomons, Kazukuru
Yele Yele-Solomons
treemaker can then generate a Newick representation:
text
((Baniata,Bilua,Lavukaleve,Savosavo),(Dororo,Guliguli,Kazukuru),Yele);
...which can then be loaded into phylogenetic programs to visualise or manipulate as in Figure 2.
treemaker has been used to enable the analyses in (Bromham et al. 2018), and a number of forthcoming articles.


Installation:
Installation is only a pip install away:
shell
pip install treemaker
Or from git:
shell
git clone https://github.com/SimonGreenhill/treemaker/ treemaker
cd treemaker
python setup.py install
Usage: Command line:
Basic usage:
```shell
treemaker
usage: treemaker [-h] [-o OUTPUT] [-m {nexus,newick}] [--labels] input ```
e.g. Given a text file:
LangA Indo-European, Germanic
LangB Indo-European, Germanic
LangC Indo-European, Romance
LangD Indo-European, Anatolian
... then you can build a taxonomy/classification tree from that as follows:
```shell
treemaker classification.txt (LangD,(LangA,LangB),LangC);
with nodelabels:
treemaker --labels classification.txt (LangD,(LangA,LangB)Germanic,LangC)Indo-European;
treemaker -m nexus classification.txt
NEXUS
begin trees; tree root = (LangD,(LangA,LangB),LangC); end; ```
To write to file:
```shell
treemaker classification.txt (LangD,(LangA,LangB),LangC);
treemaker classification.txt -o classification.nex ```
Usage: Library:
python
from treemaker import TreeMaker
generate a tree manually:
```python from treemaker import TreeMaker
t = TreeMaker() t.add('A1', 'family a, subgroup 1') t.add('A2', 'family a, subgroup 2') t.add('B1a', 'family b, subgroup 1') t.add('B1b', 'family b, subgroup 1') t.add('B2', 'family b, subgroup 2')
print(t.write()) ```
Add from a list:
```python from treemaker import TreeMaker
taxa = [ ('A1', 'family a, subgroup 1'), ('A2', 'family a, subgroup 2'), ('B1a', 'family b, subgroup 1'), ('B1b', 'family b, subgroup 1'), ('B2', 'family b, subgroup 2'), ]
t = TreeMaker() t.add_from(taxa)
print(t.write())
```
API Documentation:
The API is documented here.
Running treemaker's tests:
To run treemaker's tests simply run:
```shell
make test
or
python setup.py test
or
python treemaker/test_treemaker.py ```
Version History:
- v1.4: fix bug with no terminating semicolon in nexus file output.
- v1.3: add nodelabels support, add some rudimentary input checking.
Support:
For questions on how to use or update this, feel free to open an issue. I'll get to it as soon as I can.
Acknowledgements:
Thank you to Richard Littauer, Mitsuhiro Nakamura, and Dillon Niederhut.
References:
- Bromham, Lindell, Xia Hua, Marcel Cardillo, Hilde Schneemann, & Simon J. Greenhill. 2018. “Parasites and Politics: Why Cross-Cultural Studies Must Control for Relatedness, Proximity and Covariation.” Open Science 5 (8). https://doi.org/10.1098/rsos.181100.
- Maddison, D R, D L Swofford, & Wayne P. Maddison. 1997. “Nexus: An Extensible File Format for Systematic Information.” Systematic Biology 46 (4): 590–621. https://doi.org/10.1093/sysbio/46.4.590.
- Wurm, S. A. 1975. “The East Papuan Phylum in General.” In New Guinea Area Languages and Language Study: Papuan Languages and the New Guinea Linguistic Scene, edited by S. A. Wurm. Canberra: Pacific Linguistics. https://doi.org/http://dx.doi.org/10.15144/PL-C38.
Owner
- Name: Simon J Greenhill
- Login: SimonGreenhill
- Kind: user
- Location: Jena, Canberra, Auckland
- Company: @shh-dlce @eva-dlce
- Website: http://simon.net.nz
- Twitter: SimonGreenhill
- Repositories: 11
- Profile: https://github.com/SimonGreenhill
I study how languages and cultures evolve. Scientist at the University of Auckland, and the Max Planck Institute for Evolutionary Anthropology
JOSS Publication
treemaker: A Python tool for constructing a Newick formatted tree from a set of classifications.
Authors
Tags
phylogenetics newick treeCodeMeta (codemeta.json)
{
"@context": "https://raw.githubusercontent.com/codemeta/codemeta/master/codemeta.jsonld",
"@type": "Code",
"author": [
{
"@id": "http://orcid.org/0000-0001-7832-6156",
"@type": "Person",
"email": "simon@simon.net.nz",
"name": "Simon J. Greenhill",
"affiliation": "Max Planck Institute for the Science of Human History & ARC Centre of Excellence for the Dynamics of Language, Australian National University"
}
],
"identifier": "",
"codeRepository": "https://github.com/SimonGreenhill/treemaker",
"datePublished": "2018-09-05",
"dateModified": "2018-11-08",
"dateCreated": "2018-09-05",
"description": "A Python library for creating a Newick formatted tree from a set of classifications.",
"keywords": "phylogenetics,newick",
"license": "BSD",
"title": "treemaker",
"version": "1.2"
}
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| SimonGreenhill | s****n@s****z | 61 |
| pyup-bot | g****t@p****o | 10 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 10
- Total pull requests: 93
- Average time to close issues: about 3 hours
- Average time to close pull requests: 22 days
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 2.3
- Average comments per pull request: 1.61
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 10
- Average time to close issues: N/A
- Average time to close pull requests: 10 days
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.9
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- RichardLitt (7)
- deniederhut (2)
- pyup-bot (1)
Pull Request Authors
- pyup-bot (104)
- mnacamura (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 16 last-month
- Total dependent packages: 0
- Total dependent repositories: 3
- Total versions: 6
- Total maintainers: 1
pypi.org: treemaker
A python tool for generating a Newick formatted tree from alist of classifications
- Homepage: https://github.com/SimonGreenhill/treemaker
- Documentation: https://treemaker.readthedocs.io/
- License: BSD
-
Latest release: 1.0.3
published about 7 years ago
Rankings
Maintainers (1)
Dependencies
- sphinx ==3.3.1
- sphinxcontrib-napoleon ==0.7
