hstrat
hstrat: a Python Package for phylogenetic inference on distributed digital evolution populations - Published in JOSS (2022)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Scientific Fields
Repository
hstrat enables phylogenetic inference on distributed digital evolution populations
Basic Info
- Host: GitHub
- Owner: mmore500
- License: other
- Language: Python
- Default Branch: master
- Size: 131 MB
Statistics
- Stars: 4
- Watchers: 2
- Forks: 2
- Open Issues: 35
- Releases: 3
Metadata Files
README.md

hstrat enables phylogenetic inference on distributed digital evolution populations
- Free software: MIT license
- Documentation: https://hstrat.readthedocs.io
- Repository: https://github.com/mmore500/hstrat
Install
python3 -m pip install hstrat
A containerized release of hstrat is available via ghcr.io
bash
singularity exec docker://ghcr.io/mmore500/hstrat:v1.20.13 python3 -m hstrat --help
Features
hstrat serves to enable robust, efficient extraction of evolutionary history from evolutionary simulations where centralized, direct phylogenetic tracking is not feasible. Namely, in large-scale, decentralized parallel/distributed evolutionary simulations, where agents' evolutionary lineages migrate among many cooperating processors over the course of simulation.
hstrat can
- accurately estimate time since MRCA among two or several digital agents, even for uneven branch lengths
- reconstruct phylogenetic trees for entire populations of evolving digital agents
- serialize genome annotations to/from text and binary formats
- provide low-footprint genome annotations (e.g., reasonably as low as 64 bits each)
- be directly configured to satisfy memory use limits and/or inference accuracy requirements
hstrat operates just as well in single-processor simulation, but direct phylogenetic tracking using a tool like phylotrackpy should usually be preferred in such cases due to its capability for perfect record-keeping given centralized global simulation observability.
Example Usage
This code briefly demonstrates,
- initialization of a population of
HereditaryStratigraphicColumnof objects, - generation-to-generation transmission of
HereditaryStratigraphicColumnobjects with simple synchronous turnover, and then - reconstruction of phylogenetic history from the final population of
HereditaryStratigraphicColumnobjects.
```python3 from random import choice as rchoice import alifedataphyloinformaticsconvert as apc from hstrat import hstrat; print(f"{hstrat.version=}") # when last ran? from hstrat.auxiliarylib import seedrandom; seedrandom(1) # reproducibility
initialize a small population of hstrat instrumentation
(in full simulations, each column would be attached to an individual genome)
population = [hstrat.HereditaryStratigraphicColumn() for __ in range(5)]
evolve population for 40 generations under drift
for generation in range(40): population = [rchoice(population).CloneDescendant() for _ in population]
reconstruct estimate of phylogenetic history
alifestddf = hstrat.buildtree(population, versionpin=hstrat.version) treeascii = apc.RosettaTree(alifestddf).asdendropy.asasciiplot(width=20) print(tree_ascii) ```
hstrat.__version__='1.8.8'
/--- 1
/---+
/--+ \--- 3
| |
/---+ \------- 2
| |
+--+ \---------- 0
|
\-------------- 4
In actual usage, each hstrat column would be bundled with underlying genetic material of interest in the simulation --- entire genomes or, in systems with sexual recombination, individual genes. The hstrat columns are designed to operate as a neutral genetic annotation, enhancing observability of the simulation but not affecting its outcome.
How it Works
In order to enable phylogenetic inference over fully-distributed evolutionary simulation, hereditary stratigraphy adopts a paradigm akin to phylogenetic work in natural history/biology. In these fields, phylogenetic history is inferred through comparisons among genetic material of extant organisms, with --- in broad terms --- phylogenetic relatedness established through the extent of genetic similarity between organisms. Phylogenetic tracking through hstrat, similarly, is achieved through analysis of similarity/dissimilarity among genetic material sampled over populations of interest.
Rather than random mutation as with natural genetic material, however, genetic material used by hstrat is structured through hereditary stratigraphy. This methodology, described fully in our documentation, provides strong guarantees on phylogenetic inferential power, minimizes memory footprint, and allows efficient reconstruction procedures.
See here for more detail on underlying hereditary stratigraphy methodology.
Getting Started
Refer to our documentation for a quickstart guide and an annotated end-to-end usage example.
The examples/ folder provides extensive usage examples, including
- incorporation of hstrat annotations into a custom genome class,
- automatic stratum retention policy parameterization,
- pairwise and population-level phylogenetic inference, and
- phylogenetic tree reconstruction.
Interested users can find an explanation of how hereditary stratigraphy methodology implemented by hstrat works "under the hood," information on project-specific hstrat configuration, and full API listing for the hstrat package in the documentation.
Citing
If hstrat software or hereditary stratigraphy methodology contributes to a scholarly work, please cite it according to references provided here. We would love to list your project using hstrat in our documentation, see more here.
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
hcat

Owner
- Name: Matthew Andres Moreno
- Login: mmore500
- Kind: user
- Location: East Lansing, MI
- Company: @devosoft
- Website: mmore500.github.io
- Twitter: MorenoMathewA
- Repositories: 43
- Profile: https://github.com/mmore500
doctoral student, Computer Science and Engineering at Michigan State University
JOSS Publication
hstrat: a Python Package for phylogenetic inference on distributed digital evolution populations
Authors
Tags
artificial life digital evolution distributed computingGitHub Events
Total
- Create event: 134
- Release event: 1
- Issues event: 43
- Delete event: 88
- Issue comment event: 111
- Push event: 1,039
- Pull request review event: 166
- Pull request review comment event: 190
- Pull request event: 148
- Fork event: 1
Last Year
- Create event: 134
- Release event: 1
- Issues event: 43
- Delete event: 88
- Issue comment event: 111
- Push event: 1,039
- Pull request review event: 166
- Pull request review comment event: 190
- Pull request event: 148
- Fork event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Matthew Andres Moreno | m****g@g****m | 2,729 |
| vivaansinghvi07 | s****n@g****m | 169 |
| vivaansinghvi07 | v****8@g****m | 142 |
| Santiago Rodriguez Papa | r****0 | 113 |
| Connor Yang | c****5@g****m | 9 |
| Juan Julián Merelo Guervós | j****o@g****m | 4 |
| joeymsu | j****4@g****m | 1 |
| Emily Dolson | e****n@g****m | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 86
- Total pull requests: 147
- Average time to close issues: 2 months
- Average time to close pull requests: 6 days
- Total issue authors: 6
- Total pull request authors: 5
- Average comments per issue: 0.43
- Average comments per pull request: 1.13
- Merged pull requests: 133
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 17
- Pull requests: 112
- Average time to close issues: 26 days
- Average time to close pull requests: 3 days
- Issue authors: 2
- Pull request authors: 2
- Average comments per issue: 0.35
- Average comments per pull request: 1.1
- Merged pull requests: 99
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mmore500 (69)
- osorensen (11)
- GeekLogan (4)
- JJ (3)
- kgd-al (2)
- vivaansinghvi07 (2)
- joeymsu (1)
Pull Request Authors
- mmore500 (161)
- vivaansinghvi07 (24)
- JJ (4)
- rodsan0 (2)
- emilydolson (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 6,419 last-month
- Total dependent packages: 1
- Total dependent repositories: 1
- Total versions: 84
- Total maintainers: 1
pypi.org: hstrat
hstrat enables phylogenetic inference on distributed digital evolution populations
- Documentation: https://hstrat.readthedocs.io/
- License: MIT license
-
Latest release: 1.20.13
published 9 months ago
Rankings
Maintainers (1)
Dependencies
- Sphinx ==4.4.0 development
- anytree ==2.8.0 development
- bump2version ==0.5.11 development
- coverage ==4.5.4 development
- flake8 ==3.7.8 development
- gmpy ==1.17 development
- interval-search ==0.1.2 development
- iterify ==0.1.0 development
- iterpop ==0.3.4 development
- lru-dict ==1.1.7 development
- matplotlib ==3.1.2 development
- mmh3 ==3.0.0 development
- more-itertools ==8.13.0 development
- mpmath ==1.1.0 development
- nose ==1.3.7 development
- opytional ==0.1.0 development
- pip ==19.2.3 development
- pytest ==7.1.2 development
- pytest-xdist ==2.5.0 development
- safe-assert ==0.2.0 development
- scipy ==1.5.4 development
- tox ==3.24.0 development
- twine ==1.14.0 development
- watchdog ==0.9.0 development
- wheel ==0.33.6 development
- alabaster ==0.7.12 development
- anytree ==2.8.0 development
- argh ==0.26.2 development
- attrs ==22.1.0 development
- babel ==2.9.1 development
- bleach ==4.1.0 development
- bump2version ==0.5.11 development
- certifi ==2021.10.8 development
- charset-normalizer ==2.0.12 development
- coverage ==4.5.4 development
- cycler ==0.11.0 development
- distlib ==0.3.4 development
- docutils ==0.17.1 development
- entrypoints ==0.3 development
- execnet ==1.9.0 development
- filelock ==3.6.0 development
- flake8 ==3.7.8 development
- gmpy ==1.17 development
- idna ==3.3 development
- imagesize ==1.3.0 development
- importlib-metadata ==4.11.1 development
- iniconfig ==1.1.1 development
- interval-search ==0.1.2 development
- iterify ==0.1.0 development
- iterpop ==0.3.4 development
- jinja2 ==3.0.3 development
- kiwisolver ==1.3.2 development
- lru-dict ==1.1.7 development
- markupsafe ==2.1.0 development
- matplotlib ==3.1.2 development
- mccabe ==0.6.1 development
- mmh3 ==3.0.0 development
- more-itertools ==8.13.0 development
- mpmath ==1.1.0 development
- nose ==1.3.7 development
- numpy ==1.23.1 development
- opytional ==0.1.0 development
- packaging ==21.3 development
- pathtools ==0.1.2 development
- pkginfo ==1.8.2 development
- platformdirs ==2.5.1 development
- pluggy ==0.13.1 development
- py ==1.11.0 development
- pycodestyle ==2.5.0 development
- pyflakes ==2.1.1 development
- pygments ==2.11.2 development
- pyparsing ==3.0.7 development
- pytest ==7.1.2 development
- pytest-forked ==1.4.0 development
- pytest-xdist ==2.5.0 development
- python-dateutil ==2.8.2 development
- pytz ==2021.3 development
- pyyaml ==6.0 development
- readme-renderer ==32.0 development
- requests ==2.27.1 development
- requests-toolbelt ==0.9.1 development
- safe-assert ==0.2.0 development
- scipy ==1.5.4 development
- six ==1.16.0 development
- snowballstemmer ==2.2.0 development
- sphinx ==4.4.0 development
- sphinxcontrib-applehelp ==1.0.2 development
- sphinxcontrib-devhelp ==1.0.2 development
- sphinxcontrib-htmlhelp ==2.0.0 development
- sphinxcontrib-jsmath ==1.0.1 development
- sphinxcontrib-qthelp ==1.0.3 development
- sphinxcontrib-serializinghtml ==1.1.5 development
- toml ==0.10.2 development
- tomli ==2.0.1 development
- tox ==3.24.0 development
- tqdm ==4.62.3 development
- twine ==1.14.0 development
- urllib3 ==1.26.8 development
- virtualenv ==20.13.1 development
- watchdog ==0.9.0 development
- webencodings ==0.5.1 development
- wheel ==0.33.6 development
- zipp ==3.7.0 development
