genesis

A library for working with phylogenetic and population genetic data.

https://github.com/lczech/genesis

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 12 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.4%) to scientific vocabulary

Keywords

c-plus-plus evolutionary-placement phylogenetic-data phylogenetic-placements phylogenetic-trees phylogenetics placement pool-sequencing population-genetics
Last synced: 6 months ago · JSON representation ·

Repository

A library for working with phylogenetic and population genetic data.

Basic Info
  • Host: GitHub
  • Owner: lczech
  • License: gpl-3.0
  • Language: C++
  • Default Branch: master
  • Homepage: http://genesis-lib.org/
  • Size: 17.4 MB
Statistics
  • Stars: 65
  • Watchers: 8
  • Forks: 12
  • Open Issues: 2
  • Releases: 43
Topics
c-plus-plus evolutionary-placement phylogenetic-data phylogenetic-placements phylogenetic-trees phylogenetics placement pool-sequencing population-genetics
Created over 11 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

genesis

A library for working with phylogenetic and population genetic data.

CI Softwipe Score License Language Platforms
Release DOI <!-- Build Status --> <!--Language-->

Features

Genesis is a C++ library for working with phylogenetic and population genetic data: <!-- Some of the features of genesis: -->

  • Trees
    • Read, annotate and write trees in various formats.
    • Versatile tree data structure that can store any data on the edges and nodes.
    • Easily iterate trees with different policies (e.g., postorder, preorder).
    • Directly draw trees with colored branches to SVG files.
  • Placements
    • Read, manipulate and write jplace files from phylogenetic placement analyses.
    • Manipulate placement data: extract, filter, merge, and much more.
    • Calculate distance measures (e.g., KR distance, EDPL).
    • Run analyses like k-means Clustering, Squash Clustering, Edge PCA.
    • Visualize aspects like read abundances or correlation with meta-data on the branches of the tree.
  • Populations
    • Read and work with genome mapping and variant formats such as sam/bam/cram, pileup, sync, and vcf, as well as auxiliary formats such as gff/gtf, bim/map, and bed.
    • Iterate positions in a genome, individually or in different types of windows.
    • Compute statistics such as Tajima's D and F_ST for pool sequencing data.
  • Sequences and Taxonomies
    • Read, filter, manipulate and write sequences in fasta, fastq, and phylip format.
    • Calculate consensus sequences with different methods.
    • Work with taxonomic paths and build a taxonomic hierarchy.
  • Utilities
    • Math tools (matrices, histograms, statistics functions etc)
    • Color support (color lists, gradients etc, for making colored trees)
    • Various supportive file formats (bmp, csv, json, xml and more)

This is just an overview of the more prominent features. See the API reference for more.

Genesis is a library that is intended for researchers and developers who want to build their own tools and methods, or run their own custom analyses. If you are simply interested in analyzing your data with our methods, have a look at our command line tool Gappa for many common phylogenetic placement analyses.

Setup and Getting Started

For download and build instructions, see Setup.

You furthermore find all the information for getting started with genesis in the documentation. It contains a user manual with setup instructions and tutorials, as well as the full API reference.

For bug reports and feature requests of genesis, please open an issue on our GitHub page.

For user support of the phylogenetic placement parts of the library, please see our Phylogenetic Placement Google Group. It is intended for discussions about phylogenetic placement, and for user support for our software tools, such as EPA-ng and Gappa.

Showcases

A focus point of the library is to work with phylogenetic placements. The following figure summarized the placement position of 7.5 mio short reads on a reference tree with 190 taxa. The color code indicates the number of reads placed on each branch.

Phylogenetic tree with coloured branches.

This and other methods are presented in our manuscripts

Methods for Inference of Automatic Reference Phylogenies and Multilevel Phylogenetic Placement.
Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.
Bioinformatics, 2018. https://doi.org/10.1093/bioinformatics/bty767
<!-- bioRxiv, 2018. https://doi.org/10.1101/299792 -->

and

Scalable Methods for Analyzing and Visualizing Phylogenetic Placement of Metagenomic Samples.
Lucas Czech and Alexandros Stamatakis.
PLOS One, 2019. https://doi.org/10.1371/journal.pone.0217050
<!-- bioRxiv, 2019. https://doi.org/10.1101/346353 -->

See there for more on what Genesis can do.

Citation

When using Genesis, please cite

Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data.
Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.
Bioinformatics, 2020. https://doi.org/10.1093/bioinformatics/btaa070

Also, see Gappa for our command line tool to run your own analyses.

Owner

  • Name: Lucas Czech
  • Login: lczech
  • Kind: user
  • Location: Stanford, USA
  • Company: Carnegie Institution for Science

Postdoc in bioinformatics :seedling: and computer scientist :octocat: working on inter-disciplinary ways to save the planet :earth_africa:

Citation (CITATION.cff)

cff-version: 1.2.0
authors:
- family-names: "Czech"
  given-names: "Lucas"
  orcid: "https://orcid.org/0000-0002-1340-9644"
- family-names: "Barbera"
  given-names: "Pierre"
  orcid: "https://orcid.org/0000-0002-3437-150X"
- family-names: "Stamatakis"
  given-names: "Alexandros"
  orcid: "https://orcid.org/0000-0003-0353-0691"
title: "Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data"
doi: 10.1093/bioinformatics/btaa070
url: "http://github.com/lczech/genesis"
preferred-citation:
  type: article
  authors:
  - family-names: "Czech"
    given-names: "Lucas"
    orcid: "https://orcid.org/0000-0002-1340-9644"
  - family-names: "Barbera"
    given-names: "Pierre"
    orcid: "https://orcid.org/0000-0002-3437-150X"
  - family-names: "Stamatakis"
    given-names: "Alexandros"
    orcid: "https://orcid.org/0000-0003-0353-0691"
  doi: "10.1093/bioinformatics/btaa070"
  journal: "Bioinformatics"
  start: 3263 # First page number
  end: 3265 # Last page number
  title: "Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data"
  volume: 36
  issue: 10
  year: 2020

GitHub Events

Total
  • Release event: 2
  • Watch event: 7
  • Push event: 80
  • Create event: 1
Last Year
  • Release event: 2
  • Watch event: 7
  • Push event: 80
  • Create event: 1

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 3,693
  • Total Committers: 5
  • Avg Commits per committer: 738.6
  • Development Distribution Score (DDS): 0.004
Past Year
  • Commits: 361
  • Committers: 1
  • Avg Commits per committer: 361.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Lucas Czech l****h@h****g 3,679
Pierre Barbera p****s@g****m 8
computations b****h@g****m 3
Frédéric Mahé f****e@c****r 2
Bruno Bzeznik B****k@i****r 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 5
  • Total pull requests: 9
  • Average time to close issues: 18 days
  • Average time to close pull requests: 21 days
  • Total issue authors: 4
  • Total pull request authors: 5
  • Average comments per issue: 4.2
  • Average comments per pull request: 1.11
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 2 hours
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • daviddao (2)
  • vinitamehlawat (1)
  • dougwyu (1)
  • pierrebarbera (1)
Pull Request Authors
  • frederic-mahe (3)
  • bzizou (2)
  • computations (2)
  • pierrebarbera (1)
  • lczech (1)
Top Labels
Issue Labels
enhancement (3)
Pull Request Labels

Dependencies

.github/workflows/ci.yaml actions
  • actions/checkout v3.1.0 composite
  • aminya/setup-cpp v0.22.0 composite