https://github.com/biocore/greengenes2

Processing support for Greengenes2

https://github.com/biocore/greengenes2

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Processing support for Greengenes2

Basic Info
  • Host: GitHub
  • Owner: biocore
  • Language: Python
  • Default Branch: main
  • Size: 2.91 MB
Statistics
  • Stars: 12
  • Watchers: 3
  • Forks: 2
  • Open Issues: 6
  • Releases: 1
Created about 4 years ago · Last pushed almost 3 years ago
Metadata Files
Readme Support

README.md

Background

The Greengenes2 phylogeny is based on whole genome information from the Web of Life, and revised with high quality full length 16S from the Living Tree Project and full length 16S extracted from bacterial operons using uDance. A seed taxonomy is derived using the mappings from the Web of Life to GTDB. This taxonomy is then augmented using information from the Living Tree Project when possible. The augmented taxonomy is decorated onto the backbone using tax2tree.

Using this decorated backbone, all public and private 16S V4 ASVs from Qiita pulled from redbiom representing hundreds of thousands of samples, as well as full length mitochondrial and chloroplast 16S (sourced from SILVA, are then placed using DEPP. Fragments are resolved. The resulting tree contains > 15,000,000 tips.

Fragment resolution can result in fragments being placed on the parent edge of a named node. This can occur if the node representing a clade, such as d__Archaea, does not represent sufficient diversity for the input fragments to place. As a result, prior to reading taxonomy off of the tree, each name from the backbone is evaluated for whether its edge to parent has a single or multifurcation of placements. If this occurs, the name is “promoted”. The idea being that fragments off a named edge to its parent are more like the named node than a sibling.

Following this name promotion, the full taxonomy is then read off the tree providing lineage information for each fragment and sequence represented in the tree. This taxonomy information can be utilized within QIIME 2 by cross referencing your input feature set against what’s present in the tree. By doing so, we can obtain taxonomy for both WGS data (if processed by Woltka and 16S V4 ASVs. There is an important caveat though: right now, we can only classify based sequences already represented by the tree, so unrepresented V4 ASVs will be unassigned.

What is this repository?

This repository contains the methods and detail for performing taxonomy decoration against a backbone. And, following decoration, to establish the release files.

Owner

  • Name: biocore
  • Login: biocore
  • Kind: organization
  • Location: Cyberspace

Collaboratively developed bioinformatics software.

GitHub Events

Total
  • Issues event: 4
  • Watch event: 1
  • Issue comment event: 6
Last Year
  • Issues event: 4
  • Watch event: 1
  • Issue comment event: 6

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 12
  • Total pull requests: 1
  • Average time to close issues: 8 months
  • Average time to close pull requests: less than a minute
  • Total issue authors: 8
  • Total pull request authors: 1
  • Average comments per issue: 1.08
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 0
  • Average time to close issues: 5 minutes
  • Average time to close pull requests: N/A
  • Issue authors: 3
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • wasade (4)
  • mestaki (1)
  • ygouin (1)
  • JCSzamosi (1)
  • julianzaugg (1)
  • VoronDM (1)
  • BrinthaVP (1)
  • danpal96 (1)
  • Oceazh (1)
Pull Request Authors
  • wasade (1)
Top Labels
Issue Labels
documentation (1)
Pull Request Labels