proportal-asv-annotation

A repository containing information needed to reproduce ecotype-level classification of ASVs using the ProPortal genomes.

https://github.com/jcmcnch/proportal-asv-annotation

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

A repository containing information needed to reproduce ecotype-level classification of ASVs using the ProPortal genomes.

Basic Info
  • Host: GitHub
  • Owner: jcmcnch
  • Language: Python
  • Default Branch: main
  • Size: 222 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created about 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

ProPortal-ASV-Annotation

Important update (2023-10-17):

One SSU rRNA sequence out of the 443 SSU rRNA sequences derived from the ProPortal database is most likely a heterotrophic contaminant (SAR11). Since I was assuming that all SSU rRNA sequences found in ProPortal must be Cyanobacteria (i.e. not anticipating potential contamination), the previous version of the BLAST database would incorrectly classify some ASVs from heterotrophic bacteria found in environmental samples as Cyanobacteria since they matched to this contaminant found in ProPortal. This was rare, but did happen. Thank you to Lexi Jones-Kellett for pointing this out! This problem has now been fixed by using VSEARCH to classify SSU rRNA sequences (using udb database file found here: https://osf.io/sdw7m). Anything SSU rRNA that didn't match to Cyanobacteria was then filtered out (i.e. the one contaminant sequence).

TL;DR - Use the new BLAST database and you will be fine. If you used the old BLAST db, re-run your analysis.


This repository contains a BLAST database from ProPortal that allows a user to assign ecotype-level taxonomy to Synechococcales ASVs.

Here's how it works:

  • Your ASV queries are BLASTed against a database of full-length Prochlorococcus and Synechococcus 16S rRNA obtained from ProPortal genome assemblies (using barrnap), requiring 100% nucleotide identity and 100% coverage.
  • A python script then parses the results to generate ecotype-level classifications. It will also tell you if the classification is ambiguous - i.e. there are perfect matches to more than one clade or genus.

Some example results are included here from the GP13 and GA03 BioGEOTRACES cruises.

Important notes:

  • Only set up to work for ASVs - may give spurious results for OTUs since centroids may not represent the true biological sequence.

Setup:

  • Conda environments are specified in env/
  • Unless you want to recreate or modify the database, you can just put an input fasta file in ASVs-2-classify and run the following scripts:

```

will BLAST everything in the ASVs-2-classify folder

./scripts/07-blast-all-datasets.sh

script requires BLAST results as sys.argv1

./scripts/08-classify-ASVs-with-ProPortal.py blast-results/220123.Synechococcales.GA03.blastout.tsv > ProPortalReclassification/220124.GA03.results.tsv ```

Owner

  • Name: Jesse McNichol
  • Login: jcmcnch
  • Kind: user
  • Company: University of Southern California

I am currently a postdoctoral scholar in Jed Fuhrman's lab at USC. Please see my website/CV for more information about my interests and experience.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "McNichol"
  given-names: "Jesse"
  orcid: "https://orcid.org/0000-0002-8870-7726"
title: "ProPortal-ASV-Annotation"
version: 1.0.0
doi: 10.5281/zenodo.14564142
date-released: 2024-12-27
url: "https://github.com/jcmcnch/ProPortal-ASV-Annotation"

GitHub Events

Total
  • Release event: 1
  • Push event: 1
  • Create event: 1
Last Year
  • Release event: 1
  • Push event: 1
  • Create event: 1