https://github.com/asadprodhan/average-nucleotide-identity-ani-analysis

Average Nucleotide Identity (ANI) analysis

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Keywords

ani pyani

Last synced: 9 months ago · JSON representation

Repository

Average Nucleotide Identity (ANI) analysis

Basic Info

Host: GitHub
Owner: asadprodhan
Default Branch: main
Homepage:
Size: 101 KB

Statistics

Stars: 3
Watchers: 1
Forks: 0
Open Issues: 1
Releases: 0

Topics

ani pyani

Created about 3 years ago · Last pushed 12 months ago

Metadata Files

Readme

Average Nucleotide Identity Analysis for Diagnosis

M. Asaduzzaman Prodhan^*

DPIRD Diagnostics and Laboratory Services, Department of Primary Industries and Regional Development

3 Baron-Hay Court, South Perth, WA 6151, Australia

^*Correspondence: Asad.Prodhan@dpird.wa.gov.au

Average Nucleotide Identity (ANI) analysis calculates the percentage of nucleotide identity among the supplied nucleotide sequences. It produces a square matrix of the calculated values. This matrix allows for pairwise comparisons among the nucleotide sequences and helps determine their similarities.

ANI methods

There are several methods to calculate the ANI:

ANIb (based on BLAST algorithm)
ANIm (based on MUMmer algorithm)
TETRA (based on tetranucleotide signature occurrences)

ANI tools

There are several tools available for ANI analysis (Figueras et al., 2014). For example:

JSpecies (http://www.imedea.uib.es/jspecies) (multiple genome analysis)

Gegenees (http://www.gegenees.org/documentation.html) (Due to changes in NCBI Blast, Gegeneese may not function with Blast versions 2.10 and later. Blast version 2.9 should work OK)

EzGenome (http://www.ezbiocloud.net/ezgenome/ani) [online]

ANI calculator (http://enve-omics.ce.gatech.edu/ani/index) online
Python3 package (pyani) (https://github.com/widdowquinn/pyani, https://pyani.readthedocs.io/en/latest/run_anim.html) based on (Richter and Rossello´-Mo´ra, 2009). Pyani uses MUMmer algorithm.

How to run pyani

If you are working on HPC Cluster, load the required version of python

module load cray-python/3.10.10

Create a conda environment with the compatible version of python, matplotlib and pyani

conda create -n pyani_env python=3.10 "matplotlib<=3.7" "pyani>=0.2.12" -c bioconda

Activate the ani environment

conda activate pyani_env

Alternatively, you can use my conda environment for pyani
Download it HERE
Then activate it as follows

conda env create -f pyani_env.yml

Check it has been installed. Copy the following command and hit enter

average_nucleotide_identity.py --help

The above command will show the flags/options of the pyani program

Install dos2unix for changing file format

conda install conda-forge::dos2unix

Check it has been installed. Copy the following command and hit enter

dos2unix

Make two metadata files and name them as ‘classes.txt’ (Fig. 1) and ‘labels.txt’ (Fig. 2)

<img src="https://github.com/asadprodhan/Average-Nucleotide-Identity-ANI-analysis/blob/main/classes.PNG"

Figure 1. Classes

<img src="https://github.com/asadprodhan/Average-Nucleotide-Identity-ANI-analysis/blob/main/labels.PNG"

Figure 2. Labels

Note, the first column is the nucleotide sequences names

Second column is the label of the nucleotide sequences

Make a directory and name it as ‘ANI’ for example
Within the ‘ANI’ directory, make another directory and name it as ‘genomes’ for example
Keep all the nucleotide sequences, ‘classes.txt’ and ‘labels.txt’ in the ‘genomes’ directory
Check the line terminator of the ‘classes.txt’ and ‘labels.txt’ files as follows

file *.txt

If ‘classes.txt’ and ‘labels.txt’ have CRLF (Windows) format, then convert them into Unix format as follows:

dos2unix *.txt

Run the following command from the ‘ANI’ directory

average_nucleotide_identity.py -i genomes -o output_ANI --labels genomes/labels.txt --classes genomes/classes.txt -g --gmethod seaborn --gformat pdf,png -v -l ba_ANI.log

Note that you do not make the output directory beforehand. Otherwise, the command will exit with an ‘overwriting’ error
Command reference: https://github.com/widdowquinn/pyani/issues/56

Results

The final output of the ANI analysis looks like this (Fig. 3):

<img src="https://github.com/asadprodhan/Average-Nucleotide-Identity-ANI-analysis/blob/main/ANImpercentageidentity.png"

Figure 3. Results

References

Figueras, M.J., Beaz-Hidalgo, R., Hossain, M.J., Liles, M.R., 2014. Taxonomic Affiliation of New Genomes Should Be Verified Using Average Nucleotide Identity and Multilocus Phylogenetic Analysis. Genome Announc 2, e00927-14. https://doi.org/10.1128/genomeA.00927-14

Richter, M., Rossello´-Mo´ra, R., 2009. Shifting the genomic gold standard for the prokaryotic species definition | Proceedings of the National Academy of Sciences. PNAS 106, 19126–19131. https://doi.org/10.1073/pnas.0906412106

Owner

Name: Asad Prodhan
Login: asadprodhan
Kind: user
Location: Perth, Australia
Company: Department of Primary Industries and Regional Development

Website: www.linkedin.com/in/asadprodhan
Twitter: Asad_Prodhan
Repositories: 2
Profile: https://github.com/asadprodhan

Laboratory Scientist at DPIRD. My work involves Oxford Nanopore Sequencing and Bioinformatics for pest and pathogen diagnosis.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/asadprodhan/average-nucleotide-identity-ani-analysis

Science Score: 39.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Average Nucleotide Identity Analysis for Diagnosis

M. Asaduzzaman Prodhan^*

Contents

ANI methods

ANI tools

How to run pyani

Download it HERE

Then activate it as follows

Results

References

Owner

GitHub Events

Total

Last Year

https://github.com/asadprodhan/average-nucleotide-identity-ani-analysis

Science Score: 39.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Average Nucleotide Identity Analysis for Diagnosis

M. Asaduzzaman Prodhan*

Contents

ANI methods

ANI tools

How to run pyani

Download it HERE

Then activate it as follows

Results

References

Owner

GitHub Events

Total

Last Year

M. Asaduzzaman Prodhan^*