staramr
Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 15 DOI reference(s) in README -
○Academic publication links
-
✓Committers with academic emails
1 of 11 committers (9.1%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.2%) to scientific vocabulary
Keywords
Repository
Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Basic Info
Statistics
- Stars: 120
- Watchers: 7
- Forks: 26
- Open Issues: 18
- Releases: 19
Topics
Metadata Files
README.md
staramr
staramr (AMR) scans bacterial genome contigs against the ResFinder, PointFinder, and PlasmidFinder databases (used by the ResFinder webservice and other webservices offered by the Center for Genomic Epidemiology) and compiles a summary report of detected antimicrobial resistance genes. The `star|instaramr` indicates that it can handle all of the ResFinder, PointFinder, and PlasmidFinder databases.
Note: The predicted phenotypes/drug resistances are for microbiological resistance and not clinical resistance. This is provided with support from the NARMS/CIPARS Molecular Working Group and is continually being improved. A small comparison between phenotype/drug resistance predictions produced by staramr and those available from NCBI can be found in the tutorial. We welcome any feedback or suggestions.
For example:
staramr search -o out --pointfinder-organism salmonella *.fasta
out/summary.tsv:
| Isolate ID | Quality Module | Genotype | Predicted Phenotype | CGE Predicted Phenotype | Plasmid | Scheme | Sequence Type | Genome Length | N50 value | Number of Contigs Greater Than Or Equal To 300 bp | Quality Module Feedback | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | Passed | aadA1, aadA2, blaTEM-57, cmlA1, gyrA (S83Y), sul3, tet(A) | streptomycin, ampicillin, chloramphenicol, ciprofloxacin I/R, nalidixic acid, sulfisoxazole, tetracycline | Spectinomycin, Streptomycin, Amoxicillin, Ampicillin, Cephalothin, Piperacillin, Ticarcillin, Chloramphenicol, Nalidixic acid, Ciprofloxacin, Sulfamethoxazole, Doxycycline, Tetracycline | ColpVC, IncFIB(S), IncFII(S), IncI1-I(Alpha) | sentericaachtman2 | 11 | 4785500 | 250423 | 41 | | SRR1952926 | Passed | blaTEM-57, gyrA (S83Y), tet(A) | ampicillin, ciprofloxacin I/R, nalidixic acid, tetracycline | Amoxicillin, Ampicillin, Cephalothin, Piperacillin, Ticarcillin, Nalidixic acid, Ciprofloxacin, Doxycycline, Tetracycline | ColpVC, IncFIB(S), IncFII(S), IncI1-I(Alpha) | sentericaachtman2 | 11 | 4785451 | 228311 | 40 |
out/detailed_summary.tsv:
| Isolate ID | Data | Data Type | Predicted Phenotype | CGE Predicted Phenotype | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Accession | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | ST11 (sentericaachtman2) | MLST | | | | | | | | | | | SRR1952908 | ColpVC | Plasmid | | | 98.96 | 100.0 | 193/193 | contig00038 | 1618 | 1426 | JX133088 | | SRR1952908 | aadA1 | Resistance | streptomycin | Spectinomycin, Streptomycin | 100.0 | 100.0 | 792/792 | contig00030 | 5355 | 4564 | JQ414041 |
out/resfinder.tsv:
| Isolate ID | Gene | Predicted Phenotype | CGE Predicted Phenotype | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Accession | Sequence | CGE Notes | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | sul3 | sulfisoxazole | Sulfamethoxazole | 100.00 | 100.00 | 792/792 | contig00030 | 2091 | 2882 | AJ459418 | ATGA[...] | | | SRR1952908 | tet(A) | tetracycline | Doxycycline, Tetracycline | 99.92 | 97.80 | 1247/1275 | contig00032 | 1476 | 2722 | AF534183 | ATGT[...] | |
out/pointfinder.tsv:
| Isolate ID | Gene | Predicted Phenotype | CGE Predicted Phenotype | Type | Position | Mutation | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Pointfinder Position | CGE Notes | CGE Required Mutation | CGE Mechanism | CGE PMID | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | gyrA (S83Y) | ciprofloxacin I/R, nalidixic acid | Nalidixic acid,Ciprofloxacin | codon | 83 | TCC -> TAC (S -> Y) | 99.96 | 100.00 | 2637/2637 | contig00008 | 22801 | 20165 | S83Y | | | Target modification | 7492118,10471553 | | SRR1952926 | gyrA (S83Y) | ciprofloxacin I/R, nalidixic acid | Nalidixic acid,Ciprofloxacin | codon | 83 | TCC -> TAC (S -> Y) | 99.96 | 100.00 | 2637/2637 | contig00011 | 157768 | 160404 | S83Y | | | Target modification | 7492118,10471553 |
out/plasmidfinder.tsv:
| Isolate ID | Plasmid | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Accession | |------------|-----------|-----------|----------|-------------------------|-------------|-------|-------|-----------| | SRR1952908 | ColpVC | 98.96 | 100 | 193/193 | contig00038 | 1618 | 1426 | JX133088 | | SRR1952908 | IncFIB(S) | 98.91 | 100 | 643/643 | contig00024 | 10302 | 9660 | FN432031 |
out/mlst.tsv:
| Isolate ID | Scheme | Sequence Type | Locus 1 | Locus 2 | Locus 3 | Locus 4 | Locus 5 | Locus 6 | Locus 7 | |------------|-----------|---------------|---------|---------|---------|---------|---------|---------|----------| | SRR1952908 | sentericaachtman2 | 11 | aroC(5) | dnaN(2) | hemD(3) | hisD(7) | purE(6) | sucA(6) | thrA(11) | | SRR1952926 | sentericaachtman2 | 11 | aroC(5) | dnaN(2) | hemD(3) | hisD(7) | purE(6) | sucA(6) | thrA(11) |
Table of Contents
Quick Usage
Search contigs
To search a list of contigs (in fasta format) for AMR genes using ResFinder please run:
bash
staramr search -o out *.fasta
Output files will be located in the directory out/.
To include acquired point-mutation resistances using PointFinder, please run:
bash
staramr search --pointfinder-organism salmonella -o out *.fasta
Where --pointfinder-organism is the specific organism you are interested in (currently only salmonella, campylobacter, enterococcus faecalis and enterococcus faecium are supported).
To specify which PlasmidFinder database to use, please run:
bash
staramr search --plasmidfinder-database-type enterobacteriaceae -o out *.fasta
Where --plasmidfinder-database-type is the specific database type you are interested in (currently only gram_positive, enterobacteriaceae are supported). By default, both databases are used.
To specify which MLST scheme to use, please run:
bash
staramr search -o out --mlst-scheme senterica *.fasta
Where --mlst-scheme is the specific organism you are interested in (please visit the scheme genus map to see which are available). By default, it detects the scheme automatically.
Database Info
To print information about the installed databases, please run:
staramr db info
Update Database
If you wish to update to the latest ResFinder, PointFinder, and PlasmidFinder databases, you may run:
bash
staramr db update --update-default
If you wish to switch to specific git commits of either ResFinder, PointFinder, or PlasmidFinder databases you may also pass --resfinder-commit [COMMIT], --pointfinder-commit [COMMIT], and --plasmidfinder-commit [COMMIT]. However, please note that because of compatibility issues arising from changes in the source databases, this functionality is largely unsupported and is unlikely to work for versions of the databases that StarAMR wasn't released with.
Restore Database
If you have updated the ResFinder/PointFinder/PlasmidFinder databases and wish to restore to the default version, you may run:
staramr db restore-default
Installation
Bioconda
Separate conda environment
The easiest way to install staramr is through Bioconda (we recommend using mamba as an alternative to conda).
bash
conda install mamba # Install mamba to make it easier to install later dependencies
mamba create -c conda-forge -c bioconda -c defaults --name staramr staramr
This will install the staramr software at the most recent version within the conda environment named staramr. Bioconda will install all necessary dependencies and databases. Once this is complete you can run:
bash
conda activate staramr # Activate conda environment
staramr --help
Same conda environment
If, instead, you wish to install staramr to the current conda environment you can run:
bash
mamba install -c conda-forge -c bioconda -c defaults staramr
You should now be able to run staramr --help and recieve a usage statement.
PyPI/Pip
You can also install staramr from PyPI using pip:
pip install staramr
However, you will have to install the external dependencies (listed below) separately.
Latest Code
If you wish to make use of the latest in-development version of staramr, you may update directly from GitHub using pip:
bash
pip install git+https://github.com/phac-nml/staramr
This will only install the Python code, you will still have to install the dependencies listed below (or run the pip command from the previously installed Bioconda environment).
Alternatively, if you wish to do development with staramr you can use a Python virtual environment (you must still install the non-Python dependencies separately).
```bash
Clone code
git clone https://github.com/phac-nml/staramr.git cd staramr
Setup virtual environment
virtualenv -p /path/to/python-bin .venv source .venv/bin/activate
Install staramr. Use '-e' to update the install on code changes.
pip install -e .
Now run staramr
staramr ```
Due to the way we packaged the ResFinder/PointFinder/PlasmidFinder databases, the development code will not come with a default database. You must first build the database before usage. E.g.
staramr db restore-default
Dependencies
- Python 3.7+
- BLAST+
- Git
- MLST
Input
List of genes to exclude
By default, the ResFinder/PointFinder/PlasmidFinder genes listed in genestoexclude.tsv will be excluded from the final results. To pass a custom list of genes the option --exclude-genes-file can be used, where the file specified will contains a list of the sequence ids (one per line) from the ResFinder/PointFinder/PlasmidFinder databases. For example:
gene_id
gyrA_1_CP073768.1
pmrB_1_CP051284.1
Please make sure to include gene_id in the first line. The default exclusion list can also be disabled with --no-exclude-genes. Gene IDs must exactly match the FASTA record IDs provided in the source databases.
Complex Mutations
Complex mutations describe multiple point mutations that must be simultaneously present in order to confer resistance. One such example is the multiple pbp5 mutations that must be present in Enterococcus faecium in order to confer ampicillin resistance. These complex mutations may be specified by the user using a TSV-formatted file with the following format:
| positions | mandatory | phenotype | | --- | --- | --- | | gene (mutation1), gene (mutation2) | gene (mutation1) | phenotype |
Where positions is all the point mutations to group into the complex mutation (optional and mandatory), mandatory is all the point mutations that must be present for the complex mutation to be reported (mandatory is a subset of positions), and phenotype is the phenotype that is conferred when this set of mutations is present. To see a specific example of this, please look at the default complex_mutations.tsv file included with StarAMR. The mutation will be reported in the pointfinder.tsv file similar to as follows:
| Isolate ID | Gene | Predicted Phenotype | CGE Predicted Phenotype | Type | Position | Mutation | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Pointfinder Position | CGE Notes | CGE Required Mutation | CGE Mechanism | CGE PMID | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | pbp5 | pbp5 (A216S), pbp5 (A499T), pbp5 (A68T), pbp5 (D204G), pbp5 (E100Q), pbp5 (E525D), pbp5 (E629V), pbp5 (E85D), pbp5 (G66E), pbp5 (K144Q), pbp5 (L177I), pbp5 (M485A), pbp5 (N496K), pbp5 (P667S), pbp5 (R34Q), pbp5 (S27G), pbp5 (T172A), pbp5 (T324A), pbp5 (V24A), pbp5 (V586L) | ampicillin | - | complex | 524, 527, 534, 566, 568, 585, 5100, 5144, 5172, 5177, 5204, 5216, 5324, 5485, 5496, 5499, 5525, 5586, 5629, 5667 | complex | 98.28 | 100.00 | 2037/2037 | pbp51AAK43724.1 | 1 | 2037 | pbp5 (A216S), pbp5 (A499T), pbp5 (A68T), pbp5 (D204G), pbp5 (E100Q), pbp5 (E525D), pbp5 (E629V), pbp5 (E85D), pbp5 (G66E), pbp5 (K144Q), pbp5 (L177I), pbp5 (M485A), pbp5 (N496K), pbp5 (P667S), pbp5 (R34Q), pbp5 (S27G), pbp5 (T172A), pbp5 (T324A), pbp5 (V24A), pbp5 (V586L) | - | - | - | - |
The complex mutation TSV file may be specifed on the command line when running Pointfinder:
staramr search --pointfinder-organism enterococcus_faecium -o out pbp5.fa --complex-mutations-file complex_mutations.tsv
Output
There are 8 different output files produced by staramr:
summary.tsv: A summary of all detected AMR genes/mutations/plasmids/sequence type in each genome, one genome per line. A series of descriptive statistics is also provided for each genome as well as feedback for whether or not the genome passes several quality metrics and if not, feedback on why the genome fails.detailed_summary.tsv: A detailed summary of all detected AMR genes/mutations/plasmids/sequence type in each genome, one gene per line.resfinder.tsv: A tabular file of each AMR gene and additional BLAST information from the ResFinder database, one gene per line.pointfinder.tsv: A tabular file of each AMR point mutation and additional BLAST information from the PointFinder database, one gene per line.plasmidfinder.tsv: A tabular file of each AMR plasmid type and additional BLAST information from the PlasmidFinder database, one plasmid type per line.mlst.tsv: A tabular file of each multi-locus sequence type (MLST) and it's corresponding locus/alleles, one genome per line.settings.txt: The command-line, database versions, and other settings used to runstaramr.results.xlsx: An Excel spreadsheet containing the previous 6 files as separate worksheets.
In addition, the directory hits/ stores fasta files of the specific blast hits.
summary.tsv
The summary.tsv output file generated by staramr contains the following columns:
- Isolate ID: The id of the isolate/genome file(s) passed to
staramr. - Quality Module: The isolate/genome file(s) pass/fail result(s) for the quality metrics
- Genotype: The AMR genotype of the isolate.
- Predicted Phenotype: The predicted AMR phenotype (drug resistances) for the isolate.
- CGE Predicted Phenotype: The CGE-predicted AMR phenotype (drug resistances) for the isolate.
- Plasmid: Plasmid types that were found for the isolate.
- Scheme: The MLST scheme used
- Sequence Type: The sequence type that's assigned when combining all allele types
- Genome Length: The isolate/genome file(s) genome length(s)
- N50 value: The isolate/genome file(s) N50 value(s)
- Number of Contigs Greater Than Or Equal To 300 bp: The number of contigs greater or equal to 300 base pair in the isolate/genome file(s)
- Quality Module Feedback: The isolate/genome file(s) detailed feedback for the quality metrics
Example
| Isolate ID | Quality Module | Genotype | Predicted Phenotype | CGE Predicted Phenotype | Plasmid | Scheme | Sequence Type | Genome Length | N50 value | Number of Contigs Greater Than Or Equal To 300 bp | Quality Module Feedback | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | Passed | aadA1, aadA2, blaTEM-57, cmlA1, gyrA (S83Y), sul3, tet(A) | streptomycin, ampicillin, chloramphenicol, ciprofloxacin I/R, nalidixic acid, sulfisoxazole, tetracycline | Spectinomycin, Streptomycin, Amoxicillin, Ampicillin, Cephalothin, Piperacillin, Ticarcillin, Chloramphenicol, Nalidixic acid, Ciprofloxacin, Sulfamethoxazole, Doxycycline, Tetracycline | ColpVC, IncFIB(S), IncFII(S), IncI1-I(Alpha) | sentericaachtman2 | 11 | 4785500 | 250423 | 41 | | SRR1952926 | Passed | blaTEM-57, gyrA (S83Y), tet(A) | ampicillin, ciprofloxacin I/R, nalidixic acid, tetracycline | Amoxicillin, Ampicillin, Cephalothin, Piperacillin, Ticarcillin, Nalidixic acid, Ciprofloxacin, Doxycycline, Tetracycline | ColpVC, IncFIB(S), IncFII(S), IncI1-I(Alpha) | sentericaachtman2 | 11 | 4785451 | 228311 | 40 |
detailed_summary.tsv
The detailed_summary.tsv output file generated by staramr contains the following columns:
- Isolate ID: The id of the isolate/genome file(s) passed to
staramr. - Data: The particular gene detected from ResFinder, PlasmidFinder, PointFinder, or the sequence type.
- Data Type: The type of gene (Resistance or Plasmid), or MLST.
- Predicted Phenotype: The predicted AMR phenotype (drug resistances) found in ResFinder/PointFinder. Plasmids will be left blank by default.
- CGE Predicted Phenotype: The CGE-predicted AMR phenotype (drug resistances) found in ResFinder/PointFinder. Plasmids will be left blank by default.
- %Identity: The % identity of the top BLAST HSP to the gene.
- %Overlap: THe % overlap of the top BLAST HSP to the gene (calculated as hsp length/total length * 100).
- HSP Length/Total Length The top BLAST HSP length over the gene total length (nucleotides).
- Contig: The contig id containing this gene.
- Start: The start of the gene (will be greater than End if on minus strand).
- End: The end of the gene.
- Accession: The accession of the gene from either ResFinder or PlasmidFinder database.
Example
| Isolate ID | Data | Data Type | Predicted Phenotype | CGE Predicted Phenotype | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Accession | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | ST11 (sentericaachtman2) | MLST | | | | | | | | | | | SRR1952908 | ColpVC | Plasmid | | | 98.96 | 100.0 | 193/193 | contig00038 | 1618 | 1426 | JX133088 | | SRR1952908 | aadA1 | Resistance | streptomycin | Spectinomycin, Streptomycin | 100.0 | 100.0 | 792/792 | contig00030 | 5355 | 4564 | JQ414041 |
resfinder.tsv
The resfinder.tsv output file generated by staramr contains the following columns:
- Isolate ID: The id of the isolate/genome file(s) passed to
staramr. - Gene: The particular AMR gene detected.
- Predicted Phenotype: The predicted AMR phenotype (drug resistances) for this gene.
- CGE Predicted Phenotype: The CGE-predicted AMR phenotype (drug resistances) for this gene.
- %Identity: The % identity of the top BLAST HSP to the AMR gene.
- %Overlap: THe % overlap of the top BLAST HSP to the AMR gene (calculated as hsp length/total length * 100).
- HSP Length/Total Length The top BLAST HSP length over the AMR gene total length (nucleotides).
- Contig: The contig id containing this AMR gene.
- Start: The start of the AMR gene (will be greater than End if on minus strand).
- End: The end of the AMR gene.
- Accession: The accession of the AMR gene in the ResFinder database.
- Sequence: The AMR Gene sequence
- CGE Notes: Any CGE notes associated with the prediction.
Example
| Isolate ID | Gene | Predicted Phenotype | CGE Predicted Phenotype | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Accession | Sequence | CGE Notes | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | sul3 | sulfisoxazole | Sulfamethoxazole | 100.00 | 100.00 | 792/792 | contig00030 | 2091 | 2882 | AJ459418 | ATGA[...] | | | SRR1952908 | tet(A) | tetracycline | Doxycycline, Tetracycline | 99.92 | 97.80 | 1247/1275 | contig00032 | 1476 | 2722 | AF534183 | ATGT[...] | |
pointfinder.tsv
The pointfinder.tsv output file generated by staramr contains the following columns:
- Isolate ID: The id of the isolate/genome file(s) passed to
staramr. - Gene: The particular AMR gene detected, with the point mutation within.
- Predicted Phenotype: The predicted AMR phenotype (drug resistances) for this gene.
- CGE Predicted Phenotype: The CGE-predicted AMR phenotype (drug resistances) for this gene.
- Type: The type of this mutation from PointFinder (either codon or nucleotide).
- Position: The position of the mutation. For codon type, the position is the codon number in the gene, for nucleotide type it is the nucleotide number.
- Mutation: The particular mutation. For codon type lists the codon mutation, for nucleotide type lists the single nucleotide mutation.
- %Identity: The % identity of the top BLAST HSP to the AMR gene.
- %Overlap: The % overlap of the top BLAST HSP to the AMR gene (calculated as hsp length/total length * 100).
- HSP Length/Total Length The top BLAST HSP length over the AMR gene total length (nucleotides).
- Contig: The contig id containing this AMR gene.
- Start: The start of the AMR gene (will be greater than End if on minus strand).
- End: The end of the AMR gene.
- Pointfinder Position: The Pointfinder-adjusted position, which may be off by one from the sequence position in the case of some indels.
- CGE Notes: Any CGE notes associated with the prediction.
- CGE Required Mutation: Any additional mutations that CGE predicts are required to confer the CGE predicted phenotype.
- CGE Mechanism: The CGE-reported mechanism.
- CGE PMID: The PMID ID associated with the CGE prediction.
Example
| Isolate ID | Gene | Predicted Phenotype | CGE Predicted Phenotype | Type | Position | Mutation | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Pointfinder Position | CGE Notes | CGE Required Mutation | CGE Mechanism | CGE PMID | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | SRR1952908 | gyrA (S83Y) | ciprofloxacin I/R, nalidixic acid | Nalidixic acid,Ciprofloxacin | codon | 83 | TCC -> TAC (S -> Y) | 99.96 | 100.00 | 2637/2637 | contig00008 | 22801 | 20165 | S83Y | | | Target modification | 7492118,10471553 | | SRR1952926 | gyrA (S83Y) | ciprofloxacin I/R, nalidixic acid | Nalidixic acid,Ciprofloxacin | codon | 83 | TCC -> TAC (S -> Y) | 99.96 | 100.00 | 2637/2637 | contig00011 | 157768 | 160404 | S83Y | | | Target modification | 7492118,10471553 |
plasmidfinder.tsv
The plasmidfinder.tsv output file generated by staramr contains the following columns:
- Isolate ID: The id of the isolate/genome file(s) passed to
staramr. - Plasmid: The particular plasmid type detected.
- %Identity: The % identity of the top BLAST HSP to the plasmid type.
- %Overlap: The % overlap of the top BLAST HSP to the plasmid type (calculated as hsp length/total length * 100).
- HSP Length/Total Length The top BLAST HSP length over the plasmid type total length (nucleotides).
- Contig: The contig id containing this plasmid type.
- Start: The start of the plasmid type (will be greater than End if on minus strand).
- End: The end of the plasmid type.
- Accession: The accession of the plasmid type in the PlasmidFinder database.
Example
| Isolate ID | Plasmid | %Identity | %Overlap | HSP Length/Total Length | Contig | Start | End | Accession | |------------|-----------|-----------|----------|-------------------------|-------------|-------|-------|-----------| | SRR1952908 | ColpVC | 98.96 | 100 | 193/193 | contig00038 | 1618 | 1426 | JX133088 | | SRR1952908 | IncFIB(S) | 98.91 | 100 | 643/643 | contig00024 | 10302 | 9660 | FN432031 |
mlst.tsv
The mlst.tsv output file generated by staramr contains the following columns:
- Isolate ID: The id of the isolate/genome file(s) passed to
staramr. - Scheme: The scheme that
MLSThas identified. - Sequence Type: The sequence type that's assigned when combining all allele types
- Locus #: A particular locus in the specified MLST scheme.
Example
| Isolate ID | Scheme | Sequence Type | Locus 1 | Locus 2 | Locus 3 | Locus 4 | Locus 5 | Locus 6 | Locus 7 | |------------|-----------|---------------|---------|---------|---------|---------|---------|---------|----------| | SRR1952908 | sentericaachtman2 | 11 | aroC(5) | dnaN(2) | hemD(3) | hisD(7) | purE(6) | sucA(6) | thrA(11) | | SRR1952926 | sentericaachtman2 | 11 | aroC(5) | dnaN(2) | hemD(3) | hisD(7) | purE(6) | sucA(6) | thrA(11) |
settings.txt
The settings.txt file contains the particular settings used to run staramr.
- command_line: The command line used to run
staramr. - version: The version of
staramr. - start_time,end_time,total_minutes: The start, end, and duration for running
staramr. - resfinderdbdir, pointfinderdbdir, plasmidfinderdbdir : The directory containing the ResFinder, PointFinder, and PlasmidFinder databases.
- resfinderdburl, pointfinderdburl, plasmidfinderdburl: The URL to the git repository for the ResFinder, PointFinder, and PlasmidFinder databases.
- resfinderdbcommit, pointfinderdbcommit, plasmidfinderdbcommit: The git commit ids for the ResFinder, PointFinder, and PlasmidFinder databases.
- resfinderdbdate, pointfinderdbdate, plasmidfinderdbdate: The date of the git commits of the ResFinder, PointFinder, and PlasmidFinder databases.
- mlst_version: The version of
MLST. - pointfindergenedrug_version, resfindergenedrug_version: A version identifier for the gene/drug mapping table used by
staramr.
Example

hits/
The hits/ directory contains the BLAST HSP nucleotides for the entries listed in the resfinder.tsv and pointfinder.tsv files. There are up to two files per input genome, one for ResFinder and one for PointFinder.
For example, with an input genome named SRR1952908.fasta there would be two files hits/resfinder_SRR1952908.fasta and hits/pointfinder_SRR1952908.fasta. These files contain mostly the same information as in the resfinder.tsv, pointfinder.tsv, and plasmidfinder.tsv files. Additional information is the databasegenestart and databasegeneend listing the start/end of the BLAST HSP on the AMR resistance gene from the ResFinder/PointFinder/PlasmidFinder databases.
Example
```
aadA13JQ414041 isolate: SRR1952908, contig: contig00030, contigstart: 5355, contigend: 4564, databasegenestart: 1, databasegeneend: 792, hsp/length: 792/792, pid: 100.00%, plength: 100.00% ATGAGGGAAGCGGTGATCGCCGAAGTATCGACTCAACTATCAGAGGTAGTTGGCGTCATC GAGCGCCATCTCGAACCGACGTTGCTGGCCGTACATTTGTACGGCTCCGCAGTGGATGGC ... ```
Tutorial
A tutorial guiding you though the usage of staramr, interpreting the results, and comparing with antimicrobial resistances available on NCBI can be found at staramr tutorial.
Usage
Main Command
Main staramr command. Can be used to set global options (primarily --verbose).

Search
Searches input FASTA files for AMR genes.

Database Build
Downloads and builds the ResFinder, PointFinder, and PlasmidFinder databases.

Database Update
Updates an existing download of the ResFinder, PointFinder, and PlasmidFinder databases.

Database Info
Prints information about an existing build of the ResFinder/PointFinder/PlasmidFinder databases.

Database Restore Default
Restores the default database for staramr.

Caveats
This software is still a work-in-progress. In particular, not all organisms stored in the PointFinder database are supported (only enterococcus_faecalis, helicobacter_pylori, enterococcus_faecium, campylobacter, escherichia_coli, salmonella are currently supported). Additionally, the predicted phenotypes are for microbiological resistance and not clinical resistance. Phenotype/drug resistance predictions are an experimental feature which is continually being improved.
staramr only works on assembled genomes and not directly on reads. A quick genome assembler you could use is Shovill. Or, you may also wish to try out the ResFinder webservice, or the command-line tools rgi or ariba which will work on sequence reads as well as genome assemblies. You may also wish to check out the CARD webservice.
Acknowledgements
Some ideas for the software were derived from the ResFinder, PointFinder, and PlasmidFinder command-line software, as well as from ABRicate and from SISTR (Salmonella In Silico Typing Resource) command-line tool .
Phenotype/drug resistance predictions are provided with support from the NARMS/CIPARS Molecular Working Group.
The Multi-locus sequence typing program is from the MLST Github.
Citations
If you find staramr useful, please cite the following paper:
Bharat A, Petkau A, Avery BP, Chen JC, Folster JP, Carson CA, Kearney A, Nadon C, Mabon P, Thiessen J, Alexander DC, Allen V, El Bailey S, Bekal S, German GJ, Haldane D, Hoang L, Chui L, Minion J, Zahariadis G, Domselaar GV, Reid-Smith RJ, Mulvey MR. Correlation between Phenotypic and In Silico Detection of Antimicrobial Resistance in Salmonella enterica in Canada Using Staramr. Microorganisms. 2022; 10(2):292. https://doi.org/10.3390/microorganisms10020292
You may also consider citing the following (databases or other resources used by staramr):
Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, Aarestrup FM, Larsen MV. 2012. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67:2640–2644. doi: 10.1093/jac/dks261
Zankari E, Allesøe R, Joensen KG, Cavaco LM, Lund O, Aarestrup F. PointFinder: a novel web tool for WGS-based detection of antimicrobial resistance associated with chromosomal point mutations in bacterial pathogens. J Antimicrob Chemother. 2017; 72(10): 2764–8. doi: 10.1093/jac/dkx217
Carattoli A, Zankari E, Garcia-Fernandez A, Voldby Larsen M, Lund O, Villa L, Aarestrup FM, Hasman H. PlasmidFinder and pMLST: in silico detection and typing of plasmids. Antimicrob. Agents Chemother. 2014. April 28th. doi: 10.1128/AAC.02412-14
Seemann T, MLST Github https://github.com/tseemann/mlst
Jolley KA, Bray JE and Maiden MCJ. Open-access bacterial population genomics: BIGSdb software, the PubMLST.org website and their applications [version 1; peer review: 2 approved]. Wellcome Open Res 2018, 3:124. doi: 10.12688/wellcomeopenres.14826.1
Legal
Copyright 2018 Government of Canada
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this work except in compliance with the License. You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Owner
- Name: National Microbiology Laboratory
- Login: phac-nml
- Kind: organization
- Website: https://www.nml-lnm.gc.ca/
- Repositories: 50
- Profile: https://github.com/phac-nml
Citation (CITATIONS.bib)
@Article{microorganisms10020292,
AUTHOR = {Bharat, Amrita and Petkau, Aaron and Avery, Brent P. and Chen, Jessica C. and Folster, Jason P. and Carson, Carolee A. and Kearney, Ashley and Nadon, Celine and Mabon, Philip and Thiessen, Jeffrey and Alexander, David C. and Allen, Vanessa and El Bailey, Sameh and Bekal, Sadjia and German, Greg J. and Haldane, David and Hoang, Linda and Chui, Linda and Minion, Jessica and Zahariadis, George and Domselaar, Gary Van and Reid-Smith, Richard J. and Mulvey, Michael R.},
TITLE = {Correlation between Phenotypic and In Silico Detection of Antimicrobial Resistance in Salmonella enterica in Canada Using Staramr},
JOURNAL = {Microorganisms},
VOLUME = {10},
YEAR = {2022},
NUMBER = {2},
ARTICLE-NUMBER = {292},
URL = {https://www.mdpi.com/2076-2607/10/2/292},
ISSN = {2076-2607},
ABSTRACT = {Whole genome sequencing (WGS) of Salmonella supports both molecular typing and detection of antimicrobial resistance (AMR). Here, we evaluated the correlation between phenotypic antimicrobial susceptibility testing (AST) and in silico prediction of AMR from WGS in Salmonella enterica (n = 1321) isolated from human infections in Canada. Phenotypic AMR results from broth microdilution testing were used as the gold standard. To facilitate high-throughput prediction of AMR from genome assemblies, we created a tool called Staramr, which incorporates the ResFinder and PointFinder databases and a custom gene-drug key for antibiogram prediction. Overall, there was 99% concordance between phenotypic and genotypic detection of categorical resistance for 14 antimicrobials in 1321 isolates (18,305 of 18,494 results in agreement). We observed an average sensitivity of 91.2% (range 80.5–100%), a specificity of 99.7% (98.6–100%), a positive predictive value of 95.4% (68.2–100%), and a negative predictive value of 99.1% (95.6–100%). The positive predictive value of gentamicin was 68%, due to seven isolates that carried aac(3)-IVa, which conferred MICs just below the breakpoint of resistance. Genetic mechanisms of resistance in these 1321 isolates included 64 unique acquired alleles and mutations in three chromosomal genes. In general, in silico prediction of AMR in Salmonella was reliable compared to the gold standard of broth microdilution. WGS can provide higher-resolution data on the epidemiology of resistance mechanisms and the emergence of new resistance alleles.},
DOI = {10.3390/microorganisms10020292}
}
GitHub Events
Total
- Create event: 5
- Release event: 1
- Issues event: 2
- Watch event: 43
- Delete event: 1
- Issue comment event: 1
- Push event: 19
- Pull request review event: 5
- Pull request event: 8
- Fork event: 4
Last Year
- Create event: 5
- Release event: 1
- Issues event: 2
- Watch event: 43
- Delete event: 1
- Issue comment event: 1
- Push event: 19
- Pull request review event: 5
- Pull request event: 8
- Fork event: 4
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Aaron Petkau | a****u@c****a | 558 |
| Eric Marinier | e****r@g****m | 123 |
| Jennifer Tran | j****1@g****m | 116 |
| RichardDavidsonTheGreat | r****s@u****a | 60 |
| Mary Jo Ramos | m****s@g****m | 15 |
| Jeffrey Thiessen | J****n | 6 |
| Mary Jo Ramos | m****s@w****a | 3 |
| Mary Jo Ramos | m****s@p****a | 1 |
| Philip.Mabon | p****n@c****a | 1 |
| Aaron Petkau | a****u@g****m | 1 |
| Jennifer Tran | j****n@c****a | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 82
- Total pull requests: 58
- Average time to close issues: 5 months
- Average time to close pull requests: 24 days
- Total issue authors: 41
- Total pull request authors: 8
- Average comments per issue: 2.65
- Average comments per pull request: 0.64
- Merged pull requests: 50
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 16
- Pull requests: 8
- Average time to close issues: 21 days
- Average time to close pull requests: about 22 hours
- Issue authors: 10
- Pull request authors: 2
- Average comments per issue: 1.94
- Average comments per pull request: 0.25
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- apetkau (22)
- Takadonet (5)
- pimarin (3)
- emarinier (3)
- lerminin (3)
- vappiah (3)
- sgsutcliffe (2)
- PHemarajata (2)
- qianxin-kxy (2)
- eam12 (2)
- peflanag (2)
- ireneortega (2)
- splaisan (2)
- kapsakcj (1)
- bgruening (1)
Pull Request Authors
- emarinier (20)
- apetkau (18)
- RichardDavidsonTheGreat (9)
- jennifertran (6)
- Takadonet (2)
- sgsutcliffe (2)
- pimarin (2)
- JeffreyThiessen (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 73 last-month
- Total dependent packages: 0
- Total dependent repositories: 4
- Total versions: 19
- Total maintainers: 2
pypi.org: staramr
Scans genome contigs against ResFinder, PlasmidFinder, and PointFinder databases
- Homepage: https://github.com/phac-nml/staramr
- Documentation: https://staramr.readthedocs.io/
- License: Apache v2.0
-
Latest release: 0.11.0
published about 1 year ago
Rankings
Maintainers (2)
Dependencies
- GitPython >=2.1.3
- biopython >=1.70
- coloredlogs >=10.0
- green >=2.13.0
- numpy >=1.12.1
- pandas >=0.23.0
- xlsxwriter >=1.0.2
- actions/checkout v2 composite
- conda-incubator/setup-miniconda v2 composite