https://github.com/dfornika/amrhike
Proof-of-concept for storing and querying harmonized AMR Genomic Analysis Results in datahike
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary
Keywords
Repository
Proof-of-concept for storing and querying harmonized AMR Genomic Analysis Results in datahike
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
amrhike
A proof-of-concept for storage and querying of harmonized Antimicrobial Resistance Genomic Analysis Results
Installation
Follow installation instructions for Leiningen for your system
Usage
bash
lein run
Currently, the program is designed to load a small set of harmonized AMR Genomic Analysis Result files into a datahike database.
It then runs the following query:
edn
[:find
?sample ?tool ?gene ?contig ?start ?stop
:where [?e :gene_symbol "catA1"]
[?e :gene_symbol ?gene]
[?e :sample_id ?sample]
[?e :analysis_software_name ?tool]
[?e :contig_id ?contig]
[?e :start ?start]
[?e :stop ?stop]]
...which essentially means "find all results where the catA1 gene was found, and display a subset of fields associated with those results"
The result is printed in JSON format to standard output, and should look like:
json
[ {
"sample_id" : "SAMEA6058467",
"analysis_software_name" : "AMRFinderPlus",
"gene_symbol" : "catA1",
"contig_id" : "DAAGAT010000085.1",
"start" : 43,
"stop" : 699
}, {
"sample_id" : "SAMEA6058467",
"analysis_software_name" : "AMRFinderPlus",
"gene_symbol" : "catA1",
"contig_id" : "DAAGAT010000041.1",
"start" : 222,
"stop" : 878
}, {
"sample_id" : "SAMEA6058467",
"analysis_software_name" : "ABRicate",
"gene_symbol" : "catA1",
"contig_id" : "DAAGAT010000041.1",
"start" : 222,
"stop" : 881
}, {
"sample_id" : "SAMEA6058467",
"analysis_software_name" : "ABRicate",
"gene_symbol" : "catA1",
"contig_id" : "DAAGAT010000085.1",
"start" : 43,
"stop" : 702
} ]
License
Copyright © 2020 Dan Fornika
This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.
This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.
Owner
- Name: Dan Fornika
- Login: dfornika
- Kind: user
- Location: Vancouver, BC
- Company: Public Health Agency of Canada
- Website: https://orcid.org/0000-0002-6178-3585
- Twitter: dfornika
- Repositories: 254
- Profile: https://github.com/dfornika
Genomics Specialist