https://github.com/dfornika/amrhike

Proof-of-concept for storing and querying harmonized AMR Genomic Analysis Results in datahike

https://github.com/dfornika/amrhike

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.7%) to scientific vocabulary

Keywords

antimicrobial-resistance clojure data-harmonization datahike triplestore
Last synced: 4 months ago · JSON representation

Repository

Proof-of-concept for storing and querying harmonized AMR Genomic Analysis Results in datahike

Basic Info
  • Host: GitHub
  • Owner: dfornika
  • License: epl-2.0
  • Language: Clojure
  • Default Branch: master
  • Homepage:
  • Size: 13.7 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
antimicrobial-resistance clojure data-harmonization datahike triplestore
Created almost 6 years ago · Last pushed almost 6 years ago
Metadata Files
Readme Changelog License

README.md

amrhike

A proof-of-concept for storage and querying of harmonized Antimicrobial Resistance Genomic Analysis Results

Installation

Follow installation instructions for Leiningen for your system

Usage

bash lein run

Currently, the program is designed to load a small set of harmonized AMR Genomic Analysis Result files into a datahike database.

It then runs the following query:

edn [:find ?sample ?tool ?gene ?contig ?start ?stop :where [?e :gene_symbol "catA1"] [?e :gene_symbol ?gene] [?e :sample_id ?sample] [?e :analysis_software_name ?tool] [?e :contig_id ?contig] [?e :start ?start] [?e :stop ?stop]]

...which essentially means "find all results where the catA1 gene was found, and display a subset of fields associated with those results"

The result is printed in JSON format to standard output, and should look like:

json [ { "sample_id" : "SAMEA6058467", "analysis_software_name" : "AMRFinderPlus", "gene_symbol" : "catA1", "contig_id" : "DAAGAT010000085.1", "start" : 43, "stop" : 699 }, { "sample_id" : "SAMEA6058467", "analysis_software_name" : "AMRFinderPlus", "gene_symbol" : "catA1", "contig_id" : "DAAGAT010000041.1", "start" : 222, "stop" : 878 }, { "sample_id" : "SAMEA6058467", "analysis_software_name" : "ABRicate", "gene_symbol" : "catA1", "contig_id" : "DAAGAT010000041.1", "start" : 222, "stop" : 881 }, { "sample_id" : "SAMEA6058467", "analysis_software_name" : "ABRicate", "gene_symbol" : "catA1", "contig_id" : "DAAGAT010000085.1", "start" : 43, "stop" : 702 } ]

License

Copyright © 2020 Dan Fornika

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

Owner

  • Name: Dan Fornika
  • Login: dfornika
  • Kind: user
  • Location: Vancouver, BC
  • Company: Public Health Agency of Canada

Genomics Specialist