qtm

QTLTableMiner++ tool for mining tables in scientific articles

https://github.com/pbr/qtm

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.7%) to scientific vocabulary

Keywords

candidate-genes europe-pmc ontologies qtl scientific-articles solr text-mining
Last synced: 6 months ago · JSON representation

Repository

QTLTableMiner++ tool for mining tables in scientific articles

Basic Info
Statistics
  • Stars: 4
  • Watchers: 3
  • Forks: 6
  • Open Issues: 5
  • Releases: 2
Topics
candidate-genes europe-pmc ontologies qtl scientific-articles solr text-mining
Created almost 9 years ago · Last pushed almost 4 years ago
Metadata Files
Readme License Citation Zenodo

README.md

QTL TableMiner++ (QTM)

Build Status DOI Published in BMC Bioinformatics

Description

A significant amount of experimental information about Quantitative Trait Locus (QTL) studies are described in (heterogenous) tables of scientific articles. Briefly, a QTL is a genomic region that correlates with a trait of interest (phenotype). QTM is a command-line tool to retrieve and semantically annotate results obtained from QTL mapping experiments. It takes full-text articles from the Europe PMC repository as input and outputs QTLs in a relational database (SQLite, see the ER diagram) and a text file (CSV).

Requirements

Installation

git clone https://github.com/PBR/QTM.git cd QTM mvn clean install solr/install.sh

Usage

``` ./QTM -h usage: QTM [-h] [-v] [-o OUTPUT] [-c CONFIG] [-V VERBOSE] FILE

Software to extract QTL data from full-text articles.

positional arguments: FILE input list of articles (PMCIDs, one per line)

named arguments: -h, --help show this help message and exit -v, --version show version and exit -o OUTPUT, --output OUTPUT filename prefix for output in SQLite (.db) and text (.csv) files (default: qtl) -c CONFIG, --config CONFIG config file (default: config.properties) -V VERBOSE, --verbose VERBOSE verbosity console output: 0-7 for OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE or ALL (default: 4 [INFO]) ```

Example data

  • input: articles.txt and config.properties files
  • output: qtl.csv and qtl.db files

Note: If you don't have access to Internet or Europe PMC, you can still run QTM on XML files stored in the data directory.

```bash

cp data/*.xml .

./QTM articles.txt ```

Owner

  • Name: Wageningen UR Plant Breeding
  • Login: PBR
  • Kind: organization
  • Location: Wageningen

Wageningen UR Plant Breeding

GitHub Events

Total
Last Year

Dependencies

pom.xml maven
  • com.google.code.gson:gson 2.6.2
  • com.googlecode.json-simple:json-simple 1.1.1
  • commons-lang:commons-lang 2.6
  • net.sf.jwordnet:jwnl 1.4_rc3
  • net.sourceforge.argparse4j:argparse4j 0.8.1
  • net.sourceforge.jexcelapi:jxl 2.6.12
  • org.apache.directory.studio:org.apache.commons.io 2.4
  • org.apache.httpcomponents:httpclient 4.5.9
  • org.apache.jena:jena-arq 2.9.2
  • org.apache.jena:jena-core 3.1.0
  • org.apache.solr:solr-solrj 6.2.1
  • org.xerial:sqlite-jdbc 3.8.11.2