concept_miner_elastic

An Elastic-backed wrapper around Prodigy, Kibana and spaCy. Designed to allow a small group to quickly annotate data and iterate on a text classification machine learning model.

https://github.com/bobwatson/concept_miner_elastic

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.1%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

An Elastic-backed wrapper around Prodigy, Kibana and spaCy. Designed to allow a small group to quickly annotate data and iterate on a text classification machine learning model.

Basic Info
  • Host: GitHub
  • Owner: BobWatson
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 671 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 2
  • Releases: 1
Created almost 5 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License Citation

README.md

Concept Miner Elastic

A framework around prodigy to let end users start and stop the interface, trigger training, etc. - for small group collab.

Installation

  1. Make sure you have an active prodigy license
  2. Create a .env file in the build directory (or edit docker-compose.yml in that location), containing:

    shell COMPOSE_PROJECT_NAME=concept_miner_elastic KIBANA_SERVER_PUBLICBASEURL=https://<your_base_url>/kibana TZ=<your_timezone> DEBUG=false PRODIGY_KEY=<your_prodigy_key>

  3. Run docker compose up

  4. The server should now be running on http://127.0.0.1:8000/ (you can change this port in ./conf/nginx.conf)

Usage

  1. Visit http://:8000/
  2. From the Home tab, upload your PDF or TXT documents for training
  3. Navigate to the Annotating tab, and perform your prodigy annotations
  4. Once you have completed a number of annotations, return to 'Home' and click 'Go' next to 'Stop annotating (Start training)'
  5. Results will be on the 'Training Log' tab, and your model and annotations will be in the ./output folder (or as configured in app.conf)

Owner

  • Login: BobWatson
  • Kind: user
  • Location: Canberra, Aus

Citation (CITATION.cff)

# YAML 1.2
---
abstract: "An Elastic-backed wrapper around Prodigy, Kibana and spaCy. Designed to allow a small group to quickly annotate data and iterate on a text classification machine learning model."
authors: 
  -
    family-names: Watson
    given-names: "Robert John"
cff-version: "1.1.0"
date-released: 2021-09-07
license: MIT
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/BobWatson/concept_miner_elastic"
title: "Concept Miner"
...

GitHub Events

Total
Last Year

Dependencies

build/requirements-dev.txt pypi
  • black * development
  • pre-commit * development
build/requirements.txt pypi
  • Flask ==2.0.1
  • Flask-RESTful ==0.3.9
  • Jinja2 ==3.0.1
  • MarkupSafe ==2.0.1
  • PyYAML ==5.4.1
  • ansi2html ==1.6.0
  • blis ==0.7.4
  • catalogue ==2.0.6
  • certifi ==2021.5.30
  • chardet ==4.0.0
  • charset-normalizer ==2.0.4
  • click ==7.1.2
  • cymem ==2.0.5
  • elasticsearch ==7.14.1
  • idna ==3.2
  • murmurhash ==1.0.5
  • numpy ==1.21.2
  • packaging ==21.0
  • pathy ==0.6.0
  • pdftotext ==2.2.0
  • preshed ==3.0.5
  • prodigy >=1.11.0,<2.0.0
  • psutil ==5.8.0
  • psycopg2 ==2.9.1
  • pydantic ==1.8.2
  • pymitter ==0.3.1
  • pyparsing ==2.4.7
  • requests ==2.26.0
  • smart-open ==5.2.1
  • spacy ==3.1.2
  • spacy-legacy ==3.0.8
  • srsly ==2.4.1
  • thinc ==8.0.9
  • tqdm ==4.62.2
  • typer ==0.3.2
  • typing-extensions ==3.10.0.2
  • urllib3 ==1.26.6
  • wasabi ==0.8.2
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v2 composite
  • github/codeql-action/analyze v1 composite
  • github/codeql-action/autobuild v1 composite
  • github/codeql-action/init v1 composite
build/concept_miner_elastic/app/Dockerfile docker
  • mcr.microsoft.com/vscode/devcontainers/python 0-${VARIANT} build
build/concept_miner_elastic/docker-compose.yml docker
  • docker.elastic.co/elasticsearch/elasticsearch 7.14.1
  • docker.elastic.co/kibana/kibana 7.14.1
  • nginx 1.21
  • postgres 13