dane-object-classification-worker

Given a set of keyframes, produces ImageNet classifications in a vector space

https://github.com/clariah/dane-object-classification-worker

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Given a set of keyframes, produces ImageNet classifications in a vector space

Basic Info
  • Host: GitHub
  • Owner: CLARIAH
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 45.9 KB
Statistics
  • Stars: 0
  • Watchers: 6
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme License Codemeta

README.md

dane-object-classification-worker

This is a worker that interacts with DANE to receive its work. Given a set of keyframes given by the shot detection worker, it applies a ResNet50 model to each keyframe, writing to a Elasticsearch database a 2048 dimensional embedding vector, and a softmax vector over the 1000 ImageNet classes if store_embeddings is on. In any case it will store the top predictions and scores (where top is determined based on the config specified threshold) as a DANE result.

Installation

Run locally:

Prerequisites:

This worker uses Pytorch as a DNN framework, to install it follow the instructions given here: https://pytorch.org/get-started/locally/

Other prerequisites:

sh poetry install poetry shell python ./worker.py

Docker

sh docker build -t dane-object-classification-worker .

Configuration

Make sure the create, and fill in to match your environment, the following config.yml:

yaml RABBITMQ: HOST: your-rabbit-mq-host # set to your rabbitMQ server PORT: 5672 EXCHANGE: DANE-exchange RESPONSE_QUEUE: DANE-response-queue USER: guest # change this for production mode PASSWORD: guest # change this for production mode ELASTICSEARCH: HOST: ["elasticsearch-host"] # set to your elasticsearch host PORT: 9200 USER: "" # change this for production mode PASSWORD: "" # change this for production mode SCHEME: http # OR https INDEX: your-dane-index # change to your liking LOGGING: LEVEL: INFO PATHS: # common settings for each DANE worker to define input/output dirs (with a common mount point) TEMP_FOLDER: "./mount" # directory is automatically created (use ./mount for local testing) OUT_FOLDER: "./mount" # directory is automatically created (use ./mount for local testing) CLASSIFICATION: # settings for this worker specifically STORE_EMBEDDINGS: false # store the vector back to Elasticsearch BATCH_SIZE: 10 # batch size for the torch data loader LOAD_WORKERS: 2 # number of workers THRESHOLD: 0.6 # threshold of score

Run

The best way to run is within a Kubernetes environment, together with the following:

  • DANE-server
  • dane-shot-detection-worker
  • RabbitMQ server
  • Elasticsearch cluster

Documentation on how to setup this environment will be linked later on

Owner

  • Name: CLARIAH
  • Login: CLARIAH
  • Kind: organization

CLARIAH offers humanities scholars a Common Lab providing access to large collections of digital resources and innovative tools for research

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "codeRepository": "https://github.com/CLARIAH/dane-object-classification-worker",
  "dateCreated": "2020-09-11",
  "issueTracker": "https://github.com/CLARIAH/dane-object-classification-worker/issues",
  "name": "dane-object-classification-worker",
  "version": "0.1.0",
  "description": "DANE worker that, given a set of keyframes, produces ImageNet classifications in a vector space; depends on DANE-server",
  "applicationCategory": "Multimedia processing",
  "developmentStatus": "wip",
  "isPartOf": "https://github.com/CLARIAH/DANE-server",
  "funder": {
    "@type": "Organization",
    "name": "CLARIAH",
    "url": "https://www.clariah.nl"
  },
  "programmingLanguage": [
    "Python 3"
  ],
  "softwareRequirements": [
    "Python 3.10"
  ],
  "author": [
    {
      "@type": "Person",
      "@id": "https://orcid.org/0000-0002-5145-3603",
      "givenName": "Nanne",
      "familyName": "van Noord"
    },
    {
      "@type": "Person",
      "@id": "https://github.com/jblom",
      "givenName": "Jaap",
      "familyName": "Blom",
      "affiliation": {
        "@type": "Organization",
        "name": "The Netherlands Institute for Sound and Vision"
      }
    }
  ]
}

GitHub Events

Total
Last Year

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 3
  • Total Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Jaap Blom j****m@b****l 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels