dane-object-classification-worker
Given a set of keyframes, produces ImageNet classifications in a vector space
https://github.com/clariah/dane-object-classification-worker
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Repository
Given a set of keyframes, produces ImageNet classifications in a vector space
Basic Info
- Host: GitHub
- Owner: CLARIAH
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 45.9 KB
Statistics
- Stars: 0
- Watchers: 6
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
dane-object-classification-worker
This is a worker that interacts with DANE to receive its work.
Given a set of keyframes given by the shot detection worker,
it applies a ResNet50 model to each keyframe, writing to a Elasticsearch database a 2048 dimensional embedding vector, and
a softmax vector over the 1000 ImageNet classes if store_embeddings is on. In any case it will store the top predictions and scores
(where top is determined based on the config specified threshold) as a DANE result.
Installation
Run locally:
Prerequisites:
This worker uses Pytorch as a DNN framework, to install it follow the instructions given here: https://pytorch.org/get-started/locally/
Other prerequisites:
- Python 3.10.x
- Poetry
sh
poetry install
poetry shell
python ./worker.py
Docker
sh
docker build -t dane-object-classification-worker .
Configuration
Make sure the create, and fill in to match your environment, the following config.yml:
yaml
RABBITMQ:
HOST: your-rabbit-mq-host # set to your rabbitMQ server
PORT: 5672
EXCHANGE: DANE-exchange
RESPONSE_QUEUE: DANE-response-queue
USER: guest # change this for production mode
PASSWORD: guest # change this for production mode
ELASTICSEARCH:
HOST: ["elasticsearch-host"] # set to your elasticsearch host
PORT: 9200
USER: "" # change this for production mode
PASSWORD: "" # change this for production mode
SCHEME: http # OR https
INDEX: your-dane-index # change to your liking
LOGGING:
LEVEL: INFO
PATHS: # common settings for each DANE worker to define input/output dirs (with a common mount point)
TEMP_FOLDER: "./mount" # directory is automatically created (use ./mount for local testing)
OUT_FOLDER: "./mount" # directory is automatically created (use ./mount for local testing)
CLASSIFICATION: # settings for this worker specifically
STORE_EMBEDDINGS: false # store the vector back to Elasticsearch
BATCH_SIZE: 10 # batch size for the torch data loader
LOAD_WORKERS: 2 # number of workers
THRESHOLD: 0.6 # threshold of score
Run
The best way to run is within a Kubernetes environment, together with the following:
- DANE-server
- dane-shot-detection-worker
- RabbitMQ server
- Elasticsearch cluster
Documentation on how to setup this environment will be linked later on
Owner
- Name: CLARIAH
- Login: CLARIAH
- Kind: organization
- Website: http://www.clariah.nl
- Repositories: 65
- Profile: https://github.com/CLARIAH
CLARIAH offers humanities scholars a Common Lab providing access to large collections of digital resources and innovative tools for research
CodeMeta (codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"codeRepository": "https://github.com/CLARIAH/dane-object-classification-worker",
"dateCreated": "2020-09-11",
"issueTracker": "https://github.com/CLARIAH/dane-object-classification-worker/issues",
"name": "dane-object-classification-worker",
"version": "0.1.0",
"description": "DANE worker that, given a set of keyframes, produces ImageNet classifications in a vector space; depends on DANE-server",
"applicationCategory": "Multimedia processing",
"developmentStatus": "wip",
"isPartOf": "https://github.com/CLARIAH/DANE-server",
"funder": {
"@type": "Organization",
"name": "CLARIAH",
"url": "https://www.clariah.nl"
},
"programmingLanguage": [
"Python 3"
],
"softwareRequirements": [
"Python 3.10"
],
"author": [
{
"@type": "Person",
"@id": "https://orcid.org/0000-0002-5145-3603",
"givenName": "Nanne",
"familyName": "van Noord"
},
{
"@type": "Person",
"@id": "https://github.com/jblom",
"givenName": "Jaap",
"familyName": "Blom",
"affiliation": {
"@type": "Organization",
"name": "The Netherlands Institute for Sound and Vision"
}
}
]
}
GitHub Events
Total
Last Year
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Jaap Blom | j****m@b****l | 3 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0