https://github.com/dissco/dissco-core-annotation-processing-service

Processes annotations

https://github.com/dissco/dissco-core-annotation-processing-service

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary
Last synced: 4 months ago · JSON representation

Repository

Processes annotations

Basic Info
  • Host: GitHub
  • Owner: DiSSCo
  • License: apache-2.0
  • Language: Java
  • Default Branch: main
  • Size: 717 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 6
Created over 3 years ago · Last pushed 8 months ago
Metadata Files
Readme License

README.md

annotation-processing-service

DOI

The annotation processing service can receive an annotation from two different sources: - Through the API as a request to register or archive an annotation - Through a RabbitMQ queue to register an annotation

After the processing services received the annotation event it will take the following actions: - Check if the same annotation is already in the system based on: - Annotation target - Annotation creator - Annotation Motivation - If there is not an existing annotation in the system it will: - Create and register a new Handle - Insert the annotation in the database - Insert the annotation in Elasticsearch - Publish a CreateUpdateDeleteEvent - Return the new annotation - If there is an existing annotation but body, creator or preferenceScore differs from the new annotation the system will: - Check if we need to update the handle (only when motivation is changed) - Update the annotation in the database - Update the annotation in Elasticsearch - Publish a CreateUpdateDeleteEvent (with the jsonpatch) - Return the updated annotation - If there is an existing annotation but the body, creator and preferenceScore is the same the system will: - Update the last checked timestamp of the annotation but not update anything else - Return null

The annotation processing does not work with batch functionality but processes a single annotation at a time.

If the insert into elasticSearch or the publishing of the CreateUpdateDelete event fails it will rollback the previous steps.

Run locally

To run the system locally it can be run from an IDEA. Clone the code and fill in the application properties (see below). The application needs to store data in a database and an Elastic Search instance. In RabbitMQ mode it needs a RabbitMQ cluster to connect to and receive the messages from.

Run as Container

The application can also be run as container. It will require the environmental values described below. The container can be built with the Dockerfile, in the root of the project.

Profiles

There are two profiles with which the application can be run:

Web

spring.profiles.active=web
This listens to a API which had two endpoints: - POST / This endpoint can be used to post annotation events to the processing service. After this it will follow the above described process. - DELETE /{prefix}/{postfix} This endpoint can be used to archive a specific annotation. Archiving will put the status of the Handle on Archived, fill the deleted_on field in the database and remove the annotation from Elasticsearch.

RabbitMQ

spring.profiles.active=rabbit-mq-mas This will make the application listen to a specified queue and process the annotation events from the queue. After receiving it will follow the above describe process. If exceptions are thrown, it will retry the message a X number of time after which it will push it to a Dead Letter Queue

Environmental variables

The following backend specific properties can be configured:

```

Database properties

spring.datasource.url=# The JDBC url to the PostgreSQL database to connect with spring.datasource.username=# The login username to use for connecting with the database spring.datasource.password=# The login password to use for connecting with the database

Elasticsearch properties

elasticsearch.hostname=# The hostname of the Elasticsearch cluster elasticsearch.port=# The port of the Elasticsearch cluster elasticsearch.index-name=# The name of the index for Elasticsearch

RabbitMQ properties (only necessary when the RabbitMQ profile is active). RabbitMQ also has a series of default properties

spring.rabbitmq.username=# Username to connect to RabbitMQ spring.rabbitmq.password=# Password to connect to RabbitMQ spring.rabbitmq.host=# Hostname of RabbitMQ

Owner

  • Name: DiSSCo
  • Login: DiSSCo
  • Kind: organization
  • Email: info@dissco.eu
  • Location: Europe

Distributed System of Scientific Collections - pan-European Research Infrastructure. Updates on DiSSCo and natural science collections

GitHub Events

Total
  • Release event: 3
  • Delete event: 13
  • Issue comment event: 38
  • Push event: 47
  • Pull request review comment event: 7
  • Pull request event: 32
  • Pull request review event: 19
  • Create event: 37
Last Year
  • Release event: 3
  • Delete event: 13
  • Issue comment event: 38
  • Push event: 47
  • Pull request review comment event: 7
  • Pull request event: 32
  • Pull request review event: 19
  • Create event: 37

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 8
  • Average time to close issues: N/A
  • Average time to close pull requests: 18 days
  • Total issue authors: 0
  • Total pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.75
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 8
  • Average time to close issues: N/A
  • Average time to close pull requests: 18 days
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.75
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • southeo (37)
  • samleeflang (14)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/build.yaml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • actions/setup-java v1 composite
  • anothrNick/github-tag-action 1.36.0 composite
  • docker/build-push-action v3 composite
  • docker/login-action v1 composite
  • docker/metadata-action v4 composite
Dockerfile docker
  • eclipse-temurin 17-alpine build
pom.xml maven
  • org.testcontainers:testcontainers-bom 1.17.6 import
  • co.elastic.clients:elasticsearch-java 8.4.1
  • com.fasterxml.jackson.core:jackson-databind
  • com.fasterxml.jackson.dataformat:jackson-dataformat-xml
  • com.fasterxml.jackson.datatype:jackson-datatype-jsr310
  • com.github.java-json-tools:json-patch 1.13
  • jakarta.json:jakarta.json-api 2.1.1
  • org.postgresql:postgresql
  • org.projectlombok:lombok
  • org.springframework.boot:spring-boot-configuration-processor
  • org.springframework.boot:spring-boot-starter-jooq
  • org.springframework.boot:spring-boot-starter-validation
  • org.springframework.boot:spring-boot-starter-web
  • org.springframework.kafka:spring-kafka
  • org.flywaydb:flyway-core test
  • org.springframework.boot:spring-boot-starter-test test
  • org.springframework.kafka:spring-kafka-test test
  • org.testcontainers:junit-jupiter test
  • org.testcontainers:postgresql test
  • org.testcontainers:testcontainers test