https://github.com/dissco/dissco-core-annotation-processing-service
Processes annotations
https://github.com/dissco/dissco-core-annotation-processing-service
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary
Repository
Processes annotations
Basic Info
- Host: GitHub
- Owner: DiSSCo
- License: apache-2.0
- Language: Java
- Default Branch: main
- Size: 717 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 6
Metadata Files
README.md
annotation-processing-service
The annotation processing service can receive an annotation from two different sources: - Through the API as a request to register or archive an annotation - Through a RabbitMQ queue to register an annotation
After the processing services received the annotation event it will take the following actions:
- Check if the same annotation is already in the system based on:
- Annotation target
- Annotation creator
- Annotation Motivation
- If there is not an existing annotation in the system it will:
- Create and register a new Handle
- Insert the annotation in the database
- Insert the annotation in Elasticsearch
- Publish a CreateUpdateDeleteEvent
- Return the new annotation
- If there is an existing annotation but body, creator or preferenceScore differs from the new annotation the system will:
- Check if we need to update the handle (only when motivation is changed)
- Update the annotation in the database
- Update the annotation in Elasticsearch
- Publish a CreateUpdateDeleteEvent (with the jsonpatch)
- Return the updated annotation
- If there is an existing annotation but the body, creator and preferenceScore is the same the system will:
- Update the last checked timestamp of the annotation but not update anything else
- Return null
The annotation processing does not work with batch functionality but processes a single annotation at a time.
If the insert into elasticSearch or the publishing of the CreateUpdateDelete event fails it will rollback the previous steps.
Run locally
To run the system locally it can be run from an IDEA. Clone the code and fill in the application properties (see below). The application needs to store data in a database and an Elastic Search instance. In RabbitMQ mode it needs a RabbitMQ cluster to connect to and receive the messages from.
Run as Container
The application can also be run as container. It will require the environmental values described below. The container can be built with the Dockerfile, in the root of the project.
Profiles
There are two profiles with which the application can be run:
Web
spring.profiles.active=web
This listens to a API which had two endpoints:
- POST /
This endpoint can be used to post annotation events to the processing service. After this it will follow the above described process.
- DELETE /{prefix}/{postfix}
This endpoint can be used to archive a specific annotation.
Archiving will put the status of the Handle on Archived, fill the deleted_on field in the database and remove the annotation from Elasticsearch.
RabbitMQ
spring.profiles.active=rabbit-mq-mas
This will make the application listen to a specified queue and process the annotation events from the queue.
After receiving it will follow the above describe process.
If exceptions are thrown, it will retry the message a X number of time after which it will push it to a Dead Letter Queue
Environmental variables
The following backend specific properties can be configured:
```
Database properties
spring.datasource.url=# The JDBC url to the PostgreSQL database to connect with spring.datasource.username=# The login username to use for connecting with the database spring.datasource.password=# The login password to use for connecting with the database
Elasticsearch properties
elasticsearch.hostname=# The hostname of the Elasticsearch cluster elasticsearch.port=# The port of the Elasticsearch cluster elasticsearch.index-name=# The name of the index for Elasticsearch
RabbitMQ properties (only necessary when the RabbitMQ profile is active). RabbitMQ also has a series of default properties
spring.rabbitmq.username=# Username to connect to RabbitMQ spring.rabbitmq.password=# Password to connect to RabbitMQ spring.rabbitmq.host=# Hostname of RabbitMQ
Owner
- Name: DiSSCo
- Login: DiSSCo
- Kind: organization
- Email: info@dissco.eu
- Location: Europe
- Website: dissco.eu
- Twitter: disscoeu
- Repositories: 35
- Profile: https://github.com/DiSSCo
Distributed System of Scientific Collections - pan-European Research Infrastructure. Updates on DiSSCo and natural science collections
GitHub Events
Total
- Release event: 3
- Delete event: 13
- Issue comment event: 38
- Push event: 47
- Pull request review comment event: 7
- Pull request event: 32
- Pull request review event: 19
- Create event: 37
Last Year
- Release event: 3
- Delete event: 13
- Issue comment event: 38
- Push event: 47
- Pull request review comment event: 7
- Pull request event: 32
- Pull request review event: 19
- Create event: 37
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 8
- Average time to close issues: N/A
- Average time to close pull requests: 18 days
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.75
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 8
- Average time to close issues: N/A
- Average time to close pull requests: 18 days
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.75
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- southeo (37)
- samleeflang (14)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/cache v1 composite
- actions/checkout v2 composite
- actions/setup-java v1 composite
- anothrNick/github-tag-action 1.36.0 composite
- docker/build-push-action v3 composite
- docker/login-action v1 composite
- docker/metadata-action v4 composite
- eclipse-temurin 17-alpine build
- org.testcontainers:testcontainers-bom 1.17.6 import
- co.elastic.clients:elasticsearch-java 8.4.1
- com.fasterxml.jackson.core:jackson-databind
- com.fasterxml.jackson.dataformat:jackson-dataformat-xml
- com.fasterxml.jackson.datatype:jackson-datatype-jsr310
- com.github.java-json-tools:json-patch 1.13
- jakarta.json:jakarta.json-api 2.1.1
- org.postgresql:postgresql
- org.projectlombok:lombok
- org.springframework.boot:spring-boot-configuration-processor
- org.springframework.boot:spring-boot-starter-jooq
- org.springframework.boot:spring-boot-starter-validation
- org.springframework.boot:spring-boot-starter-web
- org.springframework.kafka:spring-kafka
- org.flywaydb:flyway-core test
- org.springframework.boot:spring-boot-starter-test test
- org.springframework.kafka:spring-kafka-test test
- org.testcontainers:junit-jupiter test
- org.testcontainers:postgresql test
- org.testcontainers:testcontainers test