kustvakt

:speedboat: User and policy management component for KorAP, capable of rewriting queries for policy based document restrictions.

https://github.com/korap/kustvakt

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.5%) to scientific vocabulary

Keywords

access-control authorization ldap oauth2 query-rewriting
Last synced: 6 months ago · JSON representation ·

Repository

:speedboat: User and policy management component for KorAP, capable of rewriting queries for policy based document restrictions.

Basic Info
  • Host: GitHub
  • Owner: KorAP
  • License: bsd-2-clause
  • Language: Java
  • Default Branch: master
  • Homepage:
  • Size: 174 MB
Statistics
  • Stars: 4
  • Watchers: 9
  • Forks: 3
  • Open Issues: 31
  • Releases: 26
Topics
access-control authorization ldap oauth2 query-rewriting
Created over 8 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog License Citation

README.md

Kustvakt

DOI

Kustvakt is a user rights management component for KorAP managing access to linguistic resources (corpus data) typically bound with some licensing agreements (Diewald et al., 2016). KorAP provides user access to The Mannheim German Reference Corpus (DeReKo) at the Leibniz Institut für Deutsche Sprache (IDS Mannheim) that has complex licensing schemes with heterogenous restrictions involving access methods and purposes (Kupietz & Lüngen, 2014). To manage access to resources, Kustvakt implements query rewriting (Bański et al., 2014) and authorization using OAuth2 (Kupietz et al., 2022). User access also includes automated access through applications on behalf of users.

Kustvakt acts as a middleware in KorAP binding other components, such as Koral a query serializer and Krill a search component, together. As the KorAP's API provider, it provides web-services, e.g. searching and retrieving annotation data of matches, that can be used by a KorAP client, e.g. Kalamar (a KorAP web user interface), KorapSRU (the CLARIN FCS endpoint for KorAP) and the RKorAPClient (a package to access KorAP from R).

Versions

  • Kustvakt lite version

provides basic services including search, match info, statistic and annotation services, without user and policy management.

  • Kustvakt full version

provides user and policy management and extended services, in addition to the basic services. This version requires a database (Sqlite is provided) and an LDAP system (UnboundID InMemoryDirectoryServer is provided) for user authentication.

Recent changes on the project are described in the change logs (Changes files).

Setup

Prerequisites: Jdk 17, Git, Maven 3

Clone the latest version of Kustvakt git clone git@github.com:KorAP/Kustvakt.git

Since Kustvakt requires Krill and Koral, please install Krill and Koral in your maven local repository according to the required versions specified in Kustvakt/full/pom.xml. For packaging Kustvakt, change into the Kustvakt folder.

Packaging Kustvakt full version mvn clean package

Packaging Kustvakt lite version mvn package -P lite

The jar file is located in the target folder.

Running Kustvakt Server

java -jar target/Kustvakt-full-[version].jar

will run Kustvakt full version with the example kustvakt.conf configuration file included. See Customizing kustvakt configuration.

Kustvakt full version requires a Krill index and an LDAP configuration. By default, Kustvakt uses the sample-index located at the same directory of the jar file, and the embedded LDAP server example.

Running Kustvakt with a custom Spring XML configuration

Kustvakt can be run using an external Spring XML configuration file, e.g. using test-config-icc.xml located in data folder:

java -jar target/Kustvakt-full-[version].jar --spring-config data/test-config-icc.xml

Running Kustvakt with Docker

Kustvakt is available at Docker Hub. Please see the instructions to run the Kustvakt container at the DockerHub page.

Generating an OAuth2 super client

An OAuth2 super client is required to be able to use web services that require user authentication. Kustvakt can generate a super client automatically. See Setting Initial Super Client for User Authentication.

Web-services

All web-services including their usage examples are described in the wiki.

Some request examples:

  • search

curl 'http://localhost:8089/api/v1.0/search?q=Wasser&ql=poliqarp'

  • search public metadata

curl 'http://localhost:8089/api/v1.0/search?q=Wasser&ql=poliqarp&fields=textSigle,title,availablility&access-rewrite-disabled=true'

  • match info

curl 'http://localhost:8089/api/v1.0/corpus/GOE/AGA/01784/p4145-4146?foundry=opennlp'

Shutting down Kustvakt Server

Kustvakt server can be shut down by sending a POST request with a shutdown token. When Kustvakt server is started, a shutdown token is automatically generated and written to a shutdownToken file with the following format:

token=[shutdown-token]

A shutdown request can be sent as follows.

curl -H "Content-Type: application/x-www-form-urlencoded" "http://localhost:8089/shutdown" -d @shutdownToken

Customizing Kustvakt configuration

Copy the default Kustvakt configuration file (kustvakt.conf or kustvakt-lite.conf, to the data folder at the project directory. Please do not change the name of the configuration file.

Setting Index Directory

Set krill.indexDir in the configuration file to the location of your Krill index (relative path to the jar). In Kustvakt's root directory, there is a sample index, e.g.

krill.indexDir = sample-index

Changing Kustvakt Server Port and Host

server.port = 8089
server.host = localhost

Setting Default Foundries

The following properties define the default foundries used for specific layers. For instance in a rewrite, a default foundry may be added to a Koral query missing a foundry.

default.foundry.partOfSpeech = tt
default.foundry.lemma = tt
default.foundry.orthography = opennlp
default.foundry.dependency = malt
default.foundry.constituent = corenlp
default.foundry.morphology = marmot
default.foundry.surface = base

Advanced Setup

Advanced setup such as LDAP configurations, setting a test environment, database properties and mail configurations for email notifications, are described in the wiki.

License

Kustvakt is published under the BSD-2 License. It is developed as part of KorAP, the Corpus Analysis Platform at the Leibniz Institute for the German Language (IDS), member of the Leibniz Association.

Contributions

Contributions to Kustvakt are very welcome!

Ideally, any contributions should be committed via KorAP Gerrit server to facilitate code reviewing (see Gerrit Code Review - A Quick Introduction). However, we are also happy to accept comments and pull requests via GitHub.

Please note that unless you explicitly state otherwise any contribution intentionally submitted for inclusion into Kustvakt shall –
as Kustvakt itself – be under the BSD-2 License.

Publication

Margaretha Illig, Eliza / Diewald, Nils / Kamocki, Paweł / Kupietz, Marc (2024): Managing Access to Language Resources in a Corpus Analysis Platform. In: Vandeghinste, Vincent / Kontino, Thalassia (Ed.): CLARIN Annual Conference Proceedings. Barcelona: CLARIN. S. 163-167.

Kupietz, Marc / Diewald, Nils / Margaretha, Eliza (2022): Building Paths to Corpus Data. A Multi-Level Least Effort and Maximum Return Approach. In: Fišer, Darja / Witt, Andreas (Ed.): CLARIN: The Infrastructure for Language Resources, Berlin, Boston: De Gruyter, 2022, S. 163-190. https://doi.org/10.1515/9783110767377-007

Diewald, Nils / Hanl, Michael / Margaretha, Eliza / Bingel, Joachim / Kupietz, Marc / Bański, Piotr / Witt, Andreas (2016): KorAP Architecture – Diving in the Deep Sea of Corpus Data. In: Calzolari, Nicoletta / Choukri, Khalid / Declerck, Thierry / Goggi, Sara/Grobelnik, Marko / Maegaard, Bente / Mariani, Joseph / Mazo, Helene / Moreno, Asuncion / Odijk, Jan / Piperidis, Stelios (Ed.): Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia. Paris: European Language Resources Association (ELRA), 2016. S. 3586-3591.

Bański, Piotr / Diewald, Nils, Hanl, Michael, Kupietz, Marc / Witt, Andreas (2014): Access Control by Query Rewriting. The Case of KorAP. In: Proceedings of the Ninth Conference on International Language Resources and Evaluation (LREC’14). European Language Resources Association (ELRA), 2014. S. 3817-3822.

References

Kupietz, Marc/Lüngen, Harald (2014): Recent Developments in DeReKo. In: Calzolari, Nicoletta et al. (eds.): Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). Reykjavik: ELRA, 2378-2385.

Owner

  • Name: KorAP
  • Login: KorAP
  • Kind: organization
  • Location: Mannheim, Germany

Corpus Analysis Platform by the Leibniz Institute for the German Language

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Kustvakt
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Eliza Margaretha
    family-names: Illig
    orcid: 'https://orcid.org/0009-0000-0466-7783'
    affiliation: Leibniz Institute for the German Language
  - given-names: Nils
    family-names: Diewald
    affiliation: Leibniz Institute for the German Language
    orcid: 'https://orcid.org/0000-0002-2993-9180'
  - given-names: Marc
    family-names: Kupietz
    orcid: 'https://orcid.org/0000-0001-8997-8256'
    affiliation: Leibniz Institute for the German Language  
repository-code: 'https://github.com/KorAP/Kustvakt'
url: 'https://korap.ids-mannheim.de'
abstract: >-
  Kustvakt is a user and policy management component for
  KorAP, capable of rewriting queries for policy based
  document restrictions. 
keywords:
  - access control
  - query rewriting
  - OAuth 2.0
license: BSD-2-Clause

GitHub Events

Total
  • Create event: 7
  • Issues event: 52
  • Release event: 4
  • Delete event: 5
  • Issue comment event: 82
  • Push event: 61
  • Gollum event: 53
  • Pull request event: 5
Last Year
  • Create event: 7
  • Issues event: 52
  • Release event: 4
  • Delete event: 5
  • Issue comment event: 82
  • Push event: 61
  • Gollum event: 53
  • Pull request event: 5

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 29
  • Total pull requests: 1
  • Average time to close issues: about 1 year
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 1
  • Average comments per issue: 0.66
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 25
  • Pull requests: 1
  • Average time to close issues: about 1 month
  • Average time to close pull requests: N/A
  • Issue authors: 3
  • Pull request authors: 1
  • Average comments per issue: 0.48
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • margaretha (51)
  • notesjor (4)
  • Akron (3)
Pull Request Authors
  • dependabot[bot] (24)
  • margaretha (2)
Top Labels
Issue Labels
enhancement (6) API v1.1. (3) bug (2) APv1.1. (2) discussion (2) wontfix (1)
Pull Request Labels
dependencies (24) discussion (1)

Dependencies

full/pom.xml maven
  • org.hibernate:hibernate-jpamodelgen 5.6.10.Final provided
  • org.projectlombok:lombok 1.18.24 provided
  • com.nimbusds:nimbus-jose-jwt 9.23
  • com.nimbusds:oauth2-oidc-sdk 8.28.4
  • com.novell.ldap:jldap 4.3
  • com.sun.mail:javax.mail 1.6.2
  • com.unboundid:unboundid-ldapsdk 6.0.5
  • de.ids_mannheim.korap:Kustvakt-core [0.68,)
  • javax.activation:activation 1.1.1
  • mysql:mysql-connector-java 8.0.29
  • org.apache.commons:commons-text 1.9
  • org.apache.oltu.oauth2:org.apache.oltu.oauth2.authzserver 1.0.2
  • org.apache.oltu.oauth2:org.apache.oltu.oauth2.client 1.0.2
  • org.apache.velocity.tools:velocity-tools-generic 3.1
  • org.apache.velocity:velocity-engine-core 2.3
  • org.hibernate:hibernate-c3p0 5.6.10.Final
  • org.hibernate:hibernate-entitymanager 5.6.10.Final
  • org.hibernate:hibernate-java8 5.6.10.Final
  • com.sun.jersey.jersey-test-framework:jersey-test-framework-core 1.19.4 test
  • com.sun.jersey.jersey-test-framework:jersey-test-framework-grizzly 1.19.4 test
  • de.ids_mannheim.korap:Kustvakt-core [0.68,) test
  • org.mock-server:mockserver-netty 5.13.2 test
.github/workflows/ci_test.yml actions
  • actions/checkout v2 composite
  • actions/setup-java v1 composite
Dockerfile docker
  • openjdk 19-alpine build