Science Score: 75.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
✓Institutional organization owner
Organization ulbmuenster has institutional domain (www.ulb.uni-muenster.de) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Keywords
data-catalog
data-engineering
data-lake
data-lakehouse
datacite
library-catalogue
marc21
metadata
metadata-catalog
metadata-lake
metadata-management
metadata-mapping
metalake
oai-pmh
Last synced: 6 months ago
·
JSON representation
·
Repository
DatAasee - A Metadata-Lake for Libraries
Basic Info
- Host: GitHub
- Owner: ulbmuenster
- License: mit
- Language: Makefile
- Default Branch: main
- Homepage: https://ulbmuenster.github.io/dataasee/
- Size: 3.06 MB
Statistics
- Stars: 14
- Watchers: 3
- Forks: 2
- Open Issues: 0
- Releases: 1
Topics
data-catalog
data-engineering
data-lake
data-lakehouse
datacite
library-catalogue
marc21
metadata
metadata-catalog
metadata-lake
metadata-management
metadata-mapping
metalake
oai-pmh
Created over 1 year ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
License
Citation
README.md
DatAasee (0.3)

Repository: github.com/ulbmuenster/dataasee (nb sources backup)
Maintainer: Christian Himpe (at University and State Library of Münster)
Licenses: MIT (add. CC-BY for openapi.yaml)
Function: Metadata-Lake, Metadata Catalog, Metadata Aggregator, Union Catalog
Audience: University Libraries, Research Libraries, Academic Libraries, Scientific Libraries
Tech Stack Canvas
- Setting: Many distributed data and metadata sources
- Goals:
- Centralize metadata
- Interlinked metadata catalog
- Super-index for bibliographic and research data
- Features:
- Interact through HTTP-API (JSON)
- Search by filter or full-text
- Custom query via:
SQL,Gremlin,Cypher,MQL,GraphQL
- Frontend: Lowdefy
- Backend: Connect (Benthos)
- Data Storage: ArcadeDB
- Infrastructure: Compose (via Docker or Podman)
- Deployment: via Harbor (at Uni Münster)
- Monitoring: Prometheus
- Integrations:
- Protocols:
OAI-PMH(HTTP),S3(HTTP),GET(HTTP),DatAasee(HTTP) - Encodings:
XML(Plain-Text) - Formats:
DataCite(XML),DC(XML),LIDO(XML),MARC(XML),MODS(XML)
- Protocols:
- Security: Priviledged endpoints (CQRS)
- Testing: check-jsonschema
- Development: Github
Documentation
- Dependencies Overview
- Software Documentation
- Architecture Documentation
- Database Schema
- OpenAPI Schema
DatAasee: A Metadata-Lake as Metadata Catalog for a Virtual Data-Lake (Companion Paper, Open Access)
Getting Started (Deployment)
- Depends on
docker-compose(and compatible todockerandpodman) - To deploy, no need to clone, just use the
compose.yamlfile. - See the Deploy Documentation for details.
Quick Start:
shell
$ wget https://raw.githubusercontent.com/ulbmuenster/dataasee/0.3/compose.yaml
$ mkdir -p backup
$ DB_PASS=password1 DL_PASS=password2 docker compose up -d
Default Ports
8343DatAasee API2480Database API (Development Only)9999Database JMX (Development Only)8000Web Frontend (Development Only)-
80Web Frontend (Deployment Only)
Repository Contents
api/- API definition and message schemasassets/- Logos and style definitionbackend/- Processor pipeline and component definitionscontainer/- Dockerfilesdatabase/- Database initialization, schemas and enumerated datadocs/- Documentation of software, data and architecturefrontend/- Prototype frontend definitiontests/- Test definitions and data
Getting Started (Development)
- Available
maketargets:make setupBuild server imagesmake startStart serversmake stopStop serversmake resetStop and start serversmake emptyDelete database backups (requires priviledges)make logsShow logs (requiresgrep)make peakReport peak database memory usage (requiresgrep)make testRun tests (requirescheck-jsonschema,busybox,wget)make tidyList violations of StrictYAML (requiresyamllint)make todoList inline TODOs in repo (requiresgrep)
- Custom
makevariable:COMPOSE
Contributors
Owner
- Name: Universitäts- und Landesbibliothek Münster
- Login: ulbmuenster
- Kind: organization
- Location: Muenster
- Website: https://www.ulb.uni-muenster.de/
- Repositories: 5
- Profile: https://github.com/ulbmuenster
Citation (CITATION.cff)
cff-version: 1.2.0
title: DatAasee
message: In Development
type: software
authors:
- given-names: Christian
family-names: Himpe
orcid: 'https://orcid.org/0000-0003-2194-6754'
affiliation: University of Münster
- given-names: Philipp
family-names: Kuschat
affiliation: University of Münster
- given-names: Holger
family-names: Przibytzin
affiliation: University of Münster
- given-names: Marc
family-names: Schutzeichel
affiliation: University of Münster
- given-names: Jan-Erik
family-names: Stange
affiliation: University of Münster
abstract: A metadata-lake for libraries
keywords:
- Metadata Lake
- Metadata Catalog
- Metadata Management
- Data Catalog
- Data Engineering
license: MIT
version: '0.3'
GitHub Events
Total
- Release event: 1
- Watch event: 9
- Push event: 1
- Create event: 2
Last Year
- Release event: 1
- Watch event: 9
- Push event: 1
- Create event: 2