datahub

Self-hostable, open-source engine for reproducible data harmonization, dataset building & exploration

https://github.com/datasnack/datahub

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

data-harmonization reproducibility self-hosted
Last synced: 6 months ago · JSON representation ·

Repository

Self-hostable, open-source engine for reproducible data harmonization, dataset building & exploration

Basic Info
Statistics
  • Stars: 8
  • Watchers: 3
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
data-harmonization reproducibility self-hosted
Created about 2 years ago · Last pushed 8 months ago
Metadata Files
Readme Changelog License Citation

README.md

Data Hub

The Data Hub is a geographic information system (GIS) featuring a data fusion engine designed for data harmonization, alongside an interactive dashboard for effective data exploration and collaboration. Its key objective is to merge data of multiple formats and sources across temporal and spatial axes, allowing users to combine, analyze, and interpret the data.

In this repository you can explore an example setup of the Data Hub software tailored to data concerning Ghana.

Installation

The recommended way to use the Data Hub is via Docker and to reference the container built from this repository. See the Ghana Data Hub example for installation steps. By doing so it allows for updating the core system independent of you customizations.

To install the Data Hub from source, follow these steps:

  • Install uv
  • Clone this repository and open it in your terminal
  • Create a .env file based on the .env.example file.
  • Install a PostGIS v16.x database (you can use the provided Docker image from the docker-compose.yml, $ docker compose up -d postgis).
  • Create a Python v.3.12.x. virtual environment with uv venv --python 3.12 and activate it with source .venv/bin/activate.
  • Install Python dependencies via uv sync (might be complicated due to GDAL/PROJ dependencies).
  • Run database migrations with python manage.py migrate
  • Run Django with python manage.py runserver

The system is now running and usable at http://localhost:8000/, to use it:

  • Create a new superuser with python manage.py createsuperuser
  • Import your Shapes with python manage.py loadshapes <file>
  • Place your Data Layer source files in src/datalayers/
  • Downloaded data will be placed in data/datalayers/

Attributions

The Data Hub is an open-source software (OSS) developed through the DiDEX project (Digital Data and Exploratory Spaces for Strengthening Infectious Disease Research within the One Health nexus) at the Bernhard Nocht Institute for Tropical Medicine. The project is supported and funded by the Joachim Herz Foundation ("Innovate! Academy" program).

The source code of the Data Hub is informed by the results of the ESIDA project.

License

MIT

Owner

  • Name: Data Snack
  • Login: datasnack
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Data Hub
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Jonathan
    family-names: Ströbele
    orcid: 'https://orcid.org/0000-0002-9757-8030'
  - given-names: Juliane
    family-names: Boenecke
    orcid: 'https://orcid.org/0000-0002-6327-8152'
repository-code: 'https://github.com/datasnack/datahub'
license: MIT
version: 0.10.1
date-released: '2025-08-26'

GitHub Events

Total
  • Issues event: 2
  • Watch event: 2
  • Issue comment event: 2
  • Push event: 31
  • Pull request event: 2
  • Fork event: 1
  • Create event: 29
Last Year
  • Issues event: 2
  • Watch event: 2
  • Issue comment event: 2
  • Push event: 31
  • Pull request event: 2
  • Fork event: 1
  • Create event: 29

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • benedikt-weyer (1)
Pull Request Authors
  • z1zzle (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

Dockerfile docker
  • ghcr.io/osgeo/gdal ubuntu-small-3.8.4 build
docker-compose.yml docker
  • postgis/postgis 16-3.4
app/static/vendor/highlight/es/package.json npm
app/static/vendor/highlight/package.json npm
  • @colors/colors ^1.6.0 development
  • @rollup/plugin-commonjs ^25.0.5 development
  • @rollup/plugin-json ^6.0.1 development
  • @rollup/plugin-node-resolve ^15.2.3 development
  • @types/mocha ^10.0.2 development
  • @typescript-eslint/eslint-plugin ^6.7.4 development
  • @typescript-eslint/parser ^6.7.4 development
  • clean-css ^5.3.2 development
  • cli-table ^0.3.1 development
  • commander ^11.0.0 development
  • css ^3.0.0 development
  • css-color-names ^1.0.1 development
  • deep-freeze-es6 ^3.0.2 development
  • del ^7.1.0 development
  • dependency-resolver ^2.0.1 development
  • eslint ^8.51.0 development
  • eslint-config-standard ^17.1.0 development
  • eslint-plugin-import ^2.28.1 development
  • eslint-plugin-node ^11.1.0 development
  • eslint-plugin-promise ^6.1.1 development
  • glob ^8.1.0 development
  • glob-promise ^6.0.5 development
  • handlebars ^4.7.8 development
  • http-server ^14.1.1 development
  • jsdom ^22.1.0 development
  • lodash ^4.17.20 development
  • mocha ^10.2.0 development
  • refa ^0.4.1 development
  • rollup ^4.0.2 development
  • should ^13.2.3 development
  • terser ^5.21.0 development
  • tiny-worker ^2.3.0 development
  • typescript ^5.2.2 development
  • wcag-contrast ^3.0.0 development
requirements.txt pypi
  • Django ==5.0.6
  • GeoAlchemy2 ==0.14.3
  • SQLAlchemy ==2.0.25
  • affine ==2.4.0
  • asgiref ==3.7.2
  • attrs ==23.2.0
  • certifi ==2023.11.17
  • click ==8.1.7
  • click-plugins ==1.1.1
  • cligj ==0.7.2
  • datacite ==1.1.3
  • django-debug-toolbar ==4.3.0
  • django-environ ==0.11.2
  • django-taggit ==5.0.1
  • fiona ==1.9.5
  • geojson ==3.1.0
  • geopandas ==0.14.4
  • meteostat ==1.6.7
  • numpy ==1.26.4
  • openpyxl ==3.1.2
  • osmnx ==1.9.3
  • packaging ==23.2
  • pandas ==2.1.4
  • psycopg ==3.1.17
  • psycopg-binary ==3.1.17
  • psycopg2-binary ==2.9.9
  • pyparsing ==3.1.1
  • pyproj ==3.6.1
  • python-dateutil ==2.8.2
  • pytz ==2023.3.post1
  • rasterio ==1.3.10
  • shapely ==2.0.4
  • six ==1.16.0
  • snuggs ==1.4.7
  • sqlparse ==0.4.4
  • typing_extensions ==4.9.0
  • tzdata ==2023.4
pyproject.toml pypi