datahub
Self-hostable, open-source engine for reproducible data harmonization, dataset building & exploration
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary
Keywords
Repository
Self-hostable, open-source engine for reproducible data harmonization, dataset building & exploration
Basic Info
- Host: GitHub
- Owner: datasnack
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://demo.datasnack.org/
- Size: 12.2 MB
Statistics
- Stars: 8
- Watchers: 3
- Forks: 2
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Data Hub
The Data Hub is a geographic information system (GIS) featuring a data fusion engine designed for data harmonization, alongside an interactive dashboard for effective data exploration and collaboration. Its key objective is to merge data of multiple formats and sources across temporal and spatial axes, allowing users to combine, analyze, and interpret the data.
In this repository you can explore an example setup of the Data Hub software tailored to data concerning Ghana.
Installation
The recommended way to use the Data Hub is via Docker and to reference the container built from this repository. See the Ghana Data Hub example for installation steps. By doing so it allows for updating the core system independent of you customizations.
To install the Data Hub from source, follow these steps:
- Install uv
- Clone this repository and open it in your terminal
- Create a
.envfile based on the.env.examplefile. - Install a PostGIS v16.x database (you can use the provided Docker image from the
docker-compose.yml,$ docker compose up -d postgis). - Create a Python v.3.12.x. virtual environment with
uv venv --python 3.12and activate it withsource .venv/bin/activate. - Install Python dependencies via
uv sync(might be complicated due to GDAL/PROJ dependencies). - Run database migrations with
python manage.py migrate - Run Django with
python manage.py runserver
The system is now running and usable at http://localhost:8000/, to use it:
- Create a new superuser with
python manage.py createsuperuser - Import your Shapes with
python manage.py loadshapes <file> - Place your Data Layer source files in
src/datalayers/ - Downloaded data will be placed in
data/datalayers/
Attributions
The Data Hub is an open-source software (OSS) developed through the DiDEX project (Digital Data and Exploratory Spaces for Strengthening Infectious Disease Research within the One Health nexus) at the Bernhard Nocht Institute for Tropical Medicine. The project is supported and funded by the Joachim Herz Foundation ("Innovate! Academy" program).
The source code of the Data Hub is informed by the results of the ESIDA project.
License
MIT
Owner
- Name: Data Snack
- Login: datasnack
- Kind: organization
- Website: https://datasnack.github.io/
- Repositories: 1
- Profile: https://github.com/datasnack
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Data Hub
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Jonathan
family-names: Ströbele
orcid: 'https://orcid.org/0000-0002-9757-8030'
- given-names: Juliane
family-names: Boenecke
orcid: 'https://orcid.org/0000-0002-6327-8152'
repository-code: 'https://github.com/datasnack/datahub'
license: MIT
version: 0.10.1
date-released: '2025-08-26'
GitHub Events
Total
- Issues event: 2
- Watch event: 2
- Issue comment event: 2
- Push event: 31
- Pull request event: 2
- Fork event: 1
- Create event: 29
Last Year
- Issues event: 2
- Watch event: 2
- Issue comment event: 2
- Push event: 31
- Pull request event: 2
- Fork event: 1
- Create event: 29
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 1 minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 1 minute
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- benedikt-weyer (1)
Pull Request Authors
- z1zzle (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- ghcr.io/osgeo/gdal ubuntu-small-3.8.4 build
- postgis/postgis 16-3.4
- @colors/colors ^1.6.0 development
- @rollup/plugin-commonjs ^25.0.5 development
- @rollup/plugin-json ^6.0.1 development
- @rollup/plugin-node-resolve ^15.2.3 development
- @types/mocha ^10.0.2 development
- @typescript-eslint/eslint-plugin ^6.7.4 development
- @typescript-eslint/parser ^6.7.4 development
- clean-css ^5.3.2 development
- cli-table ^0.3.1 development
- commander ^11.0.0 development
- css ^3.0.0 development
- css-color-names ^1.0.1 development
- deep-freeze-es6 ^3.0.2 development
- del ^7.1.0 development
- dependency-resolver ^2.0.1 development
- eslint ^8.51.0 development
- eslint-config-standard ^17.1.0 development
- eslint-plugin-import ^2.28.1 development
- eslint-plugin-node ^11.1.0 development
- eslint-plugin-promise ^6.1.1 development
- glob ^8.1.0 development
- glob-promise ^6.0.5 development
- handlebars ^4.7.8 development
- http-server ^14.1.1 development
- jsdom ^22.1.0 development
- lodash ^4.17.20 development
- mocha ^10.2.0 development
- refa ^0.4.1 development
- rollup ^4.0.2 development
- should ^13.2.3 development
- terser ^5.21.0 development
- tiny-worker ^2.3.0 development
- typescript ^5.2.2 development
- wcag-contrast ^3.0.0 development
- Django ==5.0.6
- GeoAlchemy2 ==0.14.3
- SQLAlchemy ==2.0.25
- affine ==2.4.0
- asgiref ==3.7.2
- attrs ==23.2.0
- certifi ==2023.11.17
- click ==8.1.7
- click-plugins ==1.1.1
- cligj ==0.7.2
- datacite ==1.1.3
- django-debug-toolbar ==4.3.0
- django-environ ==0.11.2
- django-taggit ==5.0.1
- fiona ==1.9.5
- geojson ==3.1.0
- geopandas ==0.14.4
- meteostat ==1.6.7
- numpy ==1.26.4
- openpyxl ==3.1.2
- osmnx ==1.9.3
- packaging ==23.2
- pandas ==2.1.4
- psycopg ==3.1.17
- psycopg-binary ==3.1.17
- psycopg2-binary ==2.9.9
- pyparsing ==3.1.1
- pyproj ==3.6.1
- python-dateutil ==2.8.2
- pytz ==2023.3.post1
- rasterio ==1.3.10
- shapely ==2.0.4
- six ==1.16.0
- snuggs ==1.4.7
- sqlparse ==0.4.4
- typing_extensions ==4.9.0
- tzdata ==2023.4