dbscan

DBSCAN for multi-hazard spatio-temporal footprint analysis

https://github.com/dmferrario2/dbscan

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 11 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

DBSCAN for multi-hazard spatio-temporal footprint analysis

Basic Info
  • Host: GitHub
  • Owner: dmferrario2
  • License: gpl-3.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 2.27 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created 9 months ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

DBSCAN for multi-hazard spatio-temporal footprint analysis

A computational workflow to detect and analyse multi-hazard events (heatwaves, drought, wind, precipitation) using spatial-temporal clustering.

📜 Overview

This repository provides a methodology to identify multi-hazard footprints by combining climate thresholds, DBSCAN clustering, and spatiotemporal overlap analysis. The workflow consists of three steps:

  • Threshold Identification: Preprocessing climate data to define hazard-specific thresholds.

  • Single-Hazard Clustering: Using DBSCAN to detect spatial-temporal clusters for individual hazards.

  • Multi-Hazard Footprints: Detecting overlaps between single-hazard clusters to identify compound and consecutive events.

The tool is demonstrated for the Veneto Region (Italy) using 5 years of data (2018–2022) but can be adapted to other regions/timeframes.

🛠️ Workflow Steps

1. Threshold Identification (Preprocessing)

Input: Gridded climate data (NetCDF format).

Tools:

  • cdo (Climate Data Operators) for threshold calculations (e.g., percentiles for precipitation, wind, temperature).
  • Python scripts for drought indices (e.g., SPI-12) and duration-based filtering of events.

Output: Binary mask files (NetCDF) indicating hazard exceedance.

2. Single-Hazard Clustering (Jupyter Notebook)

Input: Daily gridded climate data, binary mask files for each hazard

Tools: DBSCAN clustering with custom spatial-temporal weights.

Hazards Supported:

  • Heatwaves (T_2M > 0°C)
  • Drought (SPI_12 < -2)
  • Extreme wind (WIND_SPEED > 13.9 m/s)
  • Extreme precipitation (TOT_PREC > 20 mm/day).

Output: Cluster labels, duration, intensity, and spatial extent per hazard.

3. Multi-Hazard Footprints (Juputer Notebook)

Input: Single hazard clusters, boundaries and landscape files for Veneto Region

Tools: Overlapping single-hazard clusters in space/time (e.g., heatwaves + drought).

Output: Compound event statistics (mean/max intensity, duration), visualizations (3D plots, maps).

🚀 Quick Start

  1. Install Dependencies: bash pip install numpy pandas xarray geopandas matplotlib scikit-learn cartopy rasterio rioxarray climateindices_

  2. Download Input Data: Preprocessed data (daily climate netcdf and corresponding binary mask files for 2018–2022) is available on Zenodo: 10.5281/zenodo.15805129 Regional boundaries and landscapes types are available on GitHub Original climate data can be freely downloaded:

  3. CMCC VHR REA over Italy, Raffa et al., 2021, Adinolfi et al., 2023

  4. CMCC VHR PRO over Italy (RCP4.5, RCP 8.5), Raffa et al., 2023

  5. Run the Notebook: bash jupyter notebook multihazardfootprints.ipynb

Notes:

In order to run the jupyter notebook it is necessary to download the preprocessed data (daily climate data and mask netcdf files) for each hazard, which are available on Zenodo. The data is provided only for testing purposes: in order to produce consistent results at least 30 years of climate data are required. The publication describing the analyses carried out in the Veneto Region on the historical (1991-2022), and future scenarios (RCP 4.5, RCP 8.5, 2023-2070) is in preparation.

Acknowledgments:

This study was carried out within the frame of Myriad_EU project (https://www.myriadproject.eu/), which has received fundings from the European Union’s Horizon 2020 research and innovation programme call H2020-LC-CLA-2018-2019-2020 under grant agreement number 101003276.

Owner

  • Name: Davide Mauro Ferrario
  • Login: dmferrario2
  • Kind: user
  • Location: Venice, Italy
  • Company: CMCC@CaFoscari

PhD fellow in Sustainable Development and Climate Change - focusing on ML and AI for climate and multi-risk assessment - based at CMCC@CaFoscari

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "Multi-Hazard Spatio Temporal Footprints"
version: "v1.0"
doi: "https://doi.org/10.5281/zenodo.15805330"
url: "https://github.com/dmferrario2/DBSCAN"
date-released: 2025-07-03
authors:
  - family-names: Ferrario
    given-names: Davide Mauro
  - family-names: Tiggeloven
    given-names: Timothy
  - family-names: Maraschini
    given-names: Margherita
  - family-names: Sanò
    given-names: Marcello
  - family-names: Claassen
    given-names: Judith
  - family-names: de Ruiter
    given-names: Marleen
  - family-names: Torresan
    given-names: Silvia
  - family-names: Critto
    given-names: Andrea

GitHub Events

Total
  • Create event: 1
  • Commit comment event: 1
  • Release event: 1
  • Public event: 1
  • Push event: 3
Last Year
  • Create event: 1
  • Commit comment event: 1
  • Release event: 1
  • Public event: 1
  • Push event: 3