Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: khalilT
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 94.7 KB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 9 months ago
Metadata Files
Readme License Citation

README.md

Geo-Disasters: Geocoding EM-DAT climate related disasters

This repository contains a series of Python scripts for processing and geolocating the climate-related disaster events from EM-DAT for the period 1990-2023. It uses geographic data from GAUL administrative boundaries, and the GeoNames API.

Overview

1cleanEmDAT_nogeocode.py

This script identifies the EM-DAT events for which no GAUL id was provided that need to be geocoded using Geonames API.

Inputs: EM-DAT data file (publicemdat1990_2023.xlsx) Outputs: Dataframe with events that need to be geocoded

2geolocationgeonames_script.py

This script is designed to geocode event locations using the GeoNames API. Running might take a long time (3 to 4 days) as we have around 10k names to geolocate, with limits on the usage per hour and day, in addition to potential interruptions if there are too many requests on the API. The code is adapted to continue from were the operation was interrupted. The script was run on two iterations. The first one on the list of events identified in the first script. The second one is on the events that were not identified in the first iteration (around 900 locations), after manually correcting the locations names / iso codes.

Inputs: Dataframe with locations that need to be geocoded Outputs: Dataframe with geocoded locations: lat/lon and province names identified.

3cleangeonames_geolocation.py

This script processes the locations geocoded with Geonames and identified latitudes and longitudes in each case with GAUL administrative regions at both levels (ADM1 and ADM2). We assign geocoding quality flags to the geocoded regions.

Inputs: Dataframe with geocoded locations Outputs: Dataframe with geocoded locations with identified corresponding GAUL admin level and code that enables matching with geographic data.

4geolocationidentifiedgaulID.py

In this script, we identify the geometries corresponding to the locations where the GAUL id is provided with EM-DAT. we then concatenate all identified locations together (with the geonames identified locations) and apply quality flags for consistency

Inputs: Dataframe with geocoded locations with identified corresponding GAUL Inputs: EM-DAT data file (publicemdat1990_2023.xlsx) Outputs: Geodaraframe with all identified locations and geographic data

5nationaloverlay.py

In this script, we overlay the regions corresponding to each EM-DAT event within each country, to have the total reported area of the event.

Inputs: Geodaraframe with all identified locations and geographic data Outputs: Geodaraframe with overlayed extent per event

6filterwrite_data.py

In this script, we perform additional filtering of EM-DAT events. We remove all events that do not have any impact information, and events with inaccurate geocoding. Inputs: Geodaraframe with overlayed extent per event, EM-DAT database Output: Geodaraframe with event locations, em-dat event and impact information, and needed time information for climate aggregation

7comparegdis.py

In this script, we compare the geographic mismatch between EM-DAT events geocoded by GDIS and by Geo-Disasters. Inputs: Geo-Disasters, GDIS output: comparison_df: dataframe with the comparison results.

db_descriptions.R

In this R script, we generate the figures in the publications. Inputs: Geo-Disasters (subnational, national overlay), GDIS, EM-DAT, comparison_df Output: publication figures

How to Run the Scripts

Ensure that the paths to the required input files (EM-DAT data, GAUL maps,...) are correctly stated in src/utils/paths.py and src/utils/paths.R. Run each script in the appropriate order indicated in the names.

Notes

The data output from these scripts is crucial for geographic analysis and visualization of climate-related disasters. The scripts have built-in error handling for mismatched or missing locations, ensuring robust processing. Manual corrections are necessary in many cases. The GeoNames API script requires a valid username and should be run twice: once for locations without GAUL IDs and once for manually corrected locations. We do not provide the GAUL maps, but we recommend downloading them from Google Earth Engine Data Catalog. We do not provide EM-DAT, it can be freely accessed for academic purposes in https://www.emdat.be/

Installation

  1. Clone the repository: bash git clone https://github.com/khalilT/geocode_disasters.git cd yourproject
  2. Install dependencies: bash pip install -r requirements.txt

Session Info

  • Python Version: 3.8.19 | packaged by conda-forge | (default, Mar 20 2024, 12:47:35) [GCC 12.3.0]
  • Platform: Linux-6.8.0-57-generic-x86_64-with-glibc2.10
  • OS: Linux
  • Architecture: 64bit
  • Processor: x86_64
  • Generated On: 2025-05-15 10:06:19
  • R Version: R version 4.4.2 (2024-10-31) -- "Pile of Leaves"

Owner

  • Name: Khalil Teber
  • Login: khalilT
  • Kind: user
  • Location: Leipzig, Germany
  • Company: RSC4Earth

PhD student at the university of Leipzig, I work on climate extreme events and their social and economic impacts

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this code, please cite it using the metadata below."
title: "Geo-Disasters: Reproducible geocoding framework for EM-DAT climate-disaster events"
version: "1.0.0"
doi: 10.6084/m9.figshare.29125907
date-released: 2025-05-22
repository-code: https://github.com/khalilT/geocode_disasters
license: MIT
authors:
  - family-names: Teber
    given-names: Khalil
    orcid: https://orcid.org/0009-0006-0075-2052
  - family-names: Weynants
    given-names: Melanie
    orcid: https://orcid.org/0000-0002-1447-0105
  - family-names: Gans
    given-names: Fabian
    orcid: https://orcid.org/0000-0001-9614-0435
  - family-names: Mahecha
    given-names: Miguel D.
    orcid: https://orcid.org/0000-0003-3031-613X
keywords:
  - EM-DAT
  - geocoding
  - disaster risk
  - GAUL
  - climate extremes
abstract: "Geo-Disasters is an open Python framework that geocodes climate-related EM-DAT disaster records to GAUL administrative units, assigns quality flags, and exports two GeoPackage layers covering the period from 1990 to 2023 at the subnational and national levels. This snapshot (v1.0.0) reproduces the results reported in the manuscript: DOI 10.6084/m9.figshare.29125907."

GitHub Events

Total
  • Watch event: 4
  • Push event: 3
  • Public event: 1
Last Year
  • Watch event: 4
  • Push event: 3
  • Public event: 1

Dependencies

requirements.txt pypi
  • Babel ==2.11.0
  • Bottleneck ==1.4.0
  • Brotli ==1.0.9
  • Cartopy ==0.21.1
  • Cython ==3.0.11
  • Fiona ==1.9.1
  • GDAL ==3.6.2
  • ImageHash ==4.3.1
  • Levenshtein ==0.25.1
  • Markdown ==3.6
  • MarkupSafe ==2.0.1
  • Pillow ==9.4.0
  • PyCRS ==1.0.2
  • PyJWT ==2.9.0
  • PyQt5 ==5.15.9
  • PyQt5-sip ==12.12.2
  • PySocks ==1.7.1
  • PyWavelets ==1.4.1
  • PyYAML ==6.0.2
  • Pygments ==2.13.0
  • QtPy ==2.4.1
  • Rtree ==1.3.0
  • Send2Trash ==1.8.0
  • ace-tools ==0.0
  • affine ==2.3.1
  • aiobotocore ==2.13.3
  • aiohappyeyeballs ==2.3.5
  • aiohttp ==3.10.3
  • aioitertools ==0.11.0
  • aiosignal ==1.3.1
  • altair ==5.4.0
  • annotated-types ==0.7.0
  • anyio ==3.6.2
  • appdirs ==1.4.4
  • argon2-cffi ==21.3.0
  • argon2-cffi-bindings ==21.2.0
  • args ==0.1.0
  • arrow ==1.3.0
  • asciitree ==0.3.3
  • asttokens ==2.1.0
  • async-lru ==2.0.4
  • async-timeout ==4.0.3
  • attrs ==24.2.0
  • backcall ==0.2.0
  • backports.functools-lru-cache ==2.0.0
  • basemap ==1.3.6
  • basemap-data ==1.3.2
  • beautifulsoup4 ==4.12.3
  • bleach ==5.0.1
  • blessed ==1.20.0
  • blinker ==1.8.2
  • blis ==0.7.11
  • bokeh ==3.1.1
  • boltons ==24.0.0
  • botocore ==1.34.162
  • bqplot ==0.12.36
  • branca ==0.6.0
  • brotlipy ==0.7.0
  • cached-property ==1.5.2
  • cachetools ==5.2.0
  • cairocffi ==1.6.1
  • catalogue ==2.0.10
  • certifi ==2024.8.30
  • cf-xarray ==0.8.4
  • cffi ==1.15.1
  • cftime ==1.6.4
  • chardet ==3.0.4
  • charset-normalizer ==3.3.2
  • click ==8.1.3
  • click-plugins ==1.1.1
  • cligj ==0.7.2
  • clint ==0.5.1
  • cloudpathlib ==0.18.1
  • cloudpickle ==3.0.0
  • colorama ==0.4.6
  • colorcet ==3.1.0
  • colour ==0.1.5
  • comm ==0.2.2
  • conda ==23.7.4
  • conda-package-handling ==2.3.0
  • conda-package-streaming ==0.10.0
  • confection ==0.1.5
  • confuse ==2.0.1
  • contextily ==1.6.0
  • contourpy ==1.1.1
  • coverage ==7.1.0
  • cryptography ==43.0.0
  • cycler ==0.12.1
  • cykhash ==2.0.1
  • cymem ==2.0.8
  • cytoolz ==0.12.3
  • dask ==2023.5.0
  • dask-geopandas ==0.3.1
  • datashader ==0.15.2
  • datashape ==0.5.4
  • dbfread ==2.0.7
  • debugpy ==1.6.3
  • decorator ==5.1.1
  • deep-translator ==1.11.4
  • defusedxml ==0.7.1
  • dill ==0.3.8
  • distributed ==2023.5.0
  • earthengine-api ==0.1.332
  • ee-extra ==0.0.14
  • eemont ==0.3.5
  • eerepr ==0.0.2
  • en-core-web-md ==3.7.1
  • entrypoints ==0.4
  • esmpy ==8.4.1
  • et-xmlfile ==1.1.0
  • exceptiongroup ==1.2.2
  • executing ==1.2.0
  • fasteners ==0.17.3
  • fastjsonschema ==2.16.2
  • ffmpeg-python ==0.2.0
  • flit-core ==3.9.0
  • folium ==0.13.0
  • fonttools ==4.53.1
  • forestplot ==0.4.1
  • fqdn ==1.5.1
  • fr-core-news-sm ==3.7.0
  • frozenlist ==1.4.1
  • fsspec ==2024.6.1
  • future ==0.18.2
  • fuzzywuzzy ==0.18.0
  • gcsfs ==2024.6.1
  • gdown ==4.5.3
  • geeadd ==0.5.6
  • geemap ==0.17.2
  • geocoder ==1.38.1
  • geodatasets ==2024.7.0
  • geographiclib ==2.0
  • geojson ==2.5.0
  • geopandas ==0.13.2
  • geoplot ==0.5.1
  • geopy ==2.3.0
  • geoviews ==1.10.1
  • google-api-core ==2.10.2
  • google-api-python-client ==2.66.0
  • google-auth ==2.14.1
  • google-auth-httplib2 ==0.1.0
  • google-auth-oauthlib ==1.2.1
  • google-cloud-core ==2.3.2
  • google-cloud-storage ==2.6.0
  • google-crc32c ==1.5.0
  • google-resumable-media ==2.4.0
  • googleapis-common-protos ==1.57.0
  • googletrans ==4.0.0rc1
  • gpustat ==1.1.1
  • grpcio ==1.54.2
  • h11 ==0.14.0
  • h2 ==4.1.0
  • h5netcdf ==1.1.0
  • h5py ==3.8.0
  • holoviews ==1.17.1
  • hpack ==4.0.0
  • hstspreload ==2024.7.1
  • htmlmin ==0.1.12
  • httpcore ==1.0.5
  • httplib2 ==0.22.0
  • httpx ==0.27.0
  • hvplot ==0.10.0
  • hyperframe ==6.0.1
  • idna ==3.7
  • igraph ==0.10.4
  • imagecodecs ==2023.1.23
  • imageio ==2.34.2
  • importlib-metadata ==5.0.0
  • importlib-resources ==6.4.0
  • ipyevents ==2.0.1
  • ipyfilechooser ==0.6.0
  • ipykernel ==6.17.1
  • ipyleaflet ==0.17.2
  • ipython ==8.6.0
  • ipython-genutils ==0.2.0
  • ipytree ==0.2.2
  • ipywidgets ==8.0.2
  • isoduration ==20.11.0
  • jedi ==0.18.1
  • jinja2 ==3.1.4
  • jmespath ==1.0.1
  • joblib ==1.4.2
  • json5 ==0.9.10
  • jsonpatch ==1.33
  • jsonpointer ==3.0.0
  • jsonschema ==4.23.0
  • jsonschema-specifications ==2023.12.1
  • jupyter ==1.0.0
  • jupyter-bokeh ==2.0.4
  • jupyter-client ==7.4.7
  • jupyter-console ==6.6.3
  • jupyter-core ==5.0.0
  • jupyter-events ==0.10.0
  • jupyter-lsp ==2.2.5
  • jupyter-server ==1.23.2
  • jupyter-server-terminals ==0.5.3
  • jupyterlab ==3.5.0
  • jupyterlab-lsp ==5.1.0
  • jupyterlab-pygments ==0.2.2
  • jupyterlab-server ==2.16.3
  • jupyterlab-widgets ==3.0.3
  • kiwisolver ==1.4.5
  • langcodes ==3.4.0
  • langdetect ==1.0.9
  • langid ==1.1.6
  • language-data ==1.2.0
  • lazy-loader ==0.4
  • libmambapy ==1.4.9
  • linkify-it-py ==2.0.3
  • llvmlite ==0.41.1
  • locket ==1.0.0
  • logzero ==1.7.0
  • lz4 ==4.3.3
  • mamba ==1.4.9
  • mapclassify ==2.5.0
  • marisa-trie ==1.2.0
  • markdown-it-py ==3.0.0
  • matplotlib ==3.6.3
  • matplotlib-inline ==0.1.6
  • mdit-py-plugins ==0.4.1
  • mdurl ==0.1.2
  • mercantile ==1.2.1
  • missingno ==0.5.2
  • mistune ==2.0.4
  • mpi4py ==3.1.4
  • msgpack ==1.0.8
  • multidict ==6.0.5
  • multimethod ==1.4
  • multipledispatch ==0.6.0
  • multiprocess ==0.70.16
  • munch ==4.0.0
  • munkres ==1.1.4
  • murmurhash ==1.0.10
  • narwhals ==1.3.0
  • nbclassic ==0.4.8
  • nbclient ==0.7.0
  • nbconvert ==7.2.5
  • nbformat ==5.7.0
  • nest-asyncio ==1.5.6
  • netCDF4 ==1.6.3
  • networkx ==3.1
  • noise ==1.2.2
  • notebook ==6.5.2
  • notebook-shim ==0.2.2
  • numba ==0.58.1
  • numcodecs ==0.12.1
  • numexpr ==2.8.4
  • numpy ==1.24.4
  • nvidia-ml-py ==12.535.133
  • oauthlib ==3.2.2
  • odc-geo ==0.3.2
  • odc-stac ==0.3.3
  • openpyxl ==3.1.5
  • overrides ==7.7.0
  • packaging ==24.1
  • pandas ==2.0.3
  • pandas-profiling ==3.0.0
  • pandocfilters ==1.5.0
  • panel ==1.2.3
  • param ==1.13.0
  • parso ==0.8.3
  • partd ==1.4.1
  • pathos ==0.3.2
  • patsy ==0.5.6
  • pexpect ==4.9.0
  • phik ==0.12.3
  • pickleshare ==0.7.5
  • pip ==24.2
  • pkgutil-resolve-name ==1.3.10
  • planetary-computer ==0.4.9
  • platformdirs ==4.3.6
  • plotly ==5.11.0
  • pluggy ==1.5.0
  • ply ==3.11
  • pooch ==1.8.2
  • pox ==0.3.4
  • ppft ==1.7.6.8
  • preshed ==3.0.9
  • prometheus-client ==0.15.0
  • prompt-toolkit ==3.0.32
  • proto-plus ==1.23.0
  • protobuf ==4.21.9
  • psutil ==5.9.8
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • pyOpenSSL ==24.2.1
  • pyarrow ==11.0.0
  • pyasn1 ==0.6.0
  • pyasn1-modules ==0.4.0
  • pycosat ==0.6.6
  • pycountry ==24.6.1
  • pycparser ==2.21
  • pyct ==0.4.6
  • pydantic ==1.10.2
  • pydantic-core ==2.20.1
  • pygeos ==0.14
  • pymannkendall ==1.4.3
  • pynndescent ==0.5.13
  • pyparsing ==3.1.2
  • pyproj ==3.4.0
  • pyrobuf ==0.9.3
  • pyrosm ==0.6.1
  • pyrsistent ==0.20.0
  • pyshp ==2.3.1
  • pystac ==1.6.1
  • pystac-client ==0.5.1
  • python-Levenshtein ==0.25.1
  • python-box ==6.1.0
  • python-dateutil ==2.9.0
  • python-dotenv ==0.21.0
  • python-json-logger ==2.0.7
  • python-rapidjson ==1.20
  • pytz ==2022.6
  • pyu2f ==0.1.5
  • pyviz-comms ==3.0.3
  • pyzmq ==26.1.0
  • qtconsole ==5.5.2
  • rapidfuzz ==3.9.6
  • rasterio ==1.3.4
  • ratelim ==0.1.6
  • referencing ==0.35.1
  • requests ==2.32.3
  • requests-oauthlib ==2.0.0
  • retrying ==1.3.3
  • rfc3339-validator ==0.1.4
  • rfc3986 ==1.5.0
  • rfc3986-validator ==0.1.1
  • rich ==13.7.1
  • rioxarray ==0.13.4
  • rpds-py ==0.20.0
  • rsa ==4.9
  • ruamel.yaml ==0.17.40
  • ruamel.yaml.clib ==0.2.8
  • s3fs ==2024.6.1
  • sankee ==0.2.0
  • scikit-image ==0.21.0
  • scikit-learn ==1.3.2
  • scipy ==1.10.1
  • scooby ==0.7.0
  • seaborn ==0.13.2
  • setuptools ==54.0.0
  • shap ==0.44.1
  • shapely ==2.0.1
  • shellingham ==1.5.4
  • sip ==6.7.12
  • six ==1.16.0
  • slicer ==0.0.7
  • smart-open ==7.0.4
  • sniffio ==1.3.0
  • snuggs ==1.4.7
  • sortedcontainers ==2.4.0
  • soupsieve ==2.5
  • spacepy ==0.4.1
  • spacy ==3.7.5
  • spacy-legacy ==3.0.12
  • spacy-loggers ==1.0.5
  • sparse ==0.15.4
  • spatialpandas ==0.4.10
  • spyndex ==0.4.0
  • srsly ==2.4.8
  • stack-data ==0.6.1
  • statsmodels ==0.14.1
  • tables ==3.7.0
  • tangled-up-in-unicode ==0.2.0
  • tblib ==3.0.0
  • tenacity ==8.1.0
  • terminado ==0.17.0
  • texttable ==1.7.0
  • thefuzz ==0.22.1
  • thinc ==8.2.5
  • threadpoolctl ==3.5.0
  • tifffile ==2023.7.10
  • tinycss2 ==1.2.1
  • toml ==0.10.2
  • tomli ==2.0.1
  • toolz ==0.12.1
  • tornado ==6.2
  • tqdm ==4.64.1
  • traitlets ==5.5.0
  • traittypes ==0.2.1
  • typeguard ==4.3.0
  • typer ==0.12.3
  • typer-slim ==0.12.3
  • types-python-dateutil ==2.9.0.20240316
  • typing-extensions ==4.4.0
  • typing-utils ==0.1.0
  • tzdata ==2024.1
  • uc-micro-py ==1.0.3
  • umap-learn ==0.5.6
  • unicodedata2 ==15.1.0
  • uri-template ==1.3.0
  • uritemplate ==4.1.1
  • urllib3 ==1.26.19
  • visions ==0.7.1
  • wasabi ==1.1.3
  • wcwidth ==0.2.13
  • weasel ==0.4.1
  • webcolors ==24.8.0
  • webencodings ==0.5.1
  • websocket-client ==1.4.2
  • wheel ==0.44.0
  • whitebox ==2.2.0
  • whiteboxgui ==2.2.0
  • widgetsnbextension ==4.0.3
  • wrapt ==1.16.0
  • xagg ==0.3.2.2
  • xarray ==2023.1.0
  • xarray-spatial ==0.4.0
  • xesmf ==0.8.7
  • xgboost ==2.1.1
  • xlrd ==2.0.1
  • xyzservices ==2022.9.0
  • yapf ==0.43.0
  • yarl ==1.9.4
  • zarr ==2.17.1
  • zict ==3.0.0
  • zipp ==3.19.2
  • zstandard ==0.23.0