Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: rivm-syso
- License: other
- Language: R
- Default Branch: main
- Size: 1.08 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
FOTO-NL Chemical Pollution Monitoring Dataset Analysis
This repository contains R Markdown scripts for processing, analyzing, and visualizing long-term chemical pollution monitoring data of surface waters in the Netherlands, as described in the paper:
Long-term chemical pollution monitoring data of surface waters in the Netherlands
Authors: Thomas Hofman, Matthias Hof, Jaap Postma, Rineke Keijzers, Jaap Slootweg, Bas van der Wal, Leo Posthuma
The scripts support the curation, spatial analysis, and statistical summarization of data from the FOTO-NL database, which includes 42.5 million measured water quality parameters collected between 1952 and 2020. These tools aid in understanding the spatial and temporal patterns of chemical pollution in Dutch surface waters.
Repository Contents
1. script_1_read_and_clean.Rmd
Purpose: Prepares and cleans raw monitoring data for inclusion in the FOTO-NL database.
Key Steps:
- Harmonizes measurement units (e.g., micrograms per liter) across datasets.
- Resolves inconsistencies in parameter names (AquoCodes) and adds CAS numbers where missing.
- Corrects known errors, such as outlier values (e.g., surface water temperatures exceeding realistic ranges).
- Generates a standardized dataset, ToPAF.rds, suitable for downstream analyses.
2. script_2_spatial_analysis.Rmd
Purpose: Performs spatial analysis to ensure geographic accuracy and prepares data for mapping.
Key Steps:
- Extracts geographic coordinates and ensures they are within a 2 km buffer around the Netherlands.
- Identifies and removes erroneous locations outside the defined boundary.
- Exports spatial data for integration with GIS tools (e.g., QGIS).
- Outputs the cleaned spatial dataset, AllDataNL.rds.
3. script_3_summary_statistics.Rmd
Purpose: Produces summary statistics, visualizations, and insights into temporal and spatial trends.
Key Steps:
- Defines functions for creating ECDF (Empirical Cumulative Distribution Function) plots and calculating interquartile ranges (IQR).
- Summarizes data across waterboards, years, and substances.
- Generates visualizations, including:
- ECDF plots for parameter distributions.
- Heatmaps of measurement density across waterboards and time.
- Bar charts of year-wise monitoring intensity.
- Saves outputs as summary tables and figures.
Workflow Overview
Input
Raw monitoring data collected from 21 Dutch waterboards and Rijkswaterstaat. This includes measurements of physical and chemical water quality parameters such as pH, chloride, dissolved organic carbon (DOC), and heavy metals.
Workflow
Data Cleaning (
script_1_read_and_clean.Rmd):- Standardize and harmonize data.
- Address inconsistencies in measurement units and parameter naming.
Spatial Analysis (
script_2_spatial_analysis.Rmd):- Validate geographic locations of sampling points.
- Filter out points outside the defined study area.
Statistical Analysis (
script_3_summary_statistics.Rmd):- Generate summary statistics and visualizations to highlight patterns and trends in the data.
Output
- Cleaned Data:
ToPAF.rds(intermediate) andAllDataNL.rds(final). - Spatial Data:
coordsNL.csvfor GIS use. - Summary Tables and Visualizations: Stored in the
summary_statsdirectory.
Key Features of the FOTO-NL Database
- Temporal Scope: Spanning from July 7, 1952, to August 1, 2020.
- Spatial Coverage: 28,525 unique sites across the Netherlands.
- Parameter Diversity: 42.5 million records for parameters such as pH, DOC, metals, and synthetic chemicals.
- Applications: Supports analyses of spatial patterns, temporal trends, and regulatory compliance under frameworks like the EU Water Framework Directive (WFD).
How to Use
Prerequisites
Ensure the following R packages are installed:
R
list.of.packages <- c("tidyverse", "dplyr", "ggplot2", "sf", "mapview", "readr", "scales", "lubridate", "gridExtra")
for (pkg in list.of.packages) {
if (!require(pkg, character.only = TRUE)) install.packages(pkg)
}
Steps
- Clone the repository:
{bash}
git clone https://github.com/your-repo-name.git
cd your-repo-name
Open each .Rmd file in RStudio.
Run the scripts in the following order: script1readandclean.Rmd script2spatialanalysis.Rmd script3summarystatistics.Rmd
Review outputs in the summary_stats directory and generated visualizations.
Acknowledgements
This project was funded by RIVM and STOWA under the SPR-BIOTICHS project. The authors thank the Dutch Waterboards and Rijkswaterstaat for data contributions.
Corresponding Authors:
Thomas Hofman: thomas.hofman@rivm.nl
Leo Posthuma: leo.posthuma@rivm.nl
Citation
If you use this repository or the FOTO-NL database, please cite: Hofman et al. (2025). Long-term chemical pollution monitoring data of surface waters in the Netherlands.
Owner
- Name: Rijksinstituut voor Volksgezondheid en Milieu
- Login: rivm-syso
- Kind: organization
- Email: info@rivm.nl
- Location: Bilthoven, The Netherlands
- Website: www.rivm.nl
- Repositories: 13
- Profile: https://github.com/rivm-syso
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
FOTO-NL Long-term chemical pollution monitoring data of
surface waters in the Netherlands
message: >-
If you wish to use these scripts or the output, please use
the following metadata.
type: software
authors:
- given-names: Thomas
family-names: Hofman
affiliation: RIVM
orcid: 'https://orcid.org/0009-0002-9101-0353'
email: thomas.hofman@rivm.nl
repository-code: 'https://github.com/rivm-syso/FOTO-NL'
url: 'https://github.com/rivm-syso/FOTO-NL'
abstract: >
Data on chemical pollutants in surface waters were
collated from the 21 current Dutch regional Waterboards
and the national Rijkswaterstaat authority, covering the
whole country and the period between July 7, 1952, and
August 1, 2020. Data were curated and harmonized to
provide a resource, the FOTO-NL chemical pollution
database, for scientific research and practical purposes
in characterization, prevention, and mitigation of
chemical pollution of surface waters. The collated dataset
contains a total of 42.5 million place-time specified
measured water quality parameters, representing the most
dense water quality set of Europe, readily available for
the evaluation of the water quality policy goals of the
Water Framework Directive in 2027 and beyond.
keywords:
- Chemical
- Pollution
- Monitoring
- Spatial
- Temporal
- Data
license: CC-BY-SA-4.0
version: 1.0.0
date-released: 2025-01-23
doi: 10.21945/13ec755b-abdc-4ef8-9692-12a3360b9871
GitHub Events
Total
- Release event: 1
- Watch event: 1
- Public event: 1
- Push event: 2
Last Year
- Release event: 1
- Watch event: 1
- Public event: 1
- Push event: 2