environmental-insights

Code Repository for Environmental Insights, a python package for the accessing and analytics of ambient air pollution concentration data.

https://github.com/liamjberrisford/environmental-insights

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 15 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary

Keywords

airpollution ambient data-science machinelearning
Last synced: 6 months ago · JSON representation ·

Repository

Code Repository for Environmental Insights, a python package for the accessing and analytics of ambient air pollution concentration data.

Basic Info
Statistics
  • Stars: 8
  • Watchers: 2
  • Forks: 2
  • Open Issues: 3
  • Releases: 0
Topics
airpollution ambient data-science machinelearning
Created almost 2 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

Environmental Insights

PyPI version GitHub release Build status Tests

A Python package for democratizing access to ambient air pollution data and predictive analytics.


📖 Description

Environmental Insights provides easy-to-use functions to download, process, and analyze ambient air pollution and meteorological data over England.
- Implements supervised machine-learning pipelines to predict hourly pollutant concentrations on a 1 km² grid.
- Supplies both “typical day” aggregates (percentiles) and full hourly model outputs.
- Includes geospatial utilities for mapping, interpolation, and uncertainty analysis.


⚙️ Installation

Install from PyPI:

bash pip install environmental-insights

Or from source:

bash git clone https://github.com/liamjberrisford/Environmental-Insights.git cd Environmental-Insights python -m build pip install dist/environmental_insights-0.2.1b0-py3-none-any.whl


📂 Data Sources

This package downloads and processes three primary CEDA datasets:

  1. Machine Learning for Hourly Air Pollution Prediction in England (ML-HAPPE)
    Berrisford, L. (2025). Machine Learning for Hourly Air Pollution Prediction in England (ML-HAPPE). NERC EDS Centre for Environmental Data Analysis.
    DOI: 10.5285/fc735f9878ed43e293b85f85e40df24d

    Full-year (2018) hourly modelled concentrations of NO₂, NO, NOₓ, O₃, PM₁₀, PM₂.₅ and SO₂ on a 1 km² grid, including 5th, 50th & 95th percentiles and underlying training data.

  2. Machine Learning for Hourly Air Pollution Prediction - Global (ML-HAPPG)
    Berrisford, L. (2025). Machine Learning for Hourly Air Pollution Prediction – Global (ML-HAPPG). NERC EDS Centre for Environmental Data Analysis. DOI: 10.5285/7f91b1326a324caa9e436b8fdef4a0d8

    Global hourly modelled concentrations for 2022 of NO₂, O₃, PM₁₀, PM₂.₅ and SO₂—offered on a 0.25° × 0.25° global grid with mean, 5th, 50th, and 95th percentile estimates.

  3. Synthetic Hourly Air Pollution Prediction Averages for England (SynthHAPPE)
    Berrisford, L. (2025). Synthetic Hourly Air Pollution Prediction Averages for England (SynthHAPPE). NERC EDS Centre for Environmental Data Analysis.
    DOI: 10.5285/4cbd9c53ab07497ba42de5043d1f414b

    Representative “typical day” profiles of NO₂, NO, NOₓ, O₃, PM₁₀, PM₂.₅ and SO₂ on a 1 km² grid, with 5th, 50th & 95th percentiles.


For full examples, see the Jupyter-Book tutorial in book/tutorial_environmental_insights.ipynb.

📚 Documentation

Build and view locally:

bash jupyter-book build book/

Then open book/_build/html/index.html in your browser.
Highlights:

  • API Reference: book/docs/api/environmental_insights/
  • Tutorial Notebook: book/tutorial_environmental_insights.ipynb

The documentation is also avaiable via the GitHub Pages Site


✅ Testing

Run the full test suite:

bash pytest

Integration and unit tests are under tests/.


📑 Citation

If you use Environmental Insights in your work, please cite:

Berrisford, L. J. (2025). Environmental Insights: Democratizing access to ambient air pollution data and predictive analytics (Version 0.2.1b0) [Software]. GitHub. https://github.com/liamjberrisford/Environmental-Insights

Also cite the underlying datasets:

  • Berrisford, L. (2025). ML-HAPPE: Machine Learning for Hourly Air Pollution Prediction in England. NERC EDS CEDA. DOI: 10.5285/fc735f9878ed43e293b85f85e40df24d
  • Berrisford, L. (2025). ML-HAPPG: Machine Learning for Hourly Air Pollution Prediction - Global. NERC EDS CEDA. DOI: 10.5285/7f91b1326a324caa9e436b8fdef4a0d8
  • Berrisford, L. (2025). SynthHAPPE: Synthetic Hourly Air Pollution Prediction Averages for England. NERC EDS CEDA. DOI: 10.5285/4cbd9c53ab07497ba42de5043d1f414b

📜 License

This project is released under the GPL-3.0-or-later.

Owner

  • Name: Liam Berrisford
  • Login: liamjberrisford
  • Kind: user
  • Location: Exeter
  • Company: Research Software Engineer @ University of Exeter

Computer Scientist | Research Software Engineer

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Environmental Insights
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Liam
    family-names: Berrisford
    email: liberrisford@gmail.com
    affiliation: University of Exeter
    orcid: 'https://orcid.org/0000-0001-6578-3497'
identifiers:
  - type: doi
    value: 10.1016/j.envsoft.2024.106131
repository-code: 'https://github.com/liamjberrisford/Environmental-Insights'
abstract: >-
  Ambient air pollution is a pervasive issue with
  wide-ranging effects on human health, ecosystem vitality,
  and economic structures. Utilizing data on ambient air
  pollution concentrations, researchers can perform
  comprehensive analyses to uncover the multifaceted impacts
  of air pollution across society. To this end, we introduce
  Environmental Insights, an open-source Python package
  designed to democratize access to air pollution
  concentration data. This tool enables users to easily
  retrieve historical air pollution data and employ a
  Machine Learning model for forecasting potential future
  conditions. Moreover, Environmental Insights includes a
  suite of tools aimed at facilitating the dissemination of
  analytical findings and enhancing user engagement through
  dynamic visualizations. This comprehensive approach
  ensures that the package caters to the diverse needs of
  individuals looking to explore and understand air
  pollution trends and their implications.
keywords:
  - Air Pollution
  - 'Machine Learning '
  - Predictive Analytics

GitHub Events

Total
  • Issues event: 1
  • Watch event: 1
  • Push event: 27
  • Create event: 2
Last Year
  • Issues event: 1
  • Watch event: 1
  • Push event: 27
  • Create event: 2

Dependencies

.github/workflows/testing.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
  • ipykernel ^6.29.5 develop
  • jupyter ^1.1.1 develop
  • pytest ^8.3.5 develop
  • geopandas ^1.0.1
  • jupyterlab ^4.4.2
  • lightgbm *
  • matplotlib *
  • netcdf4 ^1.7.2
  • overpy *
  • pandas *
  • pyarrow *
  • pyogrio *
  • python ^3.10
  • requests *
  • scikit-learn ^1.6.1
  • scipy *
  • shapely *
  • xarray ^2025.4.0
.github/workflows/deploy_book.yml actions
  • actions/checkout v3 composite
  • actions/deploy-pages v4 composite
  • actions/setup-python v4 composite
  • actions/upload-pages-artifact v3 composite
.github/workflows/release.yml actions
  • actions/checkout v3 composite