maize-loss-climate-experiment
Science Score: 75.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, sciencedirect.com, zenodo.org -
○Academic email domains
-
✓Institutional organization owner
Organization schmidtdse has institutional domain (dse.berkeley.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: SchmidtDSE
- License: other
- Language: Python
- Default Branch: main
- Size: 105 MB
Statistics
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 2
- Releases: 0
Metadata Files
README.md
Maize Loss Climate Experiment
Study looking at how climate change may impact loss rates for insurance by simulating maize outcomes via a neural network and Monte Carlo. This includes interactive tools to understand results and a pipeline to build a paper discussing findings.
Purpose
This repository contains three components for a study looking at how crop insurance claims rates may change in the future within the US Corn Belt using SCYM and CHC-CMIP6.
- Pipeline: Contained within the root of this repository, this Luigi-based pipeline trains neural networks and runs Monte Carlo simulations to project future insurance claims under various different parameters, outputting data to a workspace directory.
- Tools: Within the
paper/vizsubdirectory, the source code for an explorable explanation built using Sketchingpy both creates the static visualizations for the paper and offers web-based interactive tools released to ag-adaptation-study.pub which allow users to iteratively engage with these results. - Paper: Within the
papersubdirectory, a manuscript is built from the output data from the pipeline. This describes these experiments in detail with visualizations.
These are described in detail below.
Usage
The easiest way to engage with these results is through the web-based interactive explorable explanation which is housed for the public at ag-adaptation-study.pub. The paper preprint can also be found at https://arxiv.org/abs/2408.02217. We also publish our raw pipeline output. Otherwise, see local setup.
Local setup
For those wishing to extend this work, you can execute this pipeline locally by checking out this repository (git clone git@github.com:SchmidtDSE/maize-loss-climate-experiment.git).
Dev Container (Recommended)
The easiest way to get started with development is using the provided dev container, which automatically sets up all dependencies for pipeline, paper, and visualization development:
- GitHub Codespaces: Click the "Code" button and select "Open with Codespaces" for instant cloud-based development.
- VS Code with Dev Containers: Open the repository in VS Code and click "Reopen in Container" when prompted (requires Docker and the Dev Containers extension).
- Local Docker: Clone the repository and run
docker build -t maize-experiment .devcontainerthendocker run -it -v $(pwd):/workspaces/maize-loss-climate-experiment maize-experiment.
The dev container includes: - Python 3.11 with all project dependencies pre-installed - LaTeX and Pandoc for paper building - Sample data for visualization development - Development tools (linting, testing) - All system dependencies configured
After the container starts, you'll have a fully configured environment ready for development on any component.
Local pipeline
First, get access to the SCYM and CHC-CMIP6 datasets and download all of the geotiffs to an AWS S3 Bucket or another location which can be accessed via the file system. This will allow you to choose from two execution options:
- Setup for AWS: This will execute if the
USE_AWSenvironment variable is set to 1. This assumes data are hosted remotely in an AWS bucket defined by theSOURCE_DATA_LOCenvironment variable and we use Coiled to execute the computation. After setting the environment variables for access credientials (AWS_ACCESS_KEYandAWS_ACCESS_SECRET) and setting up Coiled, simply execute the Luigi pipeline as described below. - Setup for local: If the
USE_AWSenvironment variable is set to 0, this will run using a local Dask cluster. This assumes thatSOURCE_DATA_LOCis a path to the directory housing the input geotiffs. After setting up Coiled, simply execute the Luigi pipeline as described below.
You can then execute either by:
- Run directly: First, install the Python requirements (
pip install -r requirements.txt) optionally within a virtual environment. Then, simply executebash run.shto execute the pipeline from start to finish. See alsobreakpoint_tasks.pyfor Luigi targets for running subsets of the pipeline. - Run through Docker: Simply execute
bash run_docker.shto execute the pipeline from start to finish. See alsobreakpoint_tasks.pyfor Luigi targets for running subsets of the pipeline and updaterun.shwhich is executed within the container. Note that this will operate on theworkspacedirectory.
A summary of the pipeline is created in stats.json. See local package below for use in other repository components such as the interactive tools or paper rendering. Users may optionally skip some expensive steps by placing the files from https://zenodo.org/records/14533227 into the workspace directory.
Interactive tools
Written in Sketchingpy, the tools can be executed locally on your computer, in a static context for building the paper, or through a web browser. First, one needs to get data from the pipeline or download prior results:
- Download prior results: Retrieve the latest results and move them into the viz directory (
paper/viz/data). Simply use wget to gather model outputs when in thepaper/viz directoryas so:wget https://ag-adaptation-study.pub/archive/data.zip; unzip data.zip. If using prior sweep results, download full sweep information like so:cd data; wget http://ag-adaptation-study.pub/data/sweep_ag_all.csv; cd ... - Use your own results: Update the output data per instructions regarding local package below.
There are two options for executing the tools:
- Docker: You can run the web-based visualizations through a simple Docker file in the
paper/vizdirectory (bash run_docker.sh). - Local apps: You can execute the visualizations manually by running them directly as Python scripts. The entry points are
hist_viz.py,history_viz.py,results_viz_entry.py, andsweep_viz.py. Simply run them without any command line arguments for defaults. Note you may need to install python dependencies (pip install -r requirements.txt).
Note that the visualizations are also invoked through paper/viz/render_images.sh for the paper.
Paper
Due to the complexities of the software install, the only officially supported way to build the paper is through the Docker image. First update the data:
- Download prior results: Retrieve the latest results and move them into the paper directory (
paper/outputs). - Use your own results: Update the output data per instructions regarding local package below.
Then, execute render_docker.sh to drop the results into the paper_rendered directory.
Local package
Instead of retrieving data from https://ag-adaptation-study.pub, you can use your own pipeline data outputs by running bash package.sh. This will produce the data and outputs sub-directories inside of a new package directory which can be used for the interactive tools and paper rendering respectively.
Alternative: Manual setup
If you prefer not to use the dev container, you can manually set up each component following the individual setup instructions below, though this requires more configuration steps.
Examples
The following section provides a "cookbook" of common examples for how to use these tools for the most common scenarios.
Review existing results
If wanting to review the current outputs, simply navigate to https://ag-adaptation-study.pub. No additional software is required.
Run interactive tools
To use the existing outputs and run the interactive tools locally after cloning this repository, gather the data and run the visualization scripts.
$ cd paper/viz
$ wget https://ag-adaptation-study.pub/archive/data.zip
$ unzip data.zip
$ cd data
$ wget http://ag-adaptation-study.pub/data/sweep_ag_all.csv
$ cd ..
$ pip install -r requirements.txt
$ python hist_viz.py
This runs the historgram visualization but hist_viz.py, history_viz.py, rates_viz.py, results_viz_entry.py, and sweep_viz.py are all available.
Execute the pipeline locally via Docker
The following will execute the entire pipeline locally after having placed SCYM and CHC-CMIP6 in a local directory (assumed to be path/to/data below).
$ export USE_AWS=0
$ export SOURCE_DATA_LOC=path/to/data
$ bash run_docker.sh
Testing
As part of CI / CD and for local development, the following are required to pass for both the pipeline in the root of this repository and the interactives written in Python at paper/viz:
- pyflakes: Run
pyflakes *.pyto check for likely non-style code isses. - pycodestyle: Run
pycodestyle *.pyto enforce project coding style guidelines.
The pipeline also offers unit tests (nose2 in root) for the pipeline. For the visualizations, tests happen by running the interactives headless (bash render_images.sh; bash script/check_images.sh).
Deployment
To deploy changes to production, CI / CD will automatically release to ag-adaptation-study.pub once merged on main.
Development standards
Where possible, please follow the Python Google Style Guide unless an override is provided in setup.cfg. Docstrings and type hints are required for all top-level or public members but not currently enforced for private members. JSDoc is required for top level members. Docstring / JSDoc not required for "utility" code.
Data
We make publicly available both inputs and outputs to our modeling. Due to size, some of these are archived at ag-adaptation-study.pub while others are deposited into Zenodo.
CHC-CMIP6
Our derivative dataset from CHC-CMIP6 is available at climate.csv and our Zenodo. These are aggregated and preprocessed as used within our modeling.
USDA RMA SOB
Note that an archive of UDSA Risk Management Agency (RMA) Summary of Business (SOB) data is also provided at our usdarmasob.zip. As a relatively large supplemental dataset, this is not currently in Zenodo. In addition to original format, all SOB datasets are given in Avro format where possible with standardized formatting / encoding. A subset of these data are considered within our paper as supporting evidence. See the README within the data archive for further details.
Yield estimations (SCYM)
Our derivative SCYM yield estimations at neighborhood-level are avialable at Zenodo.
Model outputs
The following model outputs are made available both through our website:
- export_claims.csv: Information about the claims rate under different conditions.
- sim_hist.csv: Information about simulation-wide yield distributions under different conditions.
- sweepagall.csv: Information about sweep outcomes and model performance.
- tool.csv: Geographically specific information about simulation outcomes at the 4 character geohash level.
As smaller payloads, these are also included in our Zenodo.
Open source
The pipeline, interactives, and paper can be executed independently and have segregated dependencies. We thank all of our open source dependencies for their contribution.
Pipeline dependencies
The pipeline uses the following open source dependencies:
- bokeh under the BSD 3-Clause License.
- boto3 under the Apache v2 License.
- dask under the BSD 3-Clause License.
- fiona under the BSD License.
- geolib under the MIT License.
- geotiff under the LGPL License.
- geotiff under the LGPL License.
- imagecodecs under the BSD 3-Clause License.
- keras under the Apache v2 License.
- libgeohash under the MIT License.
- Luigi under the Apache v2 License.
- NumPy under the BSD License.
- Pandas under the BSD License.
- Pathos under the BSD License.
- requests under the Apache v2 License.
- scipy under the BSD License.
- shapely by Sean Gillies, Casper van der Wel, and Shapely Contributors under the BSD License.
- tensorflow under the Apache v2 License.
- toolz under the BSD License.
Use of Coiled is optional.
Tools and visualizations
Both the interactives and static visualization generation use the following:
- Jinja under the BSD 3-Clause License.
- Matplotlib under the PSF License.
- NumPy under the BSD License.
- Pandas under the BSD License.
- Pillow under the HPND License.
- pygame-ce under the LGPL License.
- Sketchingpy under the BSD 3-Clause License.
- toolz under the BSD License.
The web version also uses:
- es.js under the ISC License (Andrea Giammarchi).
- micropip under the MPL 2.0 License.
- packaging under the BSD License.
- Popper under the MIT License.
- Pyodide under the MPL 2.0 License.
- Pyscript under the Apache v2 License.
- Sketchingpy under the BSD 3-Clause License.
- Tabby under the MIT License.
- Tippy.js under the MIT License.
- toml (Jak Wings) under the MIT License.
- ua-parser 1.0.36 under the MIT License.
Paper
The paper uses the following open source dependencies to build the manuscript:
- Jinja under the BSD 3-Clause License.
- Matplotlib under the PSF License.
- NumPy under the BSD License.
- Pandas under the BSD License.
- Pillow under the HPND License.
- Sketchingpy under the BSD License including the packages included in its stand alone hosting archive.
- toolz under the BSD License.
Users may optionally leverage Pandoc as an executable (not linked) under the GPL but any tool converting markdown to other formats is acceptable or the paper can be built as Markdown only without Pandoc. That said, for those using Pandoc, scripts may also use pandoc-fignos under the GPL License and pandoc-tablenos under the GPL License.
Other runtime dependencies
Some executions may also use:
- Docker under the Apache v2 License.
- Docker Compose under the Apache v2 License.
- Nginx under a BSD-like License.
- OpenRefine under the BSD License.
Other sources
We also use:
- Color Brewer under the Apache v2 License.
- Public Sans under the CC0 License.
GitHub Copilot used for some post-publication steps, largely to prepare materials for presentations.
License
Code is released under BSD 3-Clause and data under CC-BY-NC 4.0 International. Please see LICENSE.md.
Owner
- Name: DSE
- Login: SchmidtDSE
- Kind: organization
- Email: dse@berkeley.edu
- Location: United States of America
- Website: https://dse.berkeley.edu/
- Repositories: 7
- Profile: https://github.com/SchmidtDSE
The Eric and Wendy Schmidt Center for Data Science & Environment at Berkeley
Citation (CITATION.cff)
cff-version: '1.1.0'
message: 'Please cite the following works to reference this software.'
authors:
- family-names: "Pottinger"
given-names: "A. Samuel"
orcid: "https://orcid.org/0000-0002-0458-4985"
- family-names: "Connor"
given-names: "Lawson"
orcid: "https://orcid.org/0000-0001-5951-5752"
- family-names: "Guzder-Williams"
given-names: "Brookie"
orcid: "https://orcid.org/0000-0001-6855-8260"
- family-names: "Weltman-Fahs"
given-names: "May"
- family-names: "Gondek"
given-names: "Nick"
orcid: "https://orcid.org/0009-0007-7431-4669"
- family-names: "Bowles"
given-names: "Timothy"
orcid: "https://orcid.org/0000-0002-4840-3787"
doi: '10.48550/ARXIV:2408.02217'
identifiers:
- type: 'doi'
value: '10.52933/jdssv.v5i3.134'
- type: 'url'
value: 'https://jdssv.org/index.php/jdssv/article/view/134'
title: 'Climate-driven doubling of U.S. maize loss probability: Interactive simulation with neural network Monte Carlo'
url: 'https://jdssv.org/index.php/jdssv/article/view/134'
GitHub Events
Total
- Issues event: 7
- Watch event: 1
- Delete event: 80
- Issue comment event: 8
- Push event: 617
- Pull request review event: 4
- Pull request review comment event: 2
- Pull request event: 132
- Create event: 68
Last Year
- Issues event: 7
- Watch event: 1
- Delete event: 80
- Issue comment event: 8
- Push event: 617
- Pull request review event: 4
- Pull request review comment event: 2
- Pull request event: 132
- Create event: 68
Dependencies
- Creepios/sftp-action v1.0.3 composite
- actions/checkout v4 composite
- actions/checkout v3 composite
- actions/download-artifact v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- ubuntu noble-20240605 build
- ubuntu noble-20240605 build
- ubuntu noble-20240605 build
- maizeviz latest
- Pillow *
- jinja2 *
- matplotlib *
- numpy *
- pandas *
- pandoc-fignos *
- pandoc-tablenos *
- sketchingpy *
- toolz *
- Pillow *
- jinja2 *
- matplotlib *
- numpy *
- pandas *
- pygame-ce *
- sketchingpy *
- toolz *
- bokeh *
- boto3 *
- coiled *
- dask *
- fiona *
- geolib *
- geotiff *
- imagecodecs *
- keras ==3.1.1
- libgeohash *
- luigi *
- numpy *
- pandas *
- pathos *
- requests *
- scipy ==1.12.0
- shapely *
- tensorflow ==2.16.1
- toolz *