epigeopop

Snakemake workflow to generate country-specific population density data, for use in epidemiological modeling

https://github.com/sabs-r3-epidemiology/epigeopop

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.5%) to scientific vocabulary

Keywords

agent-based-modeling epidemiology population-density-maps
Last synced: 6 months ago · JSON representation ·

Repository

Snakemake workflow to generate country-specific population density data, for use in epidemiological modeling

Basic Info
  • Host: GitHub
  • Owner: SABS-R3-Epidemiology
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 24 MB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 0
  • Open Issues: 1
  • Releases: 1
Topics
agent-based-modeling epidemiology population-density-maps
Created about 3 years ago · Last pushed 10 months ago
Metadata Files
Readme License Citation

README.md

EpiGeoPop

DOI

This repository is a snakemake workflow for getting population density data for arbitrary countries. It uses population data from the JRC Big Data Analytics Platform, border data from Natural Earth, and is partially based on Adam Symington's excellent blog post. This workflow is motivated by extending epiabm to other countries.

The workflow generates population density files that look like:

Luxembourg heatmap

and can generate figures of simulations like:

Luxembourg time grid

or animations like:

Luxembourg time animation

Running

There are two parts to EpiGeoPop:

  1. A Snakemake pipeline which creates population density files to be fed into an epidemiological simulation
  2. A script to process simulation outputs and generate visualizations

The generated input files and expected output files for EpiGeoPop are based on epiabm. Briefly, the generated input files are CSVs which contain the following fields:

  • cell: The cell index for a particular tile
  • microcell: The microcell index within a cell
  • location_x: The longitude of the cell
  • location_y: The latitude of the cell
  • household_number: The number of households in the microcell
  • place_numer: The number of non-residential places in the microcell
  • Susceptible: The number of (initially susceptible) people in the microcell

For more information about these files, including a desciption of microcells, etc., please refer to the epiabm repository here https://github.com/SABS-R3-Epidemiology/epiabm.

Creating simulation inputs

The following shows how to setup and run the Snakemake pipeline. By default, it will create the files for running a Luxembourg simulation, but the Snakefile can be modified to generated files for many countries, province/states, or cities.

Clone the repository

git clone git@github.com:SABS-R3-Epidemiology/EpiGeoPop.git cd EpiGeoPop

Create virtual environment (recommended)

python -m venv venv source venv/bin/activate

Install dependencies

pip install -r requirements.txt

Downlaod the raw data (See data/README.md for more information)

bash prep.sh

Run the snakemake pipeline

snakemake --cores 1

Exploring the data

Check the outputs directory for example population density maps. The image outputs/dag.svg shows the entire workflow. The file data/processed/countries/Luxembourg_microcells.csv contains the generated microcells, used for input to simulations such as epiabm. The file data/processed/countries/Luxembourg_pop_dist.json contains the age distribution of populations.

Running on other regions

The Snakefile contains commented out examples of other regions to show how to generate files for other countries, provinces, and cities. These also require a configuration file to determine factors like the age distribution of the population. Boiler plate configurations can be copied from similar files in the configs directory.

Generating animations

The file make_gif.py in data/sim_outputs is used for making GIFs and grids from simulation output data. To use it, add the simulation output file to data/sim_outputs, and run python make_gif.py -f [filename].csv. The resulating animation and grid of time snapshots will be stored in data/sim_outputs/animation. An example on Winnipeg (Canada) is provided in this repository and can be run with python make_gif.py -f output_winnipeg.csv within the data/sim_outputs directory.

Additional optional arguments (--duration (to control the time per frame in milliseconds, default 100 ms), and --dpi (to set the image resolution, default 300 dpi)) can be provided when running the script, for example: python make_gif.py -f output_winnipeg.csv --duration 100 --dpi 300.

Owner

  • Name: SABS-R3-Epidemiology
  • Login: SABS-R3-Epidemiology
  • Kind: organization

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: EpiGeoPop
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Isaac
    family-names: Ellmen
    affiliation: 'Doctoral Training Centre, University of Oxford'
  - given-names: Ioana
    family-names: Bouros
    affiliation: 'Doctoral Training Centre, University of Oxford'
  - given-names: Kit
    family-names: Gallagher
    affiliation: 'Doctoral Training Centre, University of Oxford'
    email: gallagher@maths.ox.ac.uk
    orcid: 'https://orcid.org/0000-0003-1401-115X'
identifiers:
  - type: doi
    value: 10.5281/zenodo.14112520
    description: The Zenodo DOI of the latest version
repository-code: 'https://github.com/SABS-R3-Epidemiology/EpiGeoPop'
abstract: >-
  Snakemake workflow to generate country-specific population
  density data, for use in epidemiological modeling
license: BSD-3-Clause

GitHub Events

Total
  • Create event: 4
  • Issues event: 1
  • Release event: 2
  • Watch event: 1
  • Delete event: 1
  • Issue comment event: 3
  • Push event: 9
  • Pull request event: 2
Last Year
  • Create event: 4
  • Issues event: 1
  • Release event: 2
  • Watch event: 1
  • Delete event: 1
  • Issue comment event: 3
  • Push event: 9
  • Pull request event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 days
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 days
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • anitaapplegarth (1)
Pull Request Authors
  • Ellmen (1)
Top Labels
Issue Labels
Pull Request Labels