epimargin

epimargin: A Toolkit for Epidemiological Estimation, Prediction, and Policy Evaluation - Published in JOSS (2021)

https://github.com/covid-iwg/epimargin

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in JOSS metadata
  • Academic publication links
  • Committers with academic emails
    1 of 6 committers (16.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bayesian-methods covid19 india sird-model stochastic-modeling

Keywords from Contributors

mesh

Scientific Fields

Mathematics Computer Science - 88% confidence
Artificial Intelligence and Machine Learning Computer Science - 69% confidence
Earth and Environmental Sciences Physical Sciences - 64% confidence
Last synced: 4 months ago · JSON representation

Repository

networked, stochastic SIRD epidemiological model with Bayesian parameter estimation and policy scenario comparison tools

Basic Info
Statistics
  • Stars: 10
  • Watchers: 1
  • Forks: 5
  • Open Issues: 42
  • Releases: 1
Topics
bayesian-methods covid19 india sird-model stochastic-modeling
Created over 5 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

epimargin


a public health policy analysis toolkit consisting of: 1. Bayesian simulated annealing estimator for the reproductive rate (Rt) 2. a stochastic compartmental model class supporting multiple compartment schemes (SIR, SEIR, SIRV, subpopulations, etc.) 3. policy impact evaluation that calculates longevity benefits and economic activity disruption/resumption under scenarios including: - targeted lockdowns - urban-rural migration and reverse-migration flows - multiple vaccine allocation prioritizations

For examples of how the package is used, see the docs folder (featuring a toy example and a full tutorial), or the epimargin-studies repository.

installation

The epimargin package is available on PyPI and can be installed via pip:

pip install epimargin

support and issues

Please file an issue on Github if you run into any problems with the software.

contributions and development

Contributions are always welcome! Please fork the repository and open a new pull request for any new features.

For development, we recommending installing the development dependencies, and then installing the package in editable mode:

git clone https://github.com/COVID-IWG/epimargin
cd epimargin

pip install -r requirements.txt
pip install -e . 

We also recommend using a virtual environment for development.

tutorial

In this tutorial, we will download a timeseries of daily confirmed COVID-19 cases in Mumbai from COVID19India.org, estimate the reproductive rate for the city over time, plug these estimates into a compartmental model, and compare two policy scenarios by running the compartmental model forward. The entire tutorial can be found in the docs/tutorial directory.

1. setup

After installing the package, import some commonly-used tools and set up convenience functions/variables:

from itertools import cycle

import epimargin.plots as plt
import numpy as np
import pandas as pd
from epimargin.utils import setup

(data, figs) = setup() # creates convenient directories
plt.set_theme("minimal")

2. download and clean data

Next, download data on daily COVID19 cases in the city of Mumbai from COVID19India.org. The data are noisy, so we apply a notch filter to remove weekly reporting artifacts and smooth using a convolution:

from epimargin.etl import download_data
from epimargin.smoothing import notched_smoothing

download_data(data, "districts.csv", "https://api.covid19india.org/csv/latest/") 

# load raw data
daily_reports = pd.read_csv(data / "districts.csv", parse_dates = ["Date"])\
    .rename(str.lower, axis = 1)\
    .set_index(["state", "district", "date"])\
    .sort_index()\
    .loc["Maharashtra", "Mumbai"]
daily_cases = daily_reports["confirmed"]\
    .diff()\
    .clip(lower = 0)\
    .dropna()\

# smooth/notch-filter timeseries
smoother = notched_smoothing(window = 5)
smoothed_cases = pd.Series(
    data  = smoother(daily_cases),
    index = daily_cases.index
)

# plot raw and cleaned data 
beg = "December 15, 2020"
end = "March 1, 2021"
training_cases = smoothed_cases[beg:end]

plt.scatter(daily_cases[beg:end].index, daily_cases[beg:end].values, color = "black", s = 5, alpha = 0.5, label = "raw case count data")
plt.plot(training_cases.index, training_cases.values, color = "black", linewidth = 2, label = "notch-filtered, smoothed case count data")
plt.PlotDevice()\
    .l_title("case timeseries for Mumbai")\
    .axis_labels(x = "date", y = "daily cases")\
    .legend()\
    .adjust(bottom = 0.15, left = 0.15)\
    .format_xaxis()\
    .size(9.5, 6)\
    .save(figs / "fig_1.svg")\
    .show()

raw and cleaned case count timeseries data

NOTE: a copy of the reference timeseries for all districts available through the API is checked into the docs/data folder in case you run into download issues or if the upstream API changes.

3. estimate the reproductive rate, Rt

From these data, we can estimate the reproductive rate, or the number of secondary infections caused by a single active infection. A pandemic is under control if the reproductive rate stays below 1. A number of estimation procedures are provided; we show the Bettencourt/Soman estimator as an example:

from epimargin.estimators import analytical_MPVS

(dates, Rt, Rt_CI_upper, Rt_CI_lower, *_) = analytical_MPVS(training_cases, smoother, infectious_period = 10, totals = False)
plt.Rt(dates[1:], Rt[1:], Rt_CI_upper[1:], Rt_CI_lower[1:], 0.95, legend_loc = "upper left")\
    .l_title("$R_t$ over time for Mumbai")\
    .axis_labels(x = "date", y = "reproductive rate")\
    .adjust(bottom = 0.15, left = 0.15)\
    .size(9.5, 6)\
    .save(figs / "fig_2.svg")\
    .show()

estimated reproductive rate over time

4. set up a model and run it forward to compare policy scenarios

Finally, we can use the case count data and estimated reproductive rate to project forward cases. We also show how the input data can be modified to test hypotheses about specific policies. For example, you might expect a lockdown policy to reduce the reproductive rate by 25% given historical mobility data or lockdown stringency indices. Assuming a successful reduction in Rt, what does the trajectory of daily cases look like?

from epimargin.models import SIR

num_sims = 100
N0 = 12.48e6
R0, D0 = daily_reports.loc[end][["recovered", "deceased"]]
I0  = smoothed_cases[:end].sum()
dT0 = smoothed_cases[end]
S0  = N0 - I0 - R0 - D0
Rt0 = Rt[-1] * N0 / S0
no_lockdown = SIR(
    name = "no lockdown", 
    population = N0, 
    dT0 = np.ones(num_sims) * dT0, Rt0 = np.ones(num_sims) * Rt0, I0 = np.ones(num_sims) * I0, R0 = np.ones(num_sims) * R0, D0 = np.ones(num_sims) * D0, S0 = np.ones(num_sims) * S0, infectious_period = 10
)
lockdown = SIR(
    name = "partial lockdown", 
    population = N0, 
    dT0 = np.ones(num_sims) * dT0, Rt0 = np.ones(num_sims) * 0.75 * Rt0, I0 = np.ones(num_sims) * I0, R0 = np.ones(num_sims) * R0, D0 = np.ones(num_sims) * D0, S0 = np.ones(num_sims) * S0, infectious_period = 10
)

# run models forward 
simulation_range = 7
for _ in range(simulation_range):
    lockdown   .parallel_forward_epi_step(num_sims = num_sims)
    no_lockdown.parallel_forward_epi_step(num_sims = num_sims)

# compare policies 
test_cases = smoothed_cases["February 15, 2021":pd.Timestamp(end) + pd.Timedelta(days = simulation_range)]
date_range = pd.date_range(start = end, periods = simulation_range + 1, freq = "D")
legend_entries = [plt.predictions(date_range, model, color) for (model, color) in zip([lockdown, no_lockdown], cycle(plt.SIM_PALETTE))]
train_marker, = plt.plot(test_cases[:end].index, test_cases[:end].values, color = "black")
test_marker,  = plt.plot(test_cases[end:].index, test_cases[end:].values, color = "black", linestyle = "dotted")
markers, _ = zip(*legend_entries)
plt.PlotDevice()\
    .l_title("projected case counts")\
    .axis_labels(x = "date", y = "daily cases")\
    .legend(
        [train_marker, test_marker] + list(markers),
        ["case counts (training)", "case counts (actual)", "case counts (partial lockdown; 95% simulation range)", "case counts (no lockdown; 95% simulation range)"],
        loc = "upper left"
    )\
    .adjust(bottom = 0.15, left = 0.15)\
    .size(9.5, 6)\
    .format_xaxis()\
    .save(figs / "fig_3.svg")\
    .show()

policy projections over time The median projections from the no-lockdown model (crimson) mirror the observed timeseries (dotted black) fairly well, and the model predicts that even an imperfect lockdown would have changed the trajectory of the pandemic at the time period we looked at. As the model is stochastic, we show a range of outcomes (shaded) and note the model accuracy decreases as the projection period goes on. In real-time settings, we encourage daily updating of projections to handle new data.

Owner

  • Name: COVID International Working Group
  • Login: COVID-IWG
  • Kind: organization

A coalition of epidemiologists, economists, policymakers, and data scientists focused on tackling the challenges of COVID, globally.

JOSS Publication

epimargin: A Toolkit for Epidemiological Estimation, Prediction, and Policy Evaluation
Published
September 09, 2021
Volume 6, Issue 65, Page 3464
Authors
Satej Soman ORCID
Mansueto Institute for Urban Innovation, University of Chicago
Caitlin Loftus
Mansueto Institute for Urban Innovation, University of Chicago
Steven Buschbach
Mansueto Institute for Urban Innovation, University of Chicago
Manasi Phadnis
Mansueto Institute for Urban Innovation, University of Chicago
Luís M. a. Bettencourt ORCID
Mansueto Institute for Urban Innovation, University of Chicago, Department of Ecology & Evolution, University of Chicago, Department of Sociology, University of Chicago, Santa Fe Institute
Editor
Nikoleta Glynatsi ORCID
Tags
epidemiology stochastic processes economics COVID-19 Bayesian inference

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 346
  • Total Committers: 6
  • Avg Commits per committer: 57.667
  • Development Distribution Score (DDS): 0.124
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
satej soman s****n@g****m 303
dependabot[bot] 4****] 16
Caitlin Loftus c****s@g****m 14
Nicholas Marchio n****o@g****m 10
Steven Buschbach s****h@u****u 2
manasip5 6****5 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 18
  • Total pull requests: 86
  • Average time to close issues: 29 days
  • Average time to close pull requests: 9 days
  • Total issue authors: 2
  • Total pull request authors: 5
  • Average comments per issue: 0.67
  • Average comments per pull request: 0.07
  • Merged pull requests: 68
  • Bot issues: 0
  • Bot pull requests: 30
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • dilawar (9)
  • satejsoman (9)
Pull Request Authors
  • satejsoman (51)
  • dependabot[bot] (30)
  • ceilingloft (2)
  • dilawar (2)
  • stevenbuschbach (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (30)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 20 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 8
  • Total maintainers: 1
pypi.org: epimargin

Toolkit for estimating epidemiological metrics and evaluating public health and economic policies.

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 20 Last month
Rankings
Dependent packages count: 10.1%
Forks count: 14.2%
Stargazers count: 17.7%
Average: 19.6%
Dependent repos count: 21.6%
Downloads: 34.6%
Maintainers (1)
Last synced: 4 months ago

Dependencies

requirements.txt pypi
  • Cython ==0.29.21
  • Fiona ==1.8.17
  • Flask ==1.1.2
  • GDAL ==3.1.3
  • Jinja2 ==2.11.3
  • MarkupSafe ==1.1.1
  • Pillow ==8.3.2
  • PyYAML ==5.4.1
  • Pygments ==2.7.4
  • Shapely ==1.7.0
  • Theano ==1.0.5
  • Werkzeug ==1.0.1
  • appnope ==0.1.0
  • arviz ==0.9.0
  • astroid ==2.4.1
  • attrs ==19.3.0
  • backcall ==0.1.0
  • beautifulsoup4 ==4.9.3
  • bleach ==3.3.0
  • bump2version ==1.0.1
  • bumpversion ==0.6.0
  • certifi ==2020.6.20
  • cftime ==1.1.3
  • chardet ==3.0.4
  • click ==7.1.2
  • click-plugins ==1.1.1
  • cligj ==0.5.0
  • cloudpickle ==1.6.0
  • colorama ==0.4.3
  • commonmark ==0.9.1
  • cycler ==0.10.0
  • dask ==2021.4.1
  • decorator ==4.4.2
  • descartes ==1.1.0
  • docutils ==0.17.1
  • fastprogress ==0.2.3
  • flake8 ==3.8.3
  • flat-table ==1.1.1
  • fsspec ==2021.4.0
  • geopandas ==0.8.1
  • geos ==0.2.2
  • h5py ==2.10.0
  • idna ==2.9
  • importlib-metadata ==4.5.0
  • ipython ==7.16.3
  • ipython-genutils ==0.2.0
  • isort ==4.3.21
  • itsdangerous ==1.1.0
  • jedi ==0.17.0
  • joblib ==0.14.1
  • keyring ==23.0.1
  • kiwisolver ==1.2.0
  • lazy-object-proxy ==1.4.3
  • linearmodels ==4.19
  • locket ==0.2.1
  • lxml ==4.6.5
  • mapclassify ==2.4.2
  • matplotlib ==3.2.1
  • mccabe ==0.6.1
  • munch ==2.5.0
  • mypy ==0.910
  • mypy-extensions ==0.4.3
  • netCDF4 ==1.5.3
  • networkx ==2.5.1
  • numpy ==1.21.0
  • packaging ==20.4
  • pandas ==1.0.3
  • parso ==0.7.0
  • partd ==1.2.0
  • patsy ==0.5.1
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • pkginfo ==1.7.0
  • pprintpp ==0.4.0
  • prompt-toolkit ==3.0.5
  • property-cached ==1.6.4
  • ptyprocess ==0.6.0
  • pycodestyle ==2.6.0
  • pyflakes ==2.2.0
  • pyhdfe ==0.1.0
  • pylint ==2.5.2
  • pymc3 ==3.9.3
  • pyparsing ==2.4.7
  • pyproj ==2.6.0
  • pyreadr ==0.4.0
  • python-dateutil ==2.8.1
  • pytz ==2019.3
  • readme-renderer ==29.0
  • requests ==2.23.0
  • requests-toolbelt ==0.9.1
  • rfc3986 ==1.5.0
  • rich ==2.3.0
  • rope ==0.17.0
  • scikit-learn ==0.22.2.post1
  • scipy ==1.4.1
  • seaborn ==0.10.0
  • semver ==2.9.0
  • six ==1.14.0
  • sklearn ==0.0
  • soupsieve ==2.2.1
  • statsmodels ==0.11.1
  • tikzplotlib ==0.9.4
  • toml ==0.10.1
  • toolz ==0.11.1
  • tqdm ==4.45.0
  • traitlets ==4.3.3
  • twine ==3.4.1
  • typed-ast ==1.4.1
  • typing-extensions ==3.7.4.2
  • urllib3 ==1.26.5
  • urlpath ==1.1.7
  • wcwidth ==0.1.9
  • webencodings ==0.5.1
  • wrapt ==1.12.1
  • xarray ==0.15.1
  • xlrd ==1.2.0
  • zipp ==3.1.0
setup.py pypi
  • arviz *
  • flat-table *
  • geopandas *
  • matplotlib *
  • numpy *
  • pandas *
  • pymc3 ==3.11.2
  • requests *
  • scikit-learn *
  • seaborn *
  • semver ==2.11.0
  • statsmodels *
  • tikzplotlib *
.github/workflows/tutorial-ci.yaml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite