GPCERF - An R package for implementing Gaussian processes for estimating causal exposure response curves

GPCERF - An R package for implementing Gaussian processes for estimating causal exposure response curves - Published in JOSS (2024)

https://github.com/nsaph-software/gpcerf

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 4 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software
Last synced: 4 months ago · JSON representation ·

Repository

Basic Info
Statistics
  • Stars: 9
  • Watchers: 2
  • Forks: 3
  • Open Issues: 0
  • Releases: 8
Fork of boyuren158/GP-CERF
Created about 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md


output: html_document: default

pdf_document: default

Cover Image

CRAN Package Version JOSS Status R-CMD-check status Codecov CRAN RStudio Mirror Downloads

Gaussian processes for the estimation of causal exposure-response curves (GP-CERF)

Summary

Gaussian Process (GP) and nearest neighbor Gaussian Process (nnGP) approaches for nonparametric modeling.

Installation

r library("devtools") install_github("NSAPH-Software/GPCERF", ref="develop") library("GPCERF")

Usage

Note: The following examples will also need installing ranger R package.

GP

```r library(GPCERF) set.seed(781) simdata <- generatesyntheticdata(samplesize = 500, gps_spec = 1)

n_core <- 1

mxgboost <- function(nthread = ncore, ...) { SuperLearner::SL.xgboost(nthread = nthread, ...) }

mranger <- function(num.threads = ncore, ...){ SuperLearner::SL.ranger(num.threads = num.threads, ...) }

Estimate GPS function

gpsm <- estimategps(covmt = simdata[,-(1:2)], wall = simdata$treat, sllib = c("mxgboost", "mranger"), dnormlog = TRUE)

exposure values

q1 <- stats::quantile(simdata$treat, 0.05) q2 <- stats::quantile(simdata$treat, 0.95)

w_all <- seq(q1, q2, 1)

paramslst <- list(alpha = 10 ^ seq(-2, 2, length.out = 10), beta = 10 ^ seq(-2, 2, length.out = 10), gsigma = c(0.1, 1, 10), tune_app = "all")

cerfgpobj <- estimatecerfgp(simdata, wall, gpsm, params = paramslst, outcomecol = "Y", treatmentcol = "treat", covariatescol = paste0("cf", seq(1,6)), nthread = ncore) summary(cerfgpobj) plot(cerfgpobj) GPCERF standard Gaussian grocess exposure response function object

Optimal hyper parameters(#trial: 300): alpha = 12.9154966501488 beta = 12.9154966501488 g_sigma = 0.1

Optimal covariate balance: cf1 = 0.069 cf2 = 0.082 cf3 = 0.063 cf4 = 0.066 cf5 = 0.056 cf6 = 0.081

Original covariate balance: cf1 = 0.222 cf2 = 0.112 cf3 = 0.175 cf4 = 0.318 cf5 = 0.198 cf6 = 0.257 ----***----
```

nnGP

```r set.seed(781) simdata <- generatesyntheticdata(samplesize = 5000, gps_spec = 1)

m_xgboost <- function(nthread = 12, ...) { SuperLearner::SL.xgboost(nthread = nthread, ...) }

m_ranger <- function(num.threads = 12, ...){ SuperLearner::SL.ranger(num.threads = num.threads, ...) }

Estimate GPS function

gpsm <- estimategps(covmt = simdata[,-(1:2)], wall = simdata$treat, sllib = c("mxgboost", "mranger"), dnormlog = TRUE)

exposure values

q1 <- stats::quantile(simdata$treat, 0.05) q2 <- stats::quantile(simdata$treat, 0.95)

w_all <- seq(q1, q2, 1)

paramslst <- list(alpha = 10 ^ seq(-2, 2, length.out = 10), beta = 10 ^ seq(-2, 2, length.out = 10), gsigma = c(0.1, 1, 10), tuneapp = "all", nneighbor = 50, block_size = 1e3)

cerfnngpobj <- estimatecerfnngp(simdata, wall, gpsm, params = paramslst, outcomecol = "Y", treatmentcol = "treat", covariatescol = paste0("cf", seq(1,6)), nthread = 12) summary(cerfnngpobj) plot(cerfnngp_obj) ```

``` GPCERF nearest neighbore Gaussian process exposure response function object summary

Optimal hyper parameters(#trial: 300): alpha = 0.0278255940220712 beta = 0.215443469003188 g_sigma = 0.1

Optimal covariate balance: cf1 = 0.062 cf2 = 0.070 cf3 = 0.091 cf4 = 0.062 cf5 = 0.076 cf6 = 0.088

Original covariate balance: cf1 = 0.115 cf2 = 0.137 cf3 = 0.145 cf4 = 0.296 cf5 = 0.208 cf6 = 0.225 ----***----
```

Code of Conduct

Please note that the GPCERF project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Contributing

Contributions to the package are encouraged. For detailed information on how to contribute, please refer to the CONTRIBUTING guidelines.

Reporting Issues & Seeking Support

If you encounter any issues with GPCERF, we kindly ask you to report them on our GitHub by opening a new issue. To expedite resolution, including a reproducible example is highly appreciated. For those seeking assistance or further details about a particular topic, feel free to initiate a Discussion on GitHub or open an issue. Additionally, for more direct inquiries, the package maintainer can be reached via the email address provided in the DESCRIPTION file.

References

Ren, B., Wu, X., Braun, D., Pillai, N. and Dominici, F., 2021. Bayesian modeling for exposure response curve via gaussian processes: Causal effects of exposure to air pollution on health outcomes. arXiv preprint doi:10.48550/arXiv.2105.03454.

Owner

  • Name: National Studies on Air Pollution and Health Software
  • Login: NSAPH-Software
  • Kind: organization
  • Location: United States of America

NSAPH Software is a collection of open-source packages to carry out National Studies on Air Pollution and Health.

JOSS Publication

GPCERF - An R package for implementing Gaussian processes for estimating causal exposure response curves
Published
March 13, 2024
Volume 9, Issue 95, Page 5465
Authors
Naeem Khoshnevis ORCID
University Research Computing and Data Services, Harvard University, Cambridge, Massachusetts, United States of America
Boyu Ren ORCID
McLean Hospital, Belmont, Massachusetts, United States of America
Danielle Braun ORCID
Department of Biostatistics, Harvard School of Public Health, Cambridge, Massachusetts, United States of America
Editor
Susan Holmes ORCID
Tags
causal inference Gaussian Processes causal exposure response function

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Khoshnevis
  given-names: Naeem
  orcid: "https://orcid.org/0000-0003-4315-1426"
- family-names: Ren
  given-names: Boyu
  orcid: "https://orcid.org/0000-0002-5300-1184"
- family-names: Braun
  given-names: Danielle
  orcid: "https://orcid.org/0000-0002-5177-8598"
contact:
- family-names: Khoshnevis
  given-names: Naeem
  orcid: "https://orcid.org/0000-0003-4315-1426"
doi: 10.5281/zenodo.10757333
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Khoshnevis
    given-names: Naeem
    orcid: "https://orcid.org/0000-0003-4315-1426"
  - family-names: Ren
    given-names: Boyu
    orcid: "https://orcid.org/0000-0002-5300-1184"
  - family-names: Braun
    given-names: Danielle
    orcid: "https://orcid.org/0000-0002-5177-8598"
  date-published: 2024-03-13
  doi: 10.21105/joss.05465
  issn: 2475-9066
  issue: 95
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 5465
  title: GPCERF - An R package for implementing Gaussian processes for
    estimating causal exposure response curves
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.05465"
  volume: 9
title: GPCERF - An R package for implementing Gaussian processes for
  estimating causal exposure response curves

GitHub Events

Total
Last Year

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 378
  • Total Committers: 4
  • Avg Commits per committer: 94.5
  • Development Distribution Score (DDS): 0.238
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
naeemkh k****m@g****m 288
boyuren158 b****8@g****m 88
Tanujit Dey 4****y 1
Boyu Ren b****8@r****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 25
  • Total pull requests: 74
  • Average time to close issues: 5 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 5
  • Total pull request authors: 3
  • Average comments per issue: 1.28
  • Average comments per pull request: 0.62
  • Merged pull requests: 71
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • martinmodrak (7)
  • Naeemkh (6)
  • andrewherren (6)
  • boyuren158 (5)
  • juandavidgutier (1)
Pull Request Authors
  • Naeemkh (62)
  • boyuren158 (18)
  • tanujitdey (2)
Top Labels
Issue Labels
bug (3) enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 281 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 6
  • Total maintainers: 1
cran.r-project.org: GPCERF

Gaussian Processes for Estimating Causal Exposure Response Curves

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 281 Last month
Rankings
Forks count: 17.8%
Stargazers count: 18.7%
Average: 29.2%
Dependent packages count: 29.8%
Dependent repos count: 35.5%
Downloads: 44.5%
Maintainers (1)
Last synced: 4 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact main composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 3.5.0 depends
  • MASS * imports
  • Rcpp * imports
  • RcppArmadillo * imports
  • Rfast * imports
  • SuperLearner * imports
  • ggplot2 * imports
  • logger * imports
  • parallel * imports
  • rlang * imports
  • spatstat.geom * imports
  • stats * imports
  • xgboost * imports
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests
docker_singularity/Dockerfile docker
  • rocker/verse 4.1.0 build