GPCERF - An R package for implementing Gaussian processes for estimating causal exposure response curves
GPCERF - An R package for implementing Gaussian processes for estimating causal exposure response curves - Published in JOSS (2024)
Science Score: 100.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
1 of 4 committers (25.0%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Repository
Basic Info
- Host: GitHub
- Owner: NSAPH-Software
- License: gpl-3.0
- Language: R
- Default Branch: main
- Homepage: https://NSAPH-Software.github.io/GPCERF/
- Size: 12.8 MB
Statistics
- Stars: 9
- Watchers: 2
- Forks: 3
- Open Issues: 0
- Releases: 8
Metadata Files
README.md
output: html_document: default
pdf_document: default
Gaussian processes for the estimation of causal exposure-response curves (GP-CERF)
Summary
Gaussian Process (GP) and nearest neighbor Gaussian Process (nnGP) approaches for nonparametric modeling.
Installation
r
library("devtools")
install_github("NSAPH-Software/GPCERF", ref="develop")
library("GPCERF")
Usage
Note: The following examples will also need installing ranger R package.
GP
```r library(GPCERF) set.seed(781) simdata <- generatesyntheticdata(samplesize = 500, gps_spec = 1)
n_core <- 1
mxgboost <- function(nthread = ncore, ...) { SuperLearner::SL.xgboost(nthread = nthread, ...) }
mranger <- function(num.threads = ncore, ...){ SuperLearner::SL.ranger(num.threads = num.threads, ...) }
Estimate GPS function
gpsm <- estimategps(covmt = simdata[,-(1:2)], wall = simdata$treat, sllib = c("mxgboost", "mranger"), dnormlog = TRUE)
exposure values
q1 <- stats::quantile(simdata$treat, 0.05) q2 <- stats::quantile(simdata$treat, 0.95)
w_all <- seq(q1, q2, 1)
paramslst <- list(alpha = 10 ^ seq(-2, 2, length.out = 10), beta = 10 ^ seq(-2, 2, length.out = 10), gsigma = c(0.1, 1, 10), tune_app = "all")
cerfgpobj <- estimatecerfgp(simdata,
wall,
gpsm,
params = paramslst,
outcomecol = "Y",
treatmentcol = "treat",
covariatescol = paste0("cf", seq(1,6)),
nthread = ncore)
summary(cerfgpobj)
plot(cerfgpobj)
GPCERF standard Gaussian grocess exposure response function object
Optimal hyper parameters(#trial: 300): alpha = 12.9154966501488 beta = 12.9154966501488 g_sigma = 0.1
Optimal covariate balance: cf1 = 0.069 cf2 = 0.082 cf3 = 0.063 cf4 = 0.066 cf5 = 0.056 cf6 = 0.081
Original covariate balance:
cf1 = 0.222
cf2 = 0.112
cf3 = 0.175
cf4 = 0.318
cf5 = 0.198
cf6 = 0.257
----***----
```
nnGP
```r set.seed(781) simdata <- generatesyntheticdata(samplesize = 5000, gps_spec = 1)
m_xgboost <- function(nthread = 12, ...) { SuperLearner::SL.xgboost(nthread = nthread, ...) }
m_ranger <- function(num.threads = 12, ...){ SuperLearner::SL.ranger(num.threads = num.threads, ...) }
Estimate GPS function
gpsm <- estimategps(covmt = simdata[,-(1:2)], wall = simdata$treat, sllib = c("mxgboost", "mranger"), dnormlog = TRUE)
exposure values
q1 <- stats::quantile(simdata$treat, 0.05) q2 <- stats::quantile(simdata$treat, 0.95)
w_all <- seq(q1, q2, 1)
paramslst <- list(alpha = 10 ^ seq(-2, 2, length.out = 10), beta = 10 ^ seq(-2, 2, length.out = 10), gsigma = c(0.1, 1, 10), tuneapp = "all", nneighbor = 50, block_size = 1e3)
cerfnngpobj <- estimatecerfnngp(simdata, wall, gpsm, params = paramslst, outcomecol = "Y", treatmentcol = "treat", covariatescol = paste0("cf", seq(1,6)), nthread = 12) summary(cerfnngpobj) plot(cerfnngp_obj) ```
``` GPCERF nearest neighbore Gaussian process exposure response function object summary
Optimal hyper parameters(#trial: 300): alpha = 0.0278255940220712 beta = 0.215443469003188 g_sigma = 0.1
Optimal covariate balance: cf1 = 0.062 cf2 = 0.070 cf3 = 0.091 cf4 = 0.062 cf5 = 0.076 cf6 = 0.088
Original covariate balance:
cf1 = 0.115
cf2 = 0.137
cf3 = 0.145
cf4 = 0.296
cf5 = 0.208
cf6 = 0.225
----***----
```
Code of Conduct
Please note that the GPCERF project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Contributing
Contributions to the package are encouraged. For detailed information on how to contribute, please refer to the CONTRIBUTING guidelines.
Reporting Issues & Seeking Support
If you encounter any issues with GPCERF, we kindly ask you to report them on our GitHub by opening a new issue. To expedite resolution, including a reproducible example is highly appreciated. For those seeking assistance or further details about a particular topic, feel free to initiate a Discussion on GitHub or open an issue. Additionally, for more direct inquiries, the package maintainer can be reached via the email address provided in the DESCRIPTION file.
References
Ren, B., Wu, X., Braun, D., Pillai, N. and Dominici, F., 2021. Bayesian modeling for exposure response curve via gaussian processes: Causal effects of exposure to air pollution on health outcomes. arXiv preprint doi:10.48550/arXiv.2105.03454.
Owner
- Name: National Studies on Air Pollution and Health Software
- Login: NSAPH-Software
- Kind: organization
- Location: United States of America
- Website: https://nsaph-software.github.io/intro.html
- Repositories: 11
- Profile: https://github.com/NSAPH-Software
NSAPH Software is a collection of open-source packages to carry out National Studies on Air Pollution and Health.
JOSS Publication
GPCERF - An R package for implementing Gaussian processes for estimating causal exposure response curves
Authors
Tags
causal inference Gaussian Processes causal exposure response functionCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Khoshnevis
given-names: Naeem
orcid: "https://orcid.org/0000-0003-4315-1426"
- family-names: Ren
given-names: Boyu
orcid: "https://orcid.org/0000-0002-5300-1184"
- family-names: Braun
given-names: Danielle
orcid: "https://orcid.org/0000-0002-5177-8598"
contact:
- family-names: Khoshnevis
given-names: Naeem
orcid: "https://orcid.org/0000-0003-4315-1426"
doi: 10.5281/zenodo.10757333
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Khoshnevis
given-names: Naeem
orcid: "https://orcid.org/0000-0003-4315-1426"
- family-names: Ren
given-names: Boyu
orcid: "https://orcid.org/0000-0002-5300-1184"
- family-names: Braun
given-names: Danielle
orcid: "https://orcid.org/0000-0002-5177-8598"
date-published: 2024-03-13
doi: 10.21105/joss.05465
issn: 2475-9066
issue: 95
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 5465
title: GPCERF - An R package for implementing Gaussian processes for
estimating causal exposure response curves
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.05465"
volume: 9
title: GPCERF - An R package for implementing Gaussian processes for
estimating causal exposure response curves
GitHub Events
Total
Last Year
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| naeemkh | k****m@g****m | 288 |
| boyuren158 | b****8@g****m | 88 |
| Tanujit Dey | 4****y | 1 |
| Boyu Ren | b****8@r****u | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 25
- Total pull requests: 74
- Average time to close issues: 5 months
- Average time to close pull requests: 5 days
- Total issue authors: 5
- Total pull request authors: 3
- Average comments per issue: 1.28
- Average comments per pull request: 0.62
- Merged pull requests: 71
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- martinmodrak (7)
- Naeemkh (6)
- andrewherren (6)
- boyuren158 (5)
- juandavidgutier (1)
Pull Request Authors
- Naeemkh (62)
- boyuren158 (18)
- tanujitdey (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 281 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 6
- Total maintainers: 1
cran.r-project.org: GPCERF
Gaussian Processes for Estimating Causal Exposure Response Curves
- Homepage: https://github.com/NSAPH-Software/GPCERF
- Documentation: http://cran.r-project.org/web/packages/GPCERF/GPCERF.pdf
- License: GPL (≥ 3)
-
Latest release: 0.2.4
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/upload-artifact main composite
- r-lib/actions/setup-pandoc v1 composite
- r-lib/actions/setup-r v2 composite
- actions/checkout v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
- R >= 3.5.0 depends
- MASS * imports
- Rcpp * imports
- RcppArmadillo * imports
- Rfast * imports
- SuperLearner * imports
- ggplot2 * imports
- logger * imports
- parallel * imports
- rlang * imports
- spatstat.geom * imports
- stats * imports
- xgboost * imports
- knitr * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests
- rocker/verse 4.1.0 build
