comet
COMET: Computational Open-source Model for Evaluating Transplantation
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
·
Repository
COMET: Computational Open-source Model for Evaluating Transplantation
Basic Info
Statistics
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 1
- Releases: 2
Created almost 3 years ago
· Last pushed over 1 year ago
Metadata Files
Readme
Changelog
License
Citation
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# COMET
The goal of COMET^[Rose, Johnie, Paul R. Gunsalus, Carli J. Lehr, Mark F. Swiler, Jarrod E. Dalton, and Maryam Valapour. "A modular simulation framework for organ allocation." The Journal of Heart and Lung Transplantation (2024). https://doi.org/10.1016/j.healun.2024.04.063] is to simulate provide an open source framework to simulate the United States Lung Allocation System
## Installation
You can install the development version of COMET like so:
```{r, eval=FALSE}
# install.packages("devtools")
devtools::install_github("ClevelandClinicQHS/COMET")
```
You will be prompted to install this package if it is not installed already. It is needed for `run_simulation` and `gen_and_spawn_candidates/gen_and_spawn_donors`
```{r, eval=FALSE}
devtools::install_github("ClevelandClinicQHS/cometdata")
```
## Basic example
This will simulate the 2015 Lung Allocation Score (LAS) policy for 400 days (`r1`) and then the same scenario under the Composite Allocation Score (CAS) policy (`r2`).
```{r example}
library(COMET)
r1 <- run_simulation(days = 400, can_start = 1000,
match_alg = match_las, wl_model = "LAS15", post_tx_model = "LAS15",
wl_weight = 2, post_tx_weight = 1, wl_cap = 365, post_tx_cap = 365, seed = 26638)
r2 <- run_simulation(days = 400, can_start = 1000,
match_alg = match_cas, wl_model = "CAS23", post_tx_model = "CAS23",
wl_weight = 0.25, post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826,
bio_weight = .15, pld_weight = 0.05, peds_weight = 0.2, efficiency_weight = 0.1, seed = 26638)
```
A simple way to evaluate the COMET runs, one can also look at the statistics by diagnosis group or another variable. For details on what is reported run `help('eval_simulation')`.
```{r comparison of simualtions}
# see how many were transplanted, donor count, waitlist deaths
eval_simulation(r1)
eval_simulation(r2)
## See results by diagnosis group
eval_simulation(r1, group = dx_grp)
eval_simulation(r2, group = dx_grp)
```
One can also mix and match the models and weights under the LAS/CAS policies. `r3` simulates the but the LAS policy but with the CAS waitlist and post-transplant mortality models (post-transplant survival truncated at 1 year). `r4` is the same scenario as `r3`, but waitlist mortality receives the same weight as post-transplant (1 to 1).
```{r example2}
r3 <- run_simulation(days = 400, can_start = 1000,
match_alg = match_las, wl_model = "CAS23", post_tx_model = "CAS23",
wl_weight = 2, post_tx_weight = 1, wl_cap = 365, post_tx_cap = 365, seed = 26638)
r4 <- run_simulation(days = 400, can_start = 1000,
match_alg = match_las, wl_model = "CAS23", post_tx_model = "CAS23",
wl_weight = 1, post_tx_weight = 1, wl_cap = 365, post_tx_cap = 365, seed = 26638)
```
## A more indepth demonstration to show what is going on behind the scenes
### Generates candidates and donors
One major advantage of the simulation is the use of synthetic candidates and donors.
```{r generation}
set.seed(26638)
syn_cands <- gen_and_spawn_candidates(days = 10)
syn_dons <- gen_and_spawn_donors(days = 10)
head(syn_cands)
head(syn_dons)
```
For this scenario I'm going to add `r max(r2$all_candidates$listing_day)` days to the listing and recovery day to `syn_cands` and `syn_donors` and also make sure the candidate and donor ids do not conflict. I'm going to continue the results from CAS simulation (`r2`). The notation for the LAS simulation (`r1`) is very similar.
```{r iterate}
syn_cands <- dplyr::mutate(syn_cands,
listing_day = listing_day + max(r2$all_candidates$listing_day),
c_id = c_id + max(r2$all_candidates$c_id))
syn_dons <- dplyr::mutate(syn_dons,
recovery_day = recovery_day + max(r2$all_candidates$listing_day),
d_id = d_id + max(r2$all_donors$d_id))
r2cc <- dplyr::mutate(r2$current_candidates,
days_on_waitlist = 401 - listing_day)
r2rd <- dplyr::mutate(r2$recipient_database,
days_after_tx = 401 - transplant_day)
l <- list("current_candidates" = r2cc,
"recipient_database" = r2$recipient_database,
"waitlist_death_database" = r2$waitlist_death_database,
"post_tx_death_database" = r2$post_tx_death_database,
"non_used_donors" = r2$non_used_donors)
l_1 <- iteration(401, syn_cands, syn_dons, include_matches = FALSE, updated_list = l, match_alg = match_cas,
wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25, post_tx_weight = 0.25,
wl_cap = 365, post_tx_cap = 1826, bio_weight = .15, pld_weight = 0.05,
peds_weight = 0.2, efficiency_weight = 0.1)
l_2 <- iteration(402, syn_cands, syn_dons, include_matches = FALSE, updated_list = l_1, match_alg = match_cas,
wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25, post_tx_weight = 0.25,
wl_cap = 365, post_tx_cap = 1826, bio_weight = .15, pld_weight = 0.05,
peds_weight = 0.2, efficiency_weight = 0.1)
nrow(l_2$waitlist_death_database) - nrow(l_1$waitlist_death_database)
nrow(l_2$post_tx_death_database)-nrow(l_1$post_tx_death_database)
```
Looking in depth, `r nrow(l_2$waitlist_death_database)-nrow(l_1$waitlist_death_database)` died on the waiting list, and `r nrow(l_2$post_tx_death_database)-nrow(l_1$post_tx_death_database)` died post transplant on Day 402 in this scenario. The following would run ten days sequentially
```{r full_iterate}
l_i <- l
for(i in 401:410){
l_i <- iteration(i,
syn_cands, syn_dons, include_matches = FALSE, updated_list = l_i,
match_alg = match_cas, wl_model = "CAS23", post_tx_model = "CAS23",
wl_weight = 0.25, post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826,
bio_weight = .15, pld_weight = 0.05, peds_weight = 0.2, efficiency_weight = 0.1)
}
```
### Within each day
#### Updating candidates/recipients
`update_patients` is one of the steps that occur each day, it will output a list of those that died, were removed for other reasons if on the waiting list, those that are still alive and updated characteristics. Currently, the only characteristic that is updated is the candidate's/recipient's age, but it is made to be flexible to adjust any clinical characteristic (i.e. $\textrm{pCO}_2$,$\textrm{FEV}_1$, bilirubin, etc.)
Within `update_patients` are
* `identify_deaths` returns a dataset of those who died either on the waiting list or post-transplant^[If one wonders what model is "CAS23r", it is the baseline hazard for the "CAS23" models 'recalibrated' after research and initial simulations runs showed that the "CAS23" model did not accurately simulate waitlist and post-transplant deaths.]
* `identify_removals` returns a dataset of those who were removed on the waiting list^[Models for waitlist removal for other reasons are not part of the LAS/CAS and were modeled separately. The maximum day for this 2365 days after being placed on the waiting list]
```{r updating}
update_patients(patients = r2cc, model = "CAS23r",
elapsed_time = days_on_waitlist, pre_tx = TRUE, cap = 365, date = 401)
identify_deaths(patients = r2cc, model = "CAS23r",
elapsed_time = days_on_waitlist, pre_tx = TRUE, cap = 365, date = 401)
identify_removals(patients = r2cc, elapsed_time = days_on_waitlist, cap = 2365)
## post-transplant deaths
update_patients(r2rd, model = "CAS23r",
elapsed_time = days_after_tx, pre_tx = FALSE, cap = 1826, date = 401)
```
#### Matching
`match_cas` and `match_las` are the two main functions that handle screening, matching and acceptance.
```{r matching}
##
dons_401 <- dplyr::filter(syn_dons, recovery_day == 401)
mtc <- match_cas(cands = r2cc, dons = dons_401, wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25,
post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826, bio_weight = .15,
pld_weight = 0.05, peds_weight = 0.2, efficiency_weight = 0.1)
mtc
```
#### Screening
These functions are designed to input candidates and donors respectively. Here are displayed screening for height (`hm1`), blood type (`am1`), count (`cm1` double vs single transplant), and a distance calculation (`lm1`). One can also create addition functions for screening
```{r screening}
## within matching
## there is screening
## Screening is done for height, blood type, organ count,
hm1 <- height_screen(cands = r2cc, dons = dons_401)
hm1
## returns if compabitible for single, double transplant, if a double transplant they will match for double lung
am1 <- abo_screen(cands = r2cc, dons = dons_401)
am1
## returns abo_exact abo compatible organs since that is a tie breaker in the LAS
## Ensure double with double, singles may take a double if only is assigned or either can
cm1 <- count_screen(cands = r2cc, dons = dons_401)
cm1
## Distance calculation
lm1 <- dist_calc(cands = r2cc, dons = dons_401)
lm1
```
In our simulation you will see to rankings `ov_rank` which stands for overall ranking and `offer_rank`. `ov_rank` is the candidates rank with respective to all candidates on that day based on their LAS/CAS score, regardless of donor/offer. `offer_rank` is the rank of the candidate with respect to all compatible candidates to the given donor. The part of the CAS score that does not change called the "sub-CAS" calculates the `ov_rank`
```{r over rank}
## ranking
lr1 <- calculate_sub_cas(r2cc, wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25,
post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826, bio_weight = .15,
pld_weight = 0.05, peds_weight = 0.2)
lr1
```
#### Offering and Acceptance
Once all of the screening criteria, the candidates are matched on all screening criteria and made the offers in successive order based on either the CAS or LAS policy. There is an `acceptance_prop` function which will calculate the acceptance probability and run binomial acceptance for whether or not the candidate will accept the organ. And lastly `transplant_candidates` finds the first candidate that accepts the donor lung and transplants them so the donor identifier `d_id` is tracked with the candidate.
```{r offers}
## matching and offering rank
## Pra screening is not currently implemented,
## you'll notice that if you run the function it returns a combination of every candidate and donor
offs <- cas_offer_rank(hm1, am1, cm1, lm1, overall_ranking = lr1, efficiency_weight = 0.1)
offs
set.seed(26638)
acceptance_prob(offs, dons = dons_401, cands = r2cc)
transplant_candidates(matches = mtc)
```
## Other tips and tricks
You will notice a `pra_screen` function. This is a place-holder for now that returns a grid of every candidate and donor pairing. Future implementation will incorporate PRA screening criteria.
```{r pra}
pra_screen(cands = r2cc, dons = dons_401)
```
If you want to return the matches along with the simulated output. All one has to do in the `run_simulation` or `iteration` function is set `include_matches = TRUE`. The output can be seen below is a nested tibble with a vector of donor ids (`d_id`) that are ranked in order for the subsequent offer, called `all_matches`
```{r in_match}
l_1m <- iteration(401, syn_cands, syn_dons, include_matches = TRUE, updated_list = l, match_alg = match_cas,
wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25, post_tx_weight = 0.25,
wl_cap = 365, post_tx_cap = 1826, bio_weight = .15, pld_weight = 0.05,
peds_weight = 0.2, efficiency_weight = 0.1)
l_1m$all_matches
```
## A few helpful tips
`eval_simulation` behind the seens runs a function callled `concat2`. It concatenates the several candidates/recipients datasests to one and calculates survival data as well. By default it will use the minimum and maximum number of days simulated from `all_candidates` in a COMET object, but those can also be specified if one wanted to exclude candidates after day 30 `min_enroll_date` and before day 300 days `max_enroll_date` or censor post-transplant early at day 330 `post_tx_censor_date`. Waitlist censoring `wl_censor_date` is also possible to change, but we would recommend keeping it the same as the `max_enroll_date`.
```{r concat}
concat2(r1)
concat2(r1, min_enroll_date = 31, max_enroll_date = 300, wl_censor_date = 300, post_tx_censor_date = 330)
```
The function `spec_day` will allow one to grab the waiting list on a specific day.
```{r spec}
spec_day(r1, day = 300)
```
`rmst` will output the restricted mean survival time (RMST) or the Area Under the Curve (AUC) for the LAS and CAS waitlist mortality models as well as the linear predictor. The linear predictor is calculated using the `calc_post_tx_cas23` and `calc_wl_cas23` and other respective functions for the waitlist and post-transplant models for the CAS/LAS policies. Currently implemented are the 2015 LAS (`"LAS15"`), 2021 LAS (`"LAS21"`) and 2023 CAS (`"CAS23"`). This also runs `expected_survival` which actually calculates the AUC for the candidate based on their linear predictor.
```{r rmst}
rmst("LAS21", cand_data = r2cc, cap = 365, wl = TRUE)
rmst("CAS23", cand_data = r2cc, cap = 1826, wl = FALSE)
```
One can calculate RMST if one supplies a vector of survival times, or one can calculate the survival given the raw survival value on a given day. For example below we will use Day 364 of the CAS23 waitlist survival model which has a baseline survival value of 0.994390.
```{r exp_surv}
expected_survival(lp = c(0, 1, -0.5), 0.994390)
```
Owner
- Name: Cleveland Clinic Quantitative Health Sciences
- Login: ClevelandClinicQHS
- Kind: organization
- Email: QHSGitHubAdmin@ccf.org
- Location: United States of America
- Website: https://www.lerner.ccf.org/quantitative-health/
- Repositories: 12
- Profile: https://github.com/ClevelandClinicQHS
Open-source projects from the Department of Quantitative Health Sciences at Cleveland Clinic
Citation (CITATION.cff)
# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.5.0
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------
cff-version: 1.2.0
message: 'To cite package "COMET" in publications use:'
type: software
license: MIT
title: 'COMET: Computational Open-Source Model for Evaluating Transplantation'
version: 0.1.0
abstract: COMET provides an open source simulation to model the United States lung
allocation process within synthetic candidate and donor populations. For additional
functionality, this package interacts with data that is available in the 'cometdata'
package. To access this data run "remotes::install_github('https://github.com/ClevelandClinicQHS/cometdata',
type = 'source')".
authors:
- family-names: Gunsalus
given-names: Paul
email: gunsalp@ccf.org
orcid: https://orcid.org/0000-0001-6976-9094
- family-names: Rose
given-names: Johnie
email: rosej15@ccf.org
orcid: https://orcid.org/0000-0001-7458-9303
- family-names: Dalton
given-names: Jarrod
email: daltonj@ccf.org
orcid: https://orcid.org/0000-0001-8336-1212
contact:
- family-names: Gunsalus
given-names: Paul
email: gunsalp@ccf.org
orcid: https://orcid.org/0000-0001-6976-9094
references:
- type: software
title: Rcpp
abstract: 'Rcpp: Seamless R and C++ Integration'
notes: LinkingTo
url: https://www.rcpp.org
repository: https://CRAN.R-project.org/package=Rcpp
authors:
- family-names: Eddelbuettel
given-names: Dirk
- family-names: Francois
given-names: Romain
- family-names: Allaire
given-names: JJ
- family-names: Ushey
given-names: Kevin
- family-names: Kou
given-names: Qiang
- family-names: Russell
given-names: Nathan
- family-names: Ucar
given-names: Inaki
- family-names: Bates
given-names: Douglas
- family-names: Chambers
given-names: John
year: '2024'
- type: software
title: rlang
abstract: 'rlang: Functions for Base Types and Core R and ''Tidyverse'' Features'
notes: Imports
url: https://rlang.r-lib.org
repository: https://CRAN.R-project.org/package=rlang
authors:
- family-names: Henry
given-names: Lionel
email: lionel@posit.co
- family-names: Wickham
given-names: Hadley
email: hadley@posit.co
year: '2024'
version: '>= 1.1.1'
- type: software
title: methods
abstract: 'R: A Language and Environment for Statistical Computing'
notes: Imports
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2024'
institution:
name: R Foundation for Statistical Computing
- type: software
title: stats
abstract: 'R: A Language and Environment for Statistical Computing'
notes: Imports
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2024'
institution:
name: R Foundation for Statistical Computing
- type: software
title: dplyr
abstract: 'dplyr: A Grammar of Data Manipulation'
notes: Imports
url: https://dplyr.tidyverse.org
repository: https://CRAN.R-project.org/package=dplyr
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@posit.co
orcid: https://orcid.org/0000-0003-4757-117X
- family-names: François
given-names: Romain
orcid: https://orcid.org/0000-0002-2444-4226
- family-names: Henry
given-names: Lionel
- family-names: Müller
given-names: Kirill
orcid: https://orcid.org/0000-0002-1416-3412
- family-names: Vaughan
given-names: Davis
email: davis@posit.co
orcid: https://orcid.org/0000-0003-4777-038X
year: '2024'
version: '>= 1.1.0'
- type: software
title: Rcpp
abstract: 'Rcpp: Seamless R and C++ Integration'
notes: LinkingTo
url: https://www.rcpp.org
repository: https://CRAN.R-project.org/package=Rcpp
authors:
- family-names: Eddelbuettel
given-names: Dirk
- family-names: Francois
given-names: Romain
- family-names: Allaire
given-names: JJ
- family-names: Ushey
given-names: Kevin
- family-names: Kou
given-names: Qiang
- family-names: Russell
given-names: Nathan
- family-names: Ucar
given-names: Inaki
- family-names: Bates
given-names: Douglas
- family-names: Chambers
given-names: John
year: '2024'
version: '>= 1.0.10'
- type: software
title: splines
abstract: 'R: A Language and Environment for Statistical Computing'
notes: Imports
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2024'
institution:
name: R Foundation for Statistical Computing
- type: software
title: sn
abstract: 'sn: The Skew-Normal and Related Distributions Such as the Skew-t and
the SUN'
notes: Imports
url: http://azzalini.stat.unipd.it/SN/
repository: https://CRAN.R-project.org/package=sn
authors:
- family-names: Azzalini
given-names: Adelchi
email: adelchi.azzalini@unipd.it
orcid: https://orcid.org/0000-0002-7583-1269
year: '2024'
version: '>= 2.1.1'
- type: software
title: MASS
abstract: 'MASS: Support Functions and Datasets for Venables and Ripley''s MASS'
notes: Imports
url: http://www.stats.ox.ac.uk/pub/MASS4/
repository: https://CRAN.R-project.org/package=MASS
authors:
- family-names: Ripley
given-names: Brian
email: ripley@stats.ox.ac.uk
year: '2024'
- type: software
title: tidyr
abstract: 'tidyr: Tidy Messy Data'
notes: Imports
url: https://tidyr.tidyverse.org
repository: https://CRAN.R-project.org/package=tidyr
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@posit.co
- family-names: Vaughan
given-names: Davis
email: davis@posit.co
- family-names: Girlich
given-names: Maximilian
year: '2024'
version: '>= 1.3.0'
- type: software
title: survival
abstract: 'survival: Survival Analysis'
notes: Suggests
url: https://github.com/therneau/survival
repository: https://CRAN.R-project.org/package=survival
authors:
- family-names: Therneau
given-names: Terry M
email: therneau.terry@mayo.edu
year: '2024'
- type: software
title: cometdata
abstract: 'cometdata: Computational Open-source Model for Transplantation data'
notes: Suggests
authors:
- family-names: Gunsalus
given-names: Paul
email: gunsalp@ccf.org
orcid: https://orcid.org/0000-0001-6976-9094
- family-names: Rose
given-names: Johnie
email: rosej15@ccf.org
orcid: https://orcid.org/0000-0001-7458-9303
- family-names: Dalton
given-names: Jarrod
email: daltonj@ccf.org
orcid: https://orcid.org/0000-0001-8336-1212
year: '2024'
version: '>= 0.0.1'
- type: software
title: testthat
abstract: 'testthat: Unit Testing for R'
notes: Suggests
url: https://testthat.r-lib.org
repository: https://CRAN.R-project.org/package=testthat
authors:
- family-names: Wickham
given-names: Hadley
email: hadley@posit.co
year: '2024'
version: '>= 3.0.0'
- type: software
title: 'R: A Language and Environment for Statistical Computing'
notes: Depends
url: https://www.R-project.org/
authors:
- name: R Core Team
location:
name: Vienna, Austria
year: '2024'
institution:
name: R Foundation for Statistical Computing
version: '>= 4.1.0'
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2
Dependencies
DESCRIPTION
cran
- R >= 4.2.0 depends
- MASS >= 7.3 imports
- Rcpp >= 1.0.10 imports
- dplyr >= 1.1.0 imports
- methods * imports
- plyr >= 1.8.8 imports
- purrr >= 1.0.1 imports
- rlang >= 1.1.1 imports
- sn >= 2.1.1 imports
- splines * imports
- stats >= 4.2.3 imports
- stringr >= 1.5.0 imports
- tidyr >= 1.3.0 imports
- survival * suggests