comet

COMET: Computational Open-source Model for Evaluating Transplantation

https://github.com/clevelandclinicqhs/comet

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

COMET: Computational Open-source Model for Evaluating Transplantation

Basic Info

Host: GitHub
Owner: ClevelandClinicQHS
License: other
Language: R
Default Branch: main
Homepage:
Size: 666 MB

Statistics

Stars: 3
Watchers: 1
Forks: 0
Open Issues: 1
Releases: 2

Created almost 3 years ago · Last pushed over 1 year ago

Metadata Files

Readme Changelog License Citation

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# COMET




The goal of COMET^[Rose, Johnie, Paul R. Gunsalus, Carli J. Lehr, Mark F. Swiler, Jarrod E. Dalton, and Maryam Valapour. "A modular simulation framework for organ allocation." The Journal of Heart and Lung Transplantation (2024). https://doi.org/10.1016/j.healun.2024.04.063] is to simulate provide an open source framework to simulate the United States Lung Allocation System

## Installation

You can install the development version of COMET like so:
```{r, eval=FALSE}
# install.packages("devtools")
devtools::install_github("ClevelandClinicQHS/COMET")
```

You will be prompted to install this package if it is not installed already. It is needed for `run_simulation` and `gen_and_spawn_candidates/gen_and_spawn_donors`
```{r, eval=FALSE}
devtools::install_github("ClevelandClinicQHS/cometdata")
```

## Basic example
This will simulate the 2015 Lung Allocation Score (LAS) policy for 400 days (`r1`) and then the same scenario under the Composite Allocation Score (CAS) policy (`r2`).
```{r example}
library(COMET)

r1 <- run_simulation(days = 400, can_start = 1000,
                     match_alg = match_las, wl_model = "LAS15", post_tx_model = "LAS15",
                     wl_weight = 2, post_tx_weight = 1, wl_cap = 365, post_tx_cap = 365, seed = 26638)

r2 <- run_simulation(days = 400, can_start = 1000,
                     match_alg = match_cas, wl_model = "CAS23", post_tx_model = "CAS23",
                     wl_weight = 0.25, post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826,
                     bio_weight = .15, pld_weight = 0.05, peds_weight = 0.2, efficiency_weight = 0.1, seed = 26638)

```

A simple way to evaluate the COMET runs, one can also look at the statistics by diagnosis group or another variable. For details on what is reported run `help('eval_simulation')`.
```{r comparison of simualtions}
# see how many were transplanted, donor count, waitlist deaths
eval_simulation(r1)

eval_simulation(r2)

## See results by diagnosis group
eval_simulation(r1, group = dx_grp)
eval_simulation(r2, group = dx_grp)
```

One can also mix and match the models and weights under the LAS/CAS policies. `r3` simulates the but the LAS policy but with the CAS waitlist and post-transplant mortality models (post-transplant survival truncated at 1 year). `r4` is the same scenario as `r3`, but waitlist mortality receives the same weight as post-transplant (1 to 1).
```{r example2}
r3 <- run_simulation(days = 400, can_start = 1000,
                     match_alg = match_las, wl_model = "CAS23", post_tx_model = "CAS23",
                     wl_weight = 2, post_tx_weight = 1, wl_cap = 365, post_tx_cap = 365, seed = 26638)

r4 <- run_simulation(days = 400, can_start = 1000,
                     match_alg = match_las, wl_model = "CAS23", post_tx_model = "CAS23",
                     wl_weight = 1, post_tx_weight = 1, wl_cap = 365, post_tx_cap = 365, seed = 26638)
```

## A more indepth demonstration to show what is going on behind the scenes
### Generates candidates and donors
One major advantage of the simulation is the use of synthetic candidates and donors.
```{r generation}
set.seed(26638)
syn_cands <- gen_and_spawn_candidates(days = 10)
syn_dons <- gen_and_spawn_donors(days = 10)
head(syn_cands)
head(syn_dons)
```

For this scenario I'm going to add `r max(r2$all_candidates$listing_day)` days to the listing and recovery day to `syn_cands` and `syn_donors` and also make sure the candidate and donor ids do not conflict. I'm going to continue the results from CAS simulation (`r2`). The notation for the LAS simulation (`r1`) is very similar.
```{r iterate}
syn_cands <- dplyr::mutate(syn_cands,
                           listing_day = listing_day + max(r2$all_candidates$listing_day),
                           c_id = c_id + max(r2$all_candidates$c_id))

syn_dons <- dplyr::mutate(syn_dons,
                          recovery_day = recovery_day + max(r2$all_candidates$listing_day),
                          d_id = d_id + max(r2$all_donors$d_id))

r2cc <- dplyr::mutate(r2$current_candidates,
                      days_on_waitlist = 401 - listing_day)

r2rd <- dplyr::mutate(r2$recipient_database,
                      days_after_tx = 401 - transplant_day)

l <- list("current_candidates" = r2cc,
          "recipient_database" = r2$recipient_database,
          "waitlist_death_database" = r2$waitlist_death_database,
          "post_tx_death_database" = r2$post_tx_death_database,
          "non_used_donors" = r2$non_used_donors)

l_1 <- iteration(401, syn_cands, syn_dons, include_matches = FALSE, updated_list = l, match_alg = match_cas,
                 wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25, post_tx_weight = 0.25,
                 wl_cap = 365, post_tx_cap = 1826, bio_weight = .15, pld_weight = 0.05,
                 peds_weight = 0.2, efficiency_weight = 0.1)

l_2 <- iteration(402, syn_cands, syn_dons, include_matches = FALSE, updated_list = l_1, match_alg = match_cas,
                 wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25, post_tx_weight = 0.25,
                 wl_cap = 365, post_tx_cap = 1826, bio_weight = .15, pld_weight = 0.05,
                 peds_weight = 0.2, efficiency_weight = 0.1)

nrow(l_2$waitlist_death_database) - nrow(l_1$waitlist_death_database)
nrow(l_2$post_tx_death_database)-nrow(l_1$post_tx_death_database)
```

Looking in depth, `r nrow(l_2$waitlist_death_database)-nrow(l_1$waitlist_death_database)` died on the waiting list, and `r nrow(l_2$post_tx_death_database)-nrow(l_1$post_tx_death_database)` died post transplant on Day 402 in this scenario. The following would run ten days sequentially
```{r full_iterate}
l_i <- l
for(i in 401:410){
  l_i <- iteration(i,
                   syn_cands, syn_dons, include_matches = FALSE, updated_list = l_i,
                   match_alg = match_cas,  wl_model = "CAS23", post_tx_model = "CAS23",
                   wl_weight = 0.25, post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826,
                   bio_weight = .15, pld_weight = 0.05, peds_weight = 0.2, efficiency_weight = 0.1)
  
}
```

### Within each day
#### Updating candidates/recipients
`update_patients` is one of the steps that occur each day, it will output a list of those that died, were removed for other reasons if on the waiting list, those that are still alive and updated characteristics. Currently, the only characteristic that is updated is the candidate's/recipient's age, but it is made to be flexible to adjust any clinical characteristic (i.e. $\textrm{pCO}_2$,$\textrm{FEV}_1$, bilirubin, etc.)

Within `update_patients` are 

* `identify_deaths` returns a dataset of those who died either on the waiting list or post-transplant^[If one wonders what model is "CAS23r", it is the baseline hazard for the "CAS23" models 'recalibrated' after research and initial simulations runs showed that the "CAS23" model did not accurately simulate waitlist and post-transplant deaths.]

* `identify_removals` returns a dataset of those who were removed on the waiting list^[Models for waitlist removal for other reasons are not part of the LAS/CAS and were modeled separately. The maximum day for this 2365 days after being placed on the waiting list]

```{r updating}
update_patients(patients = r2cc, model = "CAS23r", 
                elapsed_time = days_on_waitlist, pre_tx = TRUE, cap = 365, date = 401)

identify_deaths(patients = r2cc, model = "CAS23r",
                elapsed_time = days_on_waitlist, pre_tx = TRUE, cap = 365, date = 401)

identify_removals(patients = r2cc, elapsed_time = days_on_waitlist, cap = 2365)

## post-transplant deaths
update_patients(r2rd, model = "CAS23r", 
                elapsed_time = days_after_tx, pre_tx = FALSE, cap = 1826, date = 401)

```

#### Matching
`match_cas` and `match_las` are the two main functions that handle screening, matching and acceptance.
```{r matching}
##
dons_401 <- dplyr::filter(syn_dons, recovery_day == 401)

mtc <- match_cas(cands = r2cc, dons = dons_401, wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25,
                 post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826, bio_weight = .15,
                 pld_weight = 0.05, peds_weight = 0.2, efficiency_weight = 0.1)

mtc

```

#### Screening
These functions are designed to input candidates and donors respectively. Here are displayed screening for height (`hm1`), blood type (`am1`), count (`cm1` double vs single transplant), and a distance calculation (`lm1`). One can also create addition functions for screening
```{r screening}
## within matching
## there is screening
## Screening is done for height, blood type, organ count,
hm1 <- height_screen(cands = r2cc, dons = dons_401)
hm1
## returns if compabitible for single, double transplant, if a double transplant they will match for double lung
am1 <- abo_screen(cands = r2cc, dons = dons_401)
am1
## returns abo_exact abo compatible organs since that is a tie breaker in the LAS

## Ensure double with double, singles may take a double if only is assigned or either can
cm1 <- count_screen(cands = r2cc, dons = dons_401)
cm1

## Distance calculation 
lm1 <- dist_calc(cands = r2cc, dons = dons_401)
lm1


```

In our simulation you will see to rankings `ov_rank` which stands for overall ranking and `offer_rank`. `ov_rank` is the candidates rank with respective to all candidates on that day based on their LAS/CAS score, regardless of donor/offer. `offer_rank` is the rank of the candidate with respect to all compatible candidates to the given donor. The part of the CAS score that does not change called the "sub-CAS" calculates the `ov_rank`

```{r over rank}
## ranking
lr1 <- calculate_sub_cas(r2cc, wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25,
                         post_tx_weight = 0.25, wl_cap = 365, post_tx_cap = 1826, bio_weight = .15,
                         pld_weight = 0.05, peds_weight = 0.2)

lr1
```

#### Offering and Acceptance
Once all of the screening criteria, the candidates are matched on all screening criteria and made the offers in successive order based on either the CAS or LAS policy. There is an `acceptance_prop` function which will calculate the acceptance probability and run binomial acceptance for whether or not the candidate will accept the organ. And lastly `transplant_candidates` finds the first candidate that accepts the donor lung and transplants them so the donor identifier `d_id` is tracked with the candidate.
```{r offers}

## matching and offering rank
## Pra screening is not currently implemented,
## you'll notice that if you run the function it returns a combination of every candidate and donor
offs <- cas_offer_rank(hm1, am1, cm1, lm1, overall_ranking = lr1, efficiency_weight = 0.1)
offs

set.seed(26638)
acceptance_prob(offs, dons = dons_401, cands = r2cc)

transplant_candidates(matches = mtc)
```

## Other tips and tricks

You will notice a `pra_screen` function. This is a place-holder for now that returns a grid of every candidate and donor pairing. Future implementation will incorporate PRA screening criteria.
```{r pra}
pra_screen(cands = r2cc, dons = dons_401)
```

If you want to return the matches along with the simulated output. All one has to do in the `run_simulation` or `iteration` function is set `include_matches = TRUE`. The output can be seen below is a nested tibble with a vector of donor ids (`d_id`) that are ranked in order for the subsequent offer, called `all_matches`
```{r in_match}
l_1m <- iteration(401, syn_cands, syn_dons, include_matches = TRUE, updated_list = l, match_alg = match_cas,
                 wl_model = "CAS23", post_tx_model = "CAS23", wl_weight = 0.25, post_tx_weight = 0.25,
                 wl_cap = 365, post_tx_cap = 1826, bio_weight = .15, pld_weight = 0.05,
                 peds_weight = 0.2, efficiency_weight = 0.1)

l_1m$all_matches
```

## A few helpful tips
`eval_simulation` behind the seens runs a function callled `concat2`. It concatenates the several candidates/recipients datasests to one and calculates survival data as well. By default it will use the minimum and maximum number of days simulated from `all_candidates` in a COMET object, but those can also be specified if one wanted to exclude candidates after day 30 `min_enroll_date` and before day 300 days `max_enroll_date` or censor post-transplant early at day 330 `post_tx_censor_date`. Waitlist censoring `wl_censor_date` is also possible to change, but we would recommend keeping it the same as the `max_enroll_date`. 
```{r concat}
concat2(r1)
concat2(r1, min_enroll_date = 31, max_enroll_date = 300, wl_censor_date = 300, post_tx_censor_date = 330)

```

The function `spec_day` will allow one to grab the waiting list on a specific day.
```{r spec}
spec_day(r1, day = 300)
```

`rmst` will output the restricted mean survival time (RMST) or the Area Under the Curve (AUC) for the LAS and CAS waitlist mortality models as well as the linear predictor. The linear predictor is calculated using the `calc_post_tx_cas23` and `calc_wl_cas23` and  other respective functions for the waitlist and post-transplant models for the CAS/LAS policies. Currently implemented are the 2015 LAS (`"LAS15"`), 2021 LAS (`"LAS21"`) and 2023 CAS (`"CAS23"`). This also runs `expected_survival` which actually calculates the AUC for the candidate based on their linear predictor.

```{r rmst}
rmst("LAS21", cand_data = r2cc, cap = 365, wl = TRUE)
rmst("CAS23", cand_data = r2cc, cap = 1826, wl = FALSE)
```

One can calculate RMST if one supplies a vector of survival times, or one can calculate the survival given the raw survival value on a given day. For example below we will use Day 364 of the CAS23 waitlist survival model which has a baseline survival value of 0.994390.
```{r exp_surv}
expected_survival(lp = c(0, 1, -0.5), 0.994390)
```

Owner

Name: Cleveland Clinic Quantitative Health Sciences
Login: ClevelandClinicQHS
Kind: organization
Email: QHSGitHubAdmin@ccf.org
Location: United States of America

Website: https://www.lerner.ccf.org/quantitative-health/
Repositories: 12
Profile: https://github.com/ClevelandClinicQHS

Open-source projects from the Department of Quantitative Health Sciences at Cleveland Clinic

Citation (CITATION.cff)

# -----------------------------------------------------------
# CITATION file created with {cffr} R package, v0.5.0
# See also: https://docs.ropensci.org/cffr/
# -----------------------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "COMET" in publications use:'
type: software
license: MIT
title: 'COMET: Computational Open-Source Model for Evaluating Transplantation'
version: 0.1.0
abstract: COMET provides an open source simulation to model the United States lung
  allocation process within synthetic candidate and donor populations. For additional
  functionality, this package interacts with data that is available in the 'cometdata'
  package. To access this data run "remotes::install_github('https://github.com/ClevelandClinicQHS/cometdata',
  type = 'source')".
authors:
- family-names: Gunsalus
  given-names: Paul
  email: gunsalp@ccf.org
  orcid: https://orcid.org/0000-0001-6976-9094
- family-names: Rose
  given-names: Johnie
  email: rosej15@ccf.org
  orcid: https://orcid.org/0000-0001-7458-9303
- family-names: Dalton
  given-names: Jarrod
  email: daltonj@ccf.org
  orcid: https://orcid.org/0000-0001-8336-1212
contact:
- family-names: Gunsalus
  given-names: Paul
  email: gunsalp@ccf.org
  orcid: https://orcid.org/0000-0001-6976-9094
references:
- type: software
  title: Rcpp
  abstract: 'Rcpp: Seamless R and C++ Integration'
  notes: LinkingTo
  url: https://www.rcpp.org
  repository: https://CRAN.R-project.org/package=Rcpp
  authors:
  - family-names: Eddelbuettel
    given-names: Dirk
  - family-names: Francois
    given-names: Romain
  - family-names: Allaire
    given-names: JJ
  - family-names: Ushey
    given-names: Kevin
  - family-names: Kou
    given-names: Qiang
  - family-names: Russell
    given-names: Nathan
  - family-names: Ucar
    given-names: Inaki
  - family-names: Bates
    given-names: Douglas
  - family-names: Chambers
    given-names: John
  year: '2024'
- type: software
  title: rlang
  abstract: 'rlang: Functions for Base Types and Core R and ''Tidyverse'' Features'
  notes: Imports
  url: https://rlang.r-lib.org
  repository: https://CRAN.R-project.org/package=rlang
  authors:
  - family-names: Henry
    given-names: Lionel
    email: lionel@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2024'
  version: '>= 1.1.1'
- type: software
  title: methods
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2024'
  institution:
    name: R Foundation for Statistical Computing
- type: software
  title: stats
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2024'
  institution:
    name: R Foundation for Statistical Computing
- type: software
  title: dplyr
  abstract: 'dplyr: A Grammar of Data Manipulation'
  notes: Imports
  url: https://dplyr.tidyverse.org
  repository: https://CRAN.R-project.org/package=dplyr
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: François
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Henry
    given-names: Lionel
  - family-names: Müller
    given-names: Kirill
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Vaughan
    given-names: Davis
    email: davis@posit.co
    orcid: https://orcid.org/0000-0003-4777-038X
  year: '2024'
  version: '>= 1.1.0'
- type: software
  title: Rcpp
  abstract: 'Rcpp: Seamless R and C++ Integration'
  notes: LinkingTo
  url: https://www.rcpp.org
  repository: https://CRAN.R-project.org/package=Rcpp
  authors:
  - family-names: Eddelbuettel
    given-names: Dirk
  - family-names: Francois
    given-names: Romain
  - family-names: Allaire
    given-names: JJ
  - family-names: Ushey
    given-names: Kevin
  - family-names: Kou
    given-names: Qiang
  - family-names: Russell
    given-names: Nathan
  - family-names: Ucar
    given-names: Inaki
  - family-names: Bates
    given-names: Douglas
  - family-names: Chambers
    given-names: John
  year: '2024'
  version: '>= 1.0.10'
- type: software
  title: splines
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2024'
  institution:
    name: R Foundation for Statistical Computing
- type: software
  title: sn
  abstract: 'sn: The Skew-Normal and Related Distributions Such as the Skew-t and
    the SUN'
  notes: Imports
  url: http://azzalini.stat.unipd.it/SN/
  repository: https://CRAN.R-project.org/package=sn
  authors:
  - family-names: Azzalini
    given-names: Adelchi
    email: adelchi.azzalini@unipd.it
    orcid: https://orcid.org/0000-0002-7583-1269
  year: '2024'
  version: '>= 2.1.1'
- type: software
  title: MASS
  abstract: 'MASS: Support Functions and Datasets for Venables and Ripley''s MASS'
  notes: Imports
  url: http://www.stats.ox.ac.uk/pub/MASS4/
  repository: https://CRAN.R-project.org/package=MASS
  authors:
  - family-names: Ripley
    given-names: Brian
    email: ripley@stats.ox.ac.uk
  year: '2024'
- type: software
  title: tidyr
  abstract: 'tidyr: Tidy Messy Data'
  notes: Imports
  url: https://tidyr.tidyverse.org
  repository: https://CRAN.R-project.org/package=tidyr
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  - family-names: Vaughan
    given-names: Davis
    email: davis@posit.co
  - family-names: Girlich
    given-names: Maximilian
  year: '2024'
  version: '>= 1.3.0'
- type: software
  title: survival
  abstract: 'survival: Survival Analysis'
  notes: Suggests
  url: https://github.com/therneau/survival
  repository: https://CRAN.R-project.org/package=survival
  authors:
  - family-names: Therneau
    given-names: Terry M
    email: therneau.terry@mayo.edu
  year: '2024'
- type: software
  title: cometdata
  abstract: 'cometdata: Computational Open-source Model for Transplantation data'
  notes: Suggests
  authors:
  - family-names: Gunsalus
    given-names: Paul
    email: gunsalp@ccf.org
    orcid: https://orcid.org/0000-0001-6976-9094
  - family-names: Rose
    given-names: Johnie
    email: rosej15@ccf.org
    orcid: https://orcid.org/0000-0001-7458-9303
  - family-names: Dalton
    given-names: Jarrod
    email: daltonj@ccf.org
    orcid: https://orcid.org/0000-0001-8336-1212
  year: '2024'
  version: '>= 0.0.1'
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  url: https://testthat.r-lib.org
  repository: https://CRAN.R-project.org/package=testthat
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2024'
  version: '>= 3.0.0'
- type: software
  title: 'R: A Language and Environment for Statistical Computing'
  notes: Depends
  url: https://www.R-project.org/
  authors:
  - name: R Core Team
  location:
    name: Vienna, Austria
  year: '2024'
  institution:
    name: R Foundation for Statistical Computing
  version: '>= 4.1.0'

GitHub Events

Total

Push event: 2

Last Year

Push event: 2

Dependencies

DESCRIPTION cran

R >= 4.2.0 depends
MASS >= 7.3 imports
Rcpp >= 1.0.10 imports
dplyr >= 1.1.0 imports
methods * imports
plyr >= 1.8.8 imports
purrr >= 1.0.1 imports
rlang >= 1.1.1 imports
sn >= 2.1.1 imports
splines * imports
stats >= 4.2.3 imports
stringr >= 1.5.0 imports
tidyr >= 1.3.0 imports
survival * suggests

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science