cureauxsp

An R package used to fit cure models with auxiliary survival Information.

https://github.com/biostat-jieding/cureauxsp

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

An R package used to fit cure models with auxiliary survival Information.

Basic Info
  • Host: GitHub
  • Owner: biostat-jieding
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 161 KB
Statistics
  • Stars: 4
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 4 years ago · Last pushed about 2 years ago
Metadata Files
Readme

README.md

CureAuxSP

This is an R package used to fit Cure models with Auxiliary information in type of Subgroup survival Probabilities. - The underlying methods are based on the paper titled "Efficient auxiliary information synthesis for cure rate model", which has been accepted by Journal of the Royal Statistial Society Series C (Applied Statistics) with DOI: 10.1093/jrsssc/qlad106. - We have also wrote a paper titled "CureAuxSP: An R package for estimating mixture cure models with auxiliary survival probabilities" that introduces the detailed information about this package and this paper has been accepted by Computer Methods and Programs in Biomedicine with DOI: 10.1016/j.cmpb.2024.108212. - It can also be downloaded as a standard R package from https://cran.r-project.org/web/packages/CureAuxSP/index.html.

Package description and included main functions

We focus on cure models: (Semiparametric) PH mixture cure model or AFT mixture cure model with or without auxiliary subgroup survival probabilities.

Our provided main functions are (we refer to their help pages for more details): - SMC.AuxSP: fit the models in various ways with syntax R SMC.AuxSP(formula, cureform, sdata, aux = NULL, hetero = FALSE, N = Inf, latency = "PH", nboot = 400) - print.SMC.AuxSP: print outputted results from SMC.AuxSP() with syntax R print.SMC.AuxSP(object)

Two numerical illustrations

An example using a simulated dataset is shown below:

```R

library

library(survival) library(CureAuxSP)

generate both the internal dataset of interest and the external dataset

- the internal dataset

set.seed(1) sdata.internal <- sdata.SMC(n = 300) head(sdata.internal)

- the external dataset

set.seed(1) sdata.external <- sdata.SMC(n = 10000)

prepare the auxiliary information based on the external dataset

- define two functions for subgroup splitting

gfunc.t1 <- function(X,Z=NULL){ rbind((X[,1] < 0 & X[,2] == 0), (X[,1] >= 0 & X[,2] == 0), (X[,1] < 0 & X[,2] == 1), (X[,1] >= 0 & X[,2] == 1))} gfunc.t2 <- function(X,Z=NULL){rbind((X[,2] == 0), (X[,2] == 1))}

- calculate subgroup survival rates

sprob.t1 <- Probs.Sub(tstar = 1, sdata = sdata.external, G = gfunc.t1(X = sdata.external[,-c(1,2)])) sprob.t2 <- Probs.Sub(tstar = 2, sdata = sdata.external, G = gfunc.t2(X = sdata.external[,-c(1,2)])) cat("Information at t* = 1:", sprob.t1, "\nInformation at t* = 2:", sprob.t2)

- prepare the set that collects information about auxiliary data

aux <- list( time1 = list(tstar = 1, gfunc = gfunc.t1, sprob = c(0.73,0.70,0.88,0.83)), time2 = list(tstar = 2, gfunc = gfunc.t2, sprob = c(0.62,0.76)-0.20) )

fit the model without auxiliary information

set.seed(1) sol.PHMC <- SMC.AuxSP( formula = Surv(yobs,delta) ~ X1 + X2, cureform = ~ X1, sdata = sdata.internal, aux = NULL, latency = "PH" ) print.SMC.AuxSP(object = sol.PHMC)

fit the model with auxiliary information

- ignore heterogeneity

set.seed(1) sol.PHMC.Homo <- SMC.AuxSP( formula = Surv(yobs,delta) ~ X1 + X2, cureform = ~ X1, sdata = sdata.internal, aux = aux, hetero = FALSE, latency = "PH" ) print.SMC.AuxSP(object = sol.PHMC.Homo)

- consider heterogeneity

set.seed(1) sol.PHMC.Hetero <- SMC.AuxSP( formula = Surv(yobs,delta) ~ X1 + X2, cureform = ~ X1, sdata = sdata.internal, aux = aux, hetero = TRUE, latency = "PH" ) print.SMC.AuxSP(object = sol.PHMC.Hetero) ```

An example using a real dataset from TCGA program is shown below:

```R

library

library(survival) library(CureAuxSP)

prepare the breast cancer dataset

- download clinical data from the TCGA website

library(TCGAbiolinks) query <- GDCquery(project = "TCGA-BRCA", data.category = "Clinical", file.type = "xml") GDCdownload(query) clinical <- GDCprepare_clinic(query, clinical.info = "patient")

- a preparation

sdata.pre <- data.frame( yobs = ifelse(!is.na(clinical[,'daystodeath']),clinical[,'daystodeath'],clinical[,'daystolastfollowup'])/365, delta = ifelse(!is.na(clinical[,'daystodeath']),1,0), ER = ifelse(clinical[,'breastcarcinomaestrogenreceptorstatus']=='Positive',1,ifelse(clinical[,'breastcarcinomaestrogenreceptorstatus']=='Negative',0,NA)), Age = clinical[,'ageatinitialpathologicdiagnosis'], Race = ifelse(clinical[,'racelist']=='BLACK OR AFRICAN AMERICAN','black',ifelse(clinical[,'racelist']=='WHITE','white','other')), Gender = ifelse(clinical[,'gender']=='FEMALE','Female',ifelse(clinical[,'gender']=='MALE','Male',NA)), Stage = sapply(clinical[,'stageeventpathologicstage'],function(x,pattern='Stage X|Stage IV|Stage [I]*'){ifelse(grepl(pattern,x),regmatches(x,regexpr(pattern,x)),NA)},USE.NAMES = FALSE) )

- extract covariates and remove undesiable subjects and NA

sdata.TCGA <- na.omit( sdata.pre[ sdata.pre[,'yobs'] > 0 & sdata.pre[,'Age'] <= 75 & sdata.pre[,'Gender'] == "Female" & sdata.pre[,'Race'] %in% c('white') & sdata.pre[,'Stage'] %in% c('Stage I','Stage II','Stage III'), c('yobs','delta','Age','ER'), ] ) rownames(sdata.TCGA) <- NULL

- summary statistics of the internal dataset

summary(sdata.TCGA)

plot a figure to show the existence of a cure fraction

pdf("FigureKMTCGA_BRCA.pdf",width=8.88,height=6.66); {

plot( survival::survfit(survival::Surv(yobs, delta) ~ 1, data = sdata.TCGA), conf.int = T, mark.time = TRUE, lwd = 2, ylab = "Survival Probability", xlab = "Survival Time (in Years)", xlim = c(0,25), ylim = c(0,1) )

}; dev.off()

fit the model without auxiliary information

- rescale the Age variable

Age.Min <- min(sdata.TCGA$Age); Age.Max <- max(sdata.TCGA$Age) sdata.TCGA$Age <- (sdata.TCGA$Age-Age.Min)/(Age.Max-Age.Min)

- fit the model

set.seed(1) sol.PHMC <- SMC.AuxSP( formula = Surv(yobs,delta) ~ Age + ER, cureform = ~ Age + ER, sdata = sdata.TCGA, aux = NULL, latency = "PH" ) print.SMC.AuxSP(object = sol.PHMC)

fit the model with auxiliary information

- prepare the auxiliary information

Age.Cut <- c(0,(c(40,50,60)-Age.Min)/(Age.Max-Age.Min),1) gfunc.t1 <- function(X,Z){ rbind((X[,1] >= Age.Cut[1] & X[,1] < Age.Cut[2] & X[,2] == 1), (X[,1] >= Age.Cut[2] & X[,1] < Age.Cut[3] & X[,2] == 1), (X[,1] >= Age.Cut[3] & X[,1] < Age.Cut[4] & X[,2] == 1), (X[,1] >= Age.Cut[4] & X[,1] <= Age.Cut[5] & X[,2] == 1), (X[,1] >= Age.Cut[1] & X[,1] < Age.Cut[2] & X[,2] == 0), (X[,1] >= Age.Cut[2] & X[,1] < Age.Cut[3] & X[,2] == 0), (X[,1] >= Age.Cut[3] & X[,1] < Age.Cut[4] & X[,2] == 0), (X[,1] >= Age.Cut[4] & X[,1] <= Age.Cut[5] & X[,2] == 0))} gfunc.t2 <- function(X,Z){rbind((X[,2] == 1), (X[,2] == 0))} aux <- list( time1 = list(tstar = 5, gfunc = gfunc.t1, sprob = c(0.810,0.935,0.925,0.950,0.695,0.780,0.830,0.850)), time2 = list(tstar = 10, gfunc = gfunc.t2, sprob = c(0.825,0.705)) )

- ignore heterogeneity

set.seed(1) sol.PHMC.Homo <- SMC.AuxSP( formula = Surv(yobs,delta) ~ Age + ER, cureform = ~ Age + ER, sdata = sdata.TCGA, aux = aux, hetero = FALSE, N = 1910, latency = "PH" ) print.SMC.AuxSP(object = sol.PHMC.Homo)

- consider heterogeneity

set.seed(1) sol.PHMC.Hetero <- SMC.AuxSP( formula = Surv(yobs,delta) ~ Age + ER, cureform = ~ Age + ER, sdata = sdata.TCGA, aux = aux, hetero = TRUE, N = 1910, latency = "PH" ) print.SMC.AuxSP(object = sol.PHMC.Hetero) ```

Owner

  • Name: Jie Ding
  • Login: biostat-jieding
  • Kind: user

I focus on the application of survival analysis in biology. My current research interest is transfer learning and high-dimensional inference.

GitHub Events

Total
Last Year

Packages

  • Total packages: 1
  • Total downloads:
    • cran 290 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: CureAuxSP

Mixture Cure Models with Auxiliary Subgroup Survival Probabilities

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 290 Last month
Rankings
Dependent packages count: 28.1%
Dependent repos count: 36.1%
Average: 49.7%
Downloads: 85.0%
Maintainers (1)
Last synced: 10 months ago