Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 3 committers (33.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.8%) to scientific vocabulary
Keywords
art
hiv
modeling-tools
package
r
Last synced: 6 months ago
·
JSON representation
Repository
Understanding suppression of HIV in R
Basic Info
- Host: GitHub
- Owner: SineadMorris
- License: other
- Language: R
- Default Branch: master
- Size: 5.12 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
art
hiv
modeling-tools
package
r
Created over 6 years ago
· Last pushed almost 6 years ago
Metadata Files
Readme
License
README.Rmd
---
title: "ushr: understanding suppression of HIV in R"
output: github_document
bibliography: HIV.bib
link-citations: yes
csl: AIDSbibstyle.csl
---
[](https://travis-ci.com/SineadMorris/ushr)
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, fig.height = 6, fig.width = 8, message = FALSE, warning = FALSE)
```
## Introduction
In 2017, HIV/AIDS was responsible for the deaths of one million people globally, including 50,000 children less than one year old [@GBD2017paper; @GBD2017web]. Although mathematical modeling has provided important insights into the dynamics of HIV infection during anti-retroviral treatment (ART), there is still a lack of accessible tools for researchers unfamiliar with modeling techniques to apply them to their own datasets.
`ushr` is an open-source R package that models the decline of HIV during ART using a popular mathematical framework. The package can be applied to longitudinal data of viral load measurements, and automates all stages of the model fitting process. By mathematically fitting the data, important biological parameters can be estimated, including the lifespans of short and long-lived HIV-infected cells, and the time to suppress viral load below a defined detection threshold. The package also provides visualization and summary tools for fast assessment of model results.
Overall, we hope `ushr` will increase accessibility to mathematical modeling techniques so that greater insights on HIV infection and treatment dynamics may be gained.
##### Author and Contributors
Sinead E Morris (author and maintainer), Luise Dziobek-Garrett (contributor) and Andrew J Yates (contributor).
##### Citing this package
Citation information can be found using `citation("ushr")`; the package paper is open access and available at [BMC Bioinformatics](https://rdcu.be/b1yZU).
##### Recent updates
Versions 0.2.2 and 0.2.3
* Allows more flexible treatment of values below the detection threshold.
* Fixes errors thrown with dplyr 1.0.0 and tibble 3.0.0 releases.
Version 0.2.1
* Notation for the triphasic exponential model has been modified to more clearly relate to that of the biphasic model.
Version 0.2.0:
* Data from ART containing integrase inhibitors can be fit using a triphasic exponential model (see `?ushr_triphasic()`)
* Users can specify the range of initial observations from which the beginning of each individual trajectory is chosen
* Users can create pairwise correlation plots for all estimated parameters
##### Getting further information
If you encounter any bugs related to this package please contact the package author directly. Additional descriptions of the model and analyses performed by the package can be found in the vignette; details are also be available in [Morris et al. (2020) BMC Bioinformatics](https://rdcu.be/b1yZU). Further details on the mathematical theory can also be found in the references cited below.
## Background
Please read the package vignette for full details on the mathematical model and its implementation in `ushr`, including data processing, model fitting, and parameter estimation.
### Brief guide to the mathematical model
HIV dynamics in an infected individual can be mathematically described as the production and spread of virus by two groups of infected target cells: so called 'short-lived' infected cells that die at a fast rate (such as CD4 T cells), and other 'long-lived' infected cells that die more slowly (**Fig A**).
{width=650px}
Once ART has begun, the decline of HIV viral load, $V$, can be modeled using the following expression
$V(t)~ =$ A $\exp(-\delta ~t) ~+$ B $\exp(- \gamma~ t),$
where $\delta$ and $\gamma$ are the death rates of short and long-lived infected cells, respectively [@perelson1996hiv; @perelson1997a; @nowak2000book; @Shet2016]. This equation is referred to as the biphasic model: viral decay is fast initially, reflecting the loss of short-lived infected cells (at rate $\delta$), but then enters a slower decline phase reflecting the loss of long-lived infected cells (at rate $\gamma$) (**Fig B**). Eventually, viral load is suppressed below the detection threshold of the experiment (dashed line, Fig B). Note that for patient data exhibiting only one decline phase (for example, due to sparse or delayed viral load measurements), one can use a single phase version of the biphasic model given by
$V(t) =$ B̂ $\exp(-$γ̂ $t)$,
where decay could reflect the fast or the slow phase of virus suppression.
By fitting the model, we can estimate the death rate parameters and use these to calculate the lifespans of infected cells: $1/\delta$ and $1/\gamma$ for short and long-lived infected cells from the biphasic model, and 1/γ̂ for the single phase model. We can also estimate the time taken to suppress viral load ('time to suppression' (TTS)) by calculating the first time at which $V(t) = x$, where $x$ is a user-defined suppression threshold, and $V(t)$ is given by either the biphasic or single phase equation.
## Quick Start Example
To install the package from CRAN use
```{r install, eval = FALSE}
install.packages("ushr")
```
The vignette can be viewed [here](https://cran.r-project.org/web/packages/ushr/vignettes/Vignette.html). To install the package from Github, first install and load `devtools`, then install `ushr` as follows
```{r install2, eval = FALSE}
install.packages("devtools")
library("devtools")
install_github("SineadMorris/ushr")
```
Note that to install the package from Github with its vignette, you must first install `knitr`, then install `ushr` with `build_vignettes = TRUE`. The package vignette can then be viewed using `browseVignettes()`.
```{r vignette, eval = FALSE}
install.packages("knitr")
install_github("SineadMorris/ushr", build_vignettes = TRUE)
browseVignettes(package = "ushr")
```
To illustrate basic usage of the package, we include a publicly available data set from the ACTG315 clinical trial. The raw data (`actg315raw`) consist of longitudinal HIV viral load measurements from 46 chronically-infected adults up to 28 weeks following ART initiation. The detection threshold was 100 copies/ml and observations are recorded as $\log_{10}$ RNA copies/ml. [These data are available online](https://sph.uth.edu/divisions/biostatistics/wu/datasets/ACTG315LongitudinalDataViralLoad.htm) (originally accessed 2019-09-15) and have been described previously [@Lederman1998; @wu1999biometrics; @Connick2000].
### Data exploration
To begin, we load the package and print the first six rows of the raw data to identify our columns of interest; these are the viral load observations ('log.10.RNA.'), the timing of these observations ('Day'), and the identifier for each subject ('Patid').
```{r load}
library(ushr)
print(head(actg315raw))
```
Since `ushr` requires absolute viral load (VL) measurements, and specific column names ('vl', 'time', 'id'), we first back-transform the $\log_{10}$ viral load measurements into absolute values, and rename the column headings.
```{r edit}
actg315 <- actg315raw %>%
mutate(vl = 10^log10.RNA.) %>%
select(id = Patid, time = Day, vl)
print(head(actg315))
```
We can then visualize these data using the `plot_data()` function. The `detection_threshold` argument defines the detection threshold of the measurement assay.
```{r plotdata, fig.height = 8, fig.width = 8}
plot_data(actg315, detection_threshold = 100)
```
Each panel represents a different individual and the dashed horizontal line is the assay detection threshold. We can see that the data is noisy, individuals have different numbers of observations, and only a subset suppress viral load below the detection threshold.
### Model fitting and output visualization
To fit the model to these data in just one line of code we use the `ushr()` function. This processes the data to filter out any individuals who do not suppression viral load, or who violate other inclusion criteria (described in the vignette), and then fits the model to each remaining trajectory. Note that only subjects with a minimum number of measurements above the detection threshold can reliably be fit. The number can be specified by the user, but we recommend at least six observations for the biphasic model and three for the single phase model. Note the `censor_value` argument specifies how measurements below the detection threshold should be treated (here we set them to half of the detection theshold in line with previous work [@wu1999characterization]).
```{r fits}
model_output <- ushr(data = actg315, detection_threshold = 100, censor_value = 50)
```
With the fitted model output, we can then plot both the biphasic and single phase fits as follows
```{r bpfits, fig.width = 6, fig.height = 4}
plot_model(model_output, type = "biphasic", detection_threshold = 100)
```
```{r spfits, fig.width = 3.5, fig.height = 2.5}
plot_model(model_output, type = "single", detection_threshold = 100)
```
The solid lines are the best-fit model for each subject. Twelve were successfully fit with the biphasic model, and four with the single phase model. Although some single phase subjects had sufficient data to fit the biphasic model (i.e. at least six observations), the resulting 95\% parameter confidence intervals were either unattainable or sufficiently wide to indicate an unreliable fit. As a result, the subjects were automatically re-fit with the single phase model.
We can also summarize the fitting procedure and parameter estimates using `summarize_model()`. This creates a list with the following elements: (i) a summary of which subjects were successfully fit, with the corresponding infected cell lifespan estimates (`summary`); (ii) summary statistics for the biphasic model parameter estimates (`biphasicstats`); and (iii) summary statistics for the single phase parameter estimates (`singlestats`).
```{r summariz}
actg315_summary <- summarize_model(model_output, data = actg315, stats = TRUE)
head(actg315_summary$summary)
actg315_summary$biphasicstats
actg315_summary$singlestats
```
For a better understanding of parameter identifiability, one can also print the parameter estimates for each individual and model, along with their corresponding 95\% confidence intervals.
```{r CIs}
head(model_output$biphasicCI)
head(model_output$singleCI)
```
## Time to suppression
To calculate the time to viral suppression (TTS) we use the fitted model output and the `get_TTS()` function (see the vignette for more details). Here we set the suppression threshold to be the same as the detection threshold (i.e. we want to know when viral load drops below the detection threshold of the assay). We can subsequently obtain median and SD statistics, and the total number of subjects included in the analysis, using the `summarize()` function from `dplyr`.
```{r TTSparametric}
TTS <- get_TTS(model_output = model_output, suppression_threshold = 100)
head(TTS)
TTS %>% summarize(median = median(TTS), SD = sd(TTS), N = n())
```
We can also plot the distribution of estimates using `plot_TTS()`.
```{r TTSplot, fig.width = 2, fig.height = 2}
plot_TTS(TTS, bins = 6, textsize = 7)
```
## Additional functionality
`ushr` provides additional functionality to the examples documented here. For example, noisy clinical data can be simulated from an underlying biphasic model using the `simulate_data()` function. We also provide an alternative, non-parametric method for estimating TTS that does not require prior model fitting. Further details of all functions and user-specific customizations can be found in the documentation.
`ushr` provides additional functionality to the examples documented here. Notable examples are:
* For ART that includes an integrase inhibitor, a triphasic exponential model can be fit using `ushr_triphasic()` (see `?ushr_triphasic()`); this may be more appropriate than the biphasic model [@Cardozo2017]. Results can be visualized using the same plotting/summary functions as above.
* Noisy clinical data can be simulated from an underlying biphasic model using the `simulate_data()` function.
* We provide an alternative, non-parametric method for estimating TTS that does not require prior model fitting.
Further details of all functions and user-specific customizations can be found in the documentation.
## References
Owner
- Name: Sinead Morris
- Login: SineadMorris
- Kind: user
- Location: Atlanta, GA
- Company: Influenza Division, CDC
- Website: https://sineadmorris.github.io
- Twitter: SineadEMorris
- Repositories: 3
- Profile: https://github.com/SineadMorris
CDC + personal projects
GitHub Events
Total
Last Year
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Sinead Morris | S****s | 84 |
| Sinead Morris | s****s@s****u | 24 |
| Romain Francois | r****n@r****m | 1 |
Committer Domains (Top 20 + Academic)
Packages
- Total packages: 1
-
Total downloads:
- cran 211 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 5
- Total maintainers: 1
cran.r-project.org: ushr
Understanding Suppression of HIV
- Homepage: https://github.com/SineadMorris/ushr
- Documentation: http://cran.r-project.org/web/packages/ushr/ushr.pdf
- License: MIT + file LICENSE
-
Latest release: 0.2.3
published almost 6 years ago
Rankings
Forks count: 21.9%
Stargazers count: 28.5%
Dependent packages count: 29.8%
Average: 35.1%
Dependent repos count: 35.5%
Downloads: 60.0%
Maintainers (1)
Last synced:
6 months ago
Dependencies
DESCRIPTION
cran
- dplyr >= 0.8.0.1 depends
- ggplot2 >= 3.1.1 depends
- stats * depends
- tidyr >= 0.8.3 depends
- GGally >= 1.4.0 suggests
- knitr >= 1.22 suggests
- rmarkdown >= 1.12 suggests
- testthat >= 2.2.0 suggests