ebTobit
Empirical Bayesian Estimation of Possibly Censored Gaussian Matrices
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.4%) to scientific vocabulary
Repository
Empirical Bayesian Estimation of Possibly Censored Gaussian Matrices
Basic Info
- Host: GitHub
- Owner: barbehenna
- Language: R
- Default Branch: main
- Size: 110 KB
Statistics
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Empirical Bayesian Estimation of Censored Gaussian (Tobit) Matrices
What is it?
An R package for denoising censored, Gaussian means with empirical Bayes $g$-modeling. The general model is as follows:
$$ \thetai \sim{iid} g \quad (\subseteq \mathbb{R}^p) $$
$$ X{ij} \mid \theta{ij} \sim{indep.} N(\theta{ij}, \sigma^2) $$
$$ L{ij} \leq X{ij} \leq R_{ij} $$
The data is represented with matrices:
$$ \theta = \begin{bmatrix} \theta{11} & \dots & \theta{1p} \ \theta{21} & \dots & \theta{2p} \ \vdots & \ddots & \vdots \ \theta{n1} & \dots & \theta{np} \ \end{bmatrix} \qquad X = \begin{bmatrix} X{11} & \dots & X{1p} \ X{21} & \dots & X{2p} \ \vdots & \ddots & \vdots \ X{n1} & \dots & X{np} \ \end{bmatrix} $$
$$ L = \begin{bmatrix} L{11} & \dots & L{1p} \ L{21} & \dots & L{2p} \ \vdots & \ddots & \vdots \ L{n1} & \dots & L{np} \ \end{bmatrix} \qquad R = \begin{bmatrix} R{11} & \dots & R{1p} \ R{21} & \dots & R{2p} \ \vdots & \ddots & \vdots \ R{n1} & \dots & R{np} \ \end{bmatrix} $$
The bounds $L{ij}$ and $R{ij}$ are assumed to be known. When $L{ij} = R{ij}$ there is a direct (noisy) measurement of $\theta{ij}$, if $L{ij} < R{ij}$ then there is a censored measurement of $\theta{ij}$. This structure is commonly referred to as partially interval censored data and it allows for any combination of observed measurements and left-, right-, and interval-censored measurements.
We use a Tobit likelihood for each measurement:
$$ P(L, R \mid \theta) = \begin{cases} \phi{\sigma} ( L - \theta ) & L = R \ \Phi{\sigma} ( R - \theta ) - \Phi_{\sigma} ( L - \theta ) & L < R \end{cases} $$
where the standard Gaussian likelihood is used when there is a direct Gaussian measurement (ie $L = X = R$) and a Gaussian probability is used when there is a censored Gaussian measurement (ie $L < R$).
What does it do?
This package provides an object ebTobit (Empirical Bayes model with Tobit likelihood) that estimates the prior, $g$ over a user-specified grid gr and then computes the posterior mean or $\ell1$ mediod as estimates for $\theta$.
In one dimension, the $\ell1$ mediod is the median.
By default gr is set using the exemplar method so the grid is the maximum likelihood estimate for each $\theta{ij}$.
When the censoring interval is finite, the maximum likelihood estimate for each $\theta{ij}$ is $0.5 ( L{ij} + R{ij} )$
Suppose $p = 1$ and there is no censoring, then the basic utility is:
```r library(ebTobit)
create noisy measurements
n <- 100 t <- sample(c(0, 5), size = n, replace = TRUE, prob = c(0.8, 0.2)) x <- t + stats::rnorm(n)
fit g-model with default prior grid
res1 <- ebTobit(x)
measure performance of estimated posterior mean
mean((t - fitted(res1))^2) ```
Next we can look at a more complicated example with $p = 10$:
```r library(ebTobit)
create noisy measurements (low rank structure)
n <- 1000; p <- 10 t <- matrix(stats::rgamma(np, shape = 5, rate = 1), n, p) x <- t + matrix(stats::rnorm(np), n, p)
assume we can't accurately measure x < 1 but we know theta > 0
L <- ifelse(x < 1, 0, x) R <- ifelse(x < 1, 1, x)
fit g-model with default prior grid
res2 <- ebTobit(x) res3 <- ebTobit(L, R)
oberve that the censoring affects the fitted range
range(fitted(res2)) range(fitted(res3))
fit censored data with a different grid (large and random not MLE)
res4 <- ebTobit( L = L, R = R, gr = sapply(1:p, function(j) stats::runif(1e+4, min = min(L[,j]), max = max(R[,j]))), algorithm = "EM" )
compute posterior mean and L1mediod given new data
we can also predict based on partially interval-censored observations
y <- matrix(stats::rexp(5*p, rate = 0.5), 5, p) predict(res4, y) # posterior mean predict(res4, y, method = "L1mediod") # posterior L1-mediod ```
How do install it?
This package is available on CRAN. It can also be installed directly from GitHub:
r
remotes::install_github("barbehenna/ebTobit")
Data
This R package also includes a real bile acid data.frame taken directly from Lei et al. (2018) (https://doi.org/10.1096/fj.201700055R) via https://github.com/WandeRum/GSimp (https://doi.org/10.1371/journal.pcbi.1005973). The bile acid data contains measurements of 34 bile acids for 198 patients; no missing values are present in the data. In our modeling, we assume the bile acid values are independent log-normal measurements.
r
data(BileAcid, package = "ebTobit") # attach the bile acid data
Who wrote it?
Alton Barbehenn and Sihai Dave Zhao
What license?
GPL (>= 3)
Owner
- Name: Alton Barbehenn
- Login: barbehenna
- Kind: user
- Company: Department of Statistics, University of Illinois Urbana-Champaign
- Repositories: 1
- Profile: https://github.com/barbehenna
Citation (CITATION.cff)
cff-version: 1.2.0
message: >-
If you use this software, please cite it using the
metadata from this file.
authors:
- given-names: Alton
family-names: Barbehenn
orcid: 'https://orcid.org/0009-0000-3364-7204'
- given-names: Sihai Dave
family-names: Zhao
title: 'ebTobit: Empirical Bayesian Tobit Matrix Estimation'
version: 1.0.1
doi: 10.48550/arXiv.2306.07239
date-released: 2023-06-15
repository-code: 'https://github.com/barbehenna/ebTobit'
license: GPL-3.0
GitHub Events
Total
Last Year
Packages
- Total packages: 1
-
Total downloads:
- cran 201 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: ebTobit
Empirical Bayesian Tobit Matrix Estimation
- Homepage: https://github.com/barbehenna/ebTobit
- Documentation: http://cran.r-project.org/web/packages/ebTobit/ebTobit.pdf
- License: GPL-3
-
Latest release: 1.0.2
published about 2 years ago