invgamma
A light weight package for the dpqr functions for the inverse gamma, inverse chi-squared, and inverse exponential distributions in R
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary
Last synced: 7 months ago
·
JSON representation
Repository
A light weight package for the dpqr functions for the inverse gamma, inverse chi-squared, and inverse exponential distributions in R
Basic Info
- Host: GitHub
- Owner: dkahle
- License: other
- Language: R
- Default Branch: master
- Homepage: https://dkahle.github.io/invgamma/
- Size: 39.5 MB
Statistics
- Stars: 4
- Watchers: 1
- Forks: 2
- Open Issues: 2
- Releases: 1
Created almost 10 years ago
· Last pushed 7 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "# ",
fig.path = "tools/README-",
dpi = 250
)
```
# __invgamma__
[](https://doi.org/10.5281/zenodo.16944076)
__invgamma__ implements the `[dpqr]` statistics functions for the [inverse gamma distribution](https://en.wikipedia.org/wiki/Inverse-gamma_distribution) in [R](https://cran.r-project.org). It is ideal for using in other packages since it is lightweight and leverages the `[dpqr]gamma()` line of functions maintained by CRAN.
_Please see the section on parameterizations below to avoid any unintended mistakes!_
### Getting __invgamma__
There are two ways to get __invgamma__. For the [CRAN version](https://cran.r-project.org/package=invgamma), use
```{r, eval=FALSE}
install.packages("invgamma")
```
For the development version, use
```{r, eval=FALSE}
# install.packages("devtools")
devtools::install_github("dkahle/invgamma")
```
### The `[dpqr]invgamma()` functions
The functions in __invgamma__ match those for the gamma distribution provided by the __stats__ package. Namely, it uses as its density
_f(x) = (b^a / Gamma(a)) x^-(a+1) e^(-b/x),_
where a = `shape` and b = `rate`.
The [PDF](https://en.wikipedia.org/wiki/Probability_density_function) (the _f(x)_ above) can be evaluated with the `dinvgamma()` function:
```{r}
#| fig.height: 3
library("invgamma")
x <- seq(0, 5, .01)
shape <- 7; rate <- 10
plot(x, dinvgamma(x, shape, rate), type = "l")
```
The [CDF](https://en.wikipedia.org/wiki/Cumulative_distribution_function) can be evaluated with the `pinvgamma()` function:
```{r}
f <- function(x) dinvgamma(x, shape, rate)
q <- 2
integrate(f, 0, q)
(p <- pinvgamma(q, shape, rate))
```
The [quantile function](https://en.wikipedia.org/wiki/Quantile_function) can be evaluated with `qinvgamma()`:
```{r}
qinvgamma(p, shape, rate) # = q
```
And random number generation can be performed with `rinvgamma()`:
```{r}
set.seed(1234)
rinvgamma(5, shape, rate)
```
`rinvgamma()` can be used to obtain a [Monte Carlo](https://en.wikipedia.org/wiki/Monte_Carlo_method) estimate of the probability given by `pinvgamma()` above:
```{r}
n <- 1e5
draws <- rinvgamma(n, shape, rate)
mean(draws <= q)
```
Moreover, we can check the consistency and correctness of the implementation with first a [kernel density estimate](https://en.wikipedia.org/wiki/Kernel_density_estimation)...
```{r}
#| fig.height: 3
plot(density(draws), xlim = c(0,5))
curve(f(x), col = "red", add = TRUE)
```
...and also a [quantile plot](https://en.wikipedia.org/wiki/Q–Q_plot)...
```{r}
#| fig.height: 3
qqplot( "x" = ppoints(n) |> qinvgamma(shape, rate), "y" = draws,
xlab = "Theoretical quantiles", ylab = "Sample quantiles",
main = "QQ plot for inverse gamma draws"
)
abline(0, 1, col = "red")
```
Both of these indicate that the samplers are consistent. As an inferential alternative, we can use a [KS test](https://en.wikipedia.org/wiki/Kolmogorov–Smirnov_test):
```{r}
ks.test(
draws,
function(p) pinvgamma(p, shape, rate)
)
```
### The `[dpqr]invchisq()` and `[dpqr]invexp()` functions
The [gamma distribution](https://en.wikipedia.org/wiki/Gamma_distribution) subsumes the [chi-squared](https://en.wikipedia.org/wiki/Chi-squared_distribution) and [exponential](https://en.wikipedia.org/wiki/Exponential_distribution) [distributions](https://en.wikipedia.org/wiki/Probability_distribution#Continuous_probability_distribution), so it makes sense to include the `*invchisq()` and `*invexp()` functions in __invgamma__. Their implementations, however, wrap `*chisq()` and `*exp()`, not `*invgamma()`.
### A note on parameterizations
As detailed [here](https://github.com/dkahle/invgamma/issues/1), the parameterizations of the functions in this package cue off of their corresponding non-inverse distributions from **stats**. This commonly causes the confusion that, for example, the parameter `rate` in `dinvgamma()` is the rate parameter of the inverse gamma distribution. It is not! It is the rate parameter of the corresponding gamma distribution. Please take care with this distinction.
### A note on numerics
**invgamma** was intended to be a lightweight and simple, largely self-maintaining package implementing the inverse gamma, inverse chi-square, and inverse exponential distributions. It uses [the transformation theorem](https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables) in all cases.
One of the challenges to using naive implementations of distributions is that their [numerics](https://en.wikipedia.org/wiki/Floating-point_arithmetic) may not work well. Arithmetic on a computer is not the same as arithmetic in theory, the kind that you meet in math classes, and as a consequence the best computer implementations of mathematical facts/algorithms need to be tailored to the specific cases at hand.
In January 2023 I did a little poking around into this for `rinvgamma()` and found that it performs poorly when the shape parameter is less than .001 or so. The resulting distributions are very heavy-tailed, and the draws from these distributions returned by `rinvgamma()` are so large that they get rounded to either very large numbers (where the floating point representation of numbers does not provide many numbers) or infinity. Here's an example:
```{r}
#| warn: true
rinvgamma(10, shape = .001, rate = 7)
```
Notice that `rinvgamma()` issues a warning in this circumstance; this is the general behavior of **invgamma**: if a particular parameter configuration is known to not produce valid results, a warning is issued.
#### KS tests for sampling accuracy
##### `rinvgamma()`
Here is a more detailed Monte Carlo investigation that checks sampler quality using [the Kolmogorov-Smirnov test](https://en.wikipedia.org/wiki/Kolmogorov–Smirnov_test).
First, we write a basic Monte Carlo test for the sampler that works by generating a large (`n = 1e6`) sample of draws from the inverse gamma distribution for a given shape and rate:
```{r}
test_invgamma <- function(shape, rate, n = 1e5) {
draws <- rinvgamma(n, shape, rate)
ks.test(draws, function(p) pinvgamma(p, shape, rate))$p.value
}
test_invgamma(3, 7)
```
The function returns the $p$-value associated with the KS test, so "small" values suggest a departure from the null hypothesis that the distribution is from the corresponding inverse gamma distribution: the sampler is performing poorly. Under the null hypothesis, [the $p$-value has an approximate uniform distribution](https://en.wikipedia.org/wiki/P-value#p-value_as_the_statistic_for_performing_significance_tests), a fact that can be found in most advanced mathematical statistics books, so we would expect some proportion to be small regardless.
We want to see the behavior of the sampler `rinvgamma()` across a wide array of parameter values. To do this, we use a range of parameter values running from small ($10^-4$) to large ($10^4$):
```{r}
#| message: false
# load tidyverse and related
library("tidyverse"); library("patchwork"); library("scales", warn.conflicts = FALSE)
theme_set(theme_minimal()); theme_update(panel.grid.minor = element_blank())
# load furrr for parallel computing
library("furrr"); furrr_options(seed = TRUE)
plan(multisession(workers = parallelly::availableCores()))
# set parameter values to test
n_grid <- 51
param_vals <- 10^seq(-4, 4, length.out = n_grid)
(param_grid <- expand_grid("shape" = param_vals, "rate" = param_vals))
```
Here's what the experiment's design space looks like:
```{r}
# make axes labeller
fmt <- scales::math_format(10^.x)
# make plot
ggplot(param_grid, aes(shape, rate)) +
geom_point() +
scale_x_log10(expression(alpha), n.breaks = 10, labels = fmt(-5:5)) +
scale_y_log10(expression(lambda), n.breaks = 10, labels = fmt(-5:5)) +
labs("title" = "Parameter Values at Which to Test `rinvgamma()`") +
coord_equal()
```
Now, we run our test for each point in the design space in parallel. (Note: we've suppressed warnings here that are relevant.)
```{r}
#| warning: false
param_grid <- param_grid |>
mutate(p_val = future_map2_dbl(shape, rate, test_invgamma))
```
And we visualize the distribution of the $p$-values over that space, binning the colors to at .05 to highlight the rejections of the tests at the 5% level:
```{r}
ggplot(param_grid, aes(shape, rate, color = p_val)) +
geom_point() +
scale_x_log10(expression(alpha), n.breaks = 10, labels = fmt(-5:5)) +
scale_y_log10(expression(lambda), n.breaks = 10, labels = fmt(-5:5)) +
scale_color_binned(breaks = c(0, .05, 1)) +
labs(color = "p value") +
labs("title" = "KS GoF Test of Draws for Different Parameter Values") +
coord_equal()
```
If the sampler were working correctly, the $p$-values would be approximately IID uniform(0,1), so we would expect about 5% of the points to be purple, and those 5% would be uniformly distributed over the whole space with no patterns. Obviously, that's not the case: when the shape parameter is small, the test is always rejecting. Clearly, when `shape` is small, the sampler does not work well. Further investigations reveal that, as an easy rule, the sampler can be considered unreliable for `shape` values less than `0.01`. As a consequence, `rinvgamma()` issues a warning in those circumstances. (This warning has been suppressed in the above computations.)
##### `rinvchisq()` and `rinvexp()`
Similar investigations using the inverse chi-squared and inverse exponential reveal that `rinvchisq()` should not be trusted when `df <= .01` and `ncp <= 10` and `rinvexp()` is trustworthy for all values. Here is the illustration for the inverse chi-squared:
```{r}
#| warning: false
test_rinvchisq <- function(df, ncp, n = 1e5) {
draws <- rinvchisq(n, df, ncp)
ks.test(draws, function(p) pinvchisq(p, df, ncp))$p.value
}
expand_grid("df" = param_vals, "ncp" = param_vals) |>
mutate("p_val" = future_map2_dbl(df, ncp, test_rinvchisq)) |>
ggplot(aes(df, ncp, color = p_val)) +
geom_point() +
scale_x_log10(expression(nu), n.breaks = 10, labels = fmt(-5:5)) +
scale_y_log10("ncp", n.breaks = 10, labels = fmt(-5:5)) +
scale_color_binned(breaks = c(0, .05, 1)) +
labs(color = "p value") +
coord_equal()
```
And here is the illustration for the inverse exponential:
```{r}
#| warning: false
test_rinvexp <- function(rate, n = 1e5) {
draws <- rinvexp(n, rate = rate)
ks.test(draws, function(p) pinvexp(p, rate))$p.value
}
tibble("rate" = 10^seq(-4, 4, length.out = 2*n_grid)) |>
mutate("p_val" = future_map_dbl(rate, test_rinvexp)) |>
ggplot(aes(rate, 0, color = p_val)) +
geom_point() +
scale_x_log10(expression(lambda), n.breaks = 10, labels = fmt(-5:5)) +
scale_color_binned(breaks = c(0, .05, 1), guide = FALSE) +
theme(axis.text.y = element_blank(), axis.title.y = element_blank(),
panel.grid.major.y = element_blank()) +
coord_equal()
```
Owner
- Name: David Kahle
- Login: dkahle
- Kind: user
- Location: Waco, Texas
- Company: Baylor University
- Website: www.kahle.io
- Repositories: 11
- Profile: https://github.com/dkahle
GitHub Events
Total
- Create event: 3
- Release event: 1
- Issues event: 1
- Push event: 22
- Pull request event: 6
Last Year
- Create event: 3
- Release event: 1
- Issues event: 1
- Push event: 22
- Pull request event: 6
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| David Kahle | d****e@g****m | 29 |
| David Kahle | d****d@k****o | 5 |
Committer Domains (Top 20 + Academic)
kahle.io: 1
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 3
- Total pull requests: 3
- Average time to close issues: 2 days
- Average time to close pull requests: 1 minute
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 2.33
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 3
- Average time to close issues: N/A
- Average time to close pull requests: 1 minute
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Beliavsky (1)
- ghost (1)
- dklinenberg2020 (1)
Pull Request Authors
- dkahle (6)
Top Labels
Issue Labels
wontfix (1)
Pull Request Labels
codex (6)
Packages
- Total packages: 2
-
Total downloads:
- cran 7,401 last-month
- Total docker downloads: 235,763
-
Total dependent packages: 19
(may contain duplicates) -
Total dependent repositories: 24
(may contain duplicates) - Total versions: 5
- Total maintainers: 1
cran.r-project.org: invgamma
The Inverse Gamma Distribution
- Homepage: https://github.com/dkahle/invgamma
- Documentation: http://cran.r-project.org/web/packages/invgamma/invgamma.pdf
- License: MIT + file LICENSE
-
Latest release: 1.2
published 9 months ago
Rankings
Dependent packages count: 3.8%
Dependent repos count: 5.7%
Downloads: 6.4%
Average: 12.4%
Forks count: 17.1%
Stargazers count: 19.3%
Docker downloads count: 22.1%
Maintainers (1)
Last synced:
7 months ago
conda-forge.org: r-invgamma
- Homepage: https://github.com/dkahle/invgamma
- License: GPL-2
-
Latest release: 1.1
published about 6 years ago
Rankings
Dependent packages count: 19.6%
Dependent repos count: 24.3%
Average: 40.4%
Stargazers count: 58.4%
Forks count: 59.2%
Last synced:
7 months ago
Dependencies
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.5.0 composite
- actions/checkout v4 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v4 composite
- actions/upload-artifact v4 composite
- codecov/codecov-action v5 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- testthat >= 3.0.0 suggests
revdep/library.noindex/invgamma/new/invgamma/DESCRIPTION
cran
- testthat >= 3.0.0 suggests
revdep/library.noindex/invgamma/old/invgamma/DESCRIPTION
cran