invgamma

A light weight package for the dpqr functions for the inverse gamma, inverse chi-squared, and inverse exponential distributions in R

https://github.com/dkahle/invgamma

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

A light weight package for the dpqr functions for the inverse gamma, inverse chi-squared, and inverse exponential distributions in R

Basic Info

Host: GitHub
Owner: dkahle
License: other
Language: R
Default Branch: master
Homepage: https://dkahle.github.io/invgamma/
Size: 39.5 MB

Statistics

Stars: 4
Watchers: 1
Forks: 2
Open Issues: 2
Releases: 1

Created almost 10 years ago · Last pushed 10 months ago

Metadata Files

Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "# ",
  fig.path = "tools/README-",
  dpi = 250
)
```


# __invgamma__



[![DOI](https://zenodo.org/badge/61824677.svg)](https://doi.org/10.5281/zenodo.16944076)



__invgamma__ implements the `[dpqr]` statistics functions for the [inverse gamma distribution](https://en.wikipedia.org/wiki/Inverse-gamma_distribution) in [R](https://cran.r-project.org).  It is ideal for using in other packages since it is lightweight and leverages the `[dpqr]gamma()` line of functions maintained by CRAN.

_Please see the section on parameterizations below to avoid any unintended mistakes!_



### Getting __invgamma__

There are two ways to get __invgamma__.  For the [CRAN version](https://cran.r-project.org/package=invgamma), use
```{r, eval=FALSE}
install.packages("invgamma")
```
For the development version, use
```{r, eval=FALSE}
# install.packages("devtools")
devtools::install_github("dkahle/invgamma")
```



### The `[dpqr]invgamma()` functions

The functions in __invgamma__ match those for the gamma distribution provided by the __stats__ package.  Namely, it uses as its density
_f(x) = (b^a / Gamma(a)) x^-(a+1) e^(-b/x),_
where a = `shape` and b = `rate`.


The [PDF](https://en.wikipedia.org/wiki/Probability_density_function) (the _f(x)_ above) can be evaluated with the `dinvgamma()` function:
```{r}
#| fig.height: 3
library("invgamma")
x <- seq(0, 5, .01)
shape <- 7; rate <- 10
plot(x, dinvgamma(x, shape, rate), type = "l")
```


The [CDF](https://en.wikipedia.org/wiki/Cumulative_distribution_function) can be evaluated with the `pinvgamma()` function:
```{r}
f <- function(x) dinvgamma(x, shape, rate)
q <- 2

integrate(f, 0, q)

(p <- pinvgamma(q, shape, rate))
```

The [quantile function](https://en.wikipedia.org/wiki/Quantile_function) can be evaluated with `qinvgamma()`:
```{r}
qinvgamma(p, shape, rate) # = q
```

And random number generation can be performed with `rinvgamma()`:
```{r}
set.seed(1234)
rinvgamma(5, shape, rate)
```
`rinvgamma()` can be used to obtain a [Monte Carlo](https://en.wikipedia.org/wiki/Monte_Carlo_method) estimate of the probability given by `pinvgamma()` above:
```{r}
n <- 1e5
draws <- rinvgamma(n, shape, rate)
mean(draws <= q)
```
Moreover, we can check the consistency and correctness of the implementation with first a [kernel density estimate](https://en.wikipedia.org/wiki/Kernel_density_estimation)... 
```{r}
#| fig.height: 3
plot(density(draws), xlim = c(0,5))
curve(f(x), col = "red", add = TRUE)
```

...and also a [quantile plot](https://en.wikipedia.org/wiki/Q–Q_plot)...

```{r}
#| fig.height: 3
qqplot( "x" = ppoints(n) |> qinvgamma(shape, rate), "y" = draws,
  xlab = "Theoretical quantiles", ylab = "Sample quantiles",
  main = "QQ plot for inverse gamma draws"
)
abline(0, 1, col = "red")
```

Both of these indicate that the samplers are consistent. As an inferential alternative, we can use a [KS test](https://en.wikipedia.org/wiki/Kolmogorov–Smirnov_test):

```{r}
ks.test(
  draws, 
  function(p) pinvgamma(p, shape, rate)
)
```





### The `[dpqr]invchisq()` and `[dpqr]invexp()` functions

The [gamma distribution](https://en.wikipedia.org/wiki/Gamma_distribution) subsumes the [chi-squared](https://en.wikipedia.org/wiki/Chi-squared_distribution) and [exponential](https://en.wikipedia.org/wiki/Exponential_distribution) [distributions](https://en.wikipedia.org/wiki/Probability_distribution#Continuous_probability_distribution), so it makes sense to include the `*invchisq()` and `*invexp()` functions in __invgamma__.  Their implementations, however, wrap `*chisq()` and `*exp()`, not `*invgamma()`.  









### A note on parameterizations

As detailed [here](https://github.com/dkahle/invgamma/issues/1), the parameterizations of the functions in this package cue off of their corresponding non-inverse distributions from **stats**. This commonly causes the confusion that, for example, the parameter `rate` in `dinvgamma()` is the rate parameter of the inverse gamma distribution. It is not! It is the rate parameter of the corresponding gamma distribution. Please take care with this distinction.


### A note on numerics

**invgamma** was intended to be a lightweight and simple, largely self-maintaining package implementing the inverse gamma, inverse chi-square, and inverse exponential distributions. It uses [the transformation theorem](https://en.wikipedia.org/wiki/Random_variable#Functions_of_random_variables) in all cases. 

One of the challenges to using naive implementations of distributions is that their [numerics](https://en.wikipedia.org/wiki/Floating-point_arithmetic) may not work well. Arithmetic on a computer is not the same as arithmetic in theory, the kind that you meet in math classes, and as a consequence the best computer implementations of mathematical facts/algorithms need to be tailored to the specific cases at hand.

In January 2023 I did a little poking around into this for `rinvgamma()` and found that it performs poorly when the shape parameter is less than .001 or so. The resulting distributions are very heavy-tailed, and the draws from these distributions returned by `rinvgamma()` are so large that they get rounded to either very large numbers (where the floating point representation of numbers does not provide many numbers) or infinity. Here's an example:

```{r}
#| warn: true
rinvgamma(10, shape = .001, rate = 7)
```

Notice that `rinvgamma()` issues a warning in this circumstance; this is the general behavior of **invgamma**: if a particular parameter configuration is known to not produce valid results, a warning is issued.


#### KS tests for sampling accuracy


##### `rinvgamma()`

Here is a more detailed Monte Carlo investigation that checks sampler quality using [the Kolmogorov-Smirnov test](https://en.wikipedia.org/wiki/Kolmogorov–Smirnov_test).

First, we write a basic Monte Carlo test for the sampler that works by generating a large (`n = 1e6`) sample of draws from the inverse gamma distribution for a given shape and rate:

```{r}
test_invgamma <- function(shape, rate, n = 1e5) {
  draws <- rinvgamma(n, shape, rate)
  ks.test(draws, function(p) pinvgamma(p, shape, rate))$p.value
}

test_invgamma(3, 7)
```

The function returns the $p$-value associated with the KS test, so "small" values suggest a departure from the null hypothesis that the distribution is from the corresponding inverse gamma distribution: the sampler is performing poorly. Under the null hypothesis, [the $p$-value has an approximate uniform distribution](https://en.wikipedia.org/wiki/P-value#p-value_as_the_statistic_for_performing_significance_tests), a fact that can be found in most advanced mathematical statistics books, so we would expect some proportion to be small regardless. 

We want to see the behavior of the sampler `rinvgamma()` across a wide array of parameter values. To do this, we use a range of parameter values running from small ($10^-4$) to large ($10^4$):

```{r}
#| message: false
# load tidyverse and related
library("tidyverse"); library("patchwork"); library("scales", warn.conflicts = FALSE)
theme_set(theme_minimal()); theme_update(panel.grid.minor = element_blank())

# load furrr for parallel computing
library("furrr"); furrr_options(seed = TRUE)
plan(multisession(workers = parallelly::availableCores()))

# set parameter values to test
n_grid <- 51
param_vals <- 10^seq(-4, 4, length.out = n_grid)
(param_grid <- expand_grid("shape" = param_vals, "rate" = param_vals))
```

Here's what the experiment's design space looks like:

```{r}
# make axes labeller
fmt <- scales::math_format(10^.x)

# make plot
ggplot(param_grid, aes(shape, rate)) +
  geom_point() +
  scale_x_log10(expression(alpha), n.breaks = 10, labels = fmt(-5:5)) +
  scale_y_log10(expression(lambda), n.breaks = 10, labels = fmt(-5:5)) +
  labs("title" = "Parameter Values at Which to Test `rinvgamma()`") +
  coord_equal()
```

Now, we run our test for each point in the design space in parallel. (Note: we've suppressed warnings here that are relevant.)

```{r}
#| warning: false
param_grid <- param_grid |> 
  mutate(p_val = future_map2_dbl(shape, rate, test_invgamma))
```

And we visualize the distribution of the $p$-values over that space, binning the colors to at .05 to highlight the rejections of the tests at the 5% level:

```{r}
ggplot(param_grid, aes(shape, rate, color = p_val)) +
  geom_point() +
  scale_x_log10(expression(alpha), n.breaks = 10, labels = fmt(-5:5)) +
  scale_y_log10(expression(lambda), n.breaks = 10, labels = fmt(-5:5)) +
  scale_color_binned(breaks = c(0, .05, 1)) +
  labs(color = "p value") +
  labs("title" = "KS GoF Test of Draws for Different Parameter Values") +
  coord_equal()
```

If the sampler were working correctly, the $p$-values would be approximately IID uniform(0,1), so we would expect about 5% of the points to be purple, and those 5% would be uniformly distributed over the whole space with no patterns. Obviously, that's not the case: when the shape parameter is small, the test is always rejecting. Clearly, when `shape` is small, the sampler does not work well. Further investigations reveal that, as an easy rule, the sampler can be considered unreliable for `shape` values less than `0.01`. As a consequence, `rinvgamma()` issues a warning in those circumstances. (This warning has been suppressed in the above computations.)

##### `rinvchisq()` and `rinvexp()`

Similar investigations using the inverse chi-squared and inverse exponential reveal that `rinvchisq()` should not be trusted when `df <= .01` and `ncp <= 10` and `rinvexp()` is trustworthy for all values. Here is the illustration for the inverse chi-squared:

```{r}
#| warning: false
test_rinvchisq <- function(df, ncp, n = 1e5) {
  draws <- rinvchisq(n, df, ncp)
  ks.test(draws, function(p) pinvchisq(p, df, ncp))$p.value
}

expand_grid("df" = param_vals, "ncp" = param_vals) |> 
  mutate("p_val" = future_map2_dbl(df, ncp, test_rinvchisq)) |> 
  ggplot(aes(df, ncp, color = p_val)) +
    geom_point() +
    scale_x_log10(expression(nu), n.breaks = 10, labels = fmt(-5:5)) +
    scale_y_log10("ncp", n.breaks = 10, labels = fmt(-5:5)) +
    scale_color_binned(breaks = c(0, .05, 1)) +
    labs(color = "p value") +
    coord_equal()
```

And here is the illustration for the inverse exponential:

```{r}
#| warning: false
test_rinvexp <- function(rate, n = 1e5) {
  draws <- rinvexp(n, rate = rate)
  ks.test(draws, function(p) pinvexp(p, rate))$p.value
}

tibble("rate" = 10^seq(-4, 4, length.out = 2*n_grid)) |> 
  mutate("p_val" = future_map_dbl(rate, test_rinvexp)) |> 
  ggplot(aes(rate, 0, color = p_val)) +
    geom_point() +
    scale_x_log10(expression(lambda), n.breaks = 10, labels = fmt(-5:5)) +
    scale_color_binned(breaks = c(0, .05, 1), guide = FALSE) +
    theme(axis.text.y = element_blank(), axis.title.y = element_blank(),
          panel.grid.major.y = element_blank()) +
    coord_equal()
```

Owner

Name: David Kahle
Login: dkahle
Kind: user
Location: Waco, Texas
Company: Baylor University

Website: www.kahle.io
Repositories: 11
Profile: https://github.com/dkahle

GitHub Events

Total

Create event: 3
Release event: 1
Issues event: 1
Push event: 22
Pull request event: 6

Last Year

Create event: 3
Release event: 1
Issues event: 1
Push event: 22
Pull request event: 6

Committers

Last synced: over 2 years ago

All Time

Total Commits: 34
Total Committers: 2
Avg Commits per committer: 17.0
Development Distribution Score (DDS): 0.147

Past Year

Commits: 5
Committers: 1
Avg Commits per committer: 5.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
David Kahle	d**e@g**m	29
David Kahle	d**d@k**o	5

Committer Domains (Top 20 + Academic)

kahle.io: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 3
Total pull requests: 3
Average time to close issues: 2 days
Average time to close pull requests: 1 minute
Total issue authors: 3
Total pull request authors: 1
Average comments per issue: 2.33
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: 1 minute
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Beliavsky (1)
ghost (1)
dklinenberg2020 (1)

Pull Request Authors

dkahle (6)

Top Labels

Issue Labels

wontfix (1)

Pull Request Labels

codex (6)

Packages

Total packages: 2
Total downloads:
- cran 7,401 last-month
Total docker downloads: 235,763

Total dependent packages: 19
(may contain duplicates)
Total dependent repositories: 24
(may contain duplicates)
Total versions: 5
Total maintainers: 1

cran.r-project.org: invgamma

The Inverse Gamma Distribution

Homepage: https://github.com/dkahle/invgamma
Documentation: http://cran.r-project.org/web/packages/invgamma/invgamma.pdf
License: MIT + file LICENSE
Latest release: 1.2
published 12 months ago

Versions: 4
Dependent Packages: 17
Dependent Repositories: 23
Downloads: 7,401 Last month
Docker Downloads: 235,763

Rankings

Dependent packages count: 3.8%

Dependent repos count: 5.7%

Downloads: 6.4%

Average: 12.4%

Forks count: 17.1%

Stargazers count: 19.3%

Docker downloads count: 22.1%

Maintainers (1)

david.kahle@gmail.com

Last synced: 10 months ago

conda-forge.org: r-invgamma

Homepage: https://github.com/dkahle/invgamma
License: GPL-2
Latest release: 1.1
published over 6 years ago

Versions: 1
Dependent Packages: 2
Dependent Repositories: 1

Rankings

Dependent packages count: 19.6%

Dependent repos count: 24.3%

Average: 40.4%

Stargazers count: 58.4%

Forks count: 59.2%

Last synced: 10 months ago

Dependencies

.github/workflows/pkgdown.yaml actions

JamesIves/github-pages-deploy-action v4.5.0 composite
actions/checkout v4 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/test-coverage.yaml actions

actions/checkout v4 composite
actions/upload-artifact v4 composite
codecov/codecov-action v5 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

DESCRIPTION cran

testthat >= 3.0.0 suggests

revdep/library.noindex/invgamma/new/invgamma/DESCRIPTION cran

testthat >= 3.0.0 suggests

revdep/library.noindex/invgamma/old/invgamma/DESCRIPTION cran

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

invgamma

Science Score: 49.0%

Repository

Basic Info

Statistics

Metadata Files

README.Rmd

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: invgamma

Rankings

Maintainers (1)

conda-forge.org: r-invgamma

Rankings

Dependencies