Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

:package: exactt

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.Rmd

---
output: github_document
editor_options: 
  chunk_output_type: console
---




```{r, include = FALSE}
knitr::opts_chunk$set(collapse  = TRUE,
                      comment   = "#>",
                      fig.path  = "man/figures/",
                      out.width = "100%")
```



exactt 
=========================================================


[![CRAN status](https://www.r-pkg.org/badges/version/exactt)](https://CRAN.R-project.org/package=exactt)




## Introduction

The `exactt` package tests whether a slope coefficient is equal to some null value using the novel method described in Pouliot (2023). Importantly, inverting such a test produces a marginally valid confidence interval. 

## Installation

The `exactt` package is hosted on GitHub at https://github.com/ian-xu-economics/exactt/. It can be installed using the `remotes::install_github()` function:
``` r
# install.packages("remotes")
remotes::install_github("ian-xu-economics/exactt")
```

## Attribution

To cite the `exactt` package in publications, use the `citation()` function, which provides both the text version and the BibTeX entry for referencing:
``` r
citation("exactt")
```

## Using `exactt`

After installing `exactt`, we can attach the package to our session using the base `library()` function:
```{r}
library("exactt")
```

## Example Usage: Regular Case

To compute the $(1-\alpha)$ confidence interval, use the `exactt()` function. Here's an example looking at the effect of vitamin C on tooth growth in guinea pigs using data from `datasets::ToothGrowth`. We'll investigate the relationship between `supp` (orange juice (OJ) or ascorbic acid (VC)) and `dose` (dose in milligrams/day) on `len` (tooth length). 
```{r}
summary(datasets::ToothGrowth)
```

Suppose our model is $len_i = \beta_0 + \beta_{dose} \times dose_i + \beta_{supp} \times supp_i + \varepsilon_i$. We can create a 90% confidence interval by plugging in standard formula notation into `exactt()`. The level of significance (alpha) equals 0.1 here, but if we choose not to specify any additional parameters, then by default:

* The number of blocks used equals 5 (`nBlocks = 5`).
* The confidence interval is constructed for all variables (`variables = NULL`).
* The number of permutations is equal to  (`nPerms = factorial(nBlocks)`).
* The level of significance equals 0.05 (`alpha = 0.05`). 
* The test statistics are studentized (`studentize = TRUE`).
* The ordering of the data is not permuted (`permutation = NULL`).
* The ordering of the data is not optimized (`optimize = FALSE`).
```{r}
exactt.1 <- exactt(model = len ~ dose + supp,
                   data = datasets::ToothGrowth,
                   alpha = 0.1)

print(exactt.1, digits = 5)
```

## Focusing on Specific Variables

To focus on specific coefficients, set the `variables` parameter. The number entered corresponds to the index of the regressors in the model (note that the intercept is never counted). For example, set `variables = 1` for `dose`, and set `variables = 2` for `supp`.
```{r}
exactt.2 <- exactt(model = len ~ dose + supp,
                   data = datasets::ToothGrowth,
                   alpha = 0.1,
                   variables = 1)

print(exactt.2, digits = 5)
```
This creates a 90% confidence interval for `dose` only. It is equivalent to the case where `variables = NULL` (all variables are of interest) because these confidence intervals are marginally valid.

## Model Flexibility
The `exactt()` function is designed to allow for easy modification of your model. For instance, you can treat a variable as categorical, include polynomial terms, or apply other transformations directly within the model formula. This flexibility helps tailor the analysis to specific research questions without needing pre-transformed data. To illustrate, consider treating `dose` as a categorical variable to explore its discrete impact on tooth length:
```{r}
exactt.3 <- exactt(model = len ~ as.factor(dose) + supp,
                   data = datasets::ToothGrowth,
                   alpha = 0.1)

exactt.3
```
The 90% confidence intervals when `dose` equals "2" and `supp` equals "VC" is not informative due to suboptimal data ordering, which can diminish the statistical power of the test. This issue can be addressed by optimizing the data ordering.

## Optimizing Data Ordering

The confidence intervals produced by the `exactt()` function can change with the ordering of the data. Certain data orderings can enhance statistical power, particularly when the sample size is small and the number of blocks is large. The impact of optimization is even more pronounced when dealing with categorical variables, where appropriate ordering can substantially increase the test's power.

The `exactt()` function utilizes a genetic algorithm (provided by the `GA::ga()` function) to optimize data ordering. This approach systematically explores various data arrangements to find the one that maximizes statistical power on average.

## Enabling Optimization

To activate the optimization feature, set `optimize = TRUE`. Additionally, `exactt()` allows for the specification of various parameters of the `GA::ga()` function to tailor the optimization process. For instance, you can limit the number of iterations with `maxiter` or specify the seed with `seed` for reproducibility:

```{r}
exactt.4 <- exactt(model = len ~ as.factor(dose) + supp,
                   data = datasets::ToothGrowth,
                   alpha = 0.1,
                   optimize = TRUE,
                   parallel = FALSE,
                   maxiter = 5,
                   seed = 2024)

print(exactt.4, digits = 5)
```

Note that by optimizing the data ordering, `exactt()` is now able to construct informative 90% confidence intervals for each category of `dose` and `supp` when they equal "2" and "VC" respectively. Furthermore, the detailed results of the optimization process, including the genetic algorithm's configurations and outcomes for each variable, are stored in the `exactt.4$gaResults`. For instance, to review a summary of the genetic algorithm's performance for the `suppVC` variable, use:
```{r}
exactt.4$gaResults$suppVC@summary
```

### Note on Optimization Effects

While optimization generally improves statistical power, it is essential to remember that it increases the average power and may not universally reduce the confidence interval's width in every instance. 

## Example Usage: IV Case

The `exactt()` function is capable of handling models with instrumental variables (IV). In Example 15.5 of Wooldridge (2020), Wooldridge reanalyzes Mroz (1987). This example explores the impact of education (`educ`) on `log(wage)`, using parental education levels—mother's education (`motheduc`) and father's education (`fatheduc`)—as instruments. The model controls for experience (`exper`) and its square (`expersq`), with education being the primary variable of interest, hence we set variables = 1. Optionally, as before, we can optimize the data ordering to enhance statistical power.

```{r}
exactt.iv <- exactt(model = lwage ~ educ + exper + expersq | exper + expersq + motheduc + fatheduc,
                    data = wooldridge::mroz,
                    variables = 1,
                    optimize = TRUE,
                    parallel = FALSE,
                    maxiter = 10,
                    monitor = TRUE,
                    seed = 31740)

exactt.iv
```

Owner

  • Name: Ian
  • Login: ian-xu-economics
  • Kind: user

Citation (CITATION.cff)

# --------------------------------------------
# CITATION file created with {cffr} R package
# See also: https://docs.ropensci.org/cffr/
# --------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "exactt" in publications use:'
type: software
license: GPL-3.0-only
title: 'exactt: Implement the Exact t-Test'
version: 4.1.0
abstract: A paragraph providing a full description of the project (on several lines...)
authors:
- family-names: Pouliot
  given-names: Guillaume
- family-names: Xu
  given-names: Ian
  email: ianxu@uchicago.edu
preferred-citation:
  type: manual
  title: 'exactt: Implement the Exact t-Test'
  authors:
    - family-names: "Pouliot"
      given-names: "Guillaume"
    - family-names: "Xu"
      given-names: "Ian"
  year: '2024'
  notes: R package version 1.2.2
  url: https://github.com/ian-xu-economics/exactt
repository-code: https://github.com/ian-xu-economics/exactt
url: https://github.com/ian-xu-economics/exactt
contact:
- family-names: Xu
  given-names: Ian
  email: ianxu@uchicago.edu
references:
- type: software
  title: ggplot2
  abstract: 'ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics'
  notes: Suggests
  url: https://ggplot2.tidyverse.org
  repository: https://CRAN.R-project.org/package=ggplot2
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: Chang
    given-names: Winston
    orcid: https://orcid.org/0000-0002-1576-2126
  - family-names: Henry
    given-names: Lionel
  - family-names: Pedersen
    given-names: Thomas Lin
    email: thomas.pedersen@posit.co
    orcid: https://orcid.org/0000-0002-5147-4711
  - family-names: Takahashi
    given-names: Kohske
  - family-names: Wilke
    given-names: Claus
    orcid: https://orcid.org/0000-0002-7470-9261
  - family-names: Woo
    given-names: Kara
    orcid: https://orcid.org/0000-0002-5125-4188
  - family-names: Yutani
    given-names: Hiroaki
    orcid: https://orcid.org/0000-0002-3385-7233
  - family-names: Dunnington
    given-names: Dewey
    orcid: https://orcid.org/0000-0002-9415-4582
  - family-names: Brand
    given-names: Teun
    name-particle: van den
    orcid: https://orcid.org/0000-0002-9335-7468
  year: '2024'
- type: software
  title: knitr
  abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
  notes: Suggests
  url: https://yihui.org/knitr/
  repository: https://CRAN.R-project.org/package=knitr
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2024'
- type: software
  title: latex2exp
  abstract: 'latex2exp: Use LaTeX Expressions in Plots'
  notes: Suggests
  url: https://www.stefanom.io/latex2exp/
  repository: https://CRAN.R-project.org/package=latex2exp
  authors:
  - family-names: Meschiari
    given-names: Stefano
    email: stefano.meschiari@gmail.com
  year: '2024'
- type: software
  title: rmarkdown
  abstract: 'rmarkdown: Dynamic Documents for R'
  notes: Suggests
  url: https://pkgs.rstudio.com/rmarkdown/
  repository: https://CRAN.R-project.org/package=rmarkdown
  authors:
  - family-names: Allaire
    given-names: JJ
    email: jj@posit.co
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  - family-names: Dervieux
    given-names: Christophe
    email: cderv@posit.co
    orcid: https://orcid.org/0000-0003-4474-2498
  - family-names: McPherson
    given-names: Jonathan
    email: jonathan@posit.co
  - family-names: Luraschi
    given-names: Javier
  - family-names: Ushey
    given-names: Kevin
    email: kevin@posit.co
  - family-names: Atkins
    given-names: Aron
    email: aron@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  - family-names: Cheng
    given-names: Joe
    email: joe@posit.co
  - family-names: Chang
    given-names: Winston
    email: winston@posit.co
  - family-names: Iannone
    given-names: Richard
    email: rich@posit.co
    orcid: https://orcid.org/0000-0003-3925-190X
  year: '2024'
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  url: https://testthat.r-lib.org
  repository: https://CRAN.R-project.org/package=testthat
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2024'
  version: '>= 3.0.0'
- type: software
  title: cli
  abstract: 'cli: Helpers for Developing Command Line Interfaces'
  notes: Imports
  url: https://cli.r-lib.org
  repository: https://CRAN.R-project.org/package=cli
  authors:
  - family-names: Csárdi
    given-names: Gábor
    email: csardi.gabor@gmail.com
  year: '2024'
- type: software
  title: combinat
  abstract: 'combinat: combinatorics utilities'
  notes: Imports
  repository: https://CRAN.R-project.org/package=combinat
  authors:
  - family-names: Chasalow
    given-names: Scott
  year: '2024'
- type: software
  title: doParallel
  abstract: 'doParallel: Foreach Parallel Adaptor for the ''parallel'' Package'
  notes: Imports
  url: https://github.com/RevolutionAnalytics/doparallel
  repository: https://CRAN.R-project.org/package=doParallel
  authors:
  - family-names: Corporation
    given-names: Microsoft
  - family-names: Weston
    given-names: Steve
  year: '2024'
- type: software
  title: dplyr
  abstract: 'dplyr: A Grammar of Data Manipulation'
  notes: Imports
  url: https://dplyr.tidyverse.org
  repository: https://CRAN.R-project.org/package=dplyr
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: François
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Henry
    given-names: Lionel
  - family-names: Müller
    given-names: Kirill
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Vaughan
    given-names: Davis
    email: davis@posit.co
    orcid: https://orcid.org/0000-0003-4777-038X
  year: '2024'
- type: software
  title: Formula
  abstract: 'Formula: Extended Model Formulas'
  notes: Imports
  repository: https://CRAN.R-project.org/package=Formula
  authors:
  - family-names: Zeileis
    given-names: Achim
    email: Achim.Zeileis@R-project.org
    orcid: https://orcid.org/0000-0003-0918-3766
  - family-names: Croissant
    given-names: Yves
    email: Yves.Croissant@univ-reunion.fr
  year: '2024'
- type: software
  title: GA
  abstract: 'GA: Genetic Algorithms'
  notes: Imports
  url: https://luca-scr.github.io/GA/
  repository: https://CRAN.R-project.org/package=GA
  authors:
  - family-names: Scrucca
    given-names: Luca
    email: luca.scrucca@unipg.it
    orcid: https://orcid.org/0000-0003-3826-0484
  year: '2024'
- type: software
  title: ivreg
  abstract: 'ivreg: Instrumental-Variables Regression by ''2SLS'', ''2SM'', or ''2SMM'',
    with Diagnostics'
  notes: Imports
  url: https://zeileis.github.io/ivreg/
  repository: https://CRAN.R-project.org/package=ivreg
  authors:
  - family-names: Fox
    given-names: John
    email: jfox@mcmaster.ca
    orcid: https://orcid.org/0000-0002-1196-8012
  - family-names: Kleiber
    given-names: Christian
    email: Christian.Kleiber@unibas.ch
    orcid: https://orcid.org/0000-0002-6781-4733
  - family-names: Zeileis
    given-names: Achim
    email: Achim.Zeileis@R-project.org
    orcid: https://orcid.org/0000-0003-0918-3766
  year: '2024'
- type: software
  title: Matrix
  abstract: 'Matrix: Sparse and Dense Matrix Classes and Methods'
  notes: Imports
  url: https://R-forge.R-project.org/tracker/?atid=294&group_id=61
  authors:
  - family-names: Bates
    given-names: Douglas
    orcid: https://orcid.org/0000-0001-8316-9503
  - family-names: Maechler
    given-names: Martin
    email: mmaechler+Matrix@gmail.com
    orcid: https://orcid.org/0000-0002-8685-9910
  - family-names: Jagan
    given-names: Mikael
    orcid: https://orcid.org/0000-0002-3542-2938
  year: '2024'
- type: software
  title: MASS
  abstract: 'MASS: Support Functions and Datasets for Venables and Ripley''s MASS'
  notes: Imports
  url: http://www.stats.ox.ac.uk/pub/MASS4/
  authors:
  - family-names: Ripley
    given-names: Brian
    email: ripley@stats.ox.ac.uk
  year: '2024'
- type: software
  title: methods
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  institution:
    name: R Foundation for Statistical Computing
    address: Vienna, Austria
  year: '2024'
- type: software
  title: parallel
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  institution:
    name: R Foundation for Statistical Computing
    address: Vienna, Austria
  year: '2024'
- type: software
  title: rlang
  abstract: 'rlang: Functions for Base Types and Core R and ''Tidyverse'' Features'
  notes: Imports
  url: https://rlang.r-lib.org
  repository: https://CRAN.R-project.org/package=rlang
  authors:
  - family-names: Henry
    given-names: Lionel
    email: lionel@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2024'
- type: software
  title: stats
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  institution:
    name: R Foundation for Statistical Computing
    address: Vienna, Austria
  year: '2024'
- type: software
  title: tibble
  abstract: 'tibble: Simple Data Frames'
  notes: Imports
  url: https://tibble.tidyverse.org/
  repository: https://CRAN.R-project.org/package=tibble
  authors:
  - family-names: Müller
    given-names: Kirill
    email: kirill@cynkra.com
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2024'
- type: software
  title: utils
  abstract: 'R: A Language and Environment for Statistical Computing'
  notes: Imports
  authors:
  - name: R Core Team
  institution:
    name: R Foundation for Statistical Computing
    address: Vienna, Austria
  year: '2024'

GitHub Events

Total
  • Push event: 21
Last Year
  • Push event: 21

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.4.1 composite
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/render-README.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/update-citation-cff.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • GA * imports
  • MASS * imports
  • Matrix * imports
  • combinat * imports
  • doParallel * imports
  • dplyr * imports
  • methods * imports
  • parallel * imports
  • rootSolve * imports
  • stats * imports
  • tibble * imports
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests