estimatr

estimatr: Fast Estimators for Design-Based Inference

https://github.com/declaredesign/estimatr

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 18 committers (11.1%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (20.3%) to scientific vocabulary

Keywords from Contributors

research simulations tidy-data prediction regression-models closember transformation latex annotation beta
Last synced: 10 months ago · JSON representation

Repository

estimatr: Fast Estimators for Design-Based Inference

Basic Info
Statistics
  • Stars: 133
  • Watchers: 13
  • Forks: 21
  • Open Issues: 73
  • Releases: 17
Created over 9 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
title: "estimatr: Fast Estimators for Design-Based Inference"
---



```{r, echo = FALSE}
set.seed(42)
knitr::opts_chunk$set(
  collapse = TRUE,
  message = FALSE,
  comment = "#>",
  fig.path = "README-"  
)
options(digits = 2)
```

[![CRAN status](https://www.r-pkg.org/badges/version/estimatr)](https://cran.r-project.org/package=estimatr)
[![CRAN RStudio mirror downloads](https://cranlogs.r-pkg.org/badges/grand-total/estimatr?color=green)](https://r-pkg.org/pkg/estimatr)
[![Build status](https://github.com/DeclareDesign/estimatr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/DeclareDesign/estimatr/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/DeclareDesign/estimatr/graph/badge.svg)](https://app.codecov.io/gh/DeclareDesign/estimatr)
[![Replications](https://softwarecite.com/badge/estimatr)](https://softwarecite.com/package/estimatr)

**estimatr** is an `R` package providing a range of commonly-used linear estimators, designed for speed and for ease-of-use. Users can easily recover robust, cluster-robust, and other design appropriate estimates. We include two functions that implement means estimators, `difference_in_means()` and `horvitz_thompson()`, and three linear regression estimators, `lm_robust()`, `lm_lin()`, and `iv_robust()`. In each case, users can choose an estimator to reflect cluster-randomized, block-randomized, and block-and-cluster-randomized designs. The [Getting Started Guide](https://declaredesign.org/r/estimatr/articles/getting-started.html) describes each estimator provided by **estimatr** and how it can be used in your analysis.

You can also see the multiple ways you can [get regression tables out of estimatr](https://declaredesign.org/r/estimatr/articles/regression-tables.html) using commonly used `R` packages such as `texreg` and `stargazer`. Fast estimators also enable fast simulation of research designs to learn about their properties (see [DeclareDesign](https://declaredesign.org)).

## Installing estimatr

To install the latest stable release of **estimatr**, please ensure that you are running version 3.5 or later of R and run the following code:

```{r, eval=F}
install.packages("estimatr")
```

## Easy to use

Once the package is installed, getting appropriate estimates and standard errors is now both fast and easy.

```{r, eval = TRUE, echo=-1}
set.seed(42)
library(estimatr)

# sample data from cluster-randomized experiment
library(fabricatr)
library(randomizr)
dat <- fabricate(
  N = 100,
  y = rnorm(N),
  clusterID = sample(letters[1:10], size = N, replace = TRUE),
  z = cluster_ra(clusterID)
)

# robust standard errors
res_rob <- lm_robust(y ~ z, data = dat)
# tidy dataframes on command!
tidy(res_rob)

# cluster robust standard errors
res_cl <- lm_robust(y ~ z, data = dat, clusters = clusterID)
# standard summary view also available
summary(res_cl)

# matched-pair design learned from blocks argument
data(sleep)
res_dim <- difference_in_means(extra ~ group, data = sleep, blocks = ID)
```

The [Getting Started Guide](/r/estimatr/articles/getting-started.html) has more examples and uses, as do the reference pages. The [Mathematical Notes](/r/estimatr/articles/mathematical-notes.html) provide more information about what each estimator is doing under the hood.

## Fast to use

Getting estimates and robust standard errors is also faster than it used to be. Compare our package to using `lm()` and the `sandwich` package to get HC2 standard errors. More speed comparisons are available [here](https://declaredesign.org/r/estimatr/articles/benchmarking-estimatr.html). Furthermore, with many blocks (or fixed effects), users can use the `fixed_effects` argument of `lm_robust` with HC1 standard errors to greatly improve estimation speed. More on [fixed effects here](https://declaredesign.org/r/estimatr/articles/absorbing-fixed-effects.html).

```{r, echo=-1}
set.seed(1)
dat <- data.frame(X = matrix(rnorm(2000*50), 2000), y = rnorm(2000))

library(microbenchmark)
library(lmtest)
library(sandwich)
mb <- microbenchmark(
  `estimatr` = lm_robust(y ~ ., data = dat),
  `lm + sandwich` = {
    lo <- lm(y ~ ., data = dat)
    coeftest(lo, vcov = vcovHC(lo, type = 'HC2'))
  }
)
```
```{r, echo = FALSE}
d <- summary(mb)[, c("expr", "median")]
names(d) <- c("estimatr", "median run-time (ms)")
knitr::kable(d)
```

---

This project is generously supported by a grant from the [Laura and John Arnold Foundation](http://www.arnoldfoundation.org) and seed funding from [Evidence in Governance and Politics (EGAP)](http://egap.org).

Owner

  • Name: DeclareDesign
  • Login: DeclareDesign
  • Kind: organization

Tools for declaring and diagnosing the properties of research designs

GitHub Events

Total
  • Create event: 5
  • Release event: 1
  • Issues event: 6
  • Watch event: 4
  • Delete event: 3
  • Issue comment event: 8
  • Push event: 7
  • Pull request review event: 9
  • Pull request review comment event: 9
  • Pull request event: 9
  • Fork event: 1
Last Year
  • Create event: 5
  • Release event: 1
  • Issues event: 6
  • Watch event: 4
  • Delete event: 3
  • Issue comment event: 8
  • Push event: 7
  • Pull request review event: 9
  • Pull request review comment event: 9
  • Pull request event: 9
  • Fork event: 1

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 985
  • Total Committers: 18
  • Avg Commits per committer: 54.722
  • Development Distribution Score (DDS): 0.392
Past Year
  • Commits: 4
  • Committers: 1
  • Avg Commits per committer: 4.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Luke Sonnet l****t@g****m 599
Graeme Blair g****r 177
Neal Fultz n****z@g****m 62
Alexander Coppock a****k@g****m 43
Luke Sonnet l****t 34
Aaron Rudkin j****s@r****a 16
Lily Medina l****u@g****m 14
Nick-Rivera 3****a 10
Vincent Arel-Bundock v****k@u****a 7
Russell V. Lenth r****h@u****u 5
Jasper Cooper j****r@g****m 4
Lily Medina 3
Graeme Blair g****r@g****m 3
Katagiri, Satoshi k****h@g****m 3
Benjamin Elbers e****b@g****m 2
Katrin Leinweber 9****r 1
Vincent Arel-Bundock v****l@u****u 1
Shiro Kuriwaki s****i@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 70
  • Total pull requests: 39
  • Average time to close issues: 3 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 39
  • Total pull request authors: 11
  • Average comments per issue: 2.57
  • Average comments per pull request: 1.44
  • Merged pull requests: 33
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 2
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 26 days
  • Issue authors: 2
  • Pull request authors: 2
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • graemeblair (13)
  • lukesonnet (7)
  • acoppock (5)
  • NickCH-K (4)
  • grantmcdermott (3)
  • macartan (3)
  • tillea (2)
  • nfultz (2)
  • bfifield (2)
  • pekofsky (2)
  • victorrssx (1)
  • MaelAstruc (1)
  • nicolaiberk (1)
  • vsdornelas (1)
  • ylelkes (1)
Pull Request Authors
  • graemeblair (13)
  • lukesonnet (11)
  • vincentarelbundock (4)
  • nfultz (3)
  • elbersb (2)
  • acoppock (2)
  • RoyalTS (2)
  • jaspercooper (1)
  • MichaelChirico (1)
  • rossellhayes (1)
  • Gedevan-Aleksizde (1)
  • mollyow (1)
Top Labels
Issue Labels
bug (15) Priority: Medium (11) Priority: Low (9) Priority: High (9) feature-usability (8) feature-methods (7) lm_robust (5) documentation (3) iv_robust (3) difference_in_means (2) statistical (2) question (2) horvitz_thompson (1)
Pull Request Labels
feature-usability (1)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 10,435 last-month
  • Total docker downloads: 20,480
  • Total dependent packages: 23
  • Total dependent repositories: 33
  • Total versions: 22
  • Total maintainers: 1
cran.r-project.org: estimatr

Fast Estimators for Design-Based Inference

  • Versions: 22
  • Dependent Packages: 23
  • Dependent Repositories: 33
  • Downloads: 10,435 Last month
  • Docker Downloads: 20,480
Rankings
Stargazers count: 3.1%
Dependent packages count: 3.3%
Downloads: 4.2%
Forks count: 4.6%
Dependent repos count: 4.6%
Average: 5.4%
Docker downloads count: 12.5%
Maintainers (1)
Last synced: 10 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5.0 depends
  • texreg * enhances
  • Formula * imports
  • Rcpp >= 0.12.16 imports
  • generics * imports
  • methods * imports
  • rlang >= 0.2.0 imports
  • AER * suggests
  • RcppEigen * suggests
  • car * suggests
  • clubSandwich * suggests
  • emmeans >= 1.4 suggests
  • estimability * suggests
  • fabricatr >= 0.10.0 suggests
  • ivpack * suggests
  • margins * suggests
  • modelsummary * suggests
  • prediction * suggests
  • randomizr >= 0.20.0 suggests
  • sandwich * suggests
  • stargazer * suggests
  • testthat * suggests