cobalt

Covariate Balance Tables and Plots - An R package for assessing covariate balance

https://github.com/ngreifer/cobalt

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 8 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.3%) to scientific vocabulary

Keywords

causal-inference propensity-scores r
Last synced: 10 months ago · JSON representation

Repository

Covariate Balance Tables and Plots - An R package for assessing covariate balance

Basic Info
Statistics
  • Stars: 77
  • Watchers: 3
  • Forks: 14
  • Open Issues: 6
  • Releases: 0
Topics
causal-inference propensity-scores r
Created almost 10 years ago · Last pushed 11 months ago
Metadata Files
Readme Changelog

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = FALSE,
  warning = FALSE,
  message = FALSE,
  tidy = FALSE,
  fig.align='center',
  comment = "#>",
  fig.path = "man/figures/README-"
)
#if (!requireNamespace("MatchIt")) knitr::opts_chunk$set(eval = FALSE)
```
# cobalt: Covariate Balance Tables and Plots 

[![CRAN_Status_Badge](https://img.shields.io/cran/v/cobalt?color=%230047ab)](https://cran.r-project.org/package=cobalt) [![CRAN_Downloads_Badge](https://cranlogs.r-pkg.org/badges/cobalt?color=%230047ab)](https://cran.r-project.org/package=cobalt)
------
### Overview

Welcome to `cobalt`, which stands for **Co**variate **Bal**ance **T**ables (and Plots). `cobalt` allows users to assess balance on covariate distributions in preprocessed groups generated through weighting, matching, or subclassification, such as by using the propensity score. `cobalt`'s primary function is `bal.tab()`, which stands for "balance table", and is meant to replace (or supplement) the balance assessment tools found in other R packages. To examine how `bal.tab()` integrates with these packages and others, see the help file for `bal.tab()` with `?bal.tab`, which links to the methods used for each package. Each page has examples of how `bal.tab()` is used with the package. There are also several vignettes detailing the use of `cobalt`, which can be accessed at `vignette("cobalt")`: one for basic uses of `cobalt`, one for the use of `cobalt` with additional packages, one for the use of `cobalt` with multiply imputed and/or clustered data, one for the use of `cobalt` with longitudinal treatments, and one for the use of `cobalt` to generate publication-ready plots. Currently, `cobalt` is compatible with output from `MatchIt`, `twang`, `Matching`, `optmatch`, `CBPS`, `ebal`, `WeightIt`, `designmatch`, `sbw`, `MatchThem`, and `cem` as well as data not processed through these packages.

For more information, check out the `cobalt` [website](https://ngreifer.github.io/cobalt/)!

### Why cobalt?

Most of the major conditioning packages contain functions to assess balance; so why use `cobalt` at all? `cobalt` arose out of several desiderata when using these packages: to have standardized measures that were consistent across all conditioning packages, to allow for flexibility in the calculation and display of balance measures, and to incorporate recent methodological recommendations in the assessment of balance. In addition, `cobalt` has unique plotting capabilities that make use of `ggplot2` in R for balance assessment and reporting.

Because conditioning methods are spread across several packages which each have their idiosyncrasies in how they report balance (if at all), comparing the resulting balance from various conditioning methods can be a challenge. `cobalt` unites these packages by providing a single, flexible tool that intelligently processes output from any of the conditioning packages and provides the user with both useful defaults and customizable options for display and calculation. `cobalt` also allows for balance assessment on data not generated through any of the conditioning packages. In addition, `cobalt` has tools for assessing and reporting balance for clustered data sets, data sets generated through multiple imputation, and data sets with a continuous treatment variable, all features that exist in very limited capacities or not at all in other packages. 

A large focus in developing `cobalt` was to streamline output so that only the most useful, non-redundant, and complete information is displayed, all at the user's choice. Balance statistics are intuitive, methodologically informed, and simple to interpret. Visual displays of balance reflect the goals of balance assessment rather than being steps removed. While other packages have focused their efforts on processing data, `cobalt` only assesses balance, and does so particularly well.

New features are being added all the time, following the cutting edge of methodological work on balance assessment. As new packages and methods are developed, `cobalt` will be ready to integrate them to further our goal of simple, unified balance assessment.

### Examples

Below are examples of `cobalt`'s primary functions:

```{r}
library("cobalt")
data("lalonde", package = "cobalt")

#Nearest neighbor matching with MatchIt
m.out <- MatchIt::matchit(treat ~ age + educ + race + married +
                              nodegree + re74 + re75,
                          data = lalonde)

#Checking balance before and after matching:
bal.tab(m.out, thresholds = c(m = .1), un = TRUE)
```

```{r, fig.show='hide', fig.width=4, fig.height=4, collapse=T}
#Examining distributional balance with plots:
bal.plot(m.out, var.name = "educ")
bal.plot(m.out, var.name = "distance",
         mirror = TRUE, type = "histogram")
```
![](man/figures/README-unnamed-chunk-3-1.png){ display=inline } ![](man/figures/README-unnamed-chunk-3-2.png){ display=inline }


```{r}
#Generating a Love plot to report balance:
love.plot(m.out, stats = c("mean.diffs", "variance.ratios"),
          thresholds = c(m = .1, v = 2), abs = TRUE, 
          binary = "std",
          var.order = "unadjusted")
```

Please remember to cite this package when using it to analyze data. For example, in a manuscript, you could write: "Matching was performed using the *Matching* package (Sekhon, 2011), and covariate balance was assessed using *cobalt* (Greifer, `r format(Sys.Date(), "%Y")`), both in R (R Core Team, `r R.version$year`)." Use `citation("cobalt")` to generate a bibliographic reference for the `cobalt` package.

Bugs appear in `cobalt` occasionally, often found by users. Please report any bugs at https://github.com/ngreifer/cobalt/issues. To install the latest development version of `cobalt`, which may have removed a bug you're experiencing, use the following code:
```{r, eval=FALSE}
remotes::install_github("ngreifer/cobalt")
```

Owner

  • Name: Noah Greifer
  • Login: ngreifer
  • Kind: user
  • Company: Harvard University

Data Science Specialist at Harvard University Institute for Quantitative Social Science (IQSS)

GitHub Events

Total
  • Issues event: 15
  • Watch event: 3
  • Issue comment event: 12
  • Push event: 34
  • Fork event: 3
Last Year
  • Issues event: 15
  • Watch event: 3
  • Issue comment event: 12
  • Push event: 34
  • Fork event: 3

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 1,577
  • Total Committers: 8
  • Avg Commits per committer: 197.125
  • Development Distribution Score (DDS): 0.022
Past Year
  • Commits: 25
  • Committers: 1
  • Avg Commits per committer: 25.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Noah Greifer n****r@g****m 1,542
Noah Greifer n****r@N****l 24
Greifer n****r@a****u 5
Gina Reynolds e****e@g****m 2
Teun van den Brand t****d@g****m 1
Jesse Cambon 3****n 1
CharpignonML m****c@g****m 1
MM m****h@g****h 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 80
  • Total pull requests: 13
  • Average time to close issues: 4 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 58
  • Total pull request authors: 8
  • Average comments per issue: 2.39
  • Average comments per pull request: 0.54
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 8
  • Pull requests: 0
  • Average time to close issues: about 1 month
  • Average time to close pull requests: N/A
  • Issue authors: 8
  • Pull request authors: 0
  • Average comments per issue: 1.88
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ngreifer (8)
  • victorn1 (7)
  • BarkleyBG (3)
  • KZARCA (2)
  • sky0502 (2)
  • eipi10 (2)
  • maellecoursonnais (2)
  • markdanese (2)
  • BorgeJorge (2)
  • sbaross (1)
  • mantramantra12 (1)
  • norafino (1)
  • jbp7 (1)
  • yohei-h (1)
  • ks8997 (1)
Pull Request Authors
  • ngreifer (6)
  • teunbrand (2)
  • etiennebacher (2)
  • EvaMaeRey (1)
  • sumtxt (1)
  • jessecambon (1)
  • ganesh-krishnan (1)
  • CharpignonML (1)
Top Labels
Issue Labels
bug (2) to do (2) enhancement (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • cran 10,509 last-month
  • Total docker downloads: 42,767
  • Total dependent packages: 10
    (may contain duplicates)
  • Total dependent repositories: 13
    (may contain duplicates)
  • Total versions: 51
  • Total maintainers: 1
cran.r-project.org: cobalt

Covariate Balance Tables and Plots

  • Versions: 43
  • Dependent Packages: 9
  • Dependent Repositories: 12
  • Downloads: 10,509 Last month
  • Docker Downloads: 42,767
Rankings
Docker downloads count: 0.6%
Downloads: 5.2%
Average: 5.2%
Stargazers count: 5.4%
Forks count: 5.8%
Dependent packages count: 6.1%
Dependent repos count: 8.4%
Maintainers (1)
Last synced: 11 months ago
conda-forge.org: r-cobalt
  • Versions: 8
  • Dependent Packages: 1
  • Dependent Repositories: 1
Rankings
Dependent repos count: 24.3%
Dependent packages count: 29.0%
Average: 33.8%
Stargazers count: 37.9%
Forks count: 43.8%
Last synced: 11 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.3.0 depends
  • backports >= 1.1.9 imports
  • crayon * imports
  • ggplot2 >= 3.3.0 imports
  • grid * imports
  • gridExtra >= 2.3 imports
  • gtable >= 0.3.0 imports
  • rlang >= 0.4.0 imports
  • CBPS >= 0.17 suggests
  • MatchIt >= 4.0.0 suggests
  • MatchThem >= 0.9.3 suggests
  • Matching * suggests
  • WeightIt >= 0.12.0 suggests
  • cem >= 1.1.27 suggests
  • designmatch * suggests
  • ebal * suggests
  • knitr * suggests
  • mice >= 3.8.0 suggests
  • optmatch * suggests
  • optweight * suggests
  • rmarkdown * suggests
  • sbw >= 1.1.5 suggests
  • twang >= 1.6 suggests
  • twangContinuous * suggests
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action 4.1.4 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite