codebook

Cook rmarkdown codebooks from metadata on R data frames

https://github.com/rubenarslan/codebook

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.1%) to scientific vocabulary

Keywords

codebook documentation formr json-ld metadata r spss webapp
Last synced: 6 months ago · JSON representation

Repository

Cook rmarkdown codebooks from metadata on R data frames

Basic Info
Statistics
  • Stars: 144
  • Watchers: 4
  • Forks: 18
  • Open Issues: 21
  • Releases: 10
Topics
codebook documentation formr json-ld metadata r spss webapp
Created over 8 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog License Zenodo

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# codebook
[![Travis-CI Build Status](https://travis-ci.org/rubenarslan/codebook.svg?branch=master)](https://app.travis-ci.com/rubenarslan/codebook) 
[![CRAN status](http://www.r-pkg.org/badges/version-ago/codebook)](https://cran.r-project.org/package=codebook) 
![Downloads](https://cranlogs.r-pkg.org/badges/grand-total/codebook) 
[![codecov](https://codecov.io/gh/rubenarslan/codebook/branch/master/graph/badge.svg)](https://app.codecov.io/gh/rubenarslan/codebook) 
[![DOI](https://zenodo.org/badge/109252375.svg)](https://zenodo.org/badge/latestdoi/109252375)

_Automatic Codebooks from Metadata Encoded in Dataset Attributes_

## Description

Easily automate the following tasks to describe data frames: 
- summarise the distributions, and labelled missings of variables graphically and using descriptive statistics
-  for surveys, compute and summarise reliabilities (internal consistencies, retest, multilevel) for psychological scales, 
-  combine this information with metadata (such as item labels and labelled values) that is derived from R attributes.

To do so, the package relies on 'rmarkdown' partials, so you can generate HTML, PDF, and Word documents. Codebooks are also available as tables (CSV, Excel, etc.) and in JSON-LD, so that search engines can find your data and index the metadata.


## Generate markdown codebooks from the attributes of the variables in your data frame

RStudio and a few of the tidyverse package already usefully display the information contained in the attributes of the variables in your data frame. The [haven](https://github.com/tidyverse/haven) package also manages to grab variable documentation from SPSS or Stata files.

## RStudio Addin
If the RStudio data viewer scrolls slow for your taste, or you'd like to keep the variable labels in view while working, use our RStudio Addins (ideally assigned to a keyboard shortcut) to see and search variable and value labels in the viewer pane. 

![Gif of Addin](https://rubenarslan.github.io/codebook/reference/figures/codebook_addin.gif)

## Codebook generation

The codebook package takes those attributes and the data and tries to produce a good-looking codebook, i.e. a place to get an overview of the variables in a dataset. The codebook processes single items, but also "scales", i.e. psychological questionnaires that are aggregated to extract a construct. For scales, the appropriate reliability coefficients (internal consistencies for single measurements, retest reliabilities for repeated measurements, multilevel reliability for multilevel data) are computed.
For items and scales, the distributions are summarised graphically and numerically.

This package integrates tightly with formr ([formr.org](https://formr.org)), an online survey framework and especially the data frames produced and marked up by the [formr R package](https://github.com/rubenarslan/formr). However, codebook is completely independent of it.

## Documentation
Confer the help or: https://rubenarslan.github.io/codebook/.
See the [vignette](https://rubenarslan.github.io/codebook/articles/codebook.html) for a quick example of an HTML document generated using `codebook`, or below for a copy-pastable rmarkdown document to get you started.

## Use as a webapp

If you don't want to install the codebook package, you can just upload an annotated dataset in a variety of formats (R, SPSS, Stata, ...) here: https://codebook.formr.org

## Use locally
### Install

Run the following in R.
```r
install.packages("codebook")
```

Or to get the latest development version:

```r
install.packages("remotes")
remotes::install_github("rubenarslan/codebook")
```

Then run the following to get started:

```r
library(codebook)
new_codebook_rmd()
```

## Citation
To cite the package, you can cite the open access paper, but to make your codebook
traceable to the version of the package you used, you might also want to cite
the archived package DOI.

### Paper
> Arslan, R. C. (2019). How to automatically document data with the codebook package to facilitate data re-use. Advances in Methods and Practices in Psychological Science. [doi:10.1177/2515245919838783](https://doi.org/10.1177/2515245919838783)

### Zenodo
> Arslan, R. C. (2024). Automatic codebooks from survey metadata (2018). URL https://github.com/rubenarslan/codebook. [![DOI](https://zenodo.org/badge/109252375.svg)](https://zenodo.org/badge/latestdoi/109252375)


### How to use
Here's a simple rmarkdown template, that you could use to get started.
The resulting codebook will be an HTML file, but you can also choose to generate PDFs or Word files by fiddling with the `output` settings.

````markdown
---
title: "Codebook"
output:
  html_document:
    toc: true
    toc_depth: 4
    toc_float: true
    code_folding: 'hide'
    self_contained: true
  pdf_document:
    toc: yes
    toc_depth: 4
    latex_engine: xelatex
---

```{r setup}`r ''`
knitr::opts_chunk$set(
  warning = TRUE, # show warnings during codebook generation
  message = TRUE, # show messages during codebook generation
  error = TRUE, # do not interrupt codebook generation in case of errors,
                # usually makes debugging easier, and sometimes half a codebook
                # is better than none
  echo = FALSE  # don't show the R code
)
ggplot2::theme_set(ggplot2::theme_bw())

```

Here, we import data from formr

```{r}`r ''`
library(formr)
source(".passwords.R")
formr_connect(email = credentials$email, password = credentials$password)
codebook_data <- formr_results("s3_daily")
```

But we can also import data from e.g. an SPSS file.
```{r}`r ''`
codebook_data <- rio::import("s3_daily.sav")
```


Sometimes, the metadata is not set up in such a way that codebook
can leverage it fully. These functions help fix this.

```{r codebook}`r ''`
library(codebook) # load the package
# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
    only_labelled = TRUE, # only labelled values are autodetected as
                                   # missing
    negative_values_are_missing = FALSE, # negative values are NOT missing values
    ninety_nine_problems = TRUE,   # 99/999 are missing values, if they
                                   # are more than 5 MAD from the median
    )

# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- detect_scales(codebook_data)
```

Now, generating a codebook is as simple as calling codebook from a chunk in an
rmarkdown document.

```{r}`r ''`
codebook(codebook_data)
```
````

## [Code of conduct for contributing](https://github.com/rubenarslan/codebook/blob/master/CONDUCT.md)

Owner

  • Name: Ruben C. Arslan
  • Login: rubenarslan
  • Kind: user
  • Location: Berlin
  • Company: @rforms @Meta-Rep

Psychologist. Working on an open source survey and study software (formr.org), two R packages and reproducible documentation of my statistical analyses.

GitHub Events

Total
  • Create event: 1
  • Issues event: 2
  • Release event: 1
  • Watch event: 2
  • Issue comment event: 2
  • Push event: 3
  • Fork event: 2
Last Year
  • Create event: 1
  • Issues event: 2
  • Release event: 1
  • Watch event: 2
  • Issue comment event: 2
  • Push event: 3
  • Fork event: 2

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 240
  • Total Committers: 3
  • Avg Commits per committer: 80.0
  • Development Distribution Score (DDS): 0.037
Top Committers
Name Email Commits
Ruben C. Arslan r****n@g****m 231
Ruben Arslan r****n@o****e 7
Hadley Wickham h****m@g****m 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 76
  • Total pull requests: 2
  • Average time to close issues: 9 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 40
  • Total pull request authors: 2
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rubenarslan (29)
  • JanaJarecki (5)
  • nsunami (2)
  • cloversleaves (2)
  • hadley (2)
  • elinalutz (2)
  • ericpgreen (1)
  • krlmlr (1)
  • stefvanbuuren (1)
  • sdaza (1)
  • cboettig (1)
  • ohlandja (1)
  • ferreira-santos (1)
  • cliocym (1)
  • marcschmid00 (1)
Pull Request Authors
  • hadley (1)
  • chengchou (1)
Top Labels
Issue Labels
bug (3) help wanted (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • cran 687 last-month
  • Total docker downloads: 70
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 23
  • Total maintainers: 1
proxy.golang.org: github.com/rubenarslan/codebook
  • Versions: 10
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
cran.r-project.org: codebook

Automatic Codebooks from Metadata Encoded in Dataset Attributes

  • Versions: 13
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 687 Last month
  • Docker Downloads: 70
Rankings
Stargazers count: 3.0%
Forks count: 4.6%
Average: 15.7%
Downloads: 18.4%
Dependent repos count: 24.0%
Dependent packages count: 28.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.2.0 depends
  • dplyr >= 1.0.0 imports
  • forcats >= 0.4.0 imports
  • ggplot2 >= 2.0.0 imports
  • glue * imports
  • graphics * imports
  • haven >= 2.3.0 imports
  • htmltools * imports
  • jsonlite * imports
  • knitr * imports
  • labeling * imports
  • labelled * imports
  • likert * imports
  • methods * imports
  • purrr * imports
  • rlang * imports
  • rmdpartials * imports
  • skimr >= 2.1.0 imports
  • stats * imports
  • stringr * imports
  • tibble * imports
  • tidyr * imports
  • tidyselect * imports
  • utils * imports
  • vctrs >= 0.3.0 imports
  • DT * suggests
  • future * suggests
  • lme4 * suggests
  • miniUI >= 0.1.1 suggests
  • psych * suggests
  • rio * suggests
  • rmarkdown * suggests
  • roxygen2 * suggests
  • rstudioapi >= 0.5 suggests
  • shiny >= 0.13 suggests
  • shinytest * suggests
  • testthat * suggests
  • ufs * suggests
  • userfriendlyscience * suggests
  • webshot * suggests
.github/workflows/rhub.yaml actions
  • r-hub/actions/checkout v1 composite
  • r-hub/actions/platform-info v1 composite
  • r-hub/actions/run-check v1 composite
  • r-hub/actions/setup v1 composite
  • r-hub/actions/setup-deps v1 composite
  • r-hub/actions/setup-r v1 composite