tidyr

Tidy Messy Data

https://github.com/tidyverse/tidyr

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    3 of 136 committers (2.2%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (19.0%) to scientific vocabulary

Keywords

r tidy-data

Keywords from Contributors

grammar data-manipulation visualisation documentation-tool tidyverse package-creation rmarkdown curl latex ropensci
Last synced: 6 months ago · JSON representation

Repository

Tidy Messy Data

Basic Info
Statistics
  • Stars: 1,409
  • Watchers: 70
  • Forks: 417
  • Open Issues: 64
  • Releases: 31
Topics
r tidy-data
Created over 11 years ago · Last pushed 7 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Codeowners Support

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# tidyr tidyr website


[![CRAN status](https://www.r-pkg.org/badges/version/tidyr)](https://cran.r-project.org/package=tidyr)
[![R-CMD-check](https://github.com/tidyverse/tidyr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/tidyr/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/tidyverse/tidyr/branch/main/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/tidyr?branch=main)

  
## Overview

The goal of tidyr is to help you create __tidy data__. Tidy data is data where:

1. Each variable is a column; each column is a variable.
1. Each observation is a row; each row is an observation.
1. Each value is a cell; each cell is a single value.

Tidy data describes a standard way of storing data that is used wherever possible throughout the [tidyverse](https://www.tidyverse.org/). If you ensure that your data is tidy, you'll spend less time fighting with the tools and more time working on your analysis. Learn more about tidy data in `vignette("tidy-data")`.

## Installation

```{r, eval = FALSE}
# The easiest way to get tidyr is to install the whole tidyverse:
install.packages("tidyverse")

# Alternatively, install just tidyr:
install.packages("tidyr")

# Or the development version from GitHub:
# install.packages("pak")
pak::pak("tidyverse/tidyr")
```

## Cheatsheet

  

## Getting started

```{r}
library(tidyr)
```

tidyr functions fall into five main categories:

* "Pivoting" which converts between long and wide forms. tidyr 1.0.0 
  introduces `pivot_longer()` and `pivot_wider()`, replacing the older 
  `spread()` and `gather()` functions. See `vignette("pivot")` for more 
  details.
  
* "Rectangling", which turns deeply nested lists (as from JSON) into tidy
  tibbles. See `unnest_longer()`, `unnest_wider()`, `hoist()`, and 
  `vignette("rectangle")` for more details.
  
* Nesting converts grouped data to a form where each group becomes a
  single row containing a nested data frame, and unnesting does the opposite.
  See `nest()`, `unnest()`, and  `vignette("nest")` for more details.

* Splitting and combining character columns. Use `separate_wider_delim()`,
  `separate_wider_position()`, and `separate_wider_regex()` to pull a single
  character column into multiple columns; use `unite()` to combine multiple
  columns into a single character column.

* Make implicit missing values explicit with `complete()`; make explicit 
  missing values implicit with `drop_na()`; replace missing values with
  next/previous value with `fill()`, or a known value with `replace_na()`.

## Related work

tidyr [supersedes](https://lifecycle.r-lib.org/articles/stages.html#superseded) reshape2 (2010-2014) and reshape (2005-2010). Somewhat counterintuitively, each iteration of the package has done less. tidyr is designed specifically for tidying data, not general reshaping (reshape2), or the general aggregation (reshape). 

[data.table](https://rdatatable.gitlab.io/data.table) provides high-performance implementations of `melt()` and `dcast()`

If you'd like to read more about data reshaping from a CS perspective, I'd recommend the following three papers:

* [Wrangler: Interactive visual specification of data transformation scripts](http://vis.stanford.edu/papers/wrangler)

* [An interactive framework for data cleaning](https://www2.eecs.berkeley.edu/Pubs/TechRpts/2000/CSD-00-1110.pdf) (Potter's wheel)

* [On efficiently implementing SchemaSQL on a SQL database system](https://www.vldb.org/conf/1999/P45.pdf)

To guide your reading, here's a translation between the terminology used in different places:

| tidyr 1.0.0       | pivot longer | pivot wider |
|-------------------|--------------|-------------|
| tidyr < 1.0.0     | gather       | spread      |
| reshape(2)        | melt         | cast        |
| spreadsheets      | unpivot      | pivot       | 
| databases         | fold         | unfold      |

## Getting help

If you encounter a clear bug, please file a minimal reproducible example on [github](https://github.com/tidyverse/tidyr/issues). For questions and other discussion, please use [forum.posit.co](https://forum.posit.co/).

---

Please note that the tidyr project is released with a [Contributor Code of Conduct](https://tidyr.tidyverse.org/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.

Owner

  • Name: tidyverse
  • Login: tidyverse
  • Kind: organization

The tidyverse is a collection of R packages that share common principles and are designed to work together seamlessly

GitHub Events

Total
  • Issues event: 34
  • Watch event: 33
  • Issue comment event: 38
  • Push event: 12
  • Pull request review event: 8
  • Pull request review comment event: 11
  • Pull request event: 14
  • Fork event: 11
  • Create event: 1
Last Year
  • Issues event: 34
  • Watch event: 33
  • Issue comment event: 38
  • Push event: 12
  • Pull request review event: 8
  • Pull request review comment event: 11
  • Pull request event: 14
  • Fork event: 11
  • Create event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 1,341
  • Total Committers: 136
  • Avg Commits per committer: 9.86
  • Development Distribution Score (DDS): 0.422
Past Year
  • Commits: 27
  • Committers: 11
  • Avg Commits per committer: 2.455
  • Development Distribution Score (DDS): 0.556
Top Committers
Name Email Commits
Hadley Wickham h****m@g****m 775
Davis Vaughan d****s@r****m 186
Lionel Henry l****y@g****m 91
Jim Hester j****r@g****m 23
Maximilian Girlich m****h@m****m 19
Kirill Müller k****r@i****h 17
Jenny Bryan j****n@g****m 13
jennybc j****y@s****a 12
Mara Averick m****k@g****m 11
Kirill Müller k****r 11
Raymond Patterson r****5@g****m 9
Mark Dulhunty m****y 7
Adam Davis d****w@g****m 5
Hiroaki Yutani y****i@g****m 4
Josh Katz j****z@n****m 4
Mine Cetinkaya-Rundel c****e@g****m 4
William Lai 4****1 4
aaronwolen a****n@w****m 4
Matt Nield 6****d 3
QuantScripter 9****m 3
Will Beasley w****y@h****m 3
atusy 3****y 3
catalamarti m****e@g****m 3
olivroy 5****y 3
Romain François r****n@p****t 3
Julia Silge j****e@g****m 2
Lorenz Walthert l****t@i****m 2
Martin John Hadley m****y@g****m 2
R. Mark Sharp r****p@m****m 2
william3031 w****5@g****m 2
and 106 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 138
  • Total pull requests: 86
  • Average time to close issues: 4 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 97
  • Total pull request authors: 26
  • Average comments per issue: 2.01
  • Average comments per pull request: 0.91
  • Merged pull requests: 60
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 23
  • Pull requests: 18
  • Average time to close issues: 2 days
  • Average time to close pull requests: 8 days
  • Issue authors: 23
  • Pull request authors: 8
  • Average comments per issue: 0.52
  • Average comments per pull request: 0.11
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • DavisVaughan (22)
  • hadley (14)
  • mgirlich (4)
  • DanChaltiel (2)
  • matthewjnield (2)
  • jennybc (2)
  • tmcd82070 (1)
  • d-morrison (1)
  • asadow (1)
  • bholtdwyer (1)
  • vorpalvorpal (1)
  • MarcoBruttini (1)
  • llrs (1)
  • felixschweigkofler (1)
  • martinmodrak (1)
Pull Request Authors
  • DavisVaughan (27)
  • olivroy (13)
  • mgirlich (9)
  • hadley (8)
  • devpowerplatform (6)
  • matthewjnield (6)
  • catalamarti (6)
  • hrryt (5)
  • raffaem (3)
  • mine-cetinkaya-rundel (2)
  • dpprdan (2)
  • gaborcsardi (2)
  • angelicambg (2)
  • billdenney (2)
  • JamesHWade (2)
Top Labels
Issue Labels
feature (32) bug (17) pivoting :recycle: (11) nesting :bird: (8) documentation (7) rectangling :file_cabinet: (6) strings :violin: (6) grids #️⃣ (4) tidy-dev-day :nerd_face: (4) breaking change :skull_and_crossbones: (2) reprex (2) group :family_man_man_boy_boy: (1) upkeep (1) ask :bowtie: (1) df-col :handbag: (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • cran 1,195,774 last-month
  • Total docker downloads: 121,497,399
  • Total dependent packages: 2,304
    (may contain duplicates)
  • Total dependent repositories: 9,231
    (may contain duplicates)
  • Total versions: 61
  • Total maintainers: 1
cran.r-project.org: tidyr

Tidy Messy Data

  • Versions: 31
  • Dependent Packages: 2,304
  • Dependent Repositories: 9,231
  • Downloads: 1,195,774 Last month
  • Docker Downloads: 121,497,399
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Downloads: 0.1%
Forks count: 0.1%
Stargazers count: 0.2%
Average: 3.0%
Docker downloads count: 17.3%
Maintainers (1)
Last synced: 6 months ago
proxy.golang.org: github.com/tidyverse/tidyr
  • Versions: 30
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/check-devel.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action 4.1.4 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pr-commands.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/pr-fetch v2 composite
  • r-lib/actions/pr-push v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 3.4.0 depends
  • cli >= 3.4.1 imports
  • dplyr >= 1.0.10 imports
  • glue * imports
  • lifecycle >= 1.0.3 imports
  • magrittr * imports
  • purrr >= 1.0.1 imports
  • rlang >= 1.0.4 imports
  • stringr >= 1.5.0 imports
  • tibble >= 2.1.1 imports
  • tidyselect >= 1.2.0 imports
  • utils * imports
  • vctrs >= 0.5.2 imports
  • covr * suggests
  • data.table * suggests
  • knitr * suggests
  • readr * suggests
  • repurrrsive >= 1.1.0 suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests