Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Keywords
excel
pivot-tables
r
spreadsheet
Keywords from Contributors
parsing
tidyverse
unit-testing
fwf
csv
visualisation
factor
strings
string-interpolation
s3-vectors
Last synced: 6 months ago
·
JSON representation
Repository
Unpivot complex and irregular data layouts in R
Basic Info
- Host: GitHub
- Owner: nacnudus
- License: other
- Language: R
- Default Branch: main
- Homepage: https://nacnudus.github.io/unpivotr/
- Size: 6.93 MB
Statistics
- Stars: 187
- Watchers: 8
- Forks: 19
- Open Issues: 4
- Releases: 5
Topics
excel
pivot-tables
r
spreadsheet
Created over 9 years ago
· Last pushed about 1 year ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/"
)
```
# unpivotr
[](https://CRAN.R-project.org/package=unpivotr)

[](https://app.codecov.io/gh/nacnudus/unpivotr)
[](https://github.com/nacnudus/unpivotr/actions/workflows/R-CMD-check.yaml)
[unpivotr](https://github.com/nacnudus/unpivotr) deals with non-tabular data,
especially from spreadsheets. Use unpivotr when your source data has any of
these 'features':
* Multi-headered hydra
* Meaningful formatting
* Headers anywhere but at the top of each column
* Non-text headers e.g. dates
* Other stuff around the table
* Several similar tables in one sheet
* Sentinel values
* Superscript symbols
* Meaningful comments
* Nested HTML tables
If that list makes your blood boil, you'll enjoy the function names.
* `behead()` deals with multi-headered hydra tables one layer of headers at a
time, working from the edge of the table inwards. It's a bit like using
`header = TRUE` in `read.csv()`, but because it's a function, you can apply it
to as many layers of headers as you need. You end up with all the headers in
columns.
* `spatter()` is like `tidyr::spread()` but preserves mixed data types. You get
into a mixed-data-type situation by delaying type coercion until *after* the
table is tidy (rather than before, like `read.csv()` et al). And yes, it
usually follows `behead()`.
More positive, corrective functions:
* `justify()` aligns column headers before `behead()`ing, and has deliberate
moral overtones.
* `enhead()` attaches a header to the body of the data, *a la* Frankenstein.
The effect is the same as `behead()`, but is more powerful because you can
choose exactly which header cells you want, paying attention to formatting
(which `behead()` doesn't understand).
* `isolate_sentinels()` separates meaningful symbols like `"N/A"` or
`"confidential"` from the rest of the data, giving them some time alone think
about what they've done.
* `partition()` takes a sheet with several tables on it, and slashes into pieces
that each contain one table. You can then unpivot each table in turn with
`purrr::map()` or similar.
## Make cells tidy
Unpivotr uses data where each cells is represented by one row in a dataframe.
Like this.

What can you do with tidy cells? The best places to start are:
* [Spreadsheet Munging
Strategies](https://nacnudus.github.io/spreadsheet-munging-strategies/), a
free, online cookbook using [tidyxl](https://github.com/nacnudus/tidyxl/) and
[unpivotr](https://github.com/nacnudus/unpivotr)
* [Screencasts](https://www.youtube.com/watch?v=1sinC7wsS5U) on YouTube.
* [Worked examples](https://github.com/nacnudus/ukfarm) on GitHub.
Otherwise the basic idea is:
1. Read the data with a specialist tool.
* For spreadsheets, use [tidyxl](https://nacnudus.github.io/tidyxl/).
* For plain text files, you might soon be able to use
[readr](https://readr.tidyverse.org), but for now you'll have to install a
pull-request on that package with
`devtools::install_github("tidyverse/readr#760")`.
* For tables in html pages, use `unpivotr::tidy_html()`
* For data frames, use `unpivotr::as_cells()` -- this should be a last
resort, because by the time the data is in a conventional data frame, it
is often too late -- formatting has been lost, and most data types have
been coerced to strings.
1. Either `behead()` straight away, else `dplyr::filter()` separately for the
header cells and the data cells, and then recombine with `enhead()`.
1. `spatter()` so that each column has one data type.
```{r}
library(unpivotr)
library(tidyverse)
x <- purpose$`up-left left-up`
x # A pivot table in a conventional data frame. Four levels of headers, in two
# rows and two columns.
y <- as_cells(x) # 'Tokenize' or 'melt' the data frame into one row per cell
y
rectify(y) # useful for reviewing the melted form as though in a spreadsheet
y %>%
behead("up-left", "sex") %>% # Strip headers
behead("up", "life-satisfication") %>% # one
behead("left-up", "qualification") %>% # by
behead("left", "age-band") %>% # one.
select(-row, -col, -data_type, count = chr) %>% # cleanup
mutate(count = as.integer(count))
```
Note the compass directions in the code above, which hint to `behead()` where to
find the header cell for each data cell.
* `"up-left"` means the header (`Female`, `Male`) is positioned up and to the
left of the columns of data cells it describes.
* `"up"` means the header (`0 - 6`, `7 - 10`) is positioned directly above the
columns of data cells it describes.
* `"left-up"` means the header (`Bachelor's degree`, `Certificate`, etc.) is
positioned to the left and upwards of the rows of data cells it describes.
* `"left"` means the header (`15 - 24`, `25 - 44`, etc.) is positioned directly to
the left of the rows of data cells it describes.
## Installation
```{r, echo = TRUE, eval = FALSE}
# install.packages("devtools") # If you don't already have devtools
devtools::install_github("nacnudus/unpivotr", build_vignettes = TRUE)
```
The version 0.4.0 release had somee breaking changes. See `NEWS.md` for
details. The previous version can be installed as follow:
```r
devtools::install_version("unpivotr", version = "0.3.1", repos = "http://cran.us.r-project.org")
```
## Similar projects
[unpivotr](https://github.com/nacnudus/unpivotr) is inspired by
[Databaker](https://github.com/sensiblecodeio/databaker), a collaboration
between the [United Kingdom Office of National Statistics](https://www.ons.gov.uk/)
and [The Sensible Code Company](https://sensiblecode.io/).
[unpivotr](https://github.com/nacnudus/unpivotr).
[jailbreaker](https://github.com/rsheets/jailbreakr) attempts to extract
non-tabular data from spreadsheets into tabular structures automatically via
some clever algorithms. [unpivotr](https://github.com/nacnudus/unpivotr)
differs by being less magic, and equipping you to express what you want to do.
Owner
- Name: Duncan Garmonsway
- Login: nacnudus
- Kind: user
- Location: London
- Company: Government Digital Service
- Website: https://www.linkedin.com/in/duncangarmonsway/
- Repositories: 119
- Profile: https://github.com/nacnudus
GitHub Events
Total
- Issues event: 1
- Watch event: 3
- Delete event: 1
- Issue comment event: 1
- Push event: 9
- Pull request event: 2
- Create event: 3
Last Year
- Issues event: 1
- Watch event: 3
- Delete event: 1
- Issue comment event: 1
- Push event: 9
- Pull request event: 2
- Create event: 3
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Duncan Garmonsway | n****s@g****m | 616 |
| Bill Denney | w****y@h****m | 4 |
| Lionel Henry | l****y@g****m | 2 |
| Ian Moran | i****1@g****m | 2 |
| DavisVaughan | d****s@r****m | 2 |
| wdkrnls | k****s@g****m | 1 |
| Michael Kerber | 4****R | 1 |
| Kirill Müller | k****r | 1 |
| Greg | g****k | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 33
- Total pull requests: 29
- Average time to close issues: about 2 months
- Average time to close pull requests: 2 months
- Total issue authors: 21
- Total pull request authors: 9
- Average comments per issue: 2.39
- Average comments per pull request: 0.76
- Merged pull requests: 24
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 9 days
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- nacnudus (10)
- billdenney (3)
- bedantaguru (2)
- jnmaloof (1)
- seanfryan (1)
- martinzuba (1)
- danstrobridge-Weston (1)
- rtaph (1)
- lepennec (1)
- callumwebb (1)
- JPennn (1)
- erm-eanway (1)
- romainfrancois (1)
- f00baroo (1)
- krlmlr (1)
Pull Request Authors
- nacnudus (18)
- billdenney (3)
- krlmlr (3)
- romainfrancois (1)
- wdkrnls (1)
- mtkerbeR (1)
- DavisVaughan (1)
- gregrs-uk (1)
- lionel- (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- cran 471 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 2
(may contain duplicates) - Total versions: 19
- Total maintainers: 1
proxy.golang.org: github.com/nacnudus/unpivotr
- Documentation: https://pkg.go.dev/github.com/nacnudus/unpivotr#section-documentation
- License: other
-
Latest release: v0.6.0
published almost 6 years ago
Rankings
Dependent packages count: 5.5%
Average: 5.6%
Dependent repos count: 5.8%
Last synced:
6 months ago
cran.r-project.org: unpivotr
Unpivot Complex and Irregular Data Layouts
- Homepage: https://github.com/nacnudus/unpivotr
- Documentation: http://cran.r-project.org/web/packages/unpivotr/unpivotr.pdf
- License: MIT + file LICENSE
-
Latest release: 0.6.4
published about 1 year ago
Rankings
Stargazers count: 2.4%
Forks count: 3.9%
Average: 14.8%
Dependent repos count: 19.3%
Downloads: 19.7%
Dependent packages count: 28.8%
Maintainers (1)
Last synced:
6 months ago
Dependencies
DESCRIPTION
cran
- R >= 3.2.0 depends
- cellranger * imports
- dplyr * imports
- forcats * imports
- magrittr * imports
- methods * imports
- pillar * imports
- purrr * imports
- rlang * imports
- tibble >= 2.1.1 imports
- tidyr * imports
- tidyselect * imports
- xml2 * imports
- DT * suggests
- covr * suggests
- here * suggests
- htmltools * suggests
- knitr * suggests
- readr * suggests
- readxl * suggests
- rmarkdown * suggests
- rvest * suggests
- selectr * suggests
- stringr * suggests
- testthat * suggests
- tidyxl * suggests
.github/workflows/R-CMD-check.yaml
actions
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/upload-artifact main composite
- r-lib/actions/setup-pandoc v1 composite
- r-lib/actions/setup-r v1 composite