unsum
Complete Listing of Original Samples of Underlying Raw Evidence
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.6%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Complete Listing of Original Samples of Underlying Raw Evidence
Basic Info
- Host: GitHub
- Owner: lhdjung
- License: other
- Language: R
- Default Branch: master
- Homepage: https://lhdjung.github.io/unsum/
- Size: 23.5 MB
Statistics
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 1
- Releases: 0
Created over 1 year ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# unsum: reconstruct raw data from summary statistics
The goal of unsum is to **un**do **sum**marization: reconstruct all possible samples that may underlie a given set of summary statistics. It currently supports sets of mean, SD, sample size, and scale bounds. This can be useful in forensic metascience to identify impossible or implausible reported numbers.
The package features *CLOSURE: Complete Listing of Original Samples of Underlying Raw Evidence*, a fast algorithm implemented in Rust. Go to [*Get started*](https://lhdjung.github.io/unsum/articles/unsum.html) to learn how to use it.
CLOSURE is exhaustive, which makes it computationally intensive. If your code takes too long to run, consider using [SPRITE](https://lukaswallrich.github.io/rsprite2//) instead (see *Previous work* below).
## Installation
You can install unsum with either of these:
``` r
install.packages("unsum")
# or
pak::pkg_install("unsum")
```
Your R version should be 4.2.0 or more recent.
## Demo
Start with `closure_generate()`, the package's main function. It creates all possible samples:
```{r example}
library(unsum)
data <- closure_generate(
mean = "2.7",
sd = "1.9",
n = 130,
scale_min = 1,
scale_max = 5
)
data
```
Visualize the overall distribution of values found in the samples:
```{r}
#| fig.alt: >
#| Barplot of `data`, the CLOSURE output.
#| It specifically visualizes the `f_average` column of
#| the `frequency` tibble, but also gives percentage figures,
#| similar to the `f_relative` column. The overall shape is
#| a somewhat polarized distribution.
closure_plot_bar(data)
```
## Previous work
[SPRITE](https://lukaswallrich.github.io/rsprite2/) generates random datasets that could have led to the reported statistics. CLOSURE is exhaustive, so it always finds all possible datasets, not just a random sample of them. For the same reason, SPRITE runs fast when CLOSURE may take too long.
[GRIM and GRIMMER](https://lhdjung.github.io/scrutiny/) test reported summary statistics for consistency, but CLOSURE is the ultimate consistency test: if it finds at least one distribution, the statistics are consistent; and if not, they cannot all be correct.
[CORVIDS](https://github.com/katherinemwood/corvids) deserves credit as the first technique to reconstruct all possible underlying datasets. However, it takes very long to run, often prohibitively so. This is partly because the code is written in Python, but the algorithm is also inherently much more complex than CLOSURE.
## About
The CLOSURE algorithm was originally written [in Python](https://github.com/larigaldie-n/CLOSURE-Python) by Nathanael Larigaldie. The R package unsum provides easy access to an optimized implementation in Rust, [closure-core](https://github.com/lhdjung/closure-core), via the amazing [extendr](https://extendr.github.io/) framework. Rust code tends to run much faster than R or Python code, which is required for many applications of CLOSURE.
Owner
- Name: Lukas Jung
- Login: lhdjung
- Kind: user
- Location: Heidelberg, Germany
- Twitter: lukasjung_hd
- Repositories: 1
- Profile: https://github.com/lhdjung
R developer and master's student at Heidelberg University.
GitHub Events
Total
- Issues event: 2
- Watch event: 2
- Delete event: 7
- Push event: 63
- Pull request event: 12
- Create event: 9
Last Year
- Issues event: 2
- Watch event: 2
- Delete event: 7
- Push event: 63
- Pull request event: 12
- Create event: 9
Issues and Pull Requests
Last synced: 10 months ago
Packages
- Total packages: 1
-
Total downloads:
- cran 165 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 1
- Total maintainers: 1
cran.r-project.org: unsum
Reconstruct Raw Data from Summary Statistics
- Homepage: https://github.com/lhdjung/unsum
- Documentation: http://cran.r-project.org/web/packages/unsum/unsum.pdf
- License: MIT + file LICENSE
-
Latest release: 0.2.0
published about 1 year ago
Rankings
Dependent packages count: 26.2%
Dependent repos count: 32.3%
Average: 48.3%
Downloads: 86.4%
Maintainers (1)
Last synced:
10 months ago