collapse

Advanced and Fast Data Transformation in R

https://github.com/sebkrantz/collapse

Science Score: 64.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Committers with academic emails
    1 of 13 committers (7.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.2%) to scientific vocabulary

Keywords

cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights

Keywords from Contributors

standardization

Scientific Fields

Biology Life Sciences - 40% confidence
Agricultural and Biological Sciences Life Sciences - 40% confidence
Last synced: 6 months ago · JSON representation ·

Repository

Advanced and Fast Data Transformation in R

Basic Info
Statistics
  • Stars: 685
  • Watchers: 8
  • Forks: 34
  • Open Issues: 18
  • Releases: 59
Topics
cran data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data r rstats scientific-computing statistics time-series weighted weights
Created almost 7 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

collapse

R-CMD-check collapse status badge CRAN status cran checks downloads per month <!-- ?color=blue --> downloads <!-- ?color=blue --> Conda Version Conda Downloads Codecov test coverage minimal R version dependencies DOI arXiv <!-- badges: end -->

collapse is a large C/C++-based package for data transformation and statistical computing in R. It aims to:

  • Facilitate complex data transformation, exploration and computing tasks in R.
  • Help make R code fast, flexible, parsimonious and programmer friendly.

Its novel class-agnostic architecture supports all basic R objects and their popular extensions, including units, integer64, xts/zoo, tibble, grouped_df, data.table, sf, pseries and pdata.frame.

Key Features:

  • Advanced statistical programming: A full set of fast statistical functions supporting grouped and weighted computations on vectors, matrices and data frames. Fast and programmable grouping, ordering, matching, deduplication, factor generation and interactions.

  • Fast data manipulation: Fast and flexible functions for data manipulation, data object conversions and memory efficient R programming.

  • Advanced aggregation: Fast and easy multi-type, weighted and parallelized data aggregation.

  • Advanced transformations: Fast row/column arithmetic, (grouped) sweeping out of statistics (by reference), (grouped, weighted) scaling and (higher-dimensional) centering and averaging.

  • Advanced time-computations: Fast and flexible indexed time series and panel data classes, lags/leads, differences and (compound) growth rates on (irregular) time series and panels, panel-autocorrelation functions and panel data to array conversions.

  • List processing: Recursive list search, filtering, splitting, apply and unlisting to data frame.

  • Advanced data exploration: Fast (grouped, weighted, multi-level) descriptive statistical tools.

collapse is written in C and C++, with algorithms much faster than base R's, has extremely low evaluation overheads, scales well (benchmarks: linux | windows), and excels on complex statistical tasks. <!--, such as weighted statistics, mode/counting/deduplication, joins, pivots, panel data. Optimized R code ensures minimal evaluation overheads. , but imports C/C++ functions from fixest, weights, RcppArmadillo, and RcppEigen for certain statistical tasks. -->

Installation

``` r

Install the current version on CRAN

install.packages("collapse")

Install a stable development version (Windows/Mac binaries) from R-universe

install.packages("collapse", repos = "https://fastverse.r-universe.dev")

Install a stable development version from GitHub (requires compilation)

remotes::install_github("SebKrantz/collapse")

Install previous versions from the CRAN Archive (requires compilation)

install.packages("https://cran.r-project.org/src/contrib/Archive/collapse/collapse_2.0.19.tar.gz", repos = NULL, type = "source")

Older stable versions: 1.9.6, 1.8.9, 1.7.6, 1.6.5, 1.5.3, 1.4.2, 1.3.2, 1.2.1

```

Documentation

collapse installs with a built-in structured documentation, implemented via a set of separate help pages. Calling help('collapse-documentation') brings up the the top-level documentation page, providing an overview of the entire package and links to all other documentation pages.

In addition there are several vignettes, among them one on Documentation and Resources.

Cheatsheet

<!-- height="227" 294 -->

Article on arXiv

An article on collapse is forthcoming at Journal of Statistical Software.

Presentation at useR 2022

Video Recording | Slides

Example Usage

This provides a simple set of examples introducing some important features of collapse. It should be easy to follow for readers familiar with R.

Click here to expand

``` r library(collapse) data("iris") # iris dataset in base R v <- iris$Sepal.Length # Vector d <- num_vars(iris) # Saving numeric variables (could also be a matrix, statistical functions are S3 generic) g <- iris$Species # Grouping variable (could also be a list of variables)

Advanced Statistical Programming -----------------------------------------------------------------------------

Simple (column-wise) statistics...

fmedian(v) # Vector fsd(qM(d)) # Matrix (qM is a faster as.matrix) fmode(d) # data.frame fmean(qM(d), drop = FALSE) # Still a matrix fmax(d, drop = FALSE) # Still a data.frame

Fast grouped and/or weighted statistics

w <- abs(rnorm(fnrow(iris))) fmedian(d, w = w) # Simple weighted statistics fnth(d, 0.75, g) # Grouped statistics (grouped third quartile) fmedian(d, g, w) # Groupwise-weighted statistics fsd(v, g, w) # Similarly for vectors fmode(qM(d), g, w, ties = "max") # Or matrices (grouped and weighted maximum mode) ...

A fast set of data manipulation functions allows complex piped programming at high speeds

library(magrittr) # Pipe operators iris %>% fgroupby(Species) %>% fndistinct # Grouped distinct value counts iris %>% fgroupby(Species) %>% fmedian(w) # Weighted group medians iris %>% addvars(w) %>% # Adding weight vector to dataset fsubset(Sepal.Length < fmean(Sepal.Length), Species, Sepal.Width:w) %>% # Fast selecting and subsetting fgroupby(Species) %>% # Grouping (efficiently creates a grouped tibble) fvar(w) %>% # Frequency-weighted group-variance, default (keep.w = TRUE)
roworder(sum.w) # also saves group weights in a column called 'sum.w'

Can also use dplyr (but dplyr manipulation verbs are a lot slower)

library(dplyr) iris %>% addvars(w) %>% filter(Sepal.Length < fmean(Sepal.Length)) %>% select(Species, Sepal.Width:w) %>% groupby(Species) %>% fvar(w) %>% arrange(sum.w)

Fast Data Manipulation ---------------------------------------------------------------------------------------

head(GGDC10S)

Pivot Wider: Only SUM (total)

SUM <- GGDC10S |> pivot(c("Country", "Year"), "SUM", "Variable", how = "wider") head(SUM)

Joining with data from wlddev

wlddev |> join(SUM, on = c("iso3c" = "Country", "year" = "Year"), how = "inner")

Recast pivoting + supplying new labels for generated columns

pivot(GGDC10S, values = 6:16, names = list("Variable", "Sectorcode"), labels = list(to = "Sector", new = c(Sectorcode = "GGDC10S Sector Code", Sector = "Long Sector Description", VA = "Value Added", EMP = "Employment")), how = "recast", na.rm = TRUE)

Advanced Aggregation -----------------------------------------------------------------------------------------

collap(iris, Sepal.Length + Sepal.Width ~ Species, fmean) # Simple aggregation using the mean.. collap(iris, ~ Species, list(fmean, fmedian, fmode)) # Multiple functions applied to each column add_vars(iris) <- w # Adding weights, return in long format.. collap(iris, ~ Species, list(fmean, fmedian, fmode), w = ~ w, return = "long")

Generate some additional logical data

settransform(iris, AWMSL = Sepal.Length > fmedian(Sepal.Length, w = w), AWMSW = Sepal.Width > fmedian(Sepal.Width, w = w))

Multi-type data aggregation: catFUN applies to all categorical columns (here AMWSW)

collap(iris, ~ Species + AWMSL, list(fmean, fmedian, fmode), catFUN = fmode, w = ~ w, return = "long")

Custom aggregation gives the greatest possible flexibility: directly mapping functions to columns

collap(iris, ~ Species + AWMSL, custom = list(fmean = 2:3, fsd = 3:4, fmode = "AWMSL"), w = ~ w, wFUN = list(fsum, fmin, fmax), # Here also aggregating the weight vector with 3 different functions keep.col.order = FALSE) # Column order not maintained -> grouping and weight variables first

Can also use grouped tibble: weighted median for numeric, weighted mode for categorical columns

iris %>% fgroup_by(Species, AWMSL) %>% collapg(fmedian, fmode, w = w)

Advanced Transformations -------------------------------------------------------------------------------------

All Fast Statistical Functions have a TRA argument, supporting 10 different replacing and sweeping operations

fmode(d, TRA = "replace") # Replacing values with the mode fsd(v, TRA = "/") # dividing by the overall standard deviation (scaling) fsum(d, TRA = "%") # Computing percentages fsd(d, g, TRA = "/") # Grouped scaling fmin(d, g, TRA = "-") # Setting the minimum value in each species to 0 ffirst(d, g, TRA = "%%") # Taking modulus of first value in each species fmedian(d, g, w, "-") # Groupwise centering by the weighted median fnth(d, 0.95, g, w, "%") # Expressing data in percentages of the weighted species-wise 95th percentile fmode(d, g, w, "replace", # Replacing data by the species-wise weighted minimum-mode ties = "min")

TRA() can also be called directly to replace or sweep with a matching set of computed statistics

TRA(v, sd(v), "/") # Same as fsd(v, TRA = "/") TRA(d, fmedian(d, g, w), "-", g) # Same as fmedian(d, g, w, "-") TRA(d, BY(d, g, quantile, 0.95), "%", g) # Same as fnth(d, 0.95, g, TRA = "%") (apart from quantile algorithm)

For common uses, there are some faster and more advanced functions

fbetween(d, g) # Grouped averaging [same as fmean(d, g, TRA = "replace") but faster] fwithin(d, g) # Grouped centering [same as fmean(d, g, TRA = "-") but faster] fwithin(d, g, w) # Grouped and weighted centering [same as fmean(d, g, w, "-")] fwithin(d, g, w, theta = 0.76) # Quasi-centering i.e. d - theta*fbetween(d, g, w) fwithin(d, g, w, mean = "overall.mean") # Preserving the overall weighted mean of the data

fscale(d) # Scaling and centering (default mean = 0, sd = 1) fscale(d, mean = 5, sd = 3) # Custom scaling and centering fscale(d, mean = FALSE, sd = 3) # Mean preserving scaling fscale(d, g, w) # Grouped and weighted scaling and centering fscale(d, g, w, mean = "overall.mean", # Setting group means to overall weighted mean, sd = "within.sd") # and group sd's to fsd(fwithin(d, g, w), w = w)

getvars(iris, 1:2) # Use getvars for fast selecting data.frame columns, gv is shortcut fhdbetween(gv(iris, 1:2), gv(iris, 3:5)) # Linear prediction with factors and continuous covariates fhdwithin(gv(iris, 1:2), gv(iris, 3:5)) # Linear partialling out factors and continuous covariates

This again opens up new possibilities for data manipulation...

iris %>%
ftransform(ASWMSL = Sepal.Length > fmedian(Sepal.Length, Species, w, "replace")) %>% fgroup_by(ASWMSL) %>% collapg(w = w, keep.col.order = FALSE)

iris %>% fgroupby(Species) %>% numvars %>% fwithin(w) # Weighted demeaning

Time Series and Panel Series ---------------------------------------------------------------------------------

flag(AirPassengers, -1:3) # A sequence of lags and leads EuStockMarkets %>% # A sequence of first and second seasonal differences fdiff(0:1 * frequency(.), 1:2)
fdiff(EuStockMarkets, rho = 0.95) # Quasi-difference [x - rhoflag(x)] fdiff(EuStockMarkets, log = TRUE) # Log-difference [log(x/flag(x))] EuStockMarkets %>% fgrowth(c(1, frequency(.))) # Ordinary and seasonal growth rate EuStockMarkets %>% fgrowth(logdiff = TRUE) # Log-difference growth rate [log(x/flag(x))100]

Creating panel data

pdata <- EuStockMarkets %>% list(A = ., B = .) %>% unlist2d(idcols = "Id", row.names = "Time")

L(pdata, -1:3, ~Id, ~Time) # Sequence of fully identified panel-lags (L is operator for flag) pdata %>% fgroup_by(Id) %>% flag(-1:3, Time) # Same thing..

collapse also supports indexed series and data frames (and plm panel data classes)

pdata <- findex_by(pdata, Id, Time)
L(pdata, -1:3) # Same as above, ... psacf(pdata) # Multivariate panel-ACF psmat(pdata) %>% plot # 3D-array of time series from panel data + plotting

HDW(pdata) # This projects out id and time fixed effects.. (HDW is operator for fhdwithin) W(pdata, effect = "Id") # Only Id effects.. (W is operator for fwithin)

List Processing ----------------------------------------------------------------------------------------------

Some nested list of heterogenous data objects..

l <- list(a = qM(mtcars[1:8]), # Matrix b = list(c = mtcars[4:11], # data.frame d = list(e = mtcars[2:10], f = fsd(mtcars)))) # Vector

ldepth(l) # List has 4 levels of nesting (considering that mtcars is a data.frame) isunlistable(l) # Can be unlisted haselem(l, "f") # Contains an element by the name of "f" has_elem(l, is.matrix) # Contains a matrix

getelem(l, "f") # Recursive extraction of elements.. getelem(l, c("c","f"))
get_elem(l, c("c","f"), keep.tree = TRUE) unlist2d(l, row.names = TRUE) # Intelligent recursive row-binding to data.frame
rapply2d(l, fmean) %>% unlist2d # Taking the mean of all elements and repeating

Application: extracting and tidying results from (potentially nested) lists of model objects

list(mod1 = lm(mpg ~ carb, mtcars), mod2 = lm(mpg ~ carb + hp, mtcars)) %>% lapply(summary) %>% get_elem("coef", regex = TRUE) %>% # Regular expression search and extraction unlist2d(idcols = "Model", row.names = "Predictor")

Summary Statistics -------------------------------------------------------------------------------------------

irisNA <- na_insert(iris, prop = 0.15) # Randmonly set 15% missing fnobs(irisNA) # Observation count pwnobs(irisNA) # Pairwise observation count fnobs(irisNA, g) # Grouped observation count fndistinct(irisNA) # Same with distinct values... (default na.rm = TRUE skips NA's) fndistinct(irisNA, g)

descr(iris) # Detailed statistical description of data

varying(iris, ~ Species) # Show which variables vary within Species varying(pdata) # Which are time-varying ? qsu(iris, w = ~ w) # Fast (one-pass) summary (with weights) qsu(iris, ~ Species, w = ~ w, higher = TRUE) # Grouped summary + higher moments qsu(pdata, higher = TRUE) # Panel-data summary (between and within entities) pwcor(num_vars(irisNA), N = TRUE, P = TRUE) # Pairwise correlations with p-value and observations pwcor(W(pdata, keep.ids = FALSE), P = TRUE) # Within-correlations

```

Evaluated and more extensive sets of examples are provided on the package page (also accessible from R by calling example('collapse-package')), and further in the vignettes and documentation.

Citation

If collapse was instrumental for your research project, please consider citing it using citation("collapse").

Owner

  • Name: Sebastian Krantz
  • Login: SebKrantz
  • Kind: user
  • Company: Kiel Institute for the World Economy

Economist/data scientist/programmer interested in econometrics, time series, geospatial analysis, machine learning, and high-performance computing.

Citation (CITATION.cff)

# --------------------------------------------
# CITATION file created with {cffr} R package
# See also: https://docs.ropensci.org/cffr/
# --------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "collapse" in publications use:'
type: software
license: GPL-2.0-or-later
title: 'collapse: Advanced and Fast Data Transformation'
version: 2.1.3.9000
identifiers:
- type: doi
  value: 10.32614/CRAN.package.collapse
abstract: A large C/C++-based package for advanced data transformation and statistical
  computing in R that is extremely fast, class-agnostic, robust, and programmer friendly.
  Core functionality includes a rich set of S3 generic grouped and weighted statistical
  functions for vectors, matrices and data frames, which provide efficient low-level
  vectorizations, OpenMP multithreading, and skip missing values by default. These
  are integrated with fast grouping and ordering algorithms (also callable from C),
  and efficient data manipulation functions. The package also provides a flexible
  and rigorous approach to time series and panel data in R, fast functions for data
  transformation and common statistical procedures, detailed (grouped, weighted) summary
  statistics, powerful tools to work with nested data, fast data object conversions,
  functions for memory efficient R programming, and helpers to effectively deal with
  variable labels, attributes, and missing data. It seamlessly supports base R objects/classes
  as well as 'units', 'integer64', 'xts'/ 'zoo', 'tibble', 'grouped_df', 'data.table',
  'sf', and 'pseries'/'pdata.frame'.
authors:
- family-names: Krantz
  given-names: Sebastian
  email: sebastian.krantz@graduateinstitute.ch
  orcid: https://orcid.org/0000-0001-6212-5229
preferred-citation:
  type: generic
  title: 'collapse: Advanced and Fast Statistical Computing and Data Transformation
    in R'
  authors:
  - family-names: Krantz
    given-names: Sebastian
    email: sebastian.krantz@graduateinstitute.ch
    orcid: https://orcid.org/0000-0001-6212-5229
  year: '2024'
  url: https://arxiv.org/abs/2403.05038
repository: https://CRAN.R-project.org/package=collapse
repository-code: https://github.com/SebKrantz/collapse
url: https://sebkrantz.github.io/collapse/
date-released: '2025-08-19'
contact:
- family-names: Krantz
  given-names: Sebastian
  email: sebastian.krantz@graduateinstitute.ch
  orcid: https://orcid.org/0000-0001-6212-5229
keywords:
- cran
- data-aggregation
- data-analysis
- data-manipulation
- data-processing
- data-science
- data-transformation
- econometrics
- high-performance
- panel-data
- r
- rstats
- scientific-computing
- statistics
- time-series
- weighted
- weights
references:
- type: manual
  title: 'collapse: Advanced and Fast Data Transformation in R'
  authors:
  - family-names: Krantz
    given-names: Sebastian
  year: '2025'
  notes: R package version 2.1.3.9000
  doi: 10.5281/zenodo.8433090
  url: https://sebkrantz.github.io/collapse/
- type: software
  title: 'R: A Language and Environment for Statistical Computing'
  notes: Depends
  url: https://www.R-project.org/
  authors:
  - name: R Core Team
  institution:
    name: R Foundation for Statistical Computing
    address: Vienna, Austria
  year: '2025'
  version: '>= 3.5.0'
- type: software
  title: Rcpp
  abstract: 'Rcpp: Seamless R and C++ Integration'
  notes: Imports
  url: https://www.rcpp.org
  repository: https://CRAN.R-project.org/package=Rcpp
  authors:
  - family-names: Eddelbuettel
    given-names: Dirk
    email: edd@debian.org
    orcid: https://orcid.org/0000-0001-6419-907X
  - family-names: Francois
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Allaire
    given-names: JJ
    orcid: https://orcid.org/0000-0003-0174-9868
  - family-names: Ushey
    given-names: Kevin
    orcid: https://orcid.org/0000-0003-2880-7407
  - family-names: Kou
    given-names: Qiang
    orcid: https://orcid.org/0000-0001-6786-5453
  - family-names: Russell
    given-names: Nathan
  - family-names: Ucar
    given-names: Iñaki
    orcid: https://orcid.org/0000-0001-6403-5550
  - family-names: Bates
    given-names: Doug
    orcid: https://orcid.org/0000-0001-8316-9503
  - family-names: Chambers
    given-names: John
  year: '2025'
  doi: 10.32614/CRAN.package.Rcpp
  version: '>= 1.0.1'
- type: software
  title: Rcpp
  abstract: 'Rcpp: Seamless R and C++ Integration'
  notes: Imports
  url: https://www.rcpp.org
  repository: https://CRAN.R-project.org/package=Rcpp
  authors:
  - family-names: Eddelbuettel
    given-names: Dirk
    email: edd@debian.org
    orcid: https://orcid.org/0000-0001-6419-907X
  - family-names: Francois
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Allaire
    given-names: JJ
    orcid: https://orcid.org/0000-0003-0174-9868
  - family-names: Ushey
    given-names: Kevin
    orcid: https://orcid.org/0000-0003-2880-7407
  - family-names: Kou
    given-names: Qiang
    orcid: https://orcid.org/0000-0001-6786-5453
  - family-names: Russell
    given-names: Nathan
  - family-names: Ucar
    given-names: Iñaki
    orcid: https://orcid.org/0000-0001-6403-5550
  - family-names: Bates
    given-names: Doug
    orcid: https://orcid.org/0000-0001-8316-9503
  - family-names: Chambers
    given-names: John
  year: '2025'
  doi: 10.32614/CRAN.package.Rcpp
- type: software
  title: fastverse
  abstract: 'fastverse: A Suite of High-Performance Packages for Statistics and Data
    Manipulation'
  notes: Suggests
  url: https://fastverse.github.io/fastverse/
  repository: https://CRAN.R-project.org/package=fastverse
  authors:
  - family-names: Krantz
    given-names: Sebastian
    email: sebastian.krantz@graduateinstitute.ch
  year: '2025'
  doi: 10.32614/CRAN.package.fastverse
- type: software
  title: data.table
  abstract: 'data.table: Extension of `data.frame`'
  notes: Suggests
  url: https://r-datatable.com
  repository: https://CRAN.R-project.org/package=data.table
  authors:
  - family-names: Barrett
    given-names: Tyson
    email: t.barrett88@gmail.com
    orcid: https://orcid.org/0000-0002-2137-1391
  - family-names: Dowle
    given-names: Matt
    email: mattjdowle@gmail.com
  - family-names: Srinivasan
    given-names: Arun
    email: asrini@pm.me
  - family-names: Gorecki
    given-names: Jan
  - family-names: Chirico
    given-names: Michael
    orcid: https://orcid.org/0000-0003-0787-087X
  - family-names: Hocking
    given-names: Toby
    orcid: https://orcid.org/0000-0002-3146-0865
  - family-names: Schwendinger
    given-names: Benjamin
    orcid: https://orcid.org/0000-0003-3315-8114
  - family-names: Krylov
    given-names: Ivan
    email: ikrylov@disroot.org
    orcid: https://orcid.org/0000-0002-0172-3812
  year: '2025'
  doi: 10.32614/CRAN.package.data.table
- type: software
  title: magrittr
  abstract: 'magrittr: A Forward-Pipe Operator for R'
  notes: Suggests
  url: https://magrittr.tidyverse.org
  repository: https://CRAN.R-project.org/package=magrittr
  authors:
  - family-names: Bache
    given-names: Stefan Milton
    email: stefan@stefanbache.dk
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2025'
  doi: 10.32614/CRAN.package.magrittr
- type: software
  title: kit
  abstract: 'kit: Data Manipulation Functions Implemented in C'
  notes: Suggests
  url: https://github.com/2005m/kit
  repository: https://CRAN.R-project.org/package=kit
  authors:
  - family-names: Jacob
    given-names: Morgan
    email: morgan.emailbox@gmail.com
  year: '2025'
  doi: 10.32614/CRAN.package.kit
- type: software
  title: xts
  abstract: 'xts: eXtensible Time Series'
  notes: Suggests
  url: https://joshuaulrich.github.io/xts/
  repository: https://CRAN.R-project.org/package=xts
  authors:
  - family-names: Ryan
    given-names: Jeffrey A.
  - family-names: Ulrich
    given-names: Joshua M.
    email: josh.m.ulrich@gmail.com
  year: '2025'
  doi: 10.32614/CRAN.package.xts
- type: software
  title: zoo
  abstract: 'zoo: S3 Infrastructure for Regular and Irregular Time Series (Z''s Ordered
    Observations)'
  notes: Suggests
  url: https://zoo.R-Forge.R-project.org/
  repository: https://CRAN.R-project.org/package=zoo
  authors:
  - family-names: Zeileis
    given-names: Achim
    email: Achim.Zeileis@R-project.org
    orcid: https://orcid.org/0000-0003-0918-3766
  - family-names: Grothendieck
    given-names: Gabor
    email: ggrothendieck@gmail.com
  - family-names: Ryan
    given-names: Jeffrey A.
    email: jeff.a.ryan@gmail.com
  year: '2025'
  doi: 10.32614/CRAN.package.zoo
- type: software
  title: plm
  abstract: 'plm: Linear Models for Panel Data'
  notes: Suggests
  url: https://cran.r-project.org/package=plm
  repository: https://CRAN.R-project.org/package=plm
  authors:
  - family-names: Croissant
    given-names: Yves
    email: yves.croissant@univ-reunion.fr
  - family-names: Millo
    given-names: Giovanni
    email: giovanni.millo@deams.units.it
  - family-names: Tappe
    given-names: Kevin
    email: kevin.tappe@bwi.uni-stuttgart.de
  year: '2025'
  doi: 10.32614/CRAN.package.plm
- type: software
  title: fixest
  abstract: 'fixest: Fast Fixed-Effects Estimations'
  notes: Suggests
  url: https://lrberge.github.io/fixest/
  repository: https://CRAN.R-project.org/package=fixest
  authors:
  - family-names: Berge
    given-names: Laurent
    email: laurent.berge@u-bordeaux.fr
  year: '2025'
  doi: 10.32614/CRAN.package.fixest
- type: software
  title: vars
  abstract: 'vars: VAR Modelling'
  notes: Suggests
  url: https://www.pfaffikus.de
  repository: https://CRAN.R-project.org/package=vars
  authors:
  - family-names: Pfaff
    given-names: Bernhard
    email: bernhard@pfaffikus.de
  year: '2025'
  doi: 10.32614/CRAN.package.vars
- type: software
  title: RcppArmadillo
  abstract: 'RcppArmadillo: ''Rcpp'' Integration for the ''Armadillo'' Templated Linear
    Algebra Library'
  notes: Suggests
  url: https://dirk.eddelbuettel.com/code/rcpp.armadillo.html
  repository: https://CRAN.R-project.org/package=RcppArmadillo
  authors:
  - family-names: Eddelbuettel
    given-names: Dirk
    email: edd@debian.org
    orcid: https://orcid.org/0000-0001-6419-907X
  - family-names: Francois
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Bates
    given-names: Doug
    orcid: https://orcid.org/0000-0001-8316-9503
  - family-names: Ni
    given-names: Binxiang
  - family-names: Sanderson
    given-names: Conrad
    orcid: https://orcid.org/0000-0002-0049-4501
  year: '2025'
  doi: 10.32614/CRAN.package.RcppArmadillo
- type: software
  title: RcppEigen
  abstract: 'RcppEigen: ''Rcpp'' Integration for the ''Eigen'' Templated Linear Algebra
    Library'
  notes: Suggests
  url: https://dirk.eddelbuettel.com/code/rcpp.eigen.html
  repository: https://CRAN.R-project.org/package=RcppEigen
  authors:
  - family-names: Bates
    given-names: Doug
    orcid: https://orcid.org/0000-0001-8316-9503
  - family-names: Eddelbuettel
    given-names: Dirk
    email: edd@debian.org
    orcid: https://orcid.org/0000-0001-6419-907X
  - family-names: Francois
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Qiu
    given-names: Yixuan
    orcid: https://orcid.org/0000-0003-0109-6692
  year: '2025'
  doi: 10.32614/CRAN.package.RcppEigen
- type: software
  title: tibble
  abstract: 'tibble: Simple Data Frames'
  notes: Suggests
  url: https://tibble.tidyverse.org/
  repository: https://CRAN.R-project.org/package=tibble
  authors:
  - family-names: Müller
    given-names: Kirill
    email: kirill@cynkra.com
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2025'
  doi: 10.32614/CRAN.package.tibble
- type: software
  title: dplyr
  abstract: 'dplyr: A Grammar of Data Manipulation'
  notes: Suggests
  url: https://dplyr.tidyverse.org
  repository: https://CRAN.R-project.org/package=dplyr
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: François
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Henry
    given-names: Lionel
  - family-names: Müller
    given-names: Kirill
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Vaughan
    given-names: Davis
    email: davis@posit.co
    orcid: https://orcid.org/0000-0003-4777-038X
  year: '2025'
  doi: 10.32614/CRAN.package.dplyr
- type: software
  title: ggplot2
  abstract: 'ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics'
  notes: Suggests
  url: https://ggplot2.tidyverse.org
  repository: https://CRAN.R-project.org/package=ggplot2
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: Chang
    given-names: Winston
    orcid: https://orcid.org/0000-0002-1576-2126
  - family-names: Henry
    given-names: Lionel
  - family-names: Pedersen
    given-names: Thomas Lin
    email: thomas.pedersen@posit.co
    orcid: https://orcid.org/0000-0002-5147-4711
  - family-names: Takahashi
    given-names: Kohske
  - family-names: Wilke
    given-names: Claus
    orcid: https://orcid.org/0000-0002-7470-9261
  - family-names: Woo
    given-names: Kara
    orcid: https://orcid.org/0000-0002-5125-4188
  - family-names: Yutani
    given-names: Hiroaki
    orcid: https://orcid.org/0000-0002-3385-7233
  - family-names: Dunnington
    given-names: Dewey
    orcid: https://orcid.org/0000-0002-9415-4582
  - family-names: Brand
    given-names: Teun
    name-particle: van den
    orcid: https://orcid.org/0000-0002-9335-7468
  year: '2025'
  doi: 10.32614/CRAN.package.ggplot2
- type: software
  title: scales
  abstract: 'scales: Scale Functions for Visualization'
  notes: Suggests
  url: https://scales.r-lib.org
  repository: https://CRAN.R-project.org/package=scales
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  - family-names: Pedersen
    given-names: Thomas Lin
    email: thomas.pedersen@posit.co
    orcid: https://orcid.org/0000-0002-5147-4711
  - family-names: Seidel
    given-names: Dana
  year: '2025'
  doi: 10.32614/CRAN.package.scales
- type: software
  title: microbenchmark
  abstract: 'microbenchmark: Accurate Timing Functions'
  notes: Suggests
  url: https://github.com/joshuaulrich/microbenchmark/
  repository: https://CRAN.R-project.org/package=microbenchmark
  authors:
  - family-names: Mersmann
    given-names: Olaf
  year: '2025'
  doi: 10.32614/CRAN.package.microbenchmark
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  url: https://testthat.r-lib.org
  repository: https://CRAN.R-project.org/package=testthat
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2025'
  doi: 10.32614/CRAN.package.testthat
- type: software
  title: covr
  abstract: 'covr: Test Coverage for Packages'
  notes: Suggests
  url: https://covr.r-lib.org
  repository: https://CRAN.R-project.org/package=covr
  authors:
  - family-names: Hester
    given-names: Jim
    email: james.f.hester@gmail.com
  year: '2025'
  doi: 10.32614/CRAN.package.covr
- type: software
  title: knitr
  abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
  notes: Suggests
  url: https://yihui.org/knitr/
  repository: https://CRAN.R-project.org/package=knitr
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2025'
  doi: 10.32614/CRAN.package.knitr
- type: software
  title: rmarkdown
  abstract: 'rmarkdown: Dynamic Documents for R'
  notes: Suggests
  url: https://pkgs.rstudio.com/rmarkdown/
  repository: https://CRAN.R-project.org/package=rmarkdown
  authors:
  - family-names: Allaire
    given-names: JJ
    email: jj@posit.co
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  - family-names: Dervieux
    given-names: Christophe
    email: cderv@posit.co
    orcid: https://orcid.org/0000-0003-4474-2498
  - family-names: McPherson
    given-names: Jonathan
    email: jonathan@posit.co
  - family-names: Luraschi
    given-names: Javier
  - family-names: Ushey
    given-names: Kevin
    email: kevin@posit.co
  - family-names: Atkins
    given-names: Aron
    email: aron@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  - family-names: Cheng
    given-names: Joe
    email: joe@posit.co
  - family-names: Chang
    given-names: Winston
    email: winston@posit.co
  - family-names: Iannone
    given-names: Richard
    email: rich@posit.co
    orcid: https://orcid.org/0000-0003-3925-190X
  year: '2025'
  doi: 10.32614/CRAN.package.rmarkdown
- type: software
  title: withr
  abstract: 'withr: Run Code ''With'' Temporarily Modified Global State'
  notes: Suggests
  url: https://withr.r-lib.org
  repository: https://CRAN.R-project.org/package=withr
  authors:
  - family-names: Hester
    given-names: Jim
  - family-names: Henry
    given-names: Lionel
    email: lionel@posit.co
  - family-names: Müller
    given-names: Kirill
    email: krlmlr+r@mailbox.org
  - family-names: Ushey
    given-names: Kevin
    email: kevinushey@gmail.com
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  - family-names: Chang
    given-names: Winston
  year: '2025'
  doi: 10.32614/CRAN.package.withr
- type: software
  title: bit64
  abstract: 'bit64: A S3 Class for Vectors of 64bit Integers'
  notes: Suggests
  url: https://github.com/r-lib/bit64
  repository: https://CRAN.R-project.org/package=bit64
  authors:
  - family-names: Chirico
    given-names: Michael
    email: michaelchirico4@gmail.com
  - family-names: Oehlschlägel
    given-names: Jens
  year: '2025'
  doi: 10.32614/CRAN.package.bit64

GitHub Events

Total
  • Create event: 12
  • Release event: 11
  • Issues event: 74
  • Watch event: 33
  • Delete event: 5
  • Issue comment event: 147
  • Push event: 379
  • Pull request review comment event: 1
  • Pull request review event: 1
  • Pull request event: 171
  • Fork event: 2
Last Year
  • Create event: 12
  • Release event: 11
  • Issues event: 74
  • Watch event: 33
  • Delete event: 5
  • Issue comment event: 147
  • Push event: 379
  • Pull request review comment event: 1
  • Pull request review event: 1
  • Pull request event: 171
  • Fork event: 2

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 2,790
  • Total Committers: 13
  • Avg Commits per committer: 214.615
  • Development Distribution Score (DDS): 0.029
Past Year
  • Commits: 366
  • Committers: 5
  • Avg Commits per committer: 73.2
  • Development Distribution Score (DDS): 0.12
Top Committers
Name Email Commits
Sebastian Krantz s****z@g****h 2,708
github-actions[bot] 4****] 38
Michael Chirico c****m@g****m 11
helix123 k****e@g****e 9
Alina Cherkas 5****s 7
Joris Meys J****s@U****e 5
Ivan K k****t@g****m 3
Arthur Gailes a****1@g****m 3
Florian Kohrt f****t@a****o 2
Tomas Kalibera t****a@g****m 1
Romain Francois r****n@r****m 1
Ralf Herold r****d@g****t 1
Dirk Eddelbuettel e****d@d****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 171
  • Total pull requests: 332
  • Average time to close issues: 25 days
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 81
  • Total pull request authors: 7
  • Average comments per issue: 3.13
  • Average comments per pull request: 0.13
  • Merged pull requests: 318
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 47
  • Pull requests: 153
  • Average time to close issues: 12 days
  • Average time to close pull requests: about 1 hour
  • Issue authors: 28
  • Pull request authors: 4
  • Average comments per issue: 2.55
  • Average comments per pull request: 0.17
  • Merged pull requests: 143
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • NicChr (14)
  • Steviey (8)
  • D3SL (7)
  • SebKrantz (7)
  • mayer79 (7)
  • tony-aw (6)
  • ummel (6)
  • orgadish (4)
  • alinacherkas (4)
  • grantmcdermott (4)
  • arthurgailes (4)
  • statzhero (3)
  • Henrik-P (3)
  • zander-prinsloo (3)
  • kylebutts (3)
Pull Request Authors
  • SebKrantz (393)
  • MichaelChirico (14)
  • alinacherkas (10)
  • arthurgailes (6)
  • fkohrt (2)
  • aitap (2)
  • kalibera (2)
Top Labels
Issue Labels
bug (9) enhancement (5)
Pull Request Labels

Packages

  • Total packages: 4
  • Total downloads:
    • cran 28,709 last-month
  • Total docker downloads: 42,548
  • Total dependent packages: 27
    (may contain duplicates)
  • Total dependent repositories: 38
    (may contain duplicates)
  • Total versions: 169
  • Total maintainers: 1
proxy.golang.org: github.com/sebkrantz/collapse
  • Versions: 56
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
proxy.golang.org: github.com/SebKrantz/collapse
  • Versions: 56
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 6 months ago
cran.r-project.org: collapse

Advanced and Fast Data Transformation

  • Versions: 52
  • Dependent Packages: 25
  • Dependent Repositories: 38
  • Downloads: 28,709 Last month
  • Docker Downloads: 42,548
Rankings
Stargazers count: 0.7%
Downloads: 1.7%
Forks count: 2.9%
Dependent packages count: 3.6%
Dependent repos count: 4.2%
Average: 5.7%
Docker downloads count: 21.0%
Last synced: 6 months ago
conda-forge.org: r-collapse
  • Versions: 5
  • Dependent Packages: 2
  • Dependent Repositories: 0
Rankings
Stargazers count: 17.5%
Dependent packages count: 19.5%
Average: 26.1%
Forks count: 33.2%
Dependent repos count: 34.0%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.3.0 depends
  • Rcpp >= 1.0.1 imports
  • RcppArmadillo * suggests
  • RcppEigen * suggests
  • covr * suggests
  • data.table * suggests
  • dplyr * suggests
  • fastverse * suggests
  • fixest * suggests
  • ggplot2 * suggests
  • kit * suggests
  • knitr * suggests
  • magrittr * suggests
  • microbenchmark * suggests
  • plm * suggests
  • rmarkdown * suggests
  • scales * suggests
  • testthat * suggests
  • tibble * suggests
  • vars * suggests
  • xts * suggests
  • zoo * suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact main composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite