Recent Releases of fwildclusterboot
fwildclusterboot - fwildclusterboot 0.14.3
- Fixes a bug with CI inversion when
rwas set to be close to the estimated parameter, in which case the confidence interval inversion failed. See #138. Thanks to Achim Zeileis & team for raising this issue! - This might lead to insubstantial differences in computed bootstrap confidence intervals compared to prior versions - the reason is that starting values for the root finding procedure used in inverting the CIs might have changed.
- R
Published by s3alfisc over 2 years ago
fwildclusterboot - v0.14
fwildclusterboot 0.14
Performance
Version 0.14 ...
- sparsifies the "fast and reliable" bootstraps - bootstrap types 31, 33, 13 (which leads to good speed gains for problems with high dimensional fixed effects)
- allows to project out cluster fixed effects when running the "fast and reliable" algorithms "11" and "31"
- computes the generalized inverse
pinvvia rcpp eigen instead ofMASS::ginv()wheneverMatrix::solve()fails (thanks to @kylebutts) - unlocks parallelization (nthreads was internally set to 1 for some reason)
Here is a performance benchmark for the "31" bootstrap between version 0.13 and 0.14: the runtime of the bootstrap decreases from 150s to around 10 seconds!
```r library(fwildclusterboot) library(fixest)
df <- fwildclusterboot:::createdata( N = 1000000, NG1 = 50, icc1 = 0.2, NG2 = 50, icc2 = 0.5, numbfe1 = 100, numb_fe2 = 100, seed = 123 )
nrow(df)
[1] 1000000
sapply(df[, c("groupid1", "Q1immigration", "Q2_defense")], function(x) length(unique(x)))
groupid1 Q1immigration Q2_defense
50 100 100
fit <- feols( propositionvote ~ treatment | groupid1 + Q1immigration + Q2defense, data = df )
fwildclusterboot 0.13:
pracma::tic() boottest( fit, param = ~ treatment, B = 9999, clustid = ~groupid1, bootstraptype = "31" ) pracma::toc()
elapsed time is 158.550000 seconds
fwildclusterboot 0.14:
pracma::tic() boottest( fit, param = ~ treatment, B = 9999, clustid = ~groupid1, bootstraptype = "31" ) pracma::toc()
elapsed time is 11.460000 seconds
```
Breaking Changes
- the
print.boottest()andprint.mboottest()method have been deprecated, as both did not really do anything that thesummary()method would not do. Note: this was a bad decision, and I will revert it in version0.14.1. - Bugfix:
boottest()should never have run withfixest::feols()and varying slopes syntax viavar1[var2]. Unfortunately it did for the heteroskedastic bootstrap - it's a bug. I am very sorry if you are affected by this! This version adds an error message for this case.
rOpenSci Review feedback link
- update docs:
- add a vignette on wild bootstrap concepts (wild bootstrap 101)
- better explanation of plot method in docs and vignette
- some guidelines on how to turn messages and warnings off
- reorganization of ropensci ssr tags into code
- it is now possible to interrupt rcpp loops
Misc
- throws a clear error message when the subcluster bootstrap is tried for the fast and reliable algos (currently not supported)
- bumps the required
WildBootTests.jlversion to0.9.7(version0.9.6contained a small bug when fixed effects where used within the bootstrap via thefeargument ofboottest().
- R
Published by s3alfisc over 2 years ago
fwildclusterboot - v0.13
Potentially Breaking Changes:
boottest(),mboottest()andboot_aggregate()no longer have a dedicatedseedargument. From version 0.13, reproducibility of results can only be controlled by setting a global seed viadrqng::dqset.seed()andset.seed(). For more context, see the discussion below.As one consequence, results produced via old versions of
fwildlcusterbootare no longer exactly reproducible. For more background on this change, see the changelog.When the bootstrap is run via
engine = "WildBootTests.jl", the bootstrapped t-statistics and the original t-statistic are now returned as vectors (to align with the results from otherenginges). Previously, they were returned as matrices.
Other Changes:
boottest()receives a new argument,sampling, which controls if random numbers are drawn via functions frombaseor thedqrngpackage.- Some code refactoring. All bootstrap algorithms and their associated files have been renamed (e.g.
boot_algo2.Ris not calledboot_algo_fastnwild.R, etc.). - Much nicer error and message formatting, via
rlang::abort(),warn()andinform().rlangis added as a dependency.
- R
Published by s3alfisc about 3 years ago
fwildclusterboot - fwildclusterboot v0.12.3
- Bump required version of WildBootTests.jl to 0.8.3
- R
Published by s3alfisc about 3 years ago
fwildclusterboot - fwildclusterboot v0.12 (CRAN release)
fwildclusterboot 0.12
This is the first CRAN release since version 0.9. It comes with a set of new features, but also potentially breaking changes. This section summarizes all developments since version 0.9.
Potentially breaking changes:
boottest()'sfunction argumentboot_algohas been renamed toengine- the
setBoottest_boot_algo()function was renamed tosetBoottest_engine()
Bug fixes and internal changes
- When a multi-parameter hypothesis of the form R beta = r was tested, the heteroskedastic wild bootstrap would nevertheless always test "betak = 0" vs "betak != 0", with "beta_k = param". I am sorry for that bug!
- The
Matrix.utilshas been removed from CRAN - it has been replaced by custom functions for internal use.
New features and Improvements
- A new function argument has been added -
bootstrap_type. In combination with theimpose_nullfunction argument, it allows to choose between different cluster bootstrap types - WCx11, WCx13, WCx31, WCx33. For more details on these methods, see the working paper by MacKinnon, Nielsen & Webb (2022). Currently, these new bootstrap types only compute p-values. Adding support for confidence intervals is work in progress. - A
boot_aggregate()method now supports the aggregation of coefficients in staggered difference-in-differences following the methods by Sun & Abraham (2021, Journal of Econometrics) in combination with thesunab()function fromfixest. Essentially,boot_aggregate()is a copy ofaggregate.fixest: the only difference is that inference is powered by a wild bootstrap. - The heteroskedastic bootstrap is now significantly faster, and WCR21 and WCR31 versions are now supported (i.e. HC2 and HC3 'imposed' on the bootstrap dgp.)
- R
Published by s3alfisc over 3 years ago
fwildclusterboot - v0.11.2
- fixes a bug for new bootstrap variants and
engine = "R"when a fixed effects is specified viafixest::feols() - drops the dependency on
Matrix.utilsas the package might be taken off CRAN - add dependency on
summclust, which is now available on CRAN
- R
Published by s3alfisc over 3 years ago
fwildclusterboot - Release of versions 0.10 and 0.11
fwildclusterboot 0.11
- This release introduces new wild cluster bootstrap variants as described in MacKinnon, Nielsen & Webb (2022). The implementation is still quite bare-bone: it only allows to test hypotheses of the form $\betak = 0$ vs $\betak \neq 0$, does not allow for regression weights or fixed effects, and further does not compute confidence intervals.
You can run one of the 'new' variants - e.g. the "WCR13", by specifying the boot_algo function argument accordingly:
boottest(
lm_fit,
param = ~treatment,
clustid = ~group_id1,
B = 9999,
impose_null = TRUE,
boot_algo = "WCR13"
)
fwildclusterboot 0.10
- introduces a range of new methods:
nobs(),pval(),teststat(),confint()andprint() - multiple (internal) changes for ropensci standards alignment
- drop the
t_boot(teststat_boot) function arguments -> they are now TRUE by default - fix a bug in the lean algorithms - it always tested hypotheses of the form beta = 0 instead of R'beta = r, even when R != 1 and r != 0
- enable full enumeration for R-lean tests
- enable deterministic 'full enumeration tests' - these are exact
- R
Published by s3alfisc over 3 years ago
fwildclusterboot - fwildclusterboot v0.9
fwildclusterboot 0.9
v0.9 moves data pre-processing from
model.framemethods tomodel_matrixmethods. I had wanted to do so for a while, but issue #42, as raised by Michael Topper, has finally convinced me to start this project.Moving to
model_matrixmethods unlocks new functionality for howboottest()plays withfixestobjects - it is now possible to runboottest()afterfeols()models that use syntactic sugar:
``` library(fwildclusterboot) library(fixest)
data(voters) feolsfit <- feols(propositionvote ~ i(treatment, ideology1) , data = voters ) boot1 <- boottest(feolsfit, B = 9999, param = "treatment::0:ideology1", clustid = "groupid1" )
feolsfits <- fixest::feols(propositionvote ~ treatment | sw(Q1immigration, Q2defense), data = voters) res <- lapply(feolsfits, (x) boottest(x, B = 999, param = "treatment", clustid = "groupid1"))
voters$split <- sample(1:2, nrow(voters), TRUE) feolsfits <- fixest::feols(propositionvote ~ treatment, split = ~split, data = voters)
res <- lapply(feolsfits, (x) boottest(x, B = 999, param = "treatment", clustid = "groupid1"))
```
Interacting fixed effects via ^ still leads to errors - this remains work in progress:
``` feolsfit2 <- feols(propositionvote ~ treatment | Q1immigration^Q2defense, data = voters )
boot1 <- boottest(feolsfit2, B = 9999, param = "treatment", clustid = "groupid1" ) ```
The release further fixes a multicollinearity bug that occured when
lm()orfixest()silently deleted multicollinar variable(s). Thanks to Kurt Schmidheiny for reporting! (see issue #43)The
na_omitfunction argument has been dropped. If the cluster variable is not included in the regression model, it is now not allowed to contain NA values.Several function arguments can now be fed to
boottest()as formulas (param,clustid,bootcluster,fe).
data(voters)
feols_fit <- feols(proposition_vote ~ treatment ,
data = voters
)
boot <- boottest(feols_fit,
B = 9999,
param = ~ treatment,
clustid = ~ group_id1
)
- R
Published by s3alfisc over 3 years ago
fwildclusterboot - fwildclusterboot v0.8
Two new bootstrap algorithms: 'WildBootTests.jl' and 'R-lean'
boot_algo = 'WildBootTests.jl'
fwildclusterbootnow supports calling WildBootTests.jl, which is a very fast Julia implementation of the wild cluster bootstrap algorithm. To do so, a new function argument is introduced,boot_algo, through which it is possible to control the executed bootstrap algorithm.
```{r}
load data set voters included in fwildclusterboot
data(voters)
estimate the regression model via lm
lmfit <- lm(propositionvote ~ treatment + ideology1 + logincome + Q1immigration , data = voters)
bootlm <- boottest(
lmfit,
clustid = "groupid1",
param = "treatment",
B = 9999,
bootalgo = "WildBootTests.jl"
)
``
+ WildBootTests.jl is (after compilation) orders of magnitudes faster thanfwildclusterboot's` native R implementation, and speed gains are particularly pronounced for large problems with a large number of clusters and many bootstrap iterations.
- Furthermore,
WildBootTests.jlsupports a range of models and tests that were previously not supported byfwildclusterboot: most importantly a) wild cluster bootstrap tests of multiple joint hypotheses and b) the WRE bootstrap by Davidson & MacKinnon for instrumental variables estimation. On top of the cake ... the WRE is really fast.
```{r} library(ivreg) data("SchoolingReturns", package = "ivreg")
drop all NA values from SchoolingReturns
SchoolingReturns <- SchoolingReturns[rowMeans(sapply(SchoolingReturns, is.na)) == 0,] ivreg_fit <- ivreg(log(wage) ~ education + age + ethnicity + smsa + south + parents14 | nearcollege + age + ethnicity + smsa + south + parents14, data = SchoolingReturns)
bootivreg <- boottest( object = ivregfit, B = 999, param = "education", clustid = "kww", type = "mammen", imposenull = TRUE ) generics::tidy(bootivreg)
term estimate statistic p.value conf.low conf.high
1 1*education = 0 0.0638822 1.043969 0.2482482 -0.03152655 0.2128746
```
For guidance on how to install and run
WildBooTests.jl, have a look at the associated article.Also, note that running the wild cluster bootstrap through
WildBootTests.jlis often very memory-efficient.
boot_algo = 'R-lean'
A key limitation of the vectorized 'fast' cluster bootstrap algorithm as implemented in fwildclusterboot is that it is very memory-demanding. For 'larger' problems, running boottest() might lead to out-of-memory errors. To offer an alternative, boottest() now ships a 'new' rcpp- and loop-based implementation of the wild cluster bootstrap (the 'wild2' algorithm in Roodman et al).
{r}
boot_lm <- boottest(
lm_fit,
clustid = "group_id1",
param = "treatment",
B = 9999,
boot_algo = "R-lean"
)
Heteroskeadstic Wild Bootstrap
It is now possible to run boottest() without specifying a clustid function argument. In this case, boottest() runs a heteroskedasticity-robust wild bootstrap (HC1), which is implemented in c++.
{r}
boot_hc1 <- boottest(lm_fit, param = "treatment", B = 9999)
summary(boot_hc1)
boottest() function argument beta0 deprecated
For consistency with WildBootTests.jl, the boottest() function argument beta0 is now replaced by a new function argument, r.
Frühjahrsputz
I have spent some time to clean up fwildclusterboot's internals, which should now hopefully be more readable and easier to maintain.
Testing
fwildclusterboot is now pre-dominantly tested against WildBootTests.jl. Tests that depend on Julia are by default not run on CRAN, but are regularly run on Mac, Windows and Linux via github actions.
- R
Published by s3alfisc almost 4 years ago
fwildclusterboot - fwildclusterboot v0.7
- A bug fix release, see issues #26 and #27 regarding preprocessing for fixest when weights are passed to feols() as a formula or when cluster is specified in fixest as a column vector.
- R
Published by s3alfisc about 4 years ago
fwildclusterboot - fwildclusterboot v0.6
fwildclusterboot 0.6
- Bug fix: for one-sided hypotheses for the WRU bootstrap (if impose_null = FALSE), the returned p-values were incorrect - they were reported as 'p', but should have been '1-p'. E.g. if the reported p-values was reported as 0.4, it should have been reported as 0.6.
A new function argument
sscgives more control over the small sample adjustments made withinboottest(). It closely mirrors thesscargument infixest. The only difference is thatfwildclusterboot::boot_ssc()'sfixef.Kargument currently has only one option,'none', which means that the fixed effect parameters are discarded when calculating the number of estimated parameters k. The default argument ofboot_ssc()areadj = TRUE, fixef.K = "none", cluster.adj = TRUEandcluster.df = "conventional". In fixest, thecluster.dfargument is"min"by default. Prior to v 0.6, by default, no small sample adjustments regarding the sample size N and the number of estimated parameters k were applied. The changes in v0.6 may slightly affect the output ofboottest(). For exact reproducibility of previous results, setadj = FALSE. Settingadj = TRUEwill not affect p-values and confidence intervals for oneway clustering, but the internally calculated t-stat, which is divided by $\sqrt{(N-k)/(N-1)}$. For twoway clustering, it might affect the number and order of invalid bootstrapped t-statistics (due to non-positive definite covariance matrices) and, through this channel, affect bootstrapped inferential parameters.Testing: unit tests are now run on github actions against wildboottestjlr, which is a JuliaConnectoR based wrapper around WildBootTests.jl, a Julia implementation of the fast wild cluster bootstrap algorithm.
Additionally, minor speed tweaks.
- R
Published by s3alfisc about 4 years ago
fwildclusterboot - v 0.5.1
- Fix a bug with Mammen weights introduced in version 0.5 -> switch back to
sample()function. To guarantee reproducibilty with Mammen weights, either a seed needs to be specified inboottest()or a global seed needs to be set viaset.seed(). - Delete some unnecessary computations from boot_algo2() -> speed improvements
- For B = 2^(#number of clusters), Rademacher weights should have been enumerated - instead, they were drawn randomly and enumeration only occured for B > 2^(#number of clusters). Now, enumeration occurs if B >= 2^(#number of clusters).
- R
Published by s3alfisc over 4 years ago
fwildclusterboot - fwildclusterboot 0.5
- Version 0.5 fixes an error for the bootstrap with weighted least squares introduced with version 0.4. All unit tests that compare fwildclusterboot with weighted least squares results from boottest.stata pass. In particular, enumerated cases pass with exact equality (in such cases, the bootstrap weights matrices are exactly identical in both R and Stata).
boottest()now stops iffixest::feols()deletes non-NA values (e.g. singleton fixed effects deletion) and asks the user to delete such rows prior to estimation viafeols()&boottest(). Currently,boottest()'spre-processing cannot handle such deletions - this remains future work.- To align
fwildclusterbootwith Stata's boottest command (Roodman et al, 2019), Mammen weights are no longer enumerated infwildclusterboot::boottest(). boottest()no longer sets an internal seed (previously set.seed(1)) if no seed is provided as a function argument.- Sampling of the bootstrap weights is now powered by the
dqrngpackage, which speeds up the creation of the bootstrap weights matrix. To set a "global" seed, one now has use thedqset.seed()function from thedqrng package, which is added as a dependency.
- R
Published by s3alfisc over 4 years ago
fwildclusterboot - fwildclusterboot 0.4
fwilclusterboot 0.4
- New feature I:
boottest()now allows for univariate tests that involve multiple variables. E.g. one can now test hypothesis as ´var1 + var2 = c´ where c is a scalar. More details on the syntax can be found in the vignette. All methods of for objects of classboottesthave been updated. - New feature II:
boottest()now also supports "equal-tailed" p-values and one-sided hypotheses. For one-sided tests, confidence intervals are currently not supported. - Internal changes: To allow for multivariable tests, the
boot_algo2()function has slightly been modified.invert_p_val2()is superseded byinvert_p_val(). - Further, a CRAN error is fixed - two tests for exact equality failed with relative difference e-03 on openBLAS. In consequence, all exact tests are set to reltol = 1e-02.
- R
Published by s3alfisc over 4 years ago
fwildclusterboot - fwildclusterboot 0.3.7
- Bug fix: the output of
boottest()varied depending on the class of the input fixed effects for regressions both vialfe::felm()andfixest::feols(). This bug occurred becauseboottest()does not work with a pre-processed model.frame object from eitherfelm()orfeols()but works with the original input data. While bothfelm()andfeols()change non-factor fixed effects variables to factors internally,boottest()did not check but implicitely assumed that all fixed effects used in the regression models are indeed factors in the original data set. As a consequence, if one or more fixed effects were e.g. numeric,boottest()would produce incorrect results without throwing an error. With version 0.3.7,boottest()checks internally if all variables in the original data set which are used as fixed effects are factor variables and if not, changes them to factors. Thanks for timotheedotc for raising the issue on github, which can be found here: https://github.com/s3alfisc/fwildclusterboot/issues/14. - Some tests have been added that compare output from
boottest()with the wild cluster bootstrap implemented viaclusterSEs.
- R
Published by s3alfisc over 4 years ago
fwildclusterboot - fwildclusterboot 0.3.6
- small fixes of CRAN bugs related to packages in "suggests"
- R
Published by s3alfisc over 4 years ago
fwildclusterboot - fwildclusterboot 0.3.5
Bug fix: For Rademacher and Mammen weights and cases where (2^ number of clusters) < # boostrap iterations,
(deterministic ) full enumeration should have been employed for sampling the bootstrap weights. If (2^ number of clusters) < # boostrap iterations,boottest()would internally overwrite the user-provided number of bootstrap iterations to B := (2^ number of clusters) . The bug now occured because the bootstrap weights were drawn randomly and with replacement instead of using full enumeration and drawing all distinct draws exactly once. Thanks to fschoner for finding the bug! see github issue #11Bug fix: A small bug has been fixed related to missing values in the cluster variables.
By default,
boottest()now sets an internal seed if no seed is provided by the user via theseedfunction argument.Several improvements to the documentation.
- R
Published by s3alfisc over 4 years ago
fwildclusterboot - fwildclusterboot 0.3.4
- fixes a small bug in the vignette that caused errors on CRAN
- R
Published by s3alfisc almost 5 years ago
fwildclusterboot - fwildclusterboot 0.3.3
- implements full enumeration for Rademacher & Mammen Weights if 2^k < B, where k is the number of clusters and B the number of bootstrap iterations
- R
Published by s3alfisc almost 5 years ago
fwildclusterboot - fwildclusterboot 0.3.2
- minor bug fixes in unit tests
- R
Published by s3alfisc almost 5 years ago
fwildclusterboot - fwildclusterboot v0.2.0
boottest() now supports the following functionality:
- tests of univariate hypotheses for regression models of type felm, fixest, lm
- multidimensional clustering, regression weights, fixed effects, subcluster bootstrap,
- all either with the null imposed on the bootstrap distribution (WCR) and with no null imposed (WCU)
- R
Published by s3alfisc about 5 years ago