Thanks for your question!
fwildclusterboot::boottest()
only supports estimation of OLS models, so running a Poisson regression should in fact throw an error in boottest()
. I will have to add an error condition for that :)
The error that you observe
Error in if (!is.numeric(lower) || !is.numeric(upper) || lower >= upper) stop("lower < upper is not fulfilled") :
missing value where TRUE/FALSE needed
stems from the numerical root finding procedure employed to compute confidence intervals - and I believe it is a direct result of fwildclusterboot
not supporting Poisson regression.
The memory problems in both boottest
and fwildclusterboot
arise either because
- the model you are fitting is very big and fwildclusterboot only accepts one fixed effect - all other factor variables specified in
fixest
are translated to dummies, hence the design matrix passed to boottest()
might be very large. In fact if you do not use the fe
argument for fwildclusterboot::boottest()
, all fixed effects specified in feols()
will be translated to dummies and no fixed effect is outprojected within the bootstrap. You can check if this is the root of your error by running your regression via lm()
or glm()
(or via a similar command in Stata) and see if these estimations fail due to memory as well .
boottest
and fwildclusterboot
are fully vectorized - hence both compute a weights matrix v, which is of dimension G x B
, where G is the number of clusters and B the number of bootstrap iterations. If both G and B are large, this consumes quite a bit of memory! Stata.boottest
has a function argument, matsize
, that aims to help in such a situation - I quote from the documentation:
"matsize(#) limits the memory demand of the G × B matrix v∗ to prevent caching of virtual
memory to disk. The limit is specified in gigabytes; e.g., matsize(8) would limit the memory
demand to 8GB. Note that this option does not limit the actual size of v∗. Instead, it forces
boottest to break the matrix into chunks no larger than the limit, and then create and destroy
each chunk in turn"
So I would suggest that you try out the matsize
argument in boottest
and see if your error occurs due to a large weights matrix?
Memory is a known issue for fwildclusterboot
, and improving memory performance is work in progress.
Last, there is also a new Julia implementation of the fast wild cluster bootstrap algorithm in WildBootTests.jl that supports ML based models and is - in my experience - less memory demanding than fwildclusterboot
.
Update 1
See also this discussion on using boottest
and pplmhdfe
.
Update 2
If you want to run the wild cluster bootstrap because you are afraid that the your numbers of clusters is low & your standard errors might be biased, an alternative might be to try a degrees-of-freedom correction as implemented for glm()
in the clubSandwich package. Though I have to admit that I am not sure how well the implemented corrections work for Poisson regression.