I need to fit GLMs on data that doesn't fit into my computer's memory. Usually to get around this issue, I would sample data, fit the model and then test on a different sample that would sit out of memory. This has been R's major limitation for me which is why for fitting GLM's SAS has been preferred since it doesn't stumble with data that doesn't fit into memory.
I've been trying to find ways to solve this issue with R on my local machine and want to know if Sparklyr can be used to get around the memory issue? I realise Spark is meant to be used in a cluster environment etc, but straight up - can Sparklyr be used to work with data on my local machine that would otherwise not fit into its memory?