We have some big datasets to process which are requiring us to hire space on a Virtual Machine - we are renting one with plenty of RAM (128-256GB) and were hoping that this would enable the process to be done entirely in RAM, however we have just found that RStudio is writing 32GB temp files to the VM's very slow hard drive.
Is there anyway we can stop RStudio writing anything to disk at all?
We are using dplyr verbs to run the bigRquery
dataset_name <- 'MOT'
con <- dbConnect(
bigrquery::bigquery(),
project = project_id,
dataset = "MOT",
billing = project_id
)
tests.con <- tbl(con, "tests")
tests <- tests.con %>% select(vehicleId,
make,
model,
firstUsedDate,
fuelType,
registrationDate,
manufactureDate,
completedDate,
testResult,
odometerValue,
odometerUnit) %>%
filter(completedDate < as.POSIXct(Qdate1)) %>%
filter(completedDate >= as.POSIXct(Qdate00)) %>%
filter(model!="") %>%
collect()
Thanks, Tim