I am running a 64-bit R/RStudio on a 64-bit Windows 10. Pc has 16GB of RAM and runs on 8-cores.
So RStudio crashes at around 1.6/7 GB of memory utilization while reading a larger dataset.
So I'm trying to use the parallel package to execute the operation with multiple cores. But somewhere I'm making a mistake.
library("data.table")
library("lubridate")
library("parallel")
library("foreach")
library("doParallel")
cl <- makeCluster(detectCores() - 1)
registerDoParallel(cl, cores = detectCores() - 2)
files = list.files(pattern="public")
myfiles = do.call(rbind, lapply(files, function(x) fread(x, colClasses=c(ID="character"))))
I don't have much experience with parallel processing.
Can you please let me know where am I getting it wrong?
Update:
R has no problem creating a 8gb object in memory.
bigint <- integer(2^32 / 2)
Still not sure what is limiting the reading of the data.
Update 2:
I made a diagnostic report. These are the errors that I'm getting.
24 Jan 2019 23:10:33 [rdesktop] ERROR system error 231 (All pipe instances are busy); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:11:47 [rdesktop] ERROR system error 231 (All pipe instances are busy); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:11:47 [rdesktop] ERROR system error 231 (All pipe instances are busy); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:13:39 [rdesktop] ERROR system error 231 (All pipe instances are busy); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:13:40 [rdesktop] ERROR system error 231 (All pipe instances are busy); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:13:41 [rdesktop] ERROR system error 232 (The pipe is being closed); OCCURRED AT: void rstudio::core::http::AsyncClient<SocketService>::handleWrite(const rstudio_boost::system::error_code&) [with SocketService = rstudio_boost::asio::windows::basic_stream_handle<>] C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/AsyncClient.hpp:342; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:13:42 [rdesktop] ERROR system error 2 (The system cannot find the file specified); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288
24 Jan 2019 23:13:42 [rdesktop] ERROR system error 2 (The system cannot find the file specified); OCCURRED AT: virtual void rstudio::core::http::NamedPipeAsyncClient::connectAndWriteRequest() C:/Users/Administrator/rstudio/src/cpp/core/include/core/http/NamedPipeAsyncClient.hpp:84; LOGGED FROM: void rstudio::desktop::NetworkReply::onError(const rstudio::core::Error&) C:\Users\Administrator\rstudio\src\cpp\desktop\DesktopNetworkReply.cpp:288