I mainly source my data by reading in CSV files at Jupyter Notebook. Everything works fine, except I use Python through company VPN connection to remote clusters. When VPN is disconnected, I have to re-read CSV files even if the computer has been on. Obviously if connection is lost there is no way Jupyter Notebook maintains connection to the kernel. Is there, then, such thing like reading in CSV once into DF and save it permanently, like SAS data set?
Asked
Active
Viewed 167 times
0
-
How big is the dataframe? – gmds Jun 12 '19 at 02:36
-
csv ~28gB, not too big. Reading into DF takes ~15minutes. No big deal, but have to wait 15 minutes each time. Storage is not problem – opt135 Jun 12 '19 at 02:48
-
Well, yes, if repeating reading >5 times, kernel tends to get killed, even if I ask it to restart and wipe out output. I can always ask admin to clean it up, but... – opt135 Jun 12 '19 at 02:50