I want to execute some python functions using data from '.RData' file. I am using the 'pyreadr' python package for the same.
Here is example of R Code
library(data.table)
# Creating demo data frame
data <- data.table(x_time = c(Sys.time(),Sys.time()+1,Sys.time()+2))
data_missing <- data.table(x_time = c(Sys.time(),NA,NA))
# checking the classes
sapply(data,class)
sapply(data_missing,class)
# Storing the data in RData file
save(data, file = "test_data.RData")
save(data_missing, file = "test_missing_data.RData")
The reason I am storing it in different files is because the 'test_data.RData' is successfully loaded in python, however the 'test_missing_data.RData' is giving the an error.
Here is the Python Code
## Working demo
# import pyreadr
# result=pyreadr.read_r('test_data.RData')
# data=result['data']
# data.dtypes
# print(data)
### Error in below
import pyreadr
result=pyreadr.read_r('test_missing_data.RData') # Error
data=result['data']
data.dtypes
print(data)
The error message is as below:
C:\Users\Pawan\AppData\Local\R-MINI~1\envs\R-RETI~1\lib\site-packages\pandas\core\tools\datetimes.py:530: RuntimeWarning: invalid value encountered in multiply arr, tz_parsed = tslib.array_with_unit_to_datetime(arg, unit, errors=errors)
The error occurs when there are NA values in the data frame. Is there other way load RData files in python ?
Thank you for your time and help.