2

When read rds file in python following way:

import pyreadr
df = prreadr.read_r('/home/test.rds')

to turn df into DataFrame, I tried:

pd.DataFrame(df) - I get If using all scalar values, you must pass an index message

pd.DataFrame(df, columns = ['a','b','c']) - I get columns with NaN

pd.DataFrame.from_dict(df, orient = 'index', columns = ['a','b','c']) - I get Shape of passed values is (1, 1), indices imply (1, 11) message

I want to read rds file in Python and turn it into DataFrame.

M--
  • 25,431
  • 8
  • 61
  • 93
thomas
  • 21
  • 1
  • Does this answer your question? [Loading a .rds file in Pandas](https://stackoverflow.com/questions/40996175/loading-a-rds-file-in-pandas) – Meisam Mar 30 '23 at 05:10
  • As you can see from the link above, the `pyreadr` module currently does not support nested objects. – Meisam Mar 30 '23 at 05:12
  • You are already reading the rds file in python as pandas dataframe with `pyreadr.read_r('/home/test.rds')` - I'm assuming that `prreadr.read_r` had a typo introduced when you typed your question. [Here's](https://ofajardo.github.io/pyreadr/_build/html/index.html#pyreadr.pyreadr.read_r) the documentation on `pyreadr.read` – Marcelo Paco Mar 30 '23 at 05:13
  • `import pyreadr` `df = pyreadr.read_r('/home/test.rds')` `df` type is OrderdDict. May be I need find a way to manipulate OrderedDict Fomat? instead of tun it into DataFrame? – thomas Mar 31 '23 at 01:40

1 Answers1

0

when using read_r, it returns a dictionary. You should first look at what keys you got in your dictionary, each of those elements is a dataframe. In the case of rdata files y can multiple dataframes per file, that is why you can get multiple elements in your dictionary. In the case of rds files you can get only one element and therefore you have only one element with the key None. So do this (observe the None in square brackets)

import pyreadr
df = prreadr.read_r('/home/test.rds')[None]
Otto Fajardo
  • 3,037
  • 1
  • 18
  • 26