-1

I'm new to R and correctly working a project refactoring code reading from csv files to from a database.

The work includes dumping the csv files to a Postgres database, and modify existing R scripts to ingest input data from the db tables instead of csv files for subsequent transformation

Right now I ran into an issue that the dataframe columns returned from dbGetQuery() have different modes and classes than the original dataframe from read_csv()

Since the data I'm reading in has hundreds of columns, it is not that convenient to explicitly specify the mode and class for each column.

Is there an easy way to make the dataframe with same schema as the old one, so I can apply existing code for data transformation on the dataframe

i.e when I run a comparison between the old dataframe and the new one from db, this is what I see

  ==================================
   VARIABLE  CLASS.(from csv)  CLASS.(from db)
  ----------------------------------
   col1     numeric           integer64
   col2     numeric           integer
   col3     numeric           integer
  ----------------------------------

1 Answers1

0

This won't be possible in general, because some SQL datatypes (e.g. DATE, TIMESTAMP, INTERVAL) have no equivalent in R, and the R data type factor has no equivalent in SQL. Depending on your R version, strings are automatically converted to factors, so it will at least be useful to import the data with stringsAsFactors=FALSE.

cdalitz
  • 1,019
  • 1
  • 6
  • 7
  • thanks. Guess there is no easy way to do this. Have to specify the data types one by one when dumping data to database. – For Work Nov 02 '20 at 00:18