R readr: get columns specification of existing data, not imported one?

Question

I have a dataset created in an R session, that I want to 1) export as csv 2) save the readr-type column specifications separately. This will allow me to read this data later on, using read_csv() and specifying col_types from the file saved in 2).

Problem: one gets column specifications (attribute spec) only for data read with a read_* function. It does not seem possible to obtain directly column specifications from dataset created within R?

My worflow so far is:

Export item: write_csv()
Read specification from the exported file: spec_csv().
Save the column specification: write_rds()
Then finally read_csv(step_1, col_types=step_3)

But this is error prone, as spec_csv() can get it wrong: it is indeed only guessing, so in case all values are NA, need to attribute arbitrary (character) class. Ideally I would like to be able to extract column specifications directly from the original dataset, instead of writing/re-loading. How can I do that? I.e., how can I convert my classes of a data-frame to a spec object?

Thanks!

Actual (inefficient) worfkow:

library(tidyverse)

write_csv(iris, "iris.csv")

spec_csv("iris.csv") %>%
  write_rds("col_specs_path.rda")  

read_csv("iris.csv", col_types = read_rds("col_specs_path.rda"))

Can you A) say what situations allow `spec_csv` to "get it wrong", and B) post an example where this actually happens? — IRTFM, Mar 23 '17 at 22:33
Sure, I added a discussion of this, although this is not really the main point of the post, having to run specs_cols on a file can also be slow. — Matifou, Mar 23 '17 at 22:57

R readr: get columns specification of existing data, not imported one?

0 Answers0