4

Is it possible within readr package to read in data and specify a single data type across all columns? Similar to base::read.table with colClasses = "character" or using the as.is argument.

Unless the task, data headers, file encoding, etc are well defined ahead of analysis I prefer to write my loaders without changing the datatypes and then process the schema later downstream. Always open to suggestion on how other folks think about things.

joran
  • 169,992
  • 32
  • 429
  • 468
gbartusk
  • 113
  • 2
  • 5
  • Something that important is probably in the documentation. – Rich Scriven Jul 30 '15 at 21:43
  • 1
    To set all the columns to the same type, read 1 line with `nmax = 1` to get the number of columns, then read the whole thing with `col_types = paste(rep("c", ncol), collapse = "")` (for "character"). – Gregor Thomas Jul 30 '15 at 21:45
  • 2
    I think asking if **readr** has the ability to recycle column specifications in the same way as `colClasses` in `read.table` is pretty reasonable. I don't see a compact, direct way to do it, but I think it would make a good feature request on github. – joran Jul 30 '15 at 21:52
  • Thanks, yes i checked out the docs as well as the col_types.R file on github but nothing jumped out at me. I agree there are certainly work around but would probably just default back to base functions rather than doing any sort of "trickery" (for reasons of maintainability). – gbartusk Jul 30 '15 at 21:55
  • 2
    FWIW, I sort of doubt that Hadley would like the idea of a fully recycled column specification, but he might be open to adding a compact, direct way to tell **readr** to just read everything as character (or another single type). – joran Jul 30 '15 at 22:09
  • In principle, you don't need to worry about `column_types` because they will be imputed from the first 30 rows on the input. For details, check `?read_table` In case you really want to specify a single data type across all columns, Gregor's answer is pretty good. – rafa.pereira Sep 28 '15 at 15:44

2 Answers2

8

As of readr 0.2.2 we can do something like this to read a csv with all columns as character:

read_csv("path/to/file",col_types = cols(.default = col_character()))
joran
  • 169,992
  • 32
  • 429
  • 468
3

Converting my comment to answer. No, this isn't built-in (at this point), the documentation of the col_types is quite clear about its capabilities, this isn't one of them. Given the way col_types works, implementing this would probably require a brand new argument since a feature is that a "short" col_types will be used to restrict the number of columns read.

However, you could write a wrapper:

read_table_asis = function(...) {
    n_cols = ncol(read_table(..., n_max = 1))
    read_table(..., col_types = paste(rep("c", n_cols), collapse = ""))
}
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • ended up going with this solution, despite by reluctance, since it was fewer keystrokes than writing out all the read.table options... thanks! – gbartusk Aug 27 '15 at 15:35