I have a tibble
(data.frame
) that I need to apply a number of type updates to. I have a readr
::col_spec
object that describes the desired types, but since the data does not originate as a csv file, I cannot use read_csv(..., col_types=cspec)
to apply the changes to the specified columns.
Since col_spec
is a data structure designed exactly to specify desired data types, I would nevertheless to use it directly as an input to a function that applies the changes for me, rather than writing a long custom script to apply the different columns. See the following example:
library(tidyverse)
# Subset starwars to get sw (comparable to my input data)
sw <- starwars %>%
select(name, height, ends_with("_color")) %>%
slice(c(1,4,5,19))
sw
#> # A tibble: 4 × 5
#> name height hair_color skin_color eye_color
#> <chr> <int> <chr> <chr> <chr>
#> 1 Luke Skywalker 172 blond fair blue
#> 2 Darth Vader 202 none white yellow
#> 3 Leia Organa 150 brown light brown
#> 4 Yoda 66 white green brown
# The col_spec that I have
cspec <- cols(
hair_color = col_factor(c("brown", "blond", "white", "none")),
skin_color = col_factor(c( "green", "light", "fair", "white")),
eye_color = col_factor(c("blue", "brown", "yellow"))
)
# I would like to apply the col_spec directly to sw
# A not so great workaround is to use a tempfile
tf <- tempfile()
sw %>% write_csv(tf)
sw_fct <- read_csv(tf, col_types=cspec)
# This is more or less the result I am after:
# But note how info on other columns (height) is lost in the roundtrip
sw_fct
#> # A tibble: 4 × 5
#> name height hair_color skin_color eye_color
#> <chr> <dbl> <fct> <fct> <fct>
#> 1 Luke Skywalker 172 blond fair blue
#> 2 Darth Vader 202 none white yellow
#> 3 Leia Organa 150 brown light brown
#> 4 Yoda 66 white green brown