5

I'm trying to read in a fairly simple csv file, but readr is throwing an error when I try to specify the column types. Here's a small snippet of my data:

text <- "Item,Date,Time,SeizureTime,ET,OriginatingNumber,TerminatingNumber,IMEI,IMSI,CT,Feature,DIALED,FORWARDED,TRANSLATED,ORIG_ORIG,MAKE,MODEL,TargetNumber
3,10/31/2012,7:53:00,0:15,1:43,(123)555-1216,(123)555-5662,,,MO,[],,11235552511,,,,,(123)555-5662
4,10/31/2011,9:04:00,0:25,0:00,(123)555-0214,(123)555-5662,,,MO,[],,11235552511,,,,,(123)555-5662
9,10/31/2014,9:08:00,0:11,2:13,(123)555-8555,(132)555-5662,,,MO,[],,11235552511,,,,,(123)555-5662
12,10/31/2011,9:27:00,0:07,0:10,(123)555-0214,(123)555-5662,,,MO,[],,11235552511,,,,,(123)555-5662
13,10/31/2015,9:35:00,0:27,0:00,(123)555-0214,(123)555-5662,,,MO,[],,11235552511,,,,,(123)555-5662
16,10/31/2011,10:09:00,0:10,14:13,(123)555-1216,(123)555-5662,,,MO,[],,11235552511,,,,,(123)555-5662"

dat <- read.table(text = text, sep = ",", header = TRUE)

When trying to read in the actual CSV using read_csv("file.csv", col_types = rep("c", times = 18)) I get the error mentioned in the title. I've seen a couple SO questions regarding this error and they all seem to be related to C++ functions taking place in the background, but I don't know how to remedy it. If I strip off the col_types argument, the error goes away, but it doesn't parse the data correctly.

tblznbits
  • 6,602
  • 6
  • 36
  • 66
  • @PierreLafortune When I drop `col_types` from the argument when reading in the actual file, I get this: Warning: 200221 parsing failures. row col expected actual 1 Time valid date 7:53:00 2 Time valid date 9:04:00 3 Time valid date 9:08:00 4 Time valid date 9:27:00 5 Time valid date 9:35:00 ... .... .......... ....... .See problems(...) for more details. – tblznbits Jan 05 '16 at 15:50
  • 1
    That is not an error, just a warning. You can retrieve the missing column with `problems(read_csv(text))[,4]` – Pierre L Jan 05 '16 at 15:53

1 Answers1

10

When you specified the columns you used rep("c", times=18), but that creates a vector with 18 elements. From the help for ?read_csv, the col_types argument takes a single string of column shortcuts like "ccccdc". So we paste the c's together to form one string:

read_csv(text, col_types=paste(rep("c", 18), collapse=""))
Source: local data frame [6 x 18]

   Item       Date     Time SeizureTime    ET
  (chr)      (chr)    (chr)       (chr) (chr)
1     3 10/31/2012  7:53:00        0:15  1:43
2     4 10/31/2011  9:04:00        0:25  0:00
3     9 10/31/2014  9:08:00        0:11  2:13
4    12 10/31/2011  9:27:00        0:07  0:10
5    13 10/31/2015  9:35:00        0:27  0:00
6    16 10/31/2011 10:09:00        0:10 14:13
Pierre L
  • 28,203
  • 6
  • 47
  • 69