12

Say I have the first test.csv that looks like this

,a,b,c,d,e

If I try to read it using read.csv, it works fine.

read.csv("test.csv",header=FALSE)
#  V1 V2 V3 V4 V5 V6
#1 NA  a  b  c  d  e
#Warning message:
#In read.table(file = file, header = header, sep = sep, quote = quote,  :
#  incomplete final line found by readTableHeader on 'test.csv'

However, if I attempt to read this file using fread, i get an error instead.

require(data.table)
fread("test.csv",header=FALSE)
#Error in fread("test.csv", header = FALSE) : 
#  Not positioned correctly after testing format of header row. ch=','

Why does this happen and what can I do to correct this?

user2100721
  • 3,557
  • 2
  • 20
  • 29
Wet Feet
  • 4,435
  • 10
  • 28
  • 41
  • 4
    I think this is a bug -- it was reported by @gsee here: https://r-forge.r-project.org/tracker/index.php?func=detail&aid=5413&group_id=240&atid=975 – Kevin Ushey Mar 12 '14 at 07:04
  • Thanks, so reverting to 1.8 would solve the problems for now, I suppose. – Wet Feet Mar 12 '14 at 07:08
  • @KevinUshey Seems like I cannot install v1.8.10 on R v3.03. Do you have any other suggestions? – Wet Feet Mar 12 '14 at 07:34
  • 2
    Wait 6 hours, I'm sure package authors will have a solution for you. – Roman Luštrik Mar 12 '14 at 08:02
  • 2
    Just want to add that I hope this is fixed soon. – Chris Mar 30 '14 at 16:21
  • 2
    @WetFeet, in [1.9.3](https://github.com/Rdatatable/data.table), it seems to work as `read.csv()`. If you'd like to not have that NA column, use the `select` argument as: `fread("test.csv", select=2:6, header=FALSE)`. – Arun Sep 06 '14 at 21:10

3 Answers3

1

As for me, my problem was only that the first ? rows of my file had a missing ID value.

So I was able to solve the problem by specifying autostart to be sufficiently far into the file that a nonmissing value popped up:

fread("test.csv", autostart = 100L, skip = "A")

This guarantees that when fread attempts to automatically identify sep and sep2, it does so at a well-formatted place in the file.

Specifying skip also makes sure fread finds the correct row in which to base the names of the columns.

If indeed there are no nonmissing values for the first field, you're better off just deleting that field from the .csv with Richard Scriven's approach or a find-and-replace in your favorite text editor.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
0

I think you could use skip/select/drop attributes of the fread function for this purpose.

fread("myfile.csv",sep=",",header=FALSE,skip="A")#to just skip the 1st column
fread("myfile.csv",sep=",",header=FALSE,select=c(2,3,4,5)) # to read other columns except 1
fread("myfile.csv",sep=",",header=FALSE,drop="A") #to drop first column
Aayush Agrawal
  • 184
  • 1
  • 6
0

I've tried making that csv file and running the code. It seems to work now - same for other people? I thought it might be an issue with not having a new line at the end (hence the warning from read.csv), but fread copes fine whether there's an new line at the end or not.

CJB
  • 1,759
  • 17
  • 26