2

The program I am exporting my data from (PowerBI) saves the data as a .csv file, but the first line of the file is sep=, and then the second line of the file has the header (column names).

Sample fake .csv file:

sep=,
Initiative,Actual to Estimate (revised),Hours Logged,Revised Estimate,InitiativeType,Client
FakeInitiative1 ,35 %,320.08,911,Platform,FakeClient1
FakeInitiative2,40 %,161.50,400,Platform,FakeClient2

I'm using this command to read the file:

initData <- read.csv("initData.csv",
                   row.names=NULL,
                   header=T,
                   stringsAsFactors = F)

but I keep getting an error that there are the wrong number of columns (because it thinks the first line tells it the number of columns).

If I do header=F instead then it loads, but then when I do names(initData) <- initData[2,] then the names have spaces and illegal characters and it breaks the rest of my program. Obnoxious.

Does anyone know how to tell R to ignore that first line? I can go into the .csv file in a text editor and just delete the first line manually before I load it each time (if I do that, everything works fine) but I have to export a bunch of files and this is a bit stupid and tedious.

Any help would be much appreciated.

Eduardo Barbaro
  • 411
  • 7
  • 17
seth127
  • 2,594
  • 5
  • 30
  • 43

2 Answers2

1

There are many ways to do that. Here's one:

all_content = readLines("initData.csv")
skip_first_line = all_content[-1]
initData <- read.csv(textConnection(skip_first_line),
                   row.names=NULL,
                   header=T,
                   stringsAsFactors = F)
Eduardo Barbaro
  • 411
  • 7
  • 17
  • Excellent. It strikes me that I could also build in an `if` command to check that the first line is indeed `sep=,` before I skip it. Thank you. – seth127 Jun 24 '16 at 18:16
0

Your file could be in a UTF-16 encoding. See hrbrmstr's answer in how to read a UTF-16 file:

boski
  • 2,437
  • 1
  • 14
  • 30
N.Zinc
  • 1