What does the "More Columns than Column Names" error mean?

Question

I'm trying to read in a .csv file from the IRS and it doesn't appear to be formatted in any weird way.

I'm using the read.table() function, which I have used several times in the past but it isn't working this time; instead, I get this error:

data_0910<-read.table("/Users/blahblahblah/countyinflow0910.csv",header=T,stringsAsFactors=FALSE,colClasses="character")

Error in read.table("/Users/blahblahblah/countyinflow0910.csv",  : 
  more columns than column names

Why is it doing this?

For reference, the .csv files can be found at:

http://www.irs.gov/uac/SOI-Tax-Stats-County-to-County-Migration-Data-Files

(The ones I need are under the county to county migration .csv section - either inflow or outflow.)

Matthew Lundberg · Answer 1 · 2014-06-04T03:33:55.763

It uses commas as separators. So you can either set sep="," or just use read.csv:

x <- read.csv(file="http://www.irs.gov/file_source/pub/irs-soi/countyinflow1011.csv")
dim(x)
## [1] 113593      9

The error is caused by spaces in some of the values, and unmatched quotes. There are no spaces in the header, so read.table thinks that there is one column. Then it thinks it sees multiple columns in some of the rows. For example, the first two lines (header and first row):

State_Code_Dest,County_Code_Dest,State_Code_Origin,County_Code_Origin,State_Abbrv,County_Name,Return_Num,Exmpt_Num,Aggr_AGI
00,000,96,000,US,Total Mig - US & For,6973489,12948316,303495582

And unmatched quotes, for example on line 1336 (row 1335) which will confuse read.table with the default quote argument (but not read.csv):

01,089,24,033,MD,Prince George's County,13,30,1040

@user3084629: There are also unmatched quotes. Read more carefully in ?read.table — IRTFM, Jun 04 '14 at 04:24

score 5 · Answer 2 · answered Jul 12 '18 at 17:38

5

you have have strange characters in your heading # % -- or ,

answered Jul 12 '18 at 17:38

em_likefrom007

86
1
3

score 3 · Answer 3 · answered Sep 21 '16 at 10:42

3

For the Germans:

you have to change your decimal commas into a Full stop in your csv-file (in Excel:File -> Options -> Advanced -> "Decimal seperator") , then the error is solved.

answered Sep 21 '16 at 10:42

Juschu

31
1

score 0 · Answer 4 · answered May 21 '20 at 17:18

0

Depending on the data (e.g. tsv extension) it may use tab as separators, so you may try sep = '\t' with read.csv.

answered May 21 '20 at 17:18

user773797

1
3

score 0 · Answer 5 · answered Nov 23 '20 at 16:49

0

This error can get thrown if your data frame has sf geometry columns.

answered Nov 23 '20 at 16:49

jsta

3,216
25
35

What does the "More Columns than Column Names" error mean?

5 Answers5

Linked