I faced an error trying to import a CSV into R which had multiple duplicate columns, is there a way I can ignore those columns? It's easy to do that in case of small files and small number of columns but mine is a big one ~3k columns and 10M rows.
Asked
Active
Viewed 91 times
-1
-
3What code were you running exactly and what was the exact error you are getting? I wouldn't think there's a problem reading a file even if it does have duplicate columns. – MrFlick Mar 20 '17 at 20:20
-
1readr::read_csv and data.table::fread are both big improvements over read.csv and read.table in base. Perhaps try them if the base functions are giving you sorrow. – russellpierce Mar 20 '17 at 21:41
2 Answers
1
Read in the first row, I.e. the column headers, with readLines. strsplit to parse to vector. Rename duplicated elements. Then you can call read.csv with a col.names arg.

russellpierce
- 4,583
- 2
- 32
- 44