0

I can't import csv (semicolon delimited) to R. The problem is that some columns contain text with special character (like semicolon) which result in unequal number of columns in some rows.

Special characters are surrounded by quotas, like ";". The file is 2.3 GB. I can open this file correctly in Excel (at least part of it).

I tried readr, data.table, basic R and failed.

read_csv2("C:/PE_Omnibus_plik_płaski/omnibus_clean.csv")

I could do this in Notepad++, but would prefer R.

  • please paste some example data and that will explain your problem better. – abhiieor Aug 18 '16 at 08:12
  • Hi, it's difficult to give a sample data, because if I open it in Excel, the file is open correctly :). In this post: (http://stackoverflow.com/questions/9364739/how-to-format-data-in-a-csv-files-so-that-it-can-easily-be-imported-in-r/9365281#9365281) @John said that "You can violate the unique separator requirement if your entries are wrapped in quotes". I have file like that, but can't open it in R properly. – Seweryn Grodny Aug 18 '16 at 09:22
  • @SewerynGrodny Did you try with different encoding options? – amrrs Aug 18 '16 at 09:32
  • @amrrs no i didn't. I thougth it's irrelevent in my case. Do you know, how can I do this? – Seweryn Grodny Aug 18 '16 at 09:43
  • I had similar issue recently with unknown symbols occurring in the text which R thought as column end, Just playing with different encoding solved it `read.csv("data.csv", encoding="UTF-8", stringsAsFactors=FALSE)` and you can try `fileEncoding="latin1"` – amrrs Aug 18 '16 at 10:14
  • @amrrs thx for it. I'm testing it now. But read.csv is quiet slow (especially with 2.3 GB). Do you know how i could do this in `readr` – Seweryn Grodny Aug 18 '16 at 12:06
  • 1
    @SewerynGrodny `fread` of `data.table` should help you improve the speed – amrrs Aug 18 '16 at 12:13
  • I tried few options and results are as follow: `read.csv("data.csv", encoding="UTF-8", stringsAsFactors=FALSE)` works fine as well as `read.csv("data.csv")`. But neither `read_csv2` nor fread does. Strange situation :) – Seweryn Grodny Aug 18 '16 at 20:21
  • If you can open it in excel, just re-save it as a .csv and make sure to surround any text fields with quotation marks. – MichaelChirico Aug 22 '16 at 17:38

0 Answers0