5

I'm having trouble importing a csv file which looks like this:

"password","score"
"p@sswd123456",0
"amdk62",0
"august89",0
"19760124",0

The scores are between 0 and 100. And the passwords can contain anything imaginable. I even found something like

""12345,./"" 

which is messing with R i assume. The command I am trying is:

mydata = read.csv(file="passwordlist.csv", header=TRUE, quote="", sep=",")

and I do not get an error message. When I try to read the file the second column is missing completely though.

How can I import both columns?

Wirsiing
  • 77
  • 3
  • 9
  • 1
    Try `quote = "\""`. – nya Jun 15 '16 at 09:49
  • single quote for quote double quote and vice versa – SabDeM Jun 15 '16 at 09:49
  • Both ' " ' and " \" " give me an EOF within quoted string error. – Wirsiing Jun 15 '16 at 09:54
  • That error probably means that there are mismatches quotes, or the last line is incomplete. Check the last line at least, is it complete? – Panagiotis Kanavos Jun 15 '16 at 09:55
  • How big is the file, does it open normally in Excel? If yes, maybe save 2 columns as 2 files, then read into R? – zx8754 Jun 15 '16 at 10:00
  • If you can open the file in Excel, save it as tab-delimited and then import into R. Passwords won't contain `\t`. – nya Jun 15 '16 at 10:02
  • @PanagiotisKanavos I checked the last line and it was fine. But somewhere in between there are some strange password attempts. – Wirsiing Jun 15 '16 at 11:04
  • @nya I will do that now. Tabs seem like a nice idea. If that does not work I will try to remove all quotes and see if that helps – Wirsiing Jun 15 '16 at 11:05
  • @Wirsiing what does "strange password attempts" mean? Tabs won't help at all if the data is bad, ie it contains unescaped double quotes inside double-quoted fields – Panagiotis Kanavos Jun 15 '16 at 11:09
  • UPDATE: Tabs did not help and neither did removing my quotes at the beginning of each password. Is it possible to let R escape the column? – Wirsiing Jun 15 '16 at 11:10
  • @PanagiotisKanavos It means the data is bad. Like in my example in the original question - everything seems to be possible – Wirsiing Jun 15 '16 at 11:11
  • @Wirsiing: a workaround that may work is using 'fread' , http://www.inside-r.org/packages/cran/data.table/docs/fread – Ruthger Righart Jun 15 '16 at 11:12
  • @RuthgerRighart package fread is not available for R3.0 unfortunately and since I am really new to R I do not know a workaround – Wirsiing Jun 15 '16 at 11:21
  • @nya I split them into 2 different files which I was able to read. Unfortunately when I try to plot(pws, scores) the lengths differ. Although the nrow of both is the same – Wirsiing Jun 15 '16 at 11:32
  • try `sep = "\n"` and deal with splitting the columns once inside? – Bryan Goggin Jun 15 '16 at 11:36
  • @Wirsiing: if you have a possibility to upgrade your R then do not hesitate. Using the data you provided `fread` works. An alternative is to open your .csv in Excel, select and copy your data cells, and then use `my.data <- read.clipboard()` from the `psych` package. This basically is a handy copy-paste function to bring data manually into R. – Ruthger Righart Jun 15 '16 at 12:00
  • @Wirsiing Check structure of the two variables `str()`. Is it what you expect? – nya Jun 15 '16 at 12:23
  • It finally worked using two seperate imports. Thanks for your help everyone! – Wirsiing Jun 15 '16 at 13:05

0 Answers0