0

I was trying to use a csv file in R in read.transactions() command from arules package.

The csv file when opened in Notepad++ shows extra commas for every non-existing values. So, I'm having to manually delete those extra commas before using the csv in read.transactions(). For example, the actual csv file when opened in Notepad++ looks like:

D115,DX06,Slz,,,,
HC,,,,,,
DX06,,,,,,
DX17,PG,,,,,
DX06,RT,Dty,Dtcr,,

I want it to appear like below while sending it into read.transactions():

D115,DX06,Slz
HC
DX06
DX17,PG
DX06,RT,Dty,Dtcr

Is there any way I can make that change in read.transactions() itself, or any other way? But even before that, we don't get to see those extra commas in R(that output I showed was from Notepad++)..

So how can we even remove them in R when we can't see it?

zx8754
  • 52,746
  • 12
  • 114
  • 209
LearneR
  • 2,351
  • 3
  • 26
  • 50
  • 1
    In CSV 'Each record "should" contain the same number of comma-separated fields.' ([Wiki](http://en.wikipedia.org/wiki/Comma-separated_values)) so file with "removed" commas is *not* a valid CSV. Also notice that **we don't know** what `read.transactions()` function is, so we cannot help you with it. – Tim May 08 '15 at 11:11

1 Answers1

3

A simple way to create a new file without the trailing commas is:

file_lines <- readLines("input.txt")
writeLines(gsub(",+$", "", file_lines),
           "without_commas.txt")

In the gsub command, ",+$" matches one or more (+) commas (,) at the end of a line ($).

Since you're using Notepad++, you could just do the substitution in that program: Search > Replace, replace ,+$ with nothing, Search Mode=Regular Expression.

James Trimble
  • 1,868
  • 13
  • 20