1

I have a weird CSV file, which has two separators: "\t" and ",".

I used to parse with CSV.parse("file", col_sep: "\t"), but now I have to separate fields with "," as well.

Any suggestion how it can be achieved?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
benams
  • 4,308
  • 9
  • 32
  • 74
  • Is the position of each separator constant? Like are tabes used for columns 1-4 and commas always used for columns 5-8? – Jason Sperske Oct 06 '13 at 14:48
  • @JasonSperske yes, it's constant. Only columns 1-2 are separated by tab, the rest are by commas. I have to note that the number of columns may change from row to row – benams Oct 06 '13 at 14:54
  • Can you give some sample entries,so that I can test it. – Arup Rakshit Oct 06 '13 at 14:57
  • Does the data contain any tabs, or are the only tabs the odd separators? – Linuxios Oct 06 '13 at 15:02
  • Are there unescaped commas in the tab columns or unescaped tabs in the comma columns? – Jason Sperske Oct 06 '13 at 15:42
  • You need to supply samples of the data you're working with, along with your expected output. Making us concoct samples wastes time. – the Tin Man Oct 06 '13 at 17:13
  • 2
    I would rather focus on fixing the input file. Every reasonable text editor should be able to replace the tabs with commas. – Patrick Oscity Oct 06 '13 at 17:14
  • 2
    CSV is a very abused file format, but one thing that is sacred is that the delimiter has to remain constant throughout the file, otherwise madness ensues. TSV and CSV are alternate forms of the same idea; A separator character defines the field boundaries and any use of that character inside a field requires it to be escaped. How is the file getting multiple delimiters? Can you control that? Fixing the problem prior to handing it to CSV to process is the best course. – the Tin Man Oct 06 '13 at 17:18

1 Answers1

-1

Try this:

CSV.parse(File.read('csvfile').gsub("\t", ","), col_sep: ',')
Linuxios
  • 34,849
  • 13
  • 91
  • 116