0

I have a list like this:

GTPYANJ         695848
GTPYANJ         27811
FPORTAL3        432532

I want to turn it into this using regular expressions:

GTPYANJ,695848,27811
FPORTAL3,432532

Suggestions?

Maladon
  • 398
  • 2
  • 11
  • 1
    Seems to me that would be a lot easier by not using regular expressions. – Carl Norum Feb 11 '10 at 21:31
  • 2
    My suggestion: don't use regular expressions. – Matt Ball Feb 11 '10 at 21:34
  • This is for a one-time report where writing some code would be overkill. Otherwise it'd be simple to read it and populate a hashtable and use that. – Maladon Feb 11 '10 at 21:40
  • 1
    Even with regular expressions, this would require multiple passes. You are best off writing a quick script to do this. – Jeff B Feb 11 '10 at 21:41
  • perl -e 'while(<>) { chomp; ($tag, $num) = split /\s+/; $tmp{$tag} .= ",$num"; } foreach $t (sort keys %tmp) { print $t.$tmp{$t}."\n" } ' myfile.txt – Jeff B Feb 11 '10 at 21:44

2 Answers2

0

Perl one-liner:

perl -e 'while(<>) { chomp; ($tag, $num) = split /\s+/; $tmp{$tag} .= ",$num"; } foreach $t (sort keys %tmp) { print $t.$tmp{$t}."\n" } '  myfile.txt

Much easier than trying to hobble together a multi-pass regex that will most likely break a couple of times before you get it right, and which depends on the data being sorted, and which might require a second regex to reformat everything at the end...

Jeff B
  • 29,943
  • 7
  • 61
  • 90
0

load into jEdit (or Notepad++, or some other editor that can search/replace via regex.

Step 1 is to make sure that the delimiter is a tab.

Then, search for

^(.*)\t(.*)\n\1

and replace that with

$1\t$2,

Repeat the find/replace all until no more matches are found.

Maladon
  • 398
  • 2
  • 11
  • 1
    This of course only works for your very particular file. If you throw in any extra white space, or your list is not sorted, this will break miserably. Not to mention, if you have thousands, or even hundreds of entries, you are stuck repeating the search/replace manually, because I am not aware of any search replace that starts over at the beginning indefinitely until there are no matches left. – Jeff B Feb 11 '10 at 21:50
  • Good point. For my one-off data manipulation task it worked great. As a programmatic solution it would need a different approach. – Maladon Feb 28 '12 at 14:49