How to replace one character inside parentheses keep everything else as is

Question

The data looks like this :

There is stuff here (word, word number phrases)
(word number anything, word phrases), even more
...

There is a lot of them in different files. There is different kind of data too, all around it that isn't in the same format. The data inside the paratheses can't change, and it's always on the same line. I do not have to deal with:

(stuff number,
maybe more here)

I would like to be able to replace the comma with a colon

Desired output would be

There is stuff here (word: word number phrases)
(word number anything: word phrases), even more
...

randomir · Accepted Answer · 2017-11-02T20:57:55.303

5

Assuming there's only one comma to be replaced inside parentheses, this POSIX BRE sed expression will replace it with colon:

sed 's/(\(.*\),\(.*\))/(\1:\2)/g' file

If there are more than one comma, only the last one will be replaced.

In multiple-commas scenario, you can replace only the first one with:

sed 's/(\([^,]*\),\([^)]*\))/(\1:\2)/g' file

edited Nov 02 '17 at 20:57

answered Nov 02 '17 at 20:27

randomir

17,989
1
40
55

the \1 and \2 are they like in `mmv` with #1 and #2? It looks like it. – Maxime Roussin-Bélanger Nov 02 '17 at 20:33
1

`\1` and `\2` hold what's captured with first `(.*)` and second group `(.*)`. – randomir Nov 02 '17 at 20:35
Because `.*` is greedy, this will only match the **last** comma on a line. It will not work as expected with a line like `foo (bar,baz) qux,blah (hello,fine,world)` -- this will work better: `sed ':a;s/($[^,)]*$,$[^)]*$)/(\1:\2)/;ta'` – glenn jackman Nov 02 '17 at 20:41
@glennjackman, that's true. But I'm not sure if OP needs to handle the first comma, the last comma, or all of them.. – randomir Nov 02 '17 at 20:43
@Lorac, any comments? – glenn jackman Nov 02 '17 at 20:44
If you need sed, `sed 's/($[^,)]*$,$[^)]*$)/(\1:\2)/g'` would do I think. – glenn jackman Nov 03 '17 at 18:55
1

@Lorac, for only the first comma, see the second part of my answer. – randomir Nov 03 '17 at 18:56

score 5 · Answer 2 · answered Nov 02 '17 at 20:36

5

Here's a version for awk that uses the parentheses as record separators:

awk -v RS='[()]' 'NR%2 == 0 {sub(/,/,":")} {printf "%s%s", $0, RT}' file

The stuff between parentheses will be every even-numbered record. The RT variable holds the character that matched the RS pattern for this record.

Note that this only replace the first comma of the parenthesized text. If you want to replace all, use gsub in place of sub

answered Nov 02 '17 at 20:36

glenn jackman

238,783
38
220
352

1

Note: `RT` is specific to GNU awk – glenn jackman Nov 02 '17 at 20:46

score 2 · Answer 3 · answered Nov 08 '17 at 10:19

While @randomir's sed solution dwells on replacing a single comma inside parentheses, there is a way to replace multiple commas inside parentheses with sed, too.

Here is the code:

sed '/(/ {:a s/\(([^,()]*\),/\1:/; t a}'

or

sed '{:a;s/\(([^,()]*\),/\1:/;ta}'

or

sed -E '{:a;s/(\([^,()]*),/\1:/;ta}'

See an online demo.

In all cases, the main part is between the curly braces. Here are the details for the POSIX ERE (sed with -E option) pattern:

:a;
s/(\([^,()]*),/\1:/; - find and capture into Group 1
- \( - a ( char
- [^,()]* - zero or more chars other than ,, ( and ) (so, only those commas will be removed that are in between the closest ( and ) chars, not inside (..,.(...,.) - remove ( from the bracket expression to also match in the latter patterns)
- \1: - and replace with the Group 1 contents + a colon after it
ta - loop to :a if there was a match at the preceding iteration.

score 1 · Answer 4 · answered Nov 02 '17 at 20:25

Using awk

$ awk -v FS="" -v OFS="" '{ c=0; for(i=1; i<=NF; i++){ if( $i=="(" || $i ==")" ) c=1-c; if(c==1 && $i==",") $i=":" } }1' file
There is stuff here (word: word number phrases)
(word number anything: word phrases), even more

-v FS="" -v OFS="" Set FS to null so that each char is treated as a field.

set variable c=0. Iterate over each field using for loop and toggle the value of c if ( or ) is encountered.
if c==1 and , appears then replace it to :

score 1 · Answer 5 · answered Nov 03 '17 at 05:39

With perl

$ perl -pe 's/\([^()]+\)/$&=~s|,|:|gr/ge' ip.txt
There is stuff here (word: word number phrases)
(word number anything: word phrases), even more

$ echo 'i,j,k (a,b,c) bar (1,2)' | perl -pe 's/\([^()]+\)/$&=~s|,|:|gr/ge'
i,j,k (a:b:c) bar (1:2)

$ # since only single character is changed, can also use tr
$ echo 'i,j,k (a,b,c) bar (1,2)' | perl -pe 's/\([^()]+\)/$&=~tr|,|:|r/ge'
i,j,k (a:b:c) bar (1:2)

e modified allows to use Perl code in replacement section
$[^()]+$ match non-nested () with one or more characters inside
$&=~s|,|:|gr perform another substitution on matched text, the r modifier would return the modified text

How to replace one character inside parentheses keep everything else as is

5 Answers5

Linked

Related