0

I have a csv file with the following contents:

INTERB-MNT,2008-09-10T21:05:38Z,2008-09-10T21:05:38Z,MARIA

How can I use sed to replace the characters 'T' and 'Z', such that the contents of the file are changed to the following?:

INTERB-MNT,2008-09-10,21:05:38,UTC,2008-09-10,21:05:38,UTC,MARIA

I tried the following, but obviously I'm missing something because that does not produce the desired results:

sed -e 's/[0-9]{4}-[0-9]{2}-[0-9]{2}.T.[0-9]{2}:[0-9]{2}:[0-9]{2}Z/[0-9]{4}-[0-9]{2}-[0-9]{2},[0-9]{2}:[0-9]{2}:[0-9]{2}UTC/g'

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
user2965031
  • 61
  • 2
  • 7

1 Answers1

0

To keep your text after substitution, you have to capture input with parens, and then use \1 through \9 to refer to the captured matching in the substitution part. To be able to use \1 through \9 backreferences, you have to use -E or -r options.

The command will look like this:

sed -r 's/(.+)T(.+)Z/\1,\2,UTC/g'

But this can't be used: the T will match the last part of the string because (.+) is greedy. So your idea to match 2008-09-10 and 21:05:38 pattern is good. You ended up with this:

sed -r 's/([0-9]{4}-[0-9]{2}-[0-9]{2})T([0-9]{2}:[0-9]{2}:[0-9]{2})Z/\1,\2,UTC/g'

This works. You could also use this simpler command:

sed -r 's/(....-..-..)T(..:..:..)Z/\1,\2,UTC/g'

It is easier to read and write, and a false positive is very unlikely. It depends on your needs.

cbliard
  • 7,051
  • 5
  • 41
  • 47
  • You do not have to use specific options to use a regex with *sed*, and the first `.+`/(.+)T(.+)Z/` won't match the first `T`, it will match the last `T` because `+` is greedy. – Wiktor Stribiżew Aug 10 '16 at 14:18
  • I tried without `\-r` at first, but got the following error: `sed: -e expression #1, char 24: invalid reference \2 on 's' command's RHS`, that's why I added `-r` to the command. About the other remark, you are totally right, I'll update the answer. – cbliard Aug 10 '16 at 14:45
  • 1
    That is because backreferences are working with the advanced ERE syntax, without any `-r` or `-E`, BRE regex syntax is accepted. You always use a regex with sed. – Wiktor Stribiżew Aug 10 '16 at 14:48