0

What is wrong with below regex in unix ?

echo AB345678  | sed -n '/^\([a-zA-Z]\{2\}[0-9]\{6\}|[0-9]\{8\}\)$/p'
echo 12345678  | sed -n '/^\([a-zA-Z]\{2\}[0-9]\{6\}|[0-9]\{8\}\)$/p'

i am not getting the output :(

I mean the string I echoed why is it not matching with my regex? Whats wrong with my regex?

2 Answers2

2

The alternation operator in the BRE regex syntax must be defined as an escaped pipe \| (similar to ( and )):

echo "AB345678"  | sed -n '/^\([a-zA-Z]\{2\}[0-9]\{6\}\|[0-9]\{8\}\)$/p'
                                                      ^^

See an online demo.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I tried both of the below , but still no output. echo AB345678 | sed -n '/^\([a-zA-Z]\{2\}[0-9]\{6\}\|[0-9]\{8\}\)$/p' echo "AB345678" | sed -n '/^\([a-zA-Z]\{2\}[0-9]\{6\}\|[0-9]\{8\}\)$/p' – Lalit Somnathe Sep 21 '16 at 10:39
  • Does it mean you are on Sun OS? See [this thread](http://unix.stackexchange.com/a/151020). – Wiktor Stribiżew Sep 21 '16 at 10:53
  • @WiktorStribiżew That looks like a red herring to me. The XPG4 `sed` does have a slightly different feature set, but as per the documentation I was able to google, your fix should work, unless there is something the OP is not telling us. (I suppose there are multiple versions of XPG4 `sed` as well; perhaps the OP should [edit] their question to indicate the OS version etc.) – tripleee Sep 21 '16 at 11:22
  • Quoting these strings should not matter in any reasonable shell. – tripleee Sep 21 '16 at 11:22
  • @WiktorStribiżew : Yes , I am on SUN OS. I tried all the combinations posted in this forum , but still not able to get output.. – Lalit Somnathe Sep 21 '16 at 12:12
  • Try `egrep '^([a-zA-Z][a-zA-Z]|[0-9][0-9])[0-9][0-9][0-9][0-9][0-9][0-9]$'` - I am afraid the regex is implemented very poorly in Sun OS. From what I know, the [bound quantifiers do not work well there](http://stackoverflow.com/questions/38822261/regex-failed-to-match-using-m-n-on-sunos). – Wiktor Stribiżew Sep 21 '16 at 12:20
  • I've tried some combinations on Sun OS 5.8. It seems it's the pipe operator that doesn't work, escaped or not. Using the first part alone, everything is fine – Ingo Leonhardt Sep 21 '16 at 17:24
  • This pattern won't work without a pipe as the alternation operator. The only way out then is to install the GNU sed. Sorry, I cannot help with that. – Wiktor Stribiżew Sep 21 '16 at 17:38
0

In a more complicated expression you can add '-r' to sed options instead of escaping sensitive characters.

From sed manual:

-r, --regexp-extended
use extended regular expressions in the script.

Answer:

echo AB345678 | sed -nr '/^([a-zA-Z]{2}[0-9]{6}|[0-9]{8})$/p'
                      ^
echo 12345678 | sed -nr '/^([a-zA-Z]{2}[0-9]{6}|[0-9]{8})$/p'
                      ^
hbadger
  • 91
  • 1
  • 4
  • @LalitSomnathe Why do not you look at my answer. I think is more elegant. This question should also be tagged #sed, so that this problem does not apply to the regex or unix systems, but the syntax of sed. – hbadger Sep 22 '16 at 08:31