2

I've recently been practicing both with regexes and shell scripting. I guess just to gain basic literacy. I've tried to match an input value to a regex to assess whether to continue with the script. However I feel I've had to write too much code to make it happen because the simple way, for some reason, wouldn't cut it for me.

The regex is designed to match a 4 or 5 digit string. The regex itself works, it is not the issue. (except if its syntax has to be changed in the conditional)

This was the easy method I've tried. I've tried different ways of regex notation and brackets and quotation marks. (something with single brackets being Posix/requirin brackets?)

Assume that I've run the code with 'foo 4535'

if [[ $1 =~ '^\b\d{4}\b|\b\d{5}\b$' ]]; then
  echo "foo matches regex"
fi

so my main question is; How do I get this short version to work?

I have looked at similar questions and I came to a work around as such:

foo=$1
echo $foo | grep -P -q '^\b\d{4}\b|\b\d{5}\b$'
bar=$?

if [[ $bar != '0' ]]; then
  echo "foo matches regex"
fi

And this works. Which is fine. But there are a few things in there that I don't understand on which I might like some clarity (solely for the purpose of exploratory learning ;) ), so feel free to ignore

When I tried reducing the first section by replacing it with

foo=$(echo "$bar" | grep -P -q '^\b\d{4}\b|\b\d{5}\b$')
echo $foo

It would give me an empty line, indicating that $foo is empty/falsy? Only when passing it through $? (of which I don't understand what kind of variable this is, how can I google for such concepts?) I get a value (which is represented as 0 or 1 when echoing, I am unsure whether this is a string or a boolean), Why is this?

And second of all, why would the input matching the regex give me 0, and not matching give me 1? isn't this counterintuitive? What kind of value is this?

My apologies if I haven't asked/formatted this question to style. I am not experienced with asking questions in a Stack overflow Format.

If you have any suggestions on how to learn more about shell scripting I would love to hear it!

Thank you very much!

  • The syntax bash uses is for built-in regex matching POSIX ERE. The syntax you were *trying* to use is PCRE. (And anything you quote on the right-hand side of `=~` is treated as literal, not regex syntax, so leave that content unquoted -- or put it in a variable, as Tom's answer shows). – Charles Duffy Jan 16 '20 at 18:12
  • ...as for matching being 0, that's completely normal -- *all* UNIX commands return an exit status of 0 for success, and nonzero exit status for failure. That said, it's bad practice to test `$?` explicitly unless you have a very good reason to; less error-prone to just use `if somecommand; then ...` instead of `somecommand; if [ "$?" -eq 0 ]; then ...` – Charles Duffy Jan 16 '20 at 18:14
  • On that line: foo=$(echo "$bar" | grep -P -q '^\b\d{4}\b|\b\d{5}\b$') , most likely you should be matching the program argument ($1) with the regex, and not $bar which you defined as the returned status of the last command. – Anicet Rakotonirina Jan 16 '20 at 18:40

1 Answers1

1

You cannot use shorthand notation like \d and \b in Bash. If you change the regex to use the [[:digit:]] character class, it will work:

re='^([[:digit:]]{4}|[[:digit:]]{5})$'
if [[ $var =~ $re ]]; then
  echo 'var matches regex'
fi

(You could probably also use a bracket expression like [0-9] instead, if they are the only characters that you consider to be a digit).

I removed the word boundaries \b because they are not needed if you are matching the whole string.

Regarding your attempts with grep, I think that the key to understanding it is to know that $? is the exit code of the previous command (0 on success), whereas foo=$(cmd) assigns the output of the command to the variable foo.

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
  • Perfect. This has worked next to another fix that I noticed myself; I ran the script with sh instead of bash! I will find out what else this means ;) – Ivan Dikmans Jan 17 '20 at 16:34
  • Yeah, if you run with `sh` then you can't use Bash-specific features such as regular expressions and arrays. If you put a `#!/bin/bash` at the top of your script, then make it executable, then you can run `./script.sh` and it will work. – Tom Fenech Jan 17 '20 at 16:50