5

I have such a string:

msg='123abc456def'

Now I need to split msg and get the result as below:

['123', 'abc', '456', 'def']

In python, I can do like this:

pattern = re.compile(r'(\d+)')
res = pattern.split(msg)[1:]

How to get the same result in bash script?
I've tried like this but it doesn't work:

IFS='[0-9]'    # how to define IFS with regex?
echo ${msg[@]}
Yves
  • 11,597
  • 17
  • 83
  • 180

3 Answers3

7

Getting the substrings with grep, and putting the output in an array using command substitution:

$ msg='123abc456def'

$ out=( $(grep -Eo '[[:digit:]]+|[^[:digit:]]+' <<<"$msg") )

$ echo "${out[0]}"
123

$ echo "${out[1]}"
abc

$ echo "${out[@]}"
123 abc 456 def
  • The Regex (ERE) pattern [[:digit:]]+|[^[:digit:]]+ matches one or more digits ([[:digit:]]+) OR (|) one or more non-digits ([^[:digit:]]+.
heemayl
  • 39,294
  • 7
  • 70
  • 76
4

Given that you already know how to solve this in Python, you can solve it using the code shown in the question:

MSG=123abc456def;
python -c "import re; print('\n'.join(re.split(r'(\\d+)', '${MSG}')[1:]))"

While python is not as standard of an executable as say grep or awk, does that really matter to you?

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
2

I would do matching instead of splitting. Here, I used grep but you can use the same regex in pure bash also.

$ msg='123abc456def'
$ grep -oE '[0-9]+|[^0-9]+' <<<$msg
123
abc
456
def
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274