2

I have a string in Linux shell. This string contains underscores in it.

I want to extract a substring from the string.

I want to extract the substring after the third occurrence of an underscore, counted from the end of the string.

file_name='email_Tracking_export_history_2018_08_15'
string_name="${file_name#*_*_*_}"
file_name2='email_Tracking_export_2018_08_15'
string_name2="${file_name2#*_*_*_}"

echo "$string_name"
echo "$string_name2"

The result

history_2018_08_15
2018_08_15

As you see, string_name="${file_name#*_*_*_}" is not working properly.

Desired result:

2018_08_15
2018_08_15

How can I achieve my desired result?

Cyrus
  • 84,225
  • 14
  • 89
  • 153
nmr
  • 605
  • 6
  • 20

6 Answers6

4

You can do it in a single step, but it's a bit convoluted. After setting the filename

file_name='email_Tracking_export_history_2018_08_15'

we get the substring that contains everything except what we want to have in the end:

$ echo "${file_name%_*_*_*}"
email_Tracking_export_history

This is almost what we want, just an underscore missing, so we add that:

$ echo "${file_name%_*_*_*}_"
email_Tracking_export_history_

Now we know what we have to remove from the beginning of the string and insert that into the ${word#pattern} expansion:

$ echo "${file_name#"${file_name%_*_*_*}_"}"
2018_08_15

or we assign it to a variable for further use:

string_name=${file_name#"${file_name%_*_*_*}_"}
              └───┬───┘ │  └───┬───┘ └─┬──┘  │
             outer word │  inner word  └────────inner pattern
                        └───outer pattern────┘

And analogous for the second string.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
0

Use a temporary variable:

file_name='email_Tracking_export_history_2018_08_15'
temp="${file_name%_*_*_*}"
string_name="${file_name/${temp}_}"
file_name2='email_Tracking_export_2018_08_15'
temp="${file_name2%_*_*_*}"
string_name2="${file_name2/${temp}_}"

echo "$string_name"
echo "$string_name2"
Ipor Sircer
  • 3,069
  • 3
  • 10
  • 15
0

How about using regex in bash:

#!/bin/bash

# Extract substring from string after 3rd occurrence in reverse
function extract() {
    if [[ "$1" =~ _([^_]+_[^_]+_[^_]+$) ]]; then
        echo "${BASH_REMATCH[1]}"
    fi
}

file_name='email_Tracking_export_history_2018_08_15'
string_name=$(extract $file_name)

file_name2='email_Tracking_export_2018_08_15'
string_name2=$(extract $file_name2)

echo "$string_name"
echo "$string_name2"
tshiono
  • 21,248
  • 2
  • 14
  • 22
0
% echo $file_name | rev | cut -f1-3 -d'_' | rev
2018_08_15
% echo $file_name2 | rev | cut -f1-3 -d'_' | rev
2018_08_15

rev reverses the string, making it easy to count the 3 underscores occurrences. The part of string you want to extract is then reversed back.

user1551605
  • 166
  • 9
0

Using (most) sed and BRE:

sed 's/.*_\([^_]*\(_[^_]*\)\{2\}\)$/\1/' <<< "$file_name"
2018_08_15

Using GNU sed and ERE:

sed -r 's/.*_([^_]*(_[^_]*){2})$/\1/' <<< "$file_name"
2018_08_15
oliv
  • 12,690
  • 25
  • 45
0

Is expr already banned to deepest hell even for string matching?:

$ expr "$file_name" : '.*_\([^_]*_[^_]*_[^_]*\)'
2018_08_15
$ expr "$file_name2" : '.*_\([^_]*_[^_]*_[^_]*\)'
2018_08_15

From https://www.tldp.org/LDP/abs/html/string-manipulation.html :

expr "$string" : '.*\($substring\)'

    Extracts $substring at end of $string, where $substring is a regular expression.
James Brown
  • 36,089
  • 7
  • 43
  • 59