0

Is there any easy solution how to trim suffix in my filename? Problem is, that my suffix length is vary. Only the same string in filename is _L001.

See the example:

NAME-code_code2_L001_sufix
NAME-code_L001_sufix_sufix2_sufix3
NAME-code_code2_code3_L001_sufix_sufix2_sufix3

I need to output everything before _L001:

NAME-code_code2
NAME-code
NAME-code_code2_code3

I was thinking do something like this (when suffix is fixed length):

echo NAME-code_code2_L001_sufix | rev | cut -c 12- | rev

But of course my suffix length is vary. Is there any bash or awk solution?

Thank you.

Paul
  • 311
  • 4
  • 13

6 Answers6

4

Using pure string manipulation technique:-

$ string="NAME-code_code2_L001_sufix"; printf "%s\n" "${string%_L001*}"
NAME-code_code2

For all the lines int the file, you can do the same by bash, by reading the file in-memory and performing the extraction

# Setting a variable to the contents of a file using 'command-substitution'
$ mystringfile="$(<stringfile)"                 

# Read the new-line de-limited string into a bash-array for per-element operation
$ IFS=$'\n' read -d '' -ra inputArray <<< "$mystringfile"

# Run the sub-string extraction for each entry in the array
$ for eachString in "${inputArray[@]}"; do printf "%s\n" "${eachString%_L001*}"; done

NAME-code_code2
NAME-code
NAME-code_code2_code3

You can write the contents to a new-file by modifying the printf in the for loop as

printf "%s\n" "${eachString%_L001*}" >> output-file
Inian
  • 80,270
  • 14
  • 142
  • 161
2

You can use _L001 as field separator in awk and print first field:

awk -F '_L001' '{print $1}' file

NAME-code_code2
NAME-code
NAME-code_code2_code3
anubhava
  • 761,203
  • 64
  • 569
  • 643
1

I would propose sed.

sed 's|\(.*\)_L001.*|\1|'

example:

$ for LINE in NAME-code_code2_L001_sufix NAME-code_L001_sufix_sufix2_sufix3 NAME-code_code2_code3_L001_sufix_sufix2_sufix3; do echo "$LINE"|sed 's|\(.*\)_L001.*|\1|';done
NAME-code_code2
NAME-code
NAME-code_code2_code3
pan-mroku
  • 803
  • 6
  • 17
1

Here is grep solution: This will print lines from the start till _L001 is seen.

grep -oP '^.*?(?=_L001)' inputfile
NAME-code_code2
NAME-code
NAME-code_code2_code3
P....
  • 17,421
  • 2
  • 32
  • 52
1

Many ways to do this:

# Here is your Input text.
bash$> cat a.txt
NAME-code_code2_L001_sufix
NAME-code_L001_sufix_sufix2_sufix3
NAME-code_code2_code3_L001_sufix_sufix2_sufix3
bash$>

# Desired output using perl.
bash$> cat a.txt |perl -nle 'if (/^(.+)_L.*$/){print $1}'
NAME-code_code2
NAME-code
NAME-code_code2_code3
bash$>

# Desired output using sed.
bash$> cat a.txt |sed 's#\(.*\)_L001_.*#\1#g'
NAME-code_code2
NAME-code
NAME-code_code2_code3
bash$>

# Desired output using cut
bash$> cat a.txt |cut -f1 -d "L"|sed 's/_$//g'
NAME-code_code2
NAME-code
NAME-code_code2_code3
bash$>
User9102d82
  • 1,172
  • 9
  • 19
1

You can also use string substitution, something like:

for i in NAME-code_code2_L001_sufix NAME-code_L001_sufix_sufix2_sufix3 NAME-code_code2_code3_L001_sufix_sufix2_sufix3
do
    echo ${i%_L001*}
done
M. Modugno
  • 741
  • 6
  • 14