1

My goal is to remove the end "1S" as well as the letter immediately before it, in this case "M". How do I achieve that? My non-working code :

echo "14M3856N61M1S" | gawk '{gensub(/([^(1S)]*)[a-zA-Z](1S$)/, "\\1", "g") ; print $0}'
>14M3856N61M1S

The desired results should be

>14M3856N61

Some additional information here . 1. I do not think substr will work here since my actual target strings would come with various lengths. 2. I prefer not to take the approach of defining special delimiter because this would be used together with "if" as part of the awk conditional operation while the delimiter is already defined globally. Thank you in advance!

Aron
  • 35
  • 5
  • 1
    `echo "14M3856N61M1S" | sed 's/.1S$//'` – Cyrus Oct 08 '18 at 04:51
  • @Cyrus, Thanks. I prefer a awk solution since I am applying it to be part of the awk script with conditional operation. – Aron Oct 08 '18 at 04:55
  • 1
    It looks like you are hoping that `[^(1S)]` will do something it doesn't do. It matches a single character which is not `(` or `1` or `S` or `)`. – tripleee Oct 08 '18 at 05:04
  • Possible duplicate of [sed: Can my pattern contain an “is not” character? How do I say “is not X”?](https://stackoverflow.com/questions/7520704/sed-can-my-pattern-contain-an-is-not-character-how-do-i-say-is-not-x/) – tripleee Oct 08 '18 at 05:05

3 Answers3

2

Why not use a simple substitution to match the 1S at the last and match any character before it?

echo "14M3856N61M1S" | awk '{sub(/[[:alnum:]]{1}1S$/,"")}1'
14M3856N61M1S

Here the [[:alnum:]] corresponds the POSIX character class to match alphanumeric characters (digits and alphabets) and {1} represent to match just one. Or if you are sure about only characters could occur before the pattern 1S, replace [[:alnum:]] with [[:alpha:]].

To answer OP's question to put the match result on a separate variable, use match() as sub() does not return the substituted string but only the count of number of substitutions made.

echo "14M3856N61M1S" | awk 'match($0,/[[:alnum:]]{1}1S$/){str=substr($0,1,RSTART-1); print str}'
Inian
  • 80,270
  • 14
  • 142
  • 161
2

EDIT: As per OP's comment I am adding solutions where OP could get the result into a bash variable too as follows.

var=$(echo "14M3856N61M1S" | awk 'match($0,/[a-zA-Z]1S$/){print substr($0,1,RSTART-1)}' )
echo "$var"
14M3856N61


Could you please try following too.

echo "14M3856N61M1S" | awk 'match($0,/[a-zA-Z]1S$/){$0=substr($0,1,RSTART-1)} 1'
14M3856N61

Explanation of above command:

echo "14M3856N61M1S" |        ##printing sample string value by echo command here and using |(pipe) for sending standard ouptut of it as standard input to awk command.
awk '                         ##Starting awk command here.
  match($0,/[a-zA-Z]1S$/){    ##using match keyword of awk here to match 1S at last of the line along with an alphabet(small or capital) before it too.
   $0=substr($0,1,RSTART-1)   ##If match found in above command then re-creating current line and keeping its value from 1 to till RSTART-1 value where RSTART and RLENGTH values are set by match out of the box variables by awk.
  }                           ##Closing match block here.
1'                            ##Mentioning 1 will print the edited/non-edited values of lines here.
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
1
echo "14M3856N61M1S" | awk -F '.1S$' '{print $1}'

Output:

14M3856N61
Cyrus
  • 84,225
  • 14
  • 89
  • 153
  • That's the first thing I thought of myself too. I prefer not to define the delimiter as it's already defined globally in the overall awk code. Thank you though, for taking the time to respond. – Aron Oct 08 '18 at 05:25
  • 1
    With its own array and separator: `echo "14M3856N61M1S" | awk '{split($1,a,".1S$"); print a[1]}'` – Cyrus Oct 08 '18 at 05:36
  • A neat solution. Cool. Thank you ! – Aron Oct 08 '18 at 05:42