How can I use awk to insert something in the middle of the word?

Question

I have an input:

This is a test

And I want to insert some letters in the middle of the word, like:

This is a teSOMETHINGst

I know I can define the needed word by $i, but how can I modify the word that way?

I'm trying to do it like that:

{
    i=4 # finding somehow
    print (substr($i,1,(length($i)/2)) "SOMETHING" substr($i,(length($i)/2),(length($i)/2)))
}

As I'm new to awk I wonder if it is a right way.

Ed Morton · Answer 1 · 2018-04-08T17:21:41.510

This may be what you're looking for:

$ awk 'match($0,/\<test\>/){mid=int(RLENGTH/2); $0=substr($0,RSTART,mid) "SOMETHING" substr($0,RSTART+mid,RELNGTH-mid)} 1'

e.g. some test cases (no pun intended):

$ echo 'This is a test' |
awk 'match($0,/\<test\>/){mid=int(RLENGTH/2); $0=substr($0,RSTART,mid) "SOMETHING" substr($0,RSTART+mid,RLENGTH-mid)} 1'
teSOMETHINGst

$ echo 'These are tests' |
awk 'match($0,/\<tests\>/){mid=int(RLENGTH/2); $0=substr($0,RSTART,mid) "SOMETHING" substr($0,RSTART+mid,RLENGTH-mid)} 1'
teSOMETHINGsts

$ echo 'These contestants are in a test' |
awk 'match($0,/\<test\>/){mid=int(RLENGTH/2); $0=substr($0,RSTART,mid) "SOMETHING" substr($0,RSTART+mid,RLENGTH-mid)} 1'
teSOMETHINGst

Inian · Accepted Answer · 2018-04-08T17:28:51.230

Assuming your requirement is to match the column number containing test and do some operations over it, do a simple loop over the columns upto NF and match using the regex match operator ~ or for fixed strings do a equality match as $i == "test"

awk '
{
  for(i=1;i<=NF;i++) {
    if ($i ~ "test") {
      halfLength=(length($i)/2)
      $i=(substr($i,1,halfLength) "SOMETHING" substr($i,(halfLength+1),halfLength))
    }
  }
}1' <<<"This is a test"

This produces the output as expected. Note that I've made the substr() call for printing the 2nd part of the string as substr($i,(halfLength+1),halfLength). The +1 is needed which you have missed before. I've used the substr() result to be modify column number containing test i.e. as $i=..

Also when doing {..}1, each of the column fields are reconstructed based on the modifications if any, in our case only to the column containing the string you wanted.

Also note that the whole attempt will fail if the target string contains an odd number of characters or forms a sub string of another larger string ( could use the equality operator but regex approach would fail )

that will fail when the target string is an odd number of characters (e.g. `tests`) and when the target word is part of another word (e.g. `contestant`) — Ed Morton, Apr 08 '18 at 17:19
@EdMorton : Sure Ed! Knew that, just wanted to fix OP’s attempt. Will add a note of the cases you mentioned. — Inian, Apr 08 '18 at 17:23
@EdMorton : Done Ed! Now they would jump straight to yours after seeing my answer ;) — Inian, Apr 08 '18 at 17:29

James Brown · Answer 3 · 2018-04-08T19:33:24.113

1

Another another one that grew from curiosity to personal vendetta (:

$ echo This is a contestant test | 
awk -v s="test" '
BEGIN {
    FS=OFS=""
}
{
    if(i=match($0, "(^| )" s "( |$)")) {   # match over index since regex support
        j=(i+length(s)/2+!!(i-1))          # !!(i-1) detect beginning of record
        $j="SOMETHING" $j
    }
}1'
This is a contestant teSOMETHINGst

~~Another one using empty separators, mostly to satisfy personal curiosity:~~

$ echo This is a test | 
awk -v s="test" '
BEGIN {
    FS=OFS=""                # empty separators
}
{
    if(i=index($0,s)) {      # index finds the beginning of test
        j=(i+length(s)/2)    # midpoint
        $j="SOMETHING" $j    # insert string
    }
}1'                          # output
This is a teSOMETHINGst

edited Apr 08 '18 at 19:33

answered Apr 08 '18 at 17:44

James Brown

36,089
7
43
59

that will incorrectly find `test` if it was within `contestant`. It's also relying on undefined behavior (what awk does with `FS=""` is unspecified by POSIX) so YMMV. – Ed Morton Apr 08 '18 at 17:54
Sure will, not boiler-plate proof at all. I was just interested in testing the scheme. Also surprised that it worked with mawk and original-awk too. – James Brown Apr 08 '18 at 17:56
consider adding a test for the character before and after the target string being non-alpha-or-^ and non-alpha-or-$ respectively, – Ed Morton Apr 08 '18 at 17:57
I was toying around with word boundary testing but got bored. :D Lazy Sunday. – James Brown Apr 08 '18 at 17:59
1

Yeah, it is too bad there's no provided way to say "look for this string not within another string" but down that road lies the dreaded p### syntax explosion so instead we have to write something like `i=index($0,s) && ((i==1) || (substr($0,i-1,1) !~ /[[:alpha:]]/) && (((i+length(s))==length($0)) || (substr($0,i+length(s),1) !~ /[[:alpha:]]/))` – Ed Morton Apr 08 '18 at 18:07
1

Besides, it will fail if `contestant` was before `test` as there is no loop to go over the first hit. Doomed from the beginning. How many ways can a man fail? (Add _in_ to the beginning of that sentence and bring the salt) – James Brown Apr 08 '18 at 18:57
Why `!!` in `!!(i-1)`? – Ed Morton Apr 08 '18 at 20:27
If `i==1` `!!(i-1)==0` but for example `i==23` -> `!!(i-1)==1`, because of the misaligning of `(^| )`. Almost midnight, I'll be ashamed tomorrow... – James Brown Apr 08 '18 at 20:30
1

So it's in place of `(i>1)` - ok, got it, thanks for explaining. – Ed Morton Apr 08 '18 at 20:32

How can I use awk to insert something in the middle of the word?

3 Answers3