3

How do I include symbols into the record separator of awk. I know the basic syntax like this:

awk 'BEGIN{RS="[:.!]"}{if (tolower($0) ~ "$" ) print $0 }'

which will separate a single line into separate records based on ! . and : but I also want to include symbols like a green checkmark this . I am having trouble understanding the syntax, so I put it in like this

awk 'BEGIN{RS="[:.!\u2705]"}{if (tolower($0) ~ "$" ) print $0 }'

which doesnt seem to work.

Sample input is this:

✅  Team collaboration  ✅  Project organisation✅  SSO support✅  API Access✅  Priority Support 
Ardie
  • 35
  • 3

1 Answers1

3

You need to use a regex with an alternation operator (|) because the character you want to split with consists of three separate UTF8 code units: E2, 9C and 85.

You can use

awk 'BEGIN{RS="[:.!]|\xE2\x9C\x85"} tolower($0) ~ "$"'

See the online demo:

#!/bin/bash
s='✅ Team collaboration ✅ Project organisation✅ SSO support✅ API Access✅ Priority Support'
awk 'BEGIN{RS="[:.!]|\xE2\x9C\x85"} tolower($0) ~ "$"' <<< "$s"

Output:


 Team collaboration 
 Project organisation
 SSO support
 API Access
 Priority Support

Note that print $0 is a default action, no need to use it explicitly.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563