I need to compare two strings in alphabetic order, not only equality test. I want to know is there way to do string comparison in awk?
Asked
Active
Viewed 1.2e+01k times
31
-
1Of course you can - it's primarily a string-processing language. – May 26 '11 at 13:17
-
This is a misconception. For instance the expression `$1 == $2` will falsely report that the strings `001` and `1.0` are equal. – Kaz Aug 15 '21 at 02:35
3 Answers
35
Sure it can:
pax$ echo 'hello
goodbye' | gawk '{if ($0 == "hello") {print "HELLO"}}'
HELLO
You can also do inequality (ordered) testing as well:
pax> printf 'aaa\naab\naac\naad\n' | gawk '{if ($1 < "aac"){print}}'
aaa
aab

paxdiablo
- 854,327
- 234
- 1,573
- 1,953
-
The operator < will only compare first letter per my experience. Hence it will not compare strings. You have to use != operator. – Sumod Jul 22 '14 at 09:49
-
@Sumod, then your implementation of `awk` is broken. In any case `!=` is useless for ordering strings as per the question. See the update for string comparison beyond the first character, and I'd suggest switching to using the GNU variant. – paxdiablo Jul 22 '14 at 11:16
-
OK. I am using awk that comes with CentOS 6.4. It says GNU awk 3.1.7. Please see the input of my commands.
$jps - 29420 Jps, 28009 RunJar, 27501 DseDaemon. If I give the command - jps | awk '{if ($2 < "Jps") {print $2}}', then only DseDaemon is printed. If I use "!=", then both RunJar and DseDaemon are printed. Hence I reached this conclusion. Please excuse typos. Not able to copy paste exact commands. – Sumod Jul 22 '14 at 11:34 -
1@Sumod, if your three lines are `29420 Jps`, `28009 RunJar` and `27501 DseDemon`, then it's acting correctly. The `DseDemon` string is the **only** one less than `Jps`. `RunJar` is greater and `Jps` is **equal** so neither of those will print. Try another line containing `11111 Jpr` and see what happens, I think you'll find it prints out fine. If you want to include the `Jps` line in your output, you should be using `<=` rather than `<`. – paxdiablo Jul 22 '14 at 12:48
-
Beware that `awk` has no explicit typing and tries to convert everything to numbers first, which sometime lead to "interesting" results: ``` awk -v a=0200 -v b=02E2 'BEGIN{print(a==b)}' ``` Instead of string comparison you get comparison by numbers, e.g. "02E2" is scientific notation for 02*10²=200 and you get True. You can force string comparison by prefixing some string, which is guaranteed to not be numeric, e.g. ``` awk -v a=0200 -v b=02E2 'BEGIN{print(("x" a)==("x" b))}' ``` – pmhahn Jun 08 '22 at 13:46
6
You can do string comparison in awk using standard boolean operators, unlike in C where you would have to use strcmp().
echo "xxx yyy" > test.txt
cat test.txt | awk '$1!=$2 { print($1 $2); }'

Ilya Matveychikov
- 3,936
- 2
- 27
- 42
-
9Don't use cat here, awk is perfectly capable of reading a file by it self. Useless Use of Cat Award springs to mind... http://partmaps.org/era/unix/award.html – Fredrik Pihl May 26 '11 at 13:24
-
Well, you are right. It might be as follows: `awk '...' test.txt`. Thanks for the link. – Ilya Matveychikov May 26 '11 at 13:28
-
8
4
You can check the answer in the nawk manual
echo aaa bbb | awk '{ print ($1 >= $2) ? "true" : "false" }'

Denim Datta
- 3,740
- 3
- 27
- 53

Ian Chang
- 122
- 5