The "invisible" FS in a comma delimited array entry

Question

Suppose I have:

awk 'BEGIN{
         c["1","2","3"]=1 
         c["12","3"]=2 
         c["123"]=3          # fleeting...
         c["1","23"]=4 
         c["1" "2" "3"]=5    # will replace c["123"] above...
         for (x in c) {
            print length(x), x, c[x]
            split(x, d, "")      # is there something that would split c["12", "3"] into "12, 3"?
                                 # better: some awk / gawkism in one step?
            for (i=1; i <= length(x); i++)
               printf("|%s|", d[i])
            print "\n"
            }
        }'

Prints:

4 123 4
|1||||2||3|

3 123 5
|1||2||3|

4 123 2
|1||2||||3|

5 123 1
|1||||2||||3|

In each case, the use of the , in forming the array entry produces a visually similar result (123) when printed in the terminal but a distinct hash value. It would appear that there is an 'invisible' separator between the the elements that is lost when printing (i.e., what delimiter makes c["12", "3"] hash differently than c["123"])

What value would I use in split to be able to determine where in the string the comma was placed when the array index was created? i.e., if I created an array entry with c["12","3"] what is the easiest way to print "12","3" vs "123" as a visually distinctly different string (in the terminal) than c["123"]?

(I know that I could do c["12" "," "3"] when creating the array entry. But what makes c["12","3"] hash differently than c["123"] and how to print those so they are seen differently in the terminal...)

score 3 · Accepted Answer · answered Mar 23 '17 at 17:16

3

c["12","3"] = c["12" SUBSEP "3"]

See SUBSEP in the awk man pages. You can set SUBSEP=FS in the BEGIN section if you have a CSV and want to write c["12","3"] instead of c["12" FS "3"] and have commas printed as the separator in the array indices.

answered Mar 23 '17 at 17:16

Ed Morton

188,023
17
78
185

1

Perfect. Thanks. `SUBSEP` is `gawk` only though, correct? – dawg Mar 23 '17 at 17:21
No SUBSEP is in every awk version. – Ed Morton Mar 23 '17 at 17:29
Hmmm. Bruce Barnett says [NAWK / GAWK only](http://www.grymoire.com/Unix/Awk.html#uh-62) but it is clearly in the [POSIX man page](http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html). Bruce needs an edit. – dawg Mar 23 '17 at 17:36
Well.... nawk is a very old, pre-POSIX awk, so maybe he means as opposed to old, broken awk (aka oawk)? – Ed Morton Mar 23 '17 at 18:36

The "invisible" FS in a comma delimited array entry

1 Answers1