4

The intent of this question is to post a canonical answer to a problem with a non-obvious solution - copying arrays of arrays (requires GNU awk for arrays of arrays).

Given an array of arrays such as shown in the gawk manual on the section about traversing arrays:

BEGIN {
    a[1] = 1
    a[2][1] = 21
    a[2][2] = 22
    a[3] = 3
    a[4][1][1] = 411
    a[4][2] = 42

    walk_array(a, "a")
}

function walk_array(arr, name,      i)
{
    for (i in arr) {
        if (isarray(arr[i]))
            walk_array(arr[i], (name "[" i "]"))
        else
            printf("%s[%s] = %s\n", name, i, arr[i])
    }
}

how would you write a copy_array function that can handle arrays of arrays to copy an existing array to a new array such that a subsequent call to walk_array() for the newly copied array would output the same values for the new array as for the original, i.e. so that this:

BEGIN {
    a[1] = 1
    a[2][1] = 21
    a[2][2] = 22
    a[3] = 3
    a[4][1][1] = 411
    a[4][2] = 42

    walk_array(a, "a")

    copy_array(a, b)

    print "----------"

    walk_array(b, "b")
}

would output:

a[1] = 1
a[2][1] = 21
a[2][2] = 22
a[3] = 3
a[4][1][1] = 411
a[4][2] = 42
----------
b[1] = 1
b[2][1] = 21
b[2][2] = 22
b[3] = 3
b[4][1][1] = 411
b[4][2] = 42
Ed Morton
  • 188,023
  • 17
  • 78
  • 185

1 Answers1

5
$ cat tst.awk
BEGIN {
    a[1] = 1
    a[2][1] = 21
    a[2][2] = 22
    a[3] = 3
    a[4][1][1] = 411
    a[4][2] = 42

    walk_array(a, "a")

    copy_array(a, b)

    print "----------"

    walk_array(b, "b")
}

function copy_array(orig, copy,      i)
{
    delete copy         # Empty "copy" for first call and delete the temp
                        # array added by copy[i][1] below for subsequent.
    for (i in orig) {
        if (isarray(orig[i])) {
            copy[i][1]  # Force copy[i] to also be an array by creating a temp
            copy_array(orig[i], copy[i])
        }
        else {
            copy[i] = orig[i]
        }
    }
}

function walk_array(arr, name,      i)
{
    for (i in arr) {
        if (isarray(arr[i]))
            walk_array(arr[i], (name "[" i "]"))
        else
            printf("%s[%s] = %s\n", name, i, arr[i])
    }
}

.

$ awk -f  tst.awk
a[1] = 1
a[2][1] = 21
a[2][2] = 22
a[3] = 3
a[4][1][1] = 411
a[4][2] = 42
----------
b[1] = 1
b[2][1] = 21
b[2][2] = 22
b[3] = 3
b[4][1][1] = 411
b[4][2] = 42

The use of copy[i][1] to create a temp array before the internal call to copy_array() which is then deleted on entry to copy_array() is to avoid the subsequent code from assuming that what exists at copy[i] is a scalar - this is the same as how you have to create a temp array before using split() (which internally first deletes the arry you pass as an argument) to populate a subarray because the content of an array element is assumed to be a scalar by default for backward compatibility with code written for awks that do not support arrays of arrays (e.g. POSIX awks):

$ printf 'a b\nc d\n' |
  awk '{split($0,arr[NR])} END{for (i in arr) for (j in arr[i]) print i,j,arr[i][j]}'
awk: cmd. line:1: (FILENAME=- FNR=1) fatal: split: second argument is not an array

$ printf 'a b\nc d\n' |
  awk '{arr[NR][1]; split($0,arr[NR])} END{for (i in arr) for (j in arr[i]) print i,j,arr[i][j]}'
1 1 a
1 2 b
2 1 c
2 2 d
Ed Morton
  • 188,023
  • 17
  • 78
  • 185