4

I have some python code that I wrote to convert a python list into an XML element. It's meant for interacting with LabVIEW, hence the weird XML array format. Anyways, here's the code:

def pack(data):
  # create the result element
  result = xml.Element("Array")

  # report the dimensions
  ref = data
  while isinstance(ref, list):
    xml.SubElement(result, "Dimsize").text = str(len(ref))
    ref = ref[0]

  # flatten the data
  while isinstance(data[0], list):
    data = sum(data, [])

  # pack the data
  for d in data:
    result.append(pack_simple(d))

  # return the result
  return result

Now I need to write an unpack() method to convert the packed XML Array back into a python list. I can extract the array dimensions and data just fine:

def unpack(element):
  # retrieve the array dimensions and data
  lengths = []
  data = []
  for entry in element:
    if entry.text == "Dimsize":
      lengths.append(int(entry.text))

    else:
      data.append(unpack_simple(entry))

  # now what?

But I am not sure how to unflatten the array. What would be an efficient way to do that?

Edit: Here's what the python list and corresponding XML looks like. Note: the arrays are n-dimensional.

data = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]

And then the XML version:

<Array>
  <Dimsize>2</Dimsize>
  <Dimsize>2</Dimsize>
  <Dimsize>2</Dimsize>
  <I32>
    <Name />
    <Val>1</Val>
  </I32>

  ... 2, 3, 4, etc.
</Array>

The actual format isn't important though, I just don't know how to unflatten the list from:

data = [1, 2, 3, 4, 5, 6, 7, 8]

back into:

data = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]

given:

lengths = [2, 2, 2]

Assume pack_simple() and unpack_simple() do the same as pack() and unpack() for the basic data types (int, long, string, boolean).

ctrlc-root
  • 1,049
  • 1
  • 15
  • 22

2 Answers2

2

start inside out:

def group(seq, k):
    return [seq[i:i+k] for i in range(0, len(seq), k)]

unflattened = group(group(data, 2), 2)

Your example might be easier, if your dimensions were not all the same. But I think the above code should work.

Daren Thomas
  • 67,947
  • 40
  • 154
  • 200
  • Daren: I'm going to try it right now. THough, I will have to call group() in a loop for each dimension. – ctrlc-root Jun 22 '12 at 14:56
  • Daren: Yup, that did it. One minor change though, you only need to call group(group(data, 2), 2) twice. Yeah, the example was suboptimal. Also, the dimensions need to be reversed i.e. the innermost number is the last dimension in the file. But this will work, thanks. – ctrlc-root Jun 22 '12 at 15:01
  • correct. argh! i hate getting stuff messed up in my head. I updated the code to only group twice. – Daren Thomas Jun 22 '12 at 15:03
2

Try the following:

from operator import mul

def validate(array, sizes):
    if reduce(mul, sizes) != len(array):
        raise ValueError("Array dimension incompatible with desired sizes")

    return array, sizes

def reshape(array, sizes):
    for s in sizes:
        array = [array[i:i + s] for i in range(0, len(array), s)]

    return array[0]

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
length = [2, 2, 3]

print reshape(validate(data, length))

length = [2, 2, 2]

print reshape(validate(data, length))

Output being:

[[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]
Traceback:
   (...)
ValueError: Array dimension incompatible with desired sizes

An alternative is using numpy arrays. Note that for this simple task, numpy is a rather big dependency, though you will find that most (common) array related tasks/problems already have an implementation there:

from numpy import array

print array(data).reshape(*length)  # optionally add .tolist() to convert to list

EDIT: Added data validation

EDIT: Example using numpy arrays (thanks to J.F.Sebastian for the hint)

unode
  • 9,321
  • 4
  • 33
  • 44
  • Unode: oh nice, this also works, and I don't have to change the code at all. Unfortunately, I already accepted Daren's answer. But I like this. Now why can't I come up with code like this haha. – ctrlc-root Jun 22 '12 at 15:03
  • @root.ctrlc you can unaccept my answer and accept Unodes instead, if it solves your problem better. I think you should! – Daren Thomas Jun 22 '12 at 15:05
  • @Daren: haha, I didn't want to be a jerk. but yes, his is slightly better :) – ctrlc-root Jun 22 '12 at 15:13
  • 1
    `reshape()` could be a more specific name than `transform()`. The parameter might be called `array1d` instead of generic `array`. – jfs Jun 22 '12 at 21:04
  • @J.F.Sebastian technically speaking there is nothing that indicates the array should be exclusively 1d. I agree about `reshape`. It actually pointed me to the same action in the numpy toolkit. Editing answer to include this alternative. – unode Jun 23 '12 at 10:24
  • 1
    you're right `array1d` is unnecessary as long as the array passes `validate()` function. – jfs Jun 24 '12 at 18:46