2

I have a server that accesses and gets data in the format of a multidimensional array so the end result is:

 [
    [
        [n1t1:1, n1s1:2, n1o1:5],
        [n1t2:3, n1s2:8, n1o2:9]
    ],
    [
        [n2t1:9, n2s1:3, n2o1:2],
        [n2t2:5, n2s2:1, n2o2:7]
    ],
    [
        [n3t1:4, n3s1:9, n3o1:2],
        [n3t2:7, n3s2:1, n3o2:5]
    ]
 ]

I need to go through that array, access only s1 values and store them into a new array that will be returned as a result.

Option 1:

result = []
parent_enum = 0
while len(array) > parent_enum:
    child_enum = 0
    result.append([])
    while len(child_enum) > array_num:
        result[parent_enum].append(array[parent_enum][child_enum][1])
        child_enum += 1
    parent_enum += 1

Option 2:

result = [[] for i in range(len(array))]
parent_enum = 0
while len(array[0]) > parent_enum:
    child_enum = 0
    while len(array) > child_enum:
         result[child_enum].append(array[child_enum][parent_enum][1])
         child_enum += 1
    parent_enum += 1

Is there a difference and if so, which way would be more efficient and fast? Considering the size of a 2nd dimension is up to 20 and 3rd dimension is up to 500

Ademir Gotov
  • 179
  • 1
  • 14
  • 1)I loop through array and access only 1 value per each row instead of 3. 2)smaller array with less data – Ademir Gotov Jul 02 '19 at 19:48
  • 3
    For questions like this, you can use the [timeit](https://docs.python.org/3/library/timeit.html) buitlin module on test data to see the actual runtime differences – G. Anderson Jul 02 '19 at 19:50
  • Thank you, is there a way to check the processing? Or will it be unnoticeable? – Ademir Gotov Jul 02 '19 at 19:52
  • both "options" don't look as most efficient – RomanPerekhrest Jul 02 '19 at 19:54
  • Read [this article](https://hackernoon.com/neat-optimization-trick-reduce-the-number-of-jumps-in-a-nested-loop-a97fdbfd4c2b). – joshwilsonvu Jul 02 '19 at 19:54
  • 1
    Using timeit and noting the time differences between two code snippets will generally give a good approximation of the difference in _processing time_ between them. It may also tell you that there's virtually no difference between the two. Also worth noting, if you're concerned about memory and function calls more than time, and if you're using ipython (or jupyter) you can use [prun](https://stackoverflow.com/questions/7069733/how-do-i-read-the-output-of-the-ipython-prun-profiler-command) to profile the code – G. Anderson Jul 02 '19 at 19:59
  • @JoshWilson Thank you for the article. – Ademir Gotov Jul 02 '19 at 20:00
  • @G.Anderson awesome, thank you for the information! – Ademir Gotov Jul 02 '19 at 20:03
  • @RomanPerekhrest what would be the better way if you don't mind me asking? – Ademir Gotov Jul 02 '19 at 20:04
  • @AdemirGotov, what are `array_num` and `array` in your Option1 as it eventually uses `array_item` ? – RomanPerekhrest Jul 02 '19 at 20:04
  • It was a typo on my side – Ademir Gotov Jul 02 '19 at 20:07

2 Answers2

2

The following code should be more readable and have good performance by using builtin functions.

data = [ ...your data... ]
result = map(lambda first:  # for each first-level entry
                 map(lambda second:  # for each second-level entry within first
                         second[1],  # return the second value
                     first
                 ),
             data
         )
[
    [
        2,
        8
    ],
    [
        3,
        1
    ],
    [
        9,
        1
    ]
 ]
joshwilsonvu
  • 2,569
  • 9
  • 20
  • @AdemirGotov, did you even test it? It'll throw an error. `map` function signature is `map(function_to_apply, list_of_inputs)`. This approach passes iterable as 1st argument – RomanPerekhrest Jul 02 '19 at 20:20
  • 1
    Sorry, that was off the top of my head. Switched the order, the concept is still the same. – joshwilsonvu Jul 03 '19 at 12:24
2

Why not using simple list comprehension:

arr = [
    [
        ["n1t1:1", "n1s1:2", "n1o1:5"],
        ["n1t2:3", "n1s2:8", "n1o2:9"]
    ],
    [
        ["n2t1:9", "n2s1:3", "n2o1:2"],
        ["n2t2:5", "n2s2:1", "n2o2:7"]
    ],
    [
        ["n3t1:4", "n3s1:9", "n3o1:2"],
        ["n3t2:7", "n3s2:1", "n3o2:5"]
    ]
 ]


result = [[arr_lev3[1] for arr_lev3 in arr_lev2] for arr_lev2 in arr]

print(result)

Sample output:

[['n1s1:2', 'n1s2:8'], ['n2s1:3', 'n2s2:1'], ['n3s1:9', 'n3s2:1']]

And it's more than 2 times faster than map approach:

In [38]: %timeit result = [[arr_lev3[1] for arr_lev3 in arr_lev2] for arr_lev2 in arr]
753 ns ± 2.24 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [39]: %timeit result2 = list(map(lambda first: list(map(lambda second: second[1], first)), arr))
1.63 µs ± 20.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105