0

I am trying to make a program faster and I found this post and I want to implement a solution that resembles the fourth case given in that question.

Here is the relevant part of the code I am using:

count = 0
hist_dat = np.zeros(r**2)
points = np.zeros((r**2, 2))
for a in range(r):
    for b in range(r):
        for i in range(N):
            for j in range(N):
                hist_dat[count] += retval(a/r, (a+1)/r, data_a[i][j])*retval(b/r, (b+1)/r, data_b[i][j])/N
                points[count][0], points[count][1] = (a+0.5)/r, (b+0.5)/r
        count += 1

What this code does is generate the values of a normalized 2D histogram (with "r" divisions in each direction) and the coordinates for those values as numpy.ndarray. As you can see in the other question linked, I am currently using the second worst possible solution and it takes several minutes to run.

For starters I want to change what the code is doing for the points array (I think that once I can see how that is done I could figure something out for hist_dat). Which is basically this: creation of the "points" numpy.ndarray

In the particular case I am working on, both A and B are the same. So for example, it could be like going from array([0, 0.5, 1]) to array([[0,0], [0,0.5], [0,1], [0.5,0], [0.5,0.5], [0.5,1], [1,0], [1,0.5], [1,1]])

Is there any method for numpy.ndarray or an operation with the np.arange() that does what the above diagram shows without requiring for loops?

Or is there any alternative that can do this as fast as what the linked post showed for the np.arange()?

  • What exactly does that diagram represent? – gmds Jun 21 '19 at 01:26
  • You have two `numpy.ndarray`'s and you combine them into one that contains all possible combinations. You can think of it as a Cartesian product of sets if you prefer. Also, it is what the code snippet above is doing. –  Jun 21 '19 at 01:30
  • By "set", you mean a mathematical set? In that case, is it possible for the sizes of the two arrays (`data_a` and `data_b`, right?), after being converted to sets, to be unequal (because they have different numbers of unique elements)? – gmds Jun 21 '19 at 01:35
  • No, they are the same size (in fact, in the case I am using they are exactly the same). For example, it could be like going from `array([0, 0.5, 1])` to `array([[0,0], [0,0.5], [0,1], [0.5,0], [0.5,0.5], [0.5,1], [1,0], [1,0.5], [1,1]])`. So it is like an array of "x" and "y" coordinates. –  Jun 21 '19 at 01:39
  • Does my answer address your problem? If it does, you might want to edit your question to focus on the abstract problem of, as you said, taking the Cartesian product of sets (although my answer actually assumes that the elements are unique, it is trivial to add a `np.unique` call) – gmds Jun 21 '19 at 01:42
  • Yes it does, but what is the `n_elements` in the `output` assignment? –  Jun 21 '19 at 01:46

2 Answers2

0

You can use np.c_ to combine the result of np.repeat and np.tile:

import numpy as np

start = 0.5
end = 5.5
step = 1.0

points = np.arange(start, end, step)  # [0.5, 1.5, 2.5, 3.5, 4.5]
output = np.c_[np.repeat(points, n_elements), np.tile(points, n_elements)]

print(output)

Output:

[[0.5 0.5]
 [0.5 1.5]
 [0.5 2.5]
 [0.5 3.5]
 [0.5 4.5]
 [1.5 0.5]
 [1.5 1.5]
 [1.5 2.5]
 [1.5 3.5]
 [1.5 4.5]
 [2.5 0.5]
 [2.5 1.5]
 [2.5 2.5]
 [2.5 3.5]
 [2.5 4.5]
 [3.5 0.5]
 [3.5 1.5]
 [3.5 2.5]
 [3.5 3.5]
 [3.5 4.5]
 [4.5 0.5]
 [4.5 1.5]
 [4.5 2.5]
 [4.5 3.5]
 [4.5 4.5]]
gmds
  • 19,325
  • 4
  • 32
  • 58
  • What is the `n_elements`? –  Jun 21 '19 at 01:45
  • 1
    @SV In this case, it is merely an aid to generating input data with `np.arange`. If you have an input array, you would simply assign that to `points` and replace `n_elements` with `len(points)`. – gmds Jun 21 '19 at 01:47
0

maybe np.mgird would help?

import numpy as np
np.mgrid[0:2:.5,0:2:.5].reshape(2,4**2).T

Output:

array([[0. , 0. ],
       [0. , 0.5],
       [0. , 1. ],
       [0. , 1.5],
       [0.5, 0. ],
       [0.5, 0.5],
       [0.5, 1. ],
       [0.5, 1.5],
       [1. , 0. ],
       [1. , 0.5],
       [1. , 1. ],
       [1. , 1.5],
       [1.5, 0. ],
       [1.5, 0.5],
       [1.5, 1. ],
       [1.5, 1.5]])
some_name.py
  • 777
  • 5
  • 16