1

Suppose we have a numpy array.

a = np.array([[1,2,3],[4,5,6], [7,8,9]])

Now, if I want to extract Columns 0 and 2, it would require doing something like

b = a[:, [0, 2]]

However, if we try to find properties of b by executing b.flags, we get

C_CONTIGUOUS : False
F_CONTIGUOUS : True

As can be seen, array a which is originally C_contiguous is automatically converted to F_contiguous. This usually does not pose any problem if I run my code on a single core. However, if I use mpi4py to scatter data across multiple cores, it has to be C_contiguous only or else scattering is incorrect.

My question is how can I avoid 'b' from automatically getting converted to F_contiguous?

Thanks,

SLB

suvayu
  • 4,271
  • 2
  • 29
  • 35
  • This swap, which can be seen in the `b.strides` ((8, 24), is AFIK undocumented, and usually un-noticed. I only became aware of it recently in another SO. – hpaulj Aug 25 '21 at 21:26

3 Answers3

5

First off, you can get to a C-contiguous version of b by copying it to a new array: fixedb = b.copy().

As for why this happens, it might be an efficient implementation of the mix of basic and advanced indexing. It looks like a hidden array is created when copying from a, then b was made by making it an F-contiguous view of that hidden array:

a

#array([[1, 2, 3],
#       [4, 5, 6],
#       [7, 8, 9]])

a.base

#None
# means a is the original array

b = a[:, [0, 2]]

#array([[1, 3],
#       [4, 6],
#       [7, 9]])

b.base

#array([[1, 4, 7],
#       [3, 6, 9]])
# b is NOT the original array
# and the original array isn't a either

fixedb = b.copy()
#array([[1, 3],
#       [4, 6],
#       [7, 9]])

fixedb.base

#None
# new array is original, and by default copy makes C-contiguous
BatWannaBe
  • 4,330
  • 1
  • 14
  • 23
  • Gentle reminder that as the poster, you can properly resolve this post by clicking the check mark under the votes of a solution. It helps keeps the stackoverflow system moving. – BatWannaBe Aug 31 '21 at 19:31
1

I'm just guessing here, but I think the F_CONTIGUOUS is due to how numpy handled multi-column slicing. You'll also see in b.flags the flag OWNDATA: False, which suggests that b is a view of another array (although in this case not a view of a), but rather:

>>> b.base
array([[1, 4, 7],
       [3, 6, 9]])

with the flags:

>>> b.base.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
...

This suggests to me that the way numpy sliced the array is it created an empty array large enough to hold the columns, and the copied the columns one by one into the array like:

>>> b.base.ravel()
array([1, 4, 7, 3, 6, 9])

But by viewing the array as F-contiguous instead of C-contiguous it will represent that data column-wise.

If you want a new C-contiguous array containing the same shape and data you could simply copy the view:

>>> c = b.copy()
>>> c.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False
Iguananaut
  • 21,810
  • 5
  • 50
  • 63
  • 1
    So `b = b.base.T`. `b.strides` is `(8, 24)`, which is the kind of reversal that we see in a transpose. – hpaulj Aug 25 '21 at 21:16
0

You can use np.ascontiguousarray or np.require to ensure that b is in C-order:

b = np.ascontiguousarray(b)

or:

b = np.require(b, requirements='C')
Jan Christoph Terasa
  • 5,781
  • 24
  • 34