0

I have an array of arrays called x and I am trying to do ravel on it but the result is the same x. It is not flattening anything. I have also tried the function flatten(). Can someone explain me why is this happening?

x = np.array([np.array(['0 <= ... < 200 DM', '< 0 DM', 'no checking account'], dtype=object),
       np.array(['critical account/ other credits existing (not at this bank)',
       'existing credits paid back duly till now'], dtype=object),
       np.array(['(vacation - does not exist?)', 'domestic appliances'],
      dtype=object)], dtype=object)

np.ravel(x)

I am actually trying to reproduce the code in this question: One-hot-encoding multiple columns in sklearn and naming columns but I am blocked by the ravel().

Thanks

jpf
  • 1,447
  • 12
  • 22
DroppingOff
  • 331
  • 3
  • 17
  • 2
    Well you created an array of objects, not a 2d array. Hence it is flattening since for numpy the objects are the items in the array. The fact that they *happen* to be arrays does not matter. – Willem Van Onsem Apr 16 '20 at 16:23
  • Thank you for your comment. How could I reproduce the response in the question in the link? For what I see it is the situation that I have and it worked in that case. – DroppingOff Apr 16 '20 at 16:52

1 Answers1

0
In [455]: x = np.array([np.array(['0 <= ... < 200 DM', '< 0 DM', 'no checking account'], dtype=object),
     ...:  
     ...:        np.array(['critical account/ other credits existing (not at this bank)', 
     ...:        'existing credits paid back duly till now'], dtype=object), 
     ...:        np.array(['(vacation - does not exist?)', 'domestic appliances'], 
     ...:       dtype=object)], dtype=object)                                                          
In [456]: x                                                                                            
Out[456]: 
array([array(['0 <= ... < 200 DM', '< 0 DM', 'no checking account'], dtype=object),
       array(['critical account/ other credits existing (not at this bank)',
       'existing credits paid back duly till now'], dtype=object),
       array(['(vacation - does not exist?)', 'domestic appliances'],
      dtype=object)], dtype=object)
In [457]: x.shape                                                                                      
Out[457]: (3,)
In [458]: [i.shape for i in x]                                                                         
Out[458]: [(3,), (2,), (2,)]

x is a 1d array with 3 elements. Those elements are themselves arrays, with differing shape.

One way to flatten it is:

In [459]: np.hstack(x)                                                                                 
Out[459]: 
array(['0 <= ... < 200 DM', '< 0 DM', 'no checking account',
       'critical account/ other credits existing (not at this bank)',
       'existing credits paid back duly till now',
       '(vacation - does not exist?)', 'domestic appliances'],
      dtype=object)
In [460]: _.shape                                                                                      
Out[460]: (7,)
hpaulj
  • 221,503
  • 14
  • 230
  • 353