1

I have a numpy array that looks like the following:

np.array([
[23, 12, 4, 103, 87, 0.6],
[32, 18, 3, 120, 70, 0.6],
[43, 12, 8, 109, 89, 0.4],
[20, 13, 7, 111, 77, 0.8]
])

I want to transform this array where the last column becomes its own array, such that it will look like this:

np.array([
[[23, 12, 4, 103, 87], [0.6]],
[[32, 18, 3, 120, 70], [0.6]],
[[43, 12, 8, 109, 89], [0.4]],
[[20, 13, 7, 111, 77], [0.8]]
])

What would be the best way to go about this? I am relatively new to Python and have tried out some loops but to no avail. Thanks!

himi64
  • 1,069
  • 3
  • 12
  • 23

1 Answers1

1

numpy requires consistent dimensions in its array; that would give two different sizes. You can either use two separate variables (i.e. parallel arrays):

X = data[:, :-1]
y = data[:, -1]

X = np.array([
[23, 12, 4, 103, 87],
[32, 18, 3, 120, 70],
[43, 12, 8, 109, 89],
[20, 13, 7, 111, 77],
])


y = np.array([
0.6, 0.6, 0.4, 0.8
])

Or you can store a list of pairs:

my_list = [(row[:-1], [row[-1]]) for row in data]
my_list = [
([23, 12, 4, 103, 87], [0.6]),
([32, 18, 3, 120, 70], [0.6]),
([43, 12, 8, 109, 89], [0.4]),
([20, 13, 7, 111, 77], [0.8])
]

The best strategy depends on your use case.

Arya McCarthy
  • 8,554
  • 4
  • 34
  • 56
  • This question isn't an *exact* duplicate, so there's some additional insight to be found here: http://stackoverflow.com/questions/3386259/how-to-make-a-multidimension-numpy-array-with-a-varying-row-size – Arya McCarthy Apr 08 '17 at 21:15
  • I tried using this, but it didn't print the last column as its own array. This is what I got: [ ([23, 12, 4, 103, 87], 0.6), ([32, 18, 3, 120, 70], 0.6), ([43, 12, 8, 109, 89], 0.4), ([20, 13, 7, 111, 77], 0.8) ] – himi64 Apr 09 '17 at 15:30
  • Be sure to include the brackets: [row[-1]]. They're critical. – Arya McCarthy Apr 09 '17 at 16:37