2

I'm trying to cast a numpy matrix that I have already defined:

    matrix = numpy.array([['name','23','45','1'],
                         ['name2','223','43','5'],
                         ['name3','12','33','2']])

resulting in this:

array([['name1', '23', '45', '1'],
       ['name2', '223', '43', '5'],
       ['name3', '12', '33', '2']], 
      dtype='|S5')

I would like to name and cast each column of my matrix to the following types:

dt = numpy.dtype({'names':['name','x','y','n'],'formats': ['S10', 'S10', 'S10', 'S10']})

For now, I will consider matrix all strings because it doesn't work, but what was expected a format like this 'formats': ['S10', 'f3', 'f3', 'i'] and do something like this:

matrix.astype(dtype=dt,casting='safe')

Result:

array([[('name', 'name', 'name', 'name'), ('23', '23', '23', '23'),
        ('45', '45', '45', '45'), ('1', '1', '1', '1')],
       [('name2', 'name2', 'name2', 'name2'), ('223', '223', '223', '223'),
        ('43', '43', '43', '43'), ('5', '5', '5', '5')],
       [('name3', 'name3', 'name3', 'name3'), ('12', '12', '12', '12'),
        ('33', '33', '33', '33'), ('2', '2', '2', '2')]], 
      dtype=[('name', 'S10'), ('x', 'S10'), ('y', 'S10'), ('n', 'S10')])

What am I missing? How can I define types for each matrix columns using numpy module?

Wouter
  • 1,568
  • 7
  • 28
  • 35
ePascoal
  • 2,362
  • 6
  • 26
  • 44

1 Answers1

0

Creating/filling a structured array is a little tricky. There are various ways, but I think the simplest to remember is to use a list of tuples:

In [11]: np.array([tuple(row) for row in matrix], dtype=dt)
Out[11]: 
array([('name', '23', '45', '1'), 
       ('name2', '223', '43', '5'),
       ('name3', '12', '33', '2')], 
      dtype=[('name', 'S10'), ('x', 'S10'), ('y', 'S10'), ('n', 'S10')])

The result is 1d array, with the dtype fields replacing the columns of the original 2d array. Each element of the new array has the same type - as specified by dt.

Or you can create an empty array of the desired dtype, and fill it, row by row or field by field:

In [14]: arr = np.zeros((3,),dt)    
In [16]: arr[0]=tuple(matrix[0,:])  # tuple of row
In [17]: arr['name']=matrix[:,0]    # field

In [18]: arr
Out[18]: 
array([('name', '23', '45', '1'), 
       ('name2', '', '', ''),
       ('name3', '', '', '')], 
      dtype=[('name', 'S10'), ('x', 'S10'), ('y', 'S10'), ('n', 'S10')])

With a compatible dt1, view would also work

dt1 = numpy.dtype({'names':['name','x','y','n'],'formats': ['S5', 'S5', 'S5', 'S5']})
matrix.view(dt1)

This doesn't change the data; it just interprets the bytes differently.


converting the strings to numbers is easy with the list of tuples

In [40]: dt2 = numpy.dtype({'names':['name','x','y','n'],'formats': ['S5', 'f', 'f', 'i']})

In [41]: np.array([tuple(row) for row in matrix], dtype=dt2)Out[41]: 
array([('name', 23.0, 45.0, 1), 
       ('name2', 223.0, 43.0, 5),
       ('name3', 12.0, 33.0, 2)], 
      dtype=[('name', 'S5'), ('x', '<f4'), ('y', '<f4'), ('n', '<i4')])
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • one of this approaches shows a matrix. My question has to do with 2d Array which in my case should have a shape of (3,4) not a (3,) – ePascoal May 09 '15 at 16:37
  • But your `dtype` specifies 4 fields, and you only have 4 columns of data in the source. Do you want to replicate values to each column of the new 2d array? – hpaulj May 09 '15 at 16:40
  • Please don't consider my dtype, the main question is 'How can I define types for each matrix columns using numpy module?' – ePascoal May 09 '15 at 16:45
  • 1
    You can't specify type on a column by column basis. It's one dtype for every element of the array. – hpaulj May 09 '15 at 16:53
  • There is no other way to cast each column of an 2dArray? – ePascoal May 09 '15 at 17:07
  • To handle the common case of 2-D-like data (i.e. rows and columns) in numpy, you either have a 2-D array of all the same type, or a 1-D array with a structured dtype (i.e. each element of your 1-D array is a structure, with fields `name`, `x`, etc.). For more flexible handling of 2-D-like data, with indexing and labeled columns, take a look at pandas, http://pandas.pydata.org/ – Warren Weckesser May 09 '15 at 17:26
  • Why does it have to be a `(3,4)` array? What are you doing that requires 2 dimensions? – hpaulj May 09 '15 at 22:05