0

as i try to sort a numpy array i have to change the dtype because i need to sort the data based on a column. But as i change the type the array transforms itself [columns to rows] and duplicates. I am using .astype you can see below:

>>> edist
array([[2.50000000e+00, 1.45000000e+02, 4.46500000e+03, 1.41958256e+01],
       [2.00000000e+00, 1.45500000e+02, 4.46500000e+03, 1.51561499e+01],
       [1.50000000e+00, 1.46000000e+02, 4.46500000e+03, 1.60814095e+01],
       ...,
       [1.24828883e+02, 2.34000000e+02, 4.55500000e+03, 1.18762398e+01],
       [1.25175876e+02, 2.34500000e+02, 4.55500000e+03, 6.60787582e+00],
       [1.25523902e+02, 2.35000000e+02, 4.55500000e+03, 1.16466343e+00]])
>>> edist.astype(typ)
array([[(2.50000000e+00, 2.50000000e+00, 2.50000000e+00, 2.50000000e+00),
        (1.45000000e+02, 1.45000000e+02, 1.45000000e+02, 1.45000000e+02),
        (4.46500000e+03, 4.46500000e+03, 4.46500000e+03, 4.46500000e+03),
        (1.41958256e+01, 1.41958256e+01, 1.41958256e+01, 1.41958256e+01)],
       [(2.00000000e+00, 2.00000000e+00, 2.00000000e+00, 2.00000000e+00),
        (1.45500000e+02, 1.45500000e+02, 1.45500000e+02, 1.45500000e+02),
        (4.46500000e+03, 4.46500000e+03, 4.46500000e+03, 4.46500000e+03),
        (1.51561499e+01, 1.51561499e+01, 1.51561499e+01, 1.51561499e+01)],
       [(1.50000000e+00, 1.50000000e+00, 1.50000000e+00, 1.50000000e+00),
        (1.46000000e+02, 1.46000000e+02, 1.46000000e+02, 1.46000000e+02),
        (4.46500000e+03, 4.46500000e+03, 4.46500000e+03, 4.46500000e+03),
        (1.60814095e+01, 1.60814095e+01, 1.60814095e+01, 1.60814095e+01)],
       ...,
       [(1.24828883e+02, 1.24828883e+02, 1.24828883e+02, 1.24828883e+02),
        (2.34000000e+02, 2.34000000e+02, 2.34000000e+02, 2.34000000e+02),
        (4.55500000e+03, 4.55500000e+03, 4.55500000e+03, 4.55500000e+03),
        (1.18762398e+01, 1.18762398e+01, 1.18762398e+01, 1.18762398e+01)],
       [(1.25175876e+02, 1.25175876e+02, 1.25175876e+02, 1.25175876e+02),
        (2.34500000e+02, 2.34500000e+02, 2.34500000e+02, 2.34500000e+02),
        (4.55500000e+03, 4.55500000e+03, 4.55500000e+03, 4.55500000e+03),
        (6.60787582e+00, 6.60787582e+00, 6.60787582e+00, 6.60787582e+00)],
       [(1.25523902e+02, 1.25523902e+02, 1.25523902e+02, 1.25523902e+02),
        (2.35000000e+02, 2.35000000e+02, 2.35000000e+02, 2.35000000e+02),
        (4.55500000e+03, 4.55500000e+03, 4.55500000e+03, 4.55500000e+03),
        (1.16466343e+00, 1.16466343e+00, 1.16466343e+00, 1.16466343e+00)]],
  dtype=[('Eucli Dist', '<f8'), ('x', '<f8'), ('y', '<f8'), ('Magn', '<f8')])

typ is the new dtype

 typ = [('Eucli Dist', float), ('x', float), ('y', float), ('Magn', float)]

UPDATE So i tried the following:

tes = [tuple(i) for i in edist]
np.reshape(np.array(tes, dtype=typ), (len(tes),1))

resulting in :

array([[(  2.5       , 145. , 4465., 14.19582558)],
       [(  2.        , 145.5, 4465., 15.15614986)],
       [(  1.5       , 146. , 4465., 16.08140945)],
       ...,
       [(124.82888288, 234. , 4555., 11.87623978)],
       [(125.17587627, 234.5, 4555.,  6.60787582)],
       [(125.52390211, 235. , 4555.,  1.16466343)]],
      dtype=[('Eucli Dist', '<f8'), ('x', '<f8'), ('y', '<f8'), ('Magn', '<f8')])

but i want it to be like:

[[..., ..., ..., ...],
 [..., ..., ..., ...]]

(calling the values is the same but...)

TheGame
  • 53
  • 2
  • 9

1 Answers1

0

Try

import numpy.lib.recfunctions as rf
rf.unstructured_to_structured(edist, typ)

That should convert the dtype and keep the dimensions straight.

In [221]: arr = np.arange(12).reshape(3,4)
In [222]: typ = [('Eucli Dist', float), ('x', float), ('y', float), ('Magn', float)]

In [224]: rf.unstructured_to_structured(arr, dtype=np.dtype(typ))
Out[224]: 
array([(0., 1.,  2.,  3.), (4., 5.,  6.,  7.), (8., 9., 10., 11.)],
      dtype=[('Eucli Dist', '<f8'), ('x', '<f8'), ('y', '<f8'), ('Magn', '<f8')])

Your typ is a list, than is turned into a dtype with np.dtype(typ).

Often you can supply the list, and the conversion to dtype is automatic. But apparently this function doesn't do that, requiring an actual dtype object:

In [225]: rf.unstructured_to_structured(arr, dtype=typ)
Traceback (most recent call last):
  File "<ipython-input-225-f8d886d54037>", line 1, in <module>
    rf.unstructured_to_structured(arr, dtype=typ)
  File "<__array_function__ internals>", line 5, in unstructured_to_structured
  File "/usr/local/lib/python3.8/dist-packages/numpy/lib/recfunctions.py", line 1067, in unstructured_to_structured
    fields = _get_fields_and_offsets(dtype)
  File "/usr/local/lib/python3.8/dist-packages/numpy/lib/recfunctions.py", line 875, in _get_fields_and_offsets
    for name in dt.names:
AttributeError: 'list' object has no attribute 'names'

Specifically:

In [226]: typ.names
Traceback (most recent call last):
  File "<ipython-input-226-f621147aab56>", line 1, in <module>
    typ.names
AttributeError: 'list' object has no attribute 'names'

In [227]: np.dtype(typ).names
Out[227]: ('Eucli Dist', 'x', 'y', 'Magn')

rf.unstructured_to_structured docs (accessed via ipython '?')

Signature:
rf.unstructured_to_structured(
    arr,
    dtype=None,
    names=None,
    align=False,
    copy=False,
    casting='unsafe',
)
Parameters
----------
arr : ndarray
   Unstructured array or dtype to convert.
dtype : dtype, optional
   The structured dtype of the output array
...
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • trying this "rf.unstructured_to_structured(edist, typ)" i get the following error : for name in dt.names: AttributeError: 'list' object has no attribute 'names'. – TheGame Jan 29 '21 at 09:09
  • then i changed it as "rf.unstructured_to_structured(edist, names=typ)" and i get the following error : in unstructured_to_structured out_dtype = np.dtype([(n, arr.dtype) for n in names], align=align) TypeError: Field name must be a str – TheGame Jan 29 '21 at 09:12
  • what exactly to read? cant you give a solution – TheGame Jan 29 '21 at 15:19
  • Your `typ` is a list, which this function does not automatically convert to `dtype`. Example added. – hpaulj Jan 29 '21 at 16:30