0

My question is: I think the answer is no, but is there any way to fix some columns of a numpy array to be integer and the rest as float?

Why I'm asking:

I have a series of items, each is of the form (i, j, x, y z) where i, j are unique, non-recurring integer pairs and x, y, z are floats. I need to keep them together and there are several ways to do it that will work, which includes but is not limited to

  • a list of 5-tuples
  • a dictionary where i, j is the key
  • or even as a pandas DataFrame

The following are true:

  • Speed is not a critical concern
  • the series does not need to be in any specific order or sorted
  • integer values will be small, less than +/- 500
  • total length will be small, less than 10,000 5-tuples

The way that is easiest for me to work with and manipulate them at the moment would be as an n x 5 numpy array, but with dtype as np.float64 I am concerned that I can't always expect i and j to be integers and that may cause trouble down the line that I can't anticipate.

uhoh
  • 3,713
  • 6
  • 42
  • 95
  • I'm voting to close my own question as a duplicate of [Store different datatypes in one NumPy array?](https://stackoverflow.com/questions/11309739/store-different-datatypes-in-one-numpy-array) It turns out that [this answer](https://stackoverflow.com/a/50442728/3904031) sort-of works but it seems that we can't actually have different columns of different data types in a 2D array. – uhoh Dec 02 '19 at 07:15
  • 1
    You can define a `compound dtype`, and make a 1d `structured` array. You access `fields` by name. It can be convenient, but you can't do math across fields with the same ease as with a 2d array with just one `dtype`. But you might find a `pandas` dataframe to be more convenient (and you can transfer data between the structures). – hpaulj Dec 02 '19 at 07:51
  • @hpaulj yes I suspect that pandas will be the best solution for me if I want to use an array-like container. With your confirmation that I can't do what I've asked about, I'm going to go ahead and finish closing this as duplicate. Thanks for your help! – uhoh Dec 02 '19 at 08:17
  • 1
    You could define a `dtype` that has a 2 element integer field, and a 3 element float field. `[('ij',int,2), ('xyz',float, 3)]`. Then `data['xyz']` will be a (n,3) float array. – hpaulj Dec 02 '19 at 08:22
  • @hpaulj wait, maybe you do have an answer for me, perhaps I closed too soon. But when I try `wow = [(3, 2, 3.3, 2.2, 1.1), (1, 2, 1.1, 2.2, 3.3)]` and `dtype = [('ij', int, 2), ('xyz', float, 3)]` and `array = np.array(wow, dtype=dtype)` I get `ValueError: could not assign tuple of length 5 to structure with 2 fields.` – uhoh Dec 02 '19 at 08:29
  • same when including quotes on the data types `dtype = [('ij', 'int', 2), ('xyz', 'float', 3)]` – uhoh Dec 02 '19 at 08:31
  • @hpaulj have just asked [How to build a numpy structured array with 2 int columns and 3 float columns?](https://stackoverflow.com/q/59135931/3904031) – uhoh Dec 02 '19 at 09:30

0 Answers0