Python Numpy return elements in ndarray that are strings

Question

I am trying to get the elements in an ndarray that are strings. That is, exclude the elements that are integers and floats.

Lets say I have this array:

x = np.array([1,'hello',2,'world'])

I want it to return:

array(['hello','world'],dtype = object)

I've tried doing np.where(x == np.str_) to get the indices where that condition is true, but it's not working.

Any help is much appreciated.

If you check `x` after your first line, you'll notice the entire array has `dtype=' — Cory Kramer, Jul 30 '21 at 19:02
See [this post](https://stackoverflow.com/questions/11309739/store-different-datatypes-in-one-numpy-array) that discusses heterogeneous data in numpy. — Cory Kramer, Jul 30 '21 at 19:03
Run `type(x[0])` and you will see that the integers are actually being stored as `numpy.str`. — cheese12345, Jul 30 '21 at 19:06

score 1 · Accepted Answer · answered Jul 30 '21 at 20:36

You can make a function to do it, and loop over the array:

def getridofnumbers(num):
    try:
        x = int(num)
    except:
        return True
    return False

output = np.array([i for i in x if getridofnumbers(i)])

if we want to keep all the numpy goodness (broadcasting etc), we can convert that into a ufunc using vectorize (or np.frompyfunc):

import numpy as np
#vectorize the fucntion, with a boolean return type
getrid = np.vectorize(getridofnumbers, otypes=[bool])

x[getrid(x)]
array(['hello', 'world'], dtype='<U11')

#or ufunc, which will require casting:
getrid = np.frompyfunc(getridofnumbers, 1, 1)
x[getrid(x).astype(bool)]

Sidx · Answer 2 · 2021-07-30T20:43:12.533

When you run x = np.array([1,'hello',2,'world']), numpy converts everything to string type.

If it is one dimensional array, you can use:

y = np.array([i for i in x if not i.replace(".","",1).replace("e+","").replace("e-","").replace("-","").isnumeric()])

to get all non-numeric values.

It can identify all floats with negative sign and and e+/e- )

like, for input: x = np.array([1,'hello',+2e-50,'world', 2e+50,-2, 3/4, 6.5 , "!"]) output will be : array(['hello', 'world', '!'], dtype='<U5')

Python Numpy return elements in ndarray that are strings

2 Answers2