1

I am trying to get the elements in an ndarray that are strings. That is, exclude the elements that are integers and floats.

Lets say I have this array:

x = np.array([1,'hello',2,'world'])

I want it to return:

array(['hello','world'],dtype = object)

I've tried doing np.where(x == np.str_) to get the indices where that condition is true, but it's not working.

Any help is much appreciated.

JoshPickel
  • 45
  • 4

2 Answers2

1

You can make a function to do it, and loop over the array:

def getridofnumbers(num):
    try:
        x = int(num)
    except:
        return True
    return False

output = np.array([i for i in x if getridofnumbers(i)])

if we want to keep all the numpy goodness (broadcasting etc), we can convert that into a ufunc using vectorize (or np.frompyfunc):

import numpy as np
#vectorize the fucntion, with a boolean return type
getrid = np.vectorize(getridofnumbers, otypes=[bool])

x[getrid(x)]
array(['hello', 'world'], dtype='<U11')

#or ufunc, which will require casting:
getrid = np.frompyfunc(getridofnumbers, 1, 1)
x[getrid(x).astype(bool)]
jeremycg
  • 24,657
  • 5
  • 63
  • 74
0

When you run x = np.array([1,'hello',2,'world']), numpy converts everything to string type.

If it is one dimensional array, you can use:

y = np.array([i for i in x if not i.replace(".","",1).replace("e+","").replace("e-","").replace("-","").isnumeric()])

to get all non-numeric values.

It can identify all floats with negative sign and and e+/e- )

like, for input: x = np.array([1,'hello',+2e-50,'world', 2e+50,-2, 3/4, 6.5 , "!"]) output will be : array(['hello', 'world', '!'], dtype='<U5')

Sidx
  • 1
  • 1