5

I have a numpy array where each element looks something like this:

['3' '1' '35' '0' '0' '8.05' '2']
['3' '1' '' '0' '0' '8.4583' '0']
['1' '1' '54' '0' '0' '51.8625' '2']

I would like to replace all empty strings like the ones in the second row above, with some default value like 0. How can I do this with numpy?

The ultimate goal is to be able to run this: S.astype(np.float), but I suspect the empty strings are causing problems in the conversion.

Oleksi
  • 12,947
  • 4
  • 56
  • 80

3 Answers3

19

If your array is t:

t[t=='']='0'

and then convert it.

Explanation:

t=='' creates a boolean array with the same shape as t that has a True value where the corresponding t value is an empty space. This boolean array is then used to assign '0' only to the appropriate indices in the original t.

Bitwise
  • 7,577
  • 6
  • 33
  • 50
  • Cool, I didn't remember this construct, I was just about to post an answer with list comprehensions, but this is quite shorter! – user1778770 Apr 10 '13 at 21:42
  • @user1778770 use list comprehension only if you have a list. as long as you have a numpy array, you should take advantage of that. – Bitwise Apr 10 '13 at 21:45
  • I think you could make your answer even better by adding a link to a page where the construct you've exposed is documented. That's for those beginners who will wonder what's going on. I haven't found one – user1778770 Apr 10 '13 at 21:53
6

Here is an approach that uses map not that is does not produce the same data type as calling .astype():

def FloatOrZero(value):
    try:
        return float(value)
    except:
        return 0.0


print map(FloatOrZero, ['3', '1', '', '0', '0', '8.4583', '0'])

Outputs:

[3.0, 1.0, 0.0, 0.0, 0.0, 8.4583, 0.0]

It's possible that this approach will give you more flexibility to cleanup data but it could also be harder to reason about if you are wanting to work with a numpy.array.

Jason Sperske
  • 29,816
  • 8
  • 73
  • 124
5

Just do this first:

s = np.array(['1', '0', ''])
s[s==''] = '0'

s.astype(float)
#array([ 1.,  0.,  0.])
askewchan
  • 45,161
  • 17
  • 118
  • 134
  • is this faster than map? – mLstudent33 Feb 12 '22 at 06:57
  • I was curious so I wrote a toy example using cProfile and they are not exactly the same. I suspect the overall work is roughly the same but calling map creates a new array of float values while calling astype(float) keeps the original np.array and just lets you see each value as a float. I now think this answer is better and just as readable so I updated my answer so people don't blindly copy bad information. – Jason Sperske Feb 16 '22 at 23:28