0

In a python data table, I wanted to replace empty strings with NaN. When I tried, I get the below error. It works with pandas. Thanks in advance for the help.

Datatable Syntax I tried:

dt[:,"column_name"].replace('',np.nan)

Error Received:

Cannot replace string value '' with a value of type <class 'float'>

pandas syntax that worked:

pd["column_name"]=pd["column_name"].replace('',np.nan)
topchef
  • 19,091
  • 9
  • 63
  • 102
jeganathan velu
  • 189
  • 2
  • 12

1 Answers1

2

Py datatable syntax to update (replace) one of the columns based on its value:

import datatable as dt
mydt = dt.Frame(a=['a','b','c','','d','e'])
mydt[dt.f.a == '', dt.update(a = None)]

Datatable before update:

mydt
   | a 
-- + --
 0 | a 
 1 | b 
 2 | c 
 3 |   
 4 | d 
 5 | e 

Datatable after update:

mydt
   | a 
-- + --
 0 | a 
 1 | b 
 2 | c 
 3 | NA
 4 | d 
 5 | e 

works with version 0.10.0 or later

Bonus answer: to accomplish the opposite - replace missing values with some constant value use function isna():

mydt = dt.Frame(a=['a','b','c', None,'d','e'])
mydt[dt.isna(dt.f.a), dt.update(a = 'NULL')]
topchef
  • 19,091
  • 9
  • 63
  • 102