pandas -> cuDF
Converting some python written for pandas to run on rapids
pandas
temp=df_train.copy()
temp['buildingqualitytypeid']=temp['buildingqualitytypeid'].fillna(-1)
temp=temp.groupby("buildingqualitytypeid").filter(lambda x: x.buildingqualitytypeid.size > 3)
temp['buildingqualitytypeid'] = temp['buildingqualitytypeid'].replace(-1,np.nan)
print(temp.buildingqualitytypeid.isnull().sum())
print(temp.shape)
Anyone know what to use in place of pandas.Series.filter
for same outcome in cuDF
?