0

I have a dictionary with key-value pair columns name and value as a list of allowed values in that columns

How to replace values that are not occurring in the dictionary list with '0'

FinalCat_ is the column names list CombinedCat is Vaex df AllowedCatColValuesFast is dictionary

def returnVal(x, li):
if x in li:
    return x
else:
    return '0'

for i in FinalCat_:
CombinedCat[i+"Mod"] = CombinedCat.apply(returnVal, [CombinedCat[i], AllowedCatColValuesFast[i]])

so when I do .value_counts() it gives me an error list index out of range for the new columns which are created.

1 Answers1

0

You can use the map method with the default_value parameter set to "0".

If for each column you have a list of accepted values you can create a dummy mapping which maps each element to itself and use it in the map.

Here is a quick example with vaex 3.0.0:

import pandas as pd
import vaex

df = pd.DataFrame({"column": ["x", "y", "z"]})

df = vaex.from_pandas(df)
accepted_values = ["x", "y"]
default_value = "0"

df["column"].map(dict(zip(accepted_values, accepted_values)), default_value=default_value)

which gives the expected output:

Expression = _choose_masked(_ordinal_values(column, map_key_set), map_...
Length: 3 dtype: str (expression)
---------------------------------
0  x
1  y
2  0

You have to make sure that the default_value used has the same type as the column. For example, if you have a column with strings you cannot use an integer as a default value.