Someone please give me a VAEX alternative for this code:
df_train = vaex.open('../input/ms-malware-hdf5/train.csv.hdf5')
total = df_train.isnull().sum().sort_values(ascending = False)
Someone please give me a VAEX alternative for this code:
df_train = vaex.open('../input/ms-malware-hdf5/train.csv.hdf5')
total = df_train.isnull().sum().sort_values(ascending = False)
Vaex does not at this time support counting missing values on a dataframe level, only on an expression (column) level. So you will have to do a bit of work yourself.
Consider the following example:
import vaex
import vaex.ml
import pandas as pd
df = vaex.ml.datasets.load_titanic()
count_na = [] # to count the missing value per column
for col in df.column_names:
count_na.append(df[col].isna().sum().item())
s = pd.Series(data=count_na, index=df.column_names).sort_values(ascending=True)
If you think this is something you might need to use often, it might be worth it to create your own dataframe method following this example.