Here is the data
ID VAR1 VAR2 VAR3
1 [12, 'a', 'ok'] [4, 'b', 'duk'] NaN
2 NaN NaN NaN
3 [1, 'f', 'sd'] NaN [34, 'daa']
I want to create a new variable called MIN_VALUE
that compares all three variables' first list items, and extract the lowest value. This will give the following
ID VAR1 VAR2 VAR3 MIN_VALUE
1 [12, 'a', 'ok'] [4, 'b', 'duk'] NaN 4
2 NaN NaN NaN NaN
3 [1, 'f', 'sd'] NaN [34, 'daa'] 1
I tried to create and apply a function as below, and I want it to be flexible with the number of variables to be selected (hence using *args
). But it doesn't work correctly
def extract_min_value_from_first_list_item_across_multiple_columns(df, *args):
return min(df[args][0])
df['MIN_VALUE'] = df.apply(
extract_min_value_from_first_list_item_across_multiple_columns, 'VAR1', 'VAR2', 'VAR3', axis=1)
Resulting error as TypeError: apply() got multiple values for argument 'axis'
.