How to create dummy variables within a loop in python?

Question

So I have a dataframe with a bunch of eatures, some of which I want to make into a dummy variable, some of which I want to leave alone, and I wanted to create a lazy/faster way to do this rather than just typing:

dum_A = pd.get_dummies(df['A'],prefix='A')
dum_B = pd.get_dummies(df['B'],prefix='B')
...
dum_N = pd.get_dummies(df['N'],prefix='N')

So this is the code I came up with below.

List_of_dummy_names = []
List_of_dummy_col = []

for col in list(df1.columns.values):
     if len(df1[col].value_counts()) <= 7:
        List_of_dummy_names.append('dum_'+col)
        List_of_dummy_col.append(col)

for (dummy, col) in zip(List_of_dummy_names, List_of_dummy_col):
    dummy = pd.get_dummies(df1[col], prefix=col)

But this only returns the variable dummy being a dummy dataframe of the nth feature in the lists. What am I doing wrong here? I thought for each loop its getting a new name from the list, instead it looks like its assinging the new dummy DF each time to the variable dummy.

Many thanks in advance guys.

How about using a dict? `d[col] = pd.get_dummies(df1[col], prefix=col)` — eumiro, Jan 19 '16 at 10:18
thanks, I think that takes me most of the way ther, but then how do I make that dict into a dataframe I can join into the rest of my DF — pakkunrob, Jan 19 '16 at 10:29

score 1 · Answer 1 · answered Jan 19 '16 at 10:54

1

for col in list(df.columns.values):
     if len(df[col].value_counts()) <= 7:
            df= pd.concat([df,pd.get_dummies(df[col],prefix=col)],axis=0)
            df[col].fillna(0,inplace=True)
        `

answered Jan 19 '16 at 10:54

5nv

441
2
15

hey, seems to work too, but running into MemoryError issues, is there a way to get around that? – pakkunrob Jan 20 '16 at 05:02

How to create dummy variables within a loop in python?

1 Answers1