I am working on the Walmart Kaggle competition and I'm trying to create a dummy column of of the "FinelineNumber" column. For context, df.shape
returns (647054, 7)
. I am trying to make a dummy column for df['FinelineNumber']
, which has 5,196 unique values. The results should be a dataframe of shape (647054, 5196)
, which I then plan to concat
to the original dataframe.
Nearly every time I run fineline_dummies = pd.get_dummies(df['FinelineNumber'], prefix='fl')
, I get the following error message The kernel appears to have died. It will restart automatically.
I am running python 2.7 in jupyter notebook on a MacBookPro with 16GB RAM.
Can someone explain why this is happening (and why it happens most of the time but not every time)? Is it a jupyter notebook or pandas bug? Also, I thought it might have to do with not enough RAM but I get the same error on a Microsoft Azure Machine Learning notebook with >100 GB of RAM. On Azure ML, the kernel dies every time - almost immediately.