-1

I am facing a problem while creating a dummy variable There is a problem with the 'town' column. it's giving a key error but my syntax is correct. please help me I didn't understand what is the problem even I am correct from my side.

import pandas as pd
import numpy as np
df= pd.read_csv('homeprices.csv')
df

enter image description here

dummies=pd.get_dummies(df['town'])
dummies

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
c:\users\saurabh singh\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2894             try:
-> 2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'town'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-30-b0961e3e5942> in <module>
      1 # df = pd.concat([df, pd.get_dummies(df['town'])], axis=1)
----> 2 dummies=pd.get_dummies(df['town'])
      3 dummies

c:\users\saurabh singh\appdata\local\programs\python\python37\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2904             if self.columns.nlevels > 1:
   2905                 return self._getitem_multilevel(key)
-> 2906             indexer = self.columns.get_loc(key)
   2907             if is_integer(indexer):
   2908                 indexer = [indexer]

c:\users\saurabh singh\appdata\local\programs\python\python37\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2895                 return self._engine.get_loc(casted_key)
   2896             except KeyError as err:
-> 2897                 raise KeyError(key) from err
   2898 
   2899         if tolerance is not None:

KeyError: 'town'
df.columns
Index(['town ', 'area', 'price'], dtype='object')

1 Answers1

0

Your column town has a space on it.

Change your column names as follows

df.columns = ['town', 'area', 'price']

After this, you can use

dummies=pd.get_dummies(df['town'])

Or just change df['town'] to df['town ']

Eduardo Coltri
  • 506
  • 3
  • 8