-1

I would like to select rows using condition on columns like "sex" = "male". I normally used loc function on DataFrame.

import pandas as pd
dane = pd.read_csv('insurance.csv')
dane.info()

the result is:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1338 entries, 0 to 1337
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   age       1338 non-null   int64  
 1   sex       1338 non-null   object 
 2   bmi       1338 non-null   float64
 3   children  1338 non-null   int64  
 4   smoker    1338 non-null   object 
 5   region    1338 non-null   object 
 6   charges   1338 non-null   float64
dtypes: float64(2), int64(2), object(3)
memory usage: 73.3+ KB

a = dane.loc(dane["sex"] == "male")

And after this calling this cells i have this error


TypeError                                 Traceback (most recent call last)
<ipython-input-9-18dd4823c7e4> in <module>()
----> 1 a = dane.loc(dane["sex"] == "male")

1 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in _get_axis_number(cls, axis)
    544     def _get_axis_number(cls, axis: Axis) -> int:
    545         try:
--> 546             return cls._AXIS_TO_AXIS_NUMBER[axis]
    547         except KeyError:
    548             raise ValueError(f"No axis named {axis} for object type {cls.__name__}")

TypeError: unhashable type: 'Series'

If i did example from the Internet everything is good:

boxes = {'Color': ['Green','Green','Green','Blue','Blue','Red','Red','Red'],
         'Shape': ['Rectangle','Rectangle','Square','Rectangle','Square','Square','Square','Rectangle'],
         'Price': [10,15,5,5,10,15,15,5]
        }

df = pd.DataFrame(boxes, columns= ['Color','Shape','Price'])
df.info()

select_color = df.loc[df['Color'] == 'Green']
print (select_color)

The result is:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Color   8 non-null      object
 1   Shape   8 non-null      object
 2   Price   8 non-null      int64 
dtypes: int64(1), object(2)
memory usage: 320.0+ bytes
   Color      Shape  Price
0  Green  Rectangle     10
1  Green  Rectangle     15
2  Green     Square      5

What is reason of problem with my situation. This is normall csv file, the same format of data etc.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66

1 Answers1

1

you are doing a function call on the method loc: dane.loc(dane["sex"] == "male")

where you should do indexing: dane.loc[dane["sex"] == "male"]

Z Li
  • 4,133
  • 1
  • 4
  • 19