0

I have a normal csv file imported as panda df. Then I use the below code to select remove some rows from the df.

df.loc[:,"Tobacco"].fillna('No involvement', inplace=True)

df = df[df.loc[:,"Tobacco"] == 'No involvement']

Spyder shows the number of rows of df drop from 8000 to 7000 rows (removed 1000 rows). I check len(df) is 7000. However, I double click on Spyder variable window to view the df, the removed rows are still there and they are grouped together at the end of df (index 7000 to 8000).

This makes me unable to continue to continue as in the next part, when I need to use len(df) to loop over the df, it does not affect those 1000 rows, and I don't know how remove them (I tried to remove using index 7000 to 8000 also).

I have tested this code on Anaconda Spyder with Python 3 on Windows, Linux, PyCharm on Linux, and native Python on Linux and get the same error.

I also tried

df = df[df['Tobacco'] == 'No involvement']

Edit: I got this warning message (sometimes there is a warning, but sometimes I run it again and no warning)

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py:3660: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self._update_inplace(new_data)

Nhi Vo
  • 1
  • 2
  • Related / possible duplicate: [How to deal with SettingWithCopyWarning in Pandas?](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – jpp Mar 24 '18 at 00:14

1 Answers1

0

I'm not able to replicate this. Try the below code and see if the same happens with this code.

df = pd.DataFrame({'A':[4,6,7,65,75,645,5,7,5,75,5]})
df.iloc[3:8]=np.nan # set sum values as NaN to simulate the NaN
print(df.shape)

df.loc[:,"A"].fillna('No involvement', inplace=True)
print(df.shape)

df = df[df.loc[:,"A"] == 'No involvement']
print(df.shape)
print(len(df))`
Arun
  • 100
  • 3
  • 13