0

I have a dataset:

    id   name  m
0   1    mina  0
1   1    sara  0
2   2    travi 0
3   3    caty  0
5   4    el    0
6   6    tom   0

I wrote the following code for changing my dataframe

for index, row in df.iterrows():
     if(row['m']==0):
          df.loc[df['id'] ==row['id'] ,'m'] = 1
      print(row['name'])

and the result is

  mina
  sara
  travi
  caty
  el
  tom

my question is why the second row is printed? Is there any way to solve it?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
ash
  • 79
  • 6
  • 1
    the code you wrote for changing the dataframe sets column m to 1 for every row that has an id that matches the id of your row.... nothing should be deleted, it's just an inefficient way of setting everything to m=1 – Stael Jan 10 '20 at 09:15
  • what are you expecting the code to do? – Stael Jan 10 '20 at 09:16
  • What do you mean "the second row is repeated"? I don't see any repeated rows. – 9769953 Jan 10 '20 at 09:24
  • I think the second row shouldn't be printed.@Stael @ – ash Jan 10 '20 at 16:40

2 Answers2

1

Is that what you need?

for item in df['id']:
      if ((df.loc[df['id'] == item, 'm'].values[0]) == 0):
            df.loc[df['id'] == item, 'm'] = 1  
       print(item)
Elham
  • 272
  • 2
  • 11
0

Check pandas documentation https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iterrows.html

It says:

You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.

There is why it happens. Use something like this: print('\n'.join(df.drop_duplicates(subset='id')[name]))

  • My problem is not the printing . I think the second row should be printed. My problem is not duplicate removing – ash Jan 10 '20 at 16:41