0

I want to change values greater than 70 in column CT_feat7 but it only changes till 59000. After that, I have to run the iteration again, with a different index value.

Please, explain why this happens. Is there a better way? Dataset before replacement. After I run this code:

for index,j in enumerate(df['CT_feat7']):
  if j>70:
    df.loc[index,'CT_feat7'] = 11+random.random()

values are changed only up to index 59180.

i,j = 59180,2
while i <= 99195:
  if df.loc[i,'CT_feat7']>70:
    df.loc[i,'CT_feat7'] = j
    j+=0.1
    if j>12:
      j=2
  i+=1
Julia Meshcheryakova
  • 3,162
  • 3
  • 22
  • 42

1 Answers1

1

I think it is because enumerate() is not the proper iterator to use with .loc. Try:

for index,j in df['CT_feat7'].items():
  if j>70:
    df.loc[index,'CT_feat7'] = 11+random.random()

enumerate() works on the first ~50,000 rows because that is (I suspect) how many rows are in df. This is because enumerate() iterates over the values j in the passed Series and for each j, the corresponding index is the location of j in the Series, ranging from 0 to the length of the Series. However, when slicing with .loc, you must give the label (not the location) of the item(s) you want. See this answer for more information.

evces
  • 372
  • 8