1

I have a dataframe with 5 columns (Participants, duration_1, duration_2, duration_3, duration_4). The "Participant" column has either subjects with the IDCY or IDCO labels. FOr example: IDCY06, IDCO02,IDCY31...etc. I want to create two new dataframes: those with the IDCY and those with IDCO. I have been using the code:

df[df["Participant"].str.contains("IDCY")]

and I keep getting a keyerror code for Participants even though everything is spelled as it should be.

Is there any other method to iterate over rows and to get a new dataframe with the participants that have a set substring?

Thank you.

Barmar
  • 741,623
  • 53
  • 500
  • 612
Isabella C
  • 11
  • 2
  • You mentioned you have columns "Participant**s**" (plural) but your code has `df['Participant']`. If this is a typo, try printing the columns `print(df.columns)` and see if you have any leading or trailing spaces or not. – Emma Mar 04 '22 at 16:42
  • Sorry the name is "Participant" and I have it as "Participant" on the code and dataframe. Thank you – Isabella C Mar 04 '22 at 16:52
  • 1
    What is the result of `print(df.columns)`? – Emma Mar 04 '22 at 16:59
  • 1
    If you're getting a key error, changing the iteration method won't solve that problem. – Barmar Mar 04 '22 at 17:14
  • After Emma's initial help, I noticed the "Participant" column does not form part of the dataframe. Before I used the code "df3=df2.groupby('Participant').agg({'current page':'first','Duration_1':'sum'})" to add all the duration_1 of the same participants since a lot of them have multiple durations for the same type. After that, the "Participant" column disappeared. Any thoughts on how to add the participant column back? – Isabella C Mar 04 '22 at 17:25
  • That means `Participant` is now the index. One option is to just filter the index: `df3[df3.index.str.contains("IDCY")]` – tdy Mar 04 '22 at 17:32
  • Another option is to use `as_index=False` when grouping, which will leave `Participant` as a column: `df3=df2.groupby('Participant', as_index=False).agg({'current page':'first','Duration_1':'sum'})` – tdy Mar 04 '22 at 17:32
  • Or use `reset_index()`, see https://stackoverflow.com/a/21768034/13138364 – tdy Mar 04 '22 at 17:36
  • Thank you! I had plotted a bar graph with the x axis as the participant ID (ex: IDCY06) and after I did the as_index=False, the x axis changed to just regular numbers. I have tried setting an index to plot the bar graph but it does not work. – Isabella C Mar 04 '22 at 19:40

0 Answers0