I have this df that contains rows that need to be duplicated based on number of letters split by '-' in 'Group' column. I want each duplicated row to only contain a single letter from the 'Group' column . XYZ does not have any "-" and would remain as a single non duplicated row. Beginning df:
Date End Time Group Assignment
2/2/2021 1130 A-B-C quiz
2/2/2021 1230 XYZ test
1/22/2021 1330 B-D paper
1/22/2021 1130 A-E-C homework
I have made several attempts at this, but can't get it. Here is one example of what I tried:
df[['Group_1', 'Group_2', 'Group_3']] = df['Group'].str.split('-', expand=True)
df.drop(columns=['Group'], inplace=True)
df.to_csv('baz_schedule_modified.csv', index=False)
reps = [2 if not (val is np.nan) else 1 for val in df['Group_2']]
df = df.loc[np.repeat(df.index.values, reps)]
But I did not know where to go from there.
I am wanting the df to end up as follows:
Date End Time Group_1 Assignment
1/22/2021 1130 A homework
1/22/2021 1330 B paper
1/22/2021 1130 C homework
1/22/2021 1330 D paper
1/22/2021 1130 E homework
2/2/2021 1130 A quiz
2/2/2021 1130 B quiz
2/2/2021 1130 C quiz
2/2/2021 1230 XYZ test
Thank you for your help on this!