In the the pic below, I have a huge dataframe. For each stroke, the machine renders penetration values then gives zeros. I want to calculate the average for each stroke. for example, the average of [0.762, 0.766] alone, and [0.66, 1.37, 2.11, 2.29] alone and so forth till the end of the Dataframe. Note that the stroke has no fixed length. enter image description here
Asked
Active
Viewed 94 times
-1
-
Hi again. What should be written for the indices following after "result". Should it be the average aswell or 0? – Sunshine Sep 04 '21 at 19:06
1 Answers
0
# Generate example data based on your image
df = pd.DataFrame({'penetration':
[0, 1, 0, 2, 3, 0, 0,
5, 6, 7, 0, 0, 0, 0]})
# Flag rows with nonzero penetration depth
df['segment'] = df['penetration'].ne(0)
# Flag rows which represent a change from non-segment
# to segment, or from segment to non-segment
df['group'] = df['segment'].ne(df['segment'].shift())
# Label each segment with a unique integer
df['group'].cumsum()
# Zero out the group number of rows which belong to non-segments
df['group'] = df['group'].where(df['segment'], 0)
# Get mean of penetration depths for each group
df['mean'] = df.groupby('group')['penetration'].transform('mean')
# Print result
df
penetration segment group mean
0 0 False 0 0.0
1 1 True 2 1.0
2 0 False 0 0.0
3 2 True 4 2.5
4 3 True 4 2.5
5 0 False 0 0.0
6 0 False 0 0.0
7 5 True 6 6.0
8 6 True 6 6.0
9 7 True 6 6.0
10 0 False 0 0.0
11 0 False 0 0.0
12 0 False 0 0.0
13 0 False 0 0.0

Peter Leimbigler
- 10,775
- 1
- 23
- 37