-1

In the the pic below, I have a huge dataframe. For each stroke, the machine renders penetration values then gives zeros. I want to calculate the average for each stroke. for example, the average of [0.762, 0.766] alone, and [0.66, 1.37, 2.11, 2.29] alone and so forth till the end of the Dataframe. Note that the stroke has no fixed length. enter image description here

  • Hi again. What should be written for the indices following after "result". Should it be the average aswell or 0? – Sunshine Sep 04 '21 at 19:06

1 Answers1

0
# Generate example data based on your image
df = pd.DataFrame({'penetration': 
                   [0, 1, 0, 2, 3, 0, 0, 
                    5, 6, 7, 0, 0, 0, 0]})

# Flag rows with nonzero penetration depth
df['segment'] = df['penetration'].ne(0)

# Flag rows which represent a change from non-segment
# to segment, or from segment to non-segment
df['group'] = df['segment'].ne(df['segment'].shift())

# Label each segment with a unique integer
df['group'].cumsum()

# Zero out the group number of rows which belong to non-segments
df['group'] = df['group'].where(df['segment'], 0)

# Get mean of penetration depths for each group
df['mean'] = df.groupby('group')['penetration'].transform('mean')

# Print result
df

    penetration  segment  group  mean
0             0    False      0   0.0
1             1     True      2   1.0
2             0    False      0   0.0
3             2     True      4   2.5
4             3     True      4   2.5
5             0    False      0   0.0
6             0    False      0   0.0
7             5     True      6   6.0
8             6     True      6   6.0
9             7     True      6   6.0
10            0    False      0   0.0
11            0    False      0   0.0
12            0    False      0   0.0
13            0    False      0   0.0
Peter Leimbigler
  • 10,775
  • 1
  • 23
  • 37