0

I have the following code that generates the two columns.

import pandas as pd
  
data = {'Group': ['1', '1', '1', '1', '1', '1',
                  '2', '2', '2', '2', '2', '2',
                  '3', '3', '3', '3', '3', '3',
                  '4', '4', '4', '4', '4', '4',],
        'Test1': ['ABC', 'CDE', 'EFG', 'GHI', 'IJK', 'KLM',
                  'MNO', 'OPQ', 'QRS', 'STU', 'UVW', 'WXYZ',
                  'ABC', 'CDE', 'EFG', 'GHI', 'IJK', 'KLM',
                  'MNO', 'OPQ', 'QRS', 'STU', 'UVW', 'WXYZ',],
        'Test2': ['1234','4567', '8910', '1112', '1314', '1415',
                  '1516', '1718', '1920', '2122', '2324', '2526',
                  '2728', '2930', '3132', '3334', '3536', '3738',
                  '2940', '4142', '4344', '4546', '4748', '4950'],
        'Value': [True, True, False, False, False, True,
                  True, True, True, True, True, True,
                  True, True, True, True, True, False,
                  True, True, True, False, True, True,],
        }
  
df = pd.DataFrame(data)

print(df)

So, by checking the last 2, 3, or 4 rows in each group if they return False, I want to return False. And if all the values are True then, I want to return true for all rows. From the above code, the expected outcome is this. If we check for the last 3 rows in each group

Group | Value
----- | -----  
  1   |   False 
  1   |   False
  1   |   False
  2   |   True
  2   |   True
  2   |   True
  3   |   False
  3   |   False
  3   |   False
  4   |   False
  4   |   False
  4   |   False
Bad Coder
  • 177
  • 11

1 Answers1

2

Update, per updated question and comments below:

df[['Test1','Test2']].merge(df.groupby('Group')['Value'].apply(lambda x: x.iloc[-3:].mul(x.iloc[-3:].min(), level=0))\
  .reset_index(), left_index=True, right_on='level_1').drop('level_1', axis=1)

Output:

   Test1 Test2 Group  Value
0    GHI  1112     1  False
1    IJK  1314     1  False
2    KLM  1415     1  False
3    STU  2122     2   True
4    UVW  2324     2   True
5   WXYZ  2526     2   True
6    GHI  3334     3  False
7    IJK  3536     3  False
8    KLM  3738     3  False
9    STU  4546     4  False
10   UVW  4748     4  False
11  WXYZ  4950     4  False

IIUC, try this:

df.groupby('Group')['Value'].apply(lambda x: x.iloc[-3:].mul(x.iloc[-3:].min(), level=0))\
  .reset_index()\
  .drop('level_1', axis=1)

Output:

   Group  Value
0      1  False
1      1  False
2      1  False
3      2   True
4      2   True
5      2   True
6      3  False
7      3  False
8      3  False
9      4  False
10     4  False
11     4  False
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • Huh, smart: group by Group, take last 3 entries, multiply each by the min. So if any of those 3 are zero, they're all zeroed out. Great way to ensure True only happens if they are all True. One question: what does `level = 0` here do? I can't tell from the `.mul()` documentation. – Vincent Rupp Nov 07 '22 at 20:39
  • Hi, I am getting `IndexError: single positional indexer is out-of-bounds`. Ther are other columns as well in between `Group` and `Value`. If we have other columns then, how can we fix the code? – Bad Coder Nov 07 '22 at 21:02
  • @BadCoder Can you product a data set that generates this error? – Scott Boston Nov 08 '22 at 01:11
  • Mul and line up level = 0 of the dataframe index, in the groupby you going to get a multiple index and min will give single level index, you telling mult to line up the single index pd.Series with the level=0 index of the other pd.Series. – Scott Boston Nov 08 '22 at 01:14
  • @ScottBoston I have edited the question and it returned just Group and Value columns. I was wondering to return all the columns from the data frame with the updated column – Bad Coder Nov 08 '22 at 02:54
  • @ScottBoston what do I have to change in the above code if I want to return a True value if any row contains True values and only return False if I all the values have False value. – Bad Coder Feb 28 '23 at 23:28
  • @BadCoder Can you create a new question with your expected results. I don't understand this request. Each row already only has a True or False value. – Scott Boston Mar 01 '23 at 01:36