0

I udnerstand how to perform conditional sum in columns but I am wondering how to achieve a similar approach and end up as a dataframe

import pandas as pd
import 
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'C'],
                'conference': ['East', 'East', 'East', 'West', 'West', 'East'],
                'points': [11, 8, 10, 6, 6, 5],
                'rebounds': [7, 7, 6, 9, 12, 8]})

pl.from_pandas(df)

┌──────┬────────────┬────────┬──────────┐
│ team ┆ conference ┆ points ┆ rebounds │
│ ---  ┆ ---        ┆ ---    ┆ ---      │
│ str  ┆ str        ┆ i64    ┆ i64      │
╞══════╪════════════╪════════╪══════════╡
│ A    ┆ East       ┆ 11     ┆ 7        │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ A    ┆ East       ┆ 8      ┆ 7        │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ A    ┆ East       ┆ 10     ┆ 6        │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ B    ┆ West       ┆ 6      ┆ 9        │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ B    ┆ West       ┆ 6      ┆ 12       │
├╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ C    ┆ East       ┆ 5      ┆ 8        │
└──────┴────────────┴────────┴──────────┘

Pandas solution:

df.loc[(df2['points'] >= 8) & (df['team'] != 8), 'rebounds'].sum()
df.query("points >= 8 and team != 'B' ")['rebounds'].sum()

# Result
20
┌─────────┬──────────┐
│ column  ┆ column_0 │
│ ---     ┆ ---      │
│ str     ┆ u32      │
╞═════════╪══════════╡
│ group_a ┆ 20       │
├╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ group_b ┆ 10       │
└─────────┴──────────┘
codedancer
  • 1,504
  • 9
  • 20

1 Answers1

2
df.select([
    pl.col("rebounds").where((pl.col("points") >= 8) & (pl.col("team") != 'B')).sum()
])
shape: (1, 1)
┌──────────┐
│ rebounds │
│ ---      │
│ i64      │
╞══════════╡
│ 20       │
└──────────┘
ritchie46
  • 10,405
  • 1
  • 24
  • 43
  • Thanks! It works nicely - Can I just ask my ultimate aim is to create a table with multiple sum calculations and I was able to achieve it with `with_columns()` and `transpose()` given the shape of the datafram. I am wondering if you have a better suggestion to do it in a more efficient way. – codedancer Oct 16 '22 at 10:51