0

I understand that when I create categorical columns in different data frames they won't join/stack when not created under the same global string cache. However, when deriving a new data frame by aggregating from an existing one, shouldn't it be possible to join them without a global string cache?

import polars as pl
df = pl.DataFrame(data={'column': ['a', 'a', 'b'], 'more': [1, 2, 3]}, columns=[('column', pl.Categorical), ('more', pl.Int32)])
df_agg = df.groupby('column').agg(pl.col('more').mean())
df.join(df_agg, on='column')

Can this join be done without recasting under a global string cache?

P.S. Example just to illustrate the problem, not a best practice example how to add a mean over a group column ;-)

datenzauber.ai
  • 379
  • 2
  • 11

1 Answers1

0

This has been added by ritchie46 and works now as expected.

datenzauber.ai
  • 379
  • 2
  • 11