2

When evaluating list elements I would like to know and use the current index. Is there already a way of doing it?

Something like pl.element().idx() ?

import polars as pl

data = {"a": [[1,2,3],[4,5,6]]}
schema = {"a": pl.List(pl.Int8)}

df = pl.DataFrame(data, schema=schema).with_columns([
    pl.col("a").list.eval(pl.element() * pl.element().idx() )
])

Expected result:

+-------------+
¦ a           ¦
¦ ---         ¦
¦ list[u8]    ¦
¦-------------¦
¦ [0, 2, 6]   ¦
¦ [0, 5, 12]  ¦
+-------------+
codester_09
  • 5,622
  • 2
  • 5
  • 27

2 Answers2

2

The best way (that I know of) is to make a row index, explode, use cumcount with a window function to create the idx (I'm calling it j), and then put it back together with groupby/agg

(
    df
        .with_row_count('i')
        .explode('a')
        .with_columns(j=pl.first().cumcount().over('i'))
        .with_columns(new=pl.col('a')*pl.col('j'))
        .groupby('i', maintain_order=True)
        .agg(pl.col('new'))
        .drop('i')
)
Dean MacGregor
  • 11,847
  • 9
  • 34
  • 72
1

You can use the apply method along with the enumerate function to achieve the desired result of accessing the current index of each element in a list column. Here's an example of how you can do it:

import polars as pl

data = {"a": [[1,2,3],[4,5,6]]}
schema = {"a": pl.List(pl.Int8)}

df = pl.DataFrame(data, schema=schema).with_columns([
    pl.col("a").apply(lambda arr: [x * i for i, x in enumerate(arr)])
])

print(df)

Output

shape: (2, 1)
┌────────────┐
│ a          │
│ ---        │
│ list[i64]  │
╞════════════╡
│ [0, 2, 6]  │
│ [0, 5, 12] │
└────────────┘
codester_09
  • 5,622
  • 2
  • 5
  • 27