I have 2 columns with similar Structs (same field names, field types, etc.).
nest = pl.DataFrame({
'a':[{'x':1,'y':10},{'x':2,'y':20},],
'b':[{'x':3,'y':30},{'x':4,'y':40},]
})
print(nest)
shape: (2, 2)
┌───────────┬───────────┐
│ a ┆ b │
│ --- ┆ --- │
│ struct[2] ┆ struct[2] │
╞═══════════╪═══════════╡
│ {1,10} ┆ {3,30} │
│ {2,20} ┆ {4,40} │
└───────────┴───────────┘
print(nest.schema)
{'a': Struct([Field('x', Int64), Field('y', Int64)]),
'b': Struct([Field('x', Int64), Field('y', Int64)])}
I want to unnest both those columns and get a flat data frame, with the fields suffixed to disambiguate them:
shape: (2, 4)
┌─────┬─────┬─────┬─────┐
│ x_a ┆ y_a ┆ x_b ┆ y_b │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪═════╡
│ 1 ┆ 10 ┆ 3 ┆ 30 │
│ 2 ┆ 20 ┆ 4 ┆ 40 │
└─────┴─────┴─────┴─────┘
I tried:
nest.unnest('a','b')
but (of course) got DuplicateError
for the names x
and y
.
Ideally something that will recursively flatten & disambiguate names using field paths :-(