How do I transform multiple columns simultaneously in polars dataframe?

Question

I have two dataframes, one of them is just a single row, and I would like to transform each of the columns in the first one with the values in the single row in some fashion. How do I do this? Here's what I want to achieve:

df1 = pl.DataFrame({'c1': [2,4,6],'c2': [20,40,60],'c3': [10,20,30]})
df2 = pl.DataFrame({'c1': [2],'c2': [20],'c3': [10]})
df = df.select([
    pl.col('c1')/df2['c1'],
    pl.col('c2')/df2['c2'],
    pl.col('c3')/df2['c3'],
])

Now, imagine I have hundreds of columns. Above code doesn't scale, how do I do this best? Thanks!

Perhaps this operation has been simplified since I last checked, but you can use `.lazy()` + `.with_context()` - https://stackoverflow.com/a/74835365 — jqurious, Jun 12 '23 at 15:14
@jqurious seems overkill - he only has 1 row in the second data frame, which means, that thing is actually a mapping (let's call it `m`), in which case `df1.select([pl.col(c)/m[c] for c in df1.columns])` would be just fine. — Radu, Jun 12 '23 at 15:19
Radu's solution works as is, without mapping to a dict. The good thing about it is I can modify only a subset of columns if I like by using with_columns and providing the list of columns to modify. — ste_kwr, Jun 12 '23 at 16:01

score 1 · Accepted Answer · answered Jun 13 '23 at 19:52

If df2 is guaranteed to be a single row AND the names from df1 and df2 will always match then you can do:

df1.select(pl.col(x)/df2[x] for x in df1.columns)

If df2 is more than a single row or if the name in df1 don't exist in df2 then this will error out.

How do I transform multiple columns simultaneously in polars dataframe?

1 Answers1