I'm migrating code to polars from pandas. I have time-series data consisting of a timestamp and value column and I need to compute a bunch of features. i.e.
df = pl.DataFrame({
"timestamp": pl.date_range(
datetime(2017, 1, 1),
datetime(2018, 1, 1),
timedelta(minutes=15),
time_zone="Australia/Sydney",
time_unit="ms", eager=True),
})
value = np.random.normal(0, 1, len(df))
df = df.with_columns([pl.Series(value).alias("value")])
I need to generate a column containing an indicator if the timestamp is standard or daylight time. I'm currently using apply
because as far as I can see the isn't a Temporal Expr, i.e. my current code is
def dst(timestamp:datetime):
return int(timestamp.dst().total_seconds()!=0)
df = df.with_columns(pl.struct(["timestamp"]).apply(lambda x: dst(**x)).alias("dst"))
(this uses a trick that effectively checks if the tzinfo.dst(dt)
offset is zero or not)
Is there a (fast) way of doing this using polars expressions
rather than (slow) apply
?