6

Is there any built-in function in polars or a better way to convert time durations to numeric by defining the time resolution (e.g.: days, hours, minutes)?

# Create a dataframe
df = pl.DataFrame(
    {
        "from": ["2023-01-01", "2023-01-02", "2023-01-03"],
        "to": ["2023-01-04", "2023-01-05", "2023-01-06"],
    }
)


# Convert to date and calculate the time difference
df = df.with_columns(
    [
        pl.col("from").str.strptime(pl.Date, "%Y-%m-%d").alias("from_date"),
        pl.col("to").str.strptime(pl.Date, "%Y-%m-%d").alias("to_date"),
    ]
).with_columns((pl.col("to_date") - pl.col("from_date")).alias("time_diff"))

# Convert the time difference to int (in days)
df = df.with_columns(
    ((pl.col("time_diff") / (24 * 60 * 60 * 1000)).cast(pl.Int8)).alias("time_diff_int")
)

df

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Guz
  • 387
  • 3
  • 21

1 Answers1

2

the dt accessor lets you obtain individual components, is that what you're looking for?

df["time_diff"].dt.days()
Series: 'time_diff' [i64]
[
    3
    3
    3
]

df["time_diff"].dt.hours()
Series: 'time_diff' [i64]
[
    72
    72
    72
]

df["time_diff"].dt.minutes()
Series: 'time_diff' [i64]
[
    4320
    4320
    4320
]

docs: API reference, series/timeseries

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • Is it possible to get years? Except the obviosus View upvote and downvote totals. the dt accessor lets you obtain individual components, is that what you're looking for? `df["time_diff"].dt.days() / 365` ? – Björn Apr 17 '23 at 11:24
  • 1
    @Björn I don't fully understand your comment; are you asking how to get years or are you suggesting a solution to this? In general, note that 'year' is an ambiguous duration; not all years have 365 days. – FObersteiner Apr 17 '23 at 13:36
  • The former, I was wondering if I substract two date objects and want to get the resulting `pl.Duration` in years, what would be the best way to obtain this. I guess a rough approximation with / 365 is good enough in most cases. Because you are absolutely correct that year is ambiguous (if you want to have an extremly high precision) – Björn Apr 17 '23 at 14:01
  • 1
    Yeah it gets ambiguous starting with month, so anything below should be covered by the duration type. Anything above: you're on your own afaik. 365 or 365.25 days should be ok in most cases to get fractional years. – FObersteiner Apr 17 '23 at 14:09