I want to efficiently find the distance from the current row to the previous occurrence. I know polars doesn't have indexes, but the formula would roughly be:
if prior_occurrence {
(current_row_index - prior_occurrence_index - 1)
} else {
-1
}
This is the input dataframe:
let df_a = df![
"a" => [1, 2, 2, 1, 4, 1],
"b" => ["c","a", "b", "c", "c","a"]
].unwrap();
println!("{}", df_a);
a - i32 | b - str |
---|---|
1 | c |
2 | a |
2 | b |
1 | c |
4 | c |
1 | a |
Wanted output:
a - i32 | b - str | b_dist - i32 |
---|---|---|
1 | c | -1 |
2 | a | -1 |
2 | b | -1 |
1 | c | 2 |
4 | c | 0 |
1 | a | 3 |
What's the most efficient way to go about this?