Objective and data
My goal is to look for the values of preceding
in vehicle_id
at a given frame_id
and extract the corresponding value of v_vel
in a new column called preceding_vel
. I want to use the siuba
python package for this purpose. Following is my dataframe:
import pandas as pd
df_mini_dict = {'vehicle_id': {884: 2, 885: 2, 886: 2, 14148: 44, 14149: 44, 14150: 44},
'frame_id': {884: 338, 885: 339, 886: 340, 14148: 338, 14149: 339, 14150: 340},
'preceding': {884: 44, 885: 44, 886: 44, 14148: 3355, 14149: 3355, 14150: 3355},
'v_vel': {884: 6.299857770322456, 885: 6.427411525504063, 886: 6.590098168958994, 14148: 7.22883474245701, 14149: 6.973590500351793, 14150: 6.727721962795176}}
df_mini = pd.DataFrame.from_dict(df_mini_dict)
Working R solution
I can achieve the objective by using the following code:
df_mini <- structure(list(vehicle_id = c(2L, 2L, 2L, 44L, 44L, 44L),
frame_id = c(338L, 339L, 340L, 338L, 339L, 340L),
preceding = c(44L, 44L, 44L, 3355L, 3355L, 3355L),
v_vel = c(6.29985777032246, 6.42741152550406,
6.59009816895899, 7.22883474245701,
6.97359050035179, 6.72772196279518),
preceding_vel = c(7.22883474245701, 6.97359050035179,
6.72772196279518, NA, NA, NA)),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA, -6L))
library(dplyr)
df_mini <- df_mini |>
dplyr::group_by(frame_id) |>
dplyr::mutate(preceding_vel = v_vel[match(preceding, vehicle_id)]) |>
dplyr::ungroup()
Python attempt
Essentially, I am trying to do in siuba
what dplyr
is doing but it seems that I need to use index()
to do what match
does. I tried the following unsuccessfully:
def match(x, table):
indicez = []
for i in x:
indicez.append(table.index(i))
return indicez
from siuba import *
df_mini = (
df_mini
>> group_by(_.frame_id) # grouping by frame id
>> mutate(preceding_vel = _.v_vel[match(_.preceding, _.vehicle_id)])
)
TypeError: 'Symbolic' object is not iterable
Please guide me what is the best way to define the match
function or use something else to meet the objective. Thanks.