0

I am analysing a data set race_dfin R to do with races and their times. Each row has a categorical variable race_df$raceID (so I will use a factor variable to simulate an indicator variable for each race) and has associated race_df$lap_time. I want to analyse the data set through linear regression and then apply shrinkage methods such as LASSO or RIDGE.

For shrinkage methods I need to standardise the data, but since for different races the average lap-time will differ (due to length of the track). When standardising the column race_df$lap_time surely I would standardise based on the average lap time and standard deviation for each race. Not just the whole column.

Governor
  • 300
  • 1
  • 10

0 Answers0