-2

I have a spark data frame with columns id, category, timestamp, price columns. I want to group the data by customer id, category sort by timestamp, and get last n rows in each group.

I tried the below code but it is retuning just 3 rows for overall data.
a <- data1 %>% dplyr::group_by(customer_id, category) %>% dplyr::arrange(dplyr::desc(timestamp)) %>% head(., n = 3)

please suggest an efficient solution

Yashwanth
  • 69
  • 7

1 Answers1

-1

Without example data we can't know if this will work.

in Base R

data1 <- data1[order(data1$timestamp),]
lapply(split(data1,data1$customer_ID), tail, n=5)
Daniel O
  • 4,258
  • 6
  • 20