How to get last N rows of each group in sparklyr?

Question

I have a spark data frame with columns id, category, timestamp, price columns. I want to group the data by customer id, category sort by timestamp, and get last n rows in each group.

I tried the below code but it is retuning just 3 rows for overall data.
a <- data1 %>% dplyr::group_by(customer_id, category) %>% dplyr::arrange(dplyr::desc(timestamp)) %>% head(., n = 3)

please suggest an efficient solution

score -1 · Answer 1 · answered May 07 '20 at 12:59

-1

Without example data we can't know if this will work.

in Base R

data1 <- data1[order(data1$timestamp),]
lapply(split(data1,data1$customer_ID), tail, n=5)

answered May 07 '20 at 12:59

Daniel O

4,258
6
20

I can implement in R directly using dplyr::top_n but I want it in sparklyr – Yashwanth May 07 '20 at 13:30

How to get last N rows of each group in sparklyr?

1 Answers1