Get last row that matches conditions in a data.table in R

Question

I have data that looks like this:

d <- data.table(SEQ = c(1, 1, 2, 2,
                        1, 2, 2, 3,
                        1, 1, 2, 3,
                        3, 4, 5, 5),
                TYPE = c("A", "A", "B", "B",
                         "C", "C", "C", "C",
                         "D", "D", "D", "D",
                         "D", "D", "D", "D"), 
                CLASS = c(1.1, 2.1, 3.3, 4.5,
                          1.3, 2.4, 3.4, 4.6,
                          1.4, 2.5, 3.6, 4.0,
                          1.4, 2.7, 3.2, 4.5))

That looks like this:

I want to filter this down into a new table, call it b.
The logic for filtering would be for each unique(TYPE) only keep the last row for each unique(SEQ). I know I can get this done with some for loops but I would like to avoid that.

The expected output for the sample data would be:

If anyone has any suggestions of efficient ways to filter, I am all ears. Thanks!

`unique(d, by=c("SEQ", "TYPE"), fromLast=TRUE)` should work... seeing if I can find a dupe... — Frank, Jul 24 '18 at 20:53
Check this answer, if you need some ordering https://stackoverflow.com/a/50906379/5795592 — hannes101, Jul 24 '18 at 20:54

score 2 · Accepted Answer · answered Jul 24 '18 at 20:48

You want to group by using both groups, and then use .N to select the last line within each subset of data, ie .SD :

R> d[, .SD[.N], by=.(SEQ,TYPE)]
    SEQ TYPE CLASS
 1:   1    A   2.1
 2:   2    B   4.5
 3:   1    C   1.3
 4:   2    C   3.4
 5:   3    C   4.6
 6:   1    D   2.5
 7:   2    D   3.6
 8:   3    D   1.4
 9:   4    D   2.7
10:   5    D   4.5
R>

AndS. · Answer 2 · 2018-07-24T21:48:49.540

1

You could also try:

library(dplyr)
d %>% group_by(TYPE, SSEQ) %>% filter(row_number() == max(row_number()))
#     SEQ TYPE  CLASS
# 1     1 A       2.1
# 2     2 B       4.5
# 3     1 C       1.3
# 4     2 C       3.4
# 5     3 C       4.6
# 6     1 D       2.5
# 7     2 D       3.6
# 8     3 D       1.4
# 9     4 D       2.7
#10     5 D       4.5

edited Jul 24 '18 at 21:48

answered Jul 24 '18 at 20:51

AndS.

7,748
2
12
17

2

You need two groups, look at his reference output. – Dirk Eddelbuettel Jul 24 '18 at 20:52
Yes, I missed that. Editing now. Thanks. – AndS. Jul 24 '18 at 20:53
2

You could also try mentioning the package you are using – David Arenburg Jul 24 '18 at 21:06
Absolutely, I apologize for its omission. – AndS. Jul 24 '18 at 21:49

Get last row that matches conditions in a data.table in R

2 Answers2