0

Referring to this question: R - delete consecutive (ONLY) duplicates I am using the same formula:

df[c(df$x[-1] != df$x[-nrow(df)],TRUE),]

But I am only having the last values and I want to fist ones how can I change that? Thank you!

Phil
  • 7,287
  • 3
  • 36
  • 66
S B
  • 13
  • 1

1 Answers1

0

Here are a few options.

First, you can use rle to get indices of consecutive values. To keep the first value in a series of consecutive numbers, start with index of 1, and add the other indices cumulatively.

lens <- rle(df$x)$lengths
df[cumsum(c(1, lens[-length(lens)])), ]

As an alternative, using tidyverse you can create groups where there is a difference in x by rows. You could keep the first value in each group.

library(dplyr)

df %>%
  group_by(grp = c(T, diff(x) != 0)) %>%
  filter(grp) %>%
  ungroup %>%
  select(-grp)

Or with data.table you can use rleid (function to gerate run-length type group id). Duplicates are FALSE. Keep rows where not FALSE allows you to keep the first row among repeats.

library(data.table)

setDT(df)[!duplicated(rleid(x))]
Ben
  • 28,684
  • 5
  • 23
  • 45