0

I have a simple dataframe with two vectors, "RECORDS" and "FLAG" which looks like this:

RECORDS  FLAG
H12434   TRUE
W3211    FALSE
Maa      FALSE
Mab      FALSE
Mac      FALSE 
Mad      FALSE
T1_12    FALSE
H7367    TRUE
R001     FALSE
W4810.5  FALSE
Maa      FALSE
Mab      FALSE
T2_12    FALSE

I want to change first TRUE to 1 and all the subsequent FALSE to 1, when second TRUE appears increase counter by 1 and change second TRUE and subsequent FALSE to 2. So result should look like this:

RECORDS  FLAG
H12434   1
W3211    1
Maa      1
Mab      1
Mac      1
Mad      1
T1_12    1
H7367    2
R001     2
W4810.5  2
Maa      2
Mab      2
T2_12    2

I tried so many for loops like below but i don't have enough experience to get it work.

counter = 0
for (i in seq_along(data))
  {
    if(data$flag == TRUE) 
    {
      counter <- counter + 1
      data$flag <- counter
    }
    else
    {
      data$flag <- counter
    }
  }

I was hoping someone can help me understand all the things that i am doing wrong here. Thanks.

Ankit
  • 3
  • 4

1 Answers1

1

I'm not entirely clear on what you're after, but isn't this just a simple matter of cumsum?

transform(df, FLAG = cumsum(FLAG))
#   RECORDS FLAG
#1   H12434    1
#2    W3211    1
#3      Maa    1
#4      Mab    1
#5      Mac    1
#6      Mad    1
#7    T1_12    1
#8    H7367    2
#9     R001    2
#10 W4810.5    2
#11     Maa    2
#12     Mab    2
#13   T2_12    2

Or using dplyr

library(dplyr)
df %>% mutate(FLAG = cumsum(FLAG))

Sample data

df <- read.table(text =
    "RECORDS  FLAG
H12434   TRUE
W3211    FALSE
Maa      FALSE
Mab      FALSE
Mac      FALSE
Mad      FALSE
T1_12    FALSE
H7367    TRUE
R001     FALSE
W4810.5  FALSE
Maa      FALSE
Mab      FALSE
T2_12    FALSE", header = T)
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68