I have a data similar to this.
B <- data.frame(State = c(rep("Arizona", 8), rep("California", 8), rep("Texas", 8)),
Account = rep(c("Balance", "Balance", "In the Bimester", "In the Bimester", "Expenses",
"Expenses", "In the Bimester", "In the Bimester"), 3), Value = runif(24))
You can see that Account
has 4 occurrences of the element "In the Bimester"
, two "chunks" of two elements for each state, "Expenses"
in between them.
The order here matters because the first chunk is not referring to the same thing as the second chunk.
My data is actually more complex, It has a 4th variable, indicating what each row of Account
means. The number of its elements for each Account
element (factor per se) can change. For example, In some state, the first "chunk" of "In the Bimester"
can have 6 rows and the second, 7; but, I cannot differentiate by this 4th variable.
Desired: I'd like to subset my data, spliting those two "In the Bimester"
by each state, subsetting only the rows of the first "chunks" by each state or the second "chunks".
I have a solution using data.table
package, but I'm finding it kind of poor. any thoughts?
library(data.table)
B <- as.data.table(B)
B <- B[, .(Account, Value, index = 1:.N), by = .(State)]
x <- B[Account == "Expenses", .(min_ind = min(index)), by = .(State)]
B <- merge(B, x, by = "State")
B <- B[index < min_ind & Account == "In the Bimester", .(Value), by = .(State)]