I have a data frame that has rows of data that should be grouped together based on having the same value in adjacent rows, and assigned a numerical identifier. The first group of data should be given the value of 1, then the next group of data should be given the value of 2 etc. The issue I'm having is I wrote a for loop which takes too long to execute. Here's an example of what the data looks like:
Day Weather
1 Rainy
2 Rainy
3 Sunny
4 Sunny
5 Sunny
6 Rainy
7 Rainy
8 Windy
9 Windy
I would like to add the following column:
Day Weather Change.in.Weather
1 Rainy 1
2 Rainy 1
3 Sunny 2
4 Sunny 2
5 Sunny 2
6 Rainy 3
7 Rainy 3
8 Windy 4
9 Windy 4
dataset$change.in.weather <- 1
for (i in 2:nrow(dataset)) {
if (dataset$weather[i] == dataset$weather[i-1] {
dataset$change.in.weather[i] <- dataset$change.in.weather[i-1]
} else {
dataset$change.in.weather[i] <- dataset$change.in.weather[i-1]+1
}
}
Since my dataset is over 1 million rows the for loop takes too long to execute so I'm looking for another solution. Thanks!