I am trying to conditionally replace some fields in a dataframe; however, my code is finding about 25% of the actual instances present. I've searched through the other conditional search questions, but didn't find anything matching my problem -- I apologize in advance if I missed one.
Specifically, I am trying to replace all numbers 1 to 9 in dta$day, with a to i.
Here are the first 100 items in that vector: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 6 7 8 9
When I conditionally search for values 1 to 9, using:
dta$day == c("1","2","3","4","5","6","7","8","9")
It states that only the first and last set in that grouping match my condition as below (I've bolded ~what should be TRUE for your reference):
[1] **TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE** FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[17] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE **FALSE**
[33] **FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE** FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE **FALSE FALSE**
[65] **FALSE FALSE FALSE FALSE FALSE FALSE FALSE** FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[81] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE **TRUE TRUE TRUE TRUE TRUE TRUE**
[97] **TRUE TRUE TRUE**
The problem must be in that first step, but to show you the result, only the first and last set in that first 100 in my vector are appropriately replaced after applying this code:
dta[dta$day == c("1","2","3","4","5","6","7","8","9"),1
] <- c("a", "b", "c", "d", "e", "f", "g", "h", "i")
[1] **"a" "b" "c" "d" "e" "f" "g" "h" "i"** "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
[20] "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" **"1" "2" "3" "4" "5" "6" "7"**
[39] "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26"
[58] "27" "28" **"1" "2" "3" "4" "5" "6" "7" "8" "9" "10"** "11" "12" "13" "14" "15" "16" "17"
[77] "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" **"a" "b" "c" "d" "e"
[96] "f" "g" "h" "i"**
If useful, here is the initial state of that vector:
is.numeric(dta$day)
[1] TRUE
summary(dta$day)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 8.00 16.00 15.73 23.00 31.00
I am reproducing the data frame here:
day <- c(1:31,1:28,1:31,1:30)
month <- c(rep_len(1,31),rep_len(2,28),rep_len(3,31),rep_len(4,30))
temp <- rnorm(length(month),10,10)
dta=as.data.frame(cbind(day,month,temp))
And actually, although I am able to reproduce the problem with this toy example, I get a warning that I do not get with my actual data (not reproduced here because it is very large): "longer object length is not a multiple of shorter object length".
I would love some help, and if I haven't provided something or haven't done so in the format needed, please kindly let me know!