I am realtively new to R and I am trying to split a continuous variable into two categories. Assume the following:
y = c(6.3, 6.2, 6.2, 5.5, 6.9, 6.8, 5.3, 5.3, 5.4, 5.2, 7.2, 7.1, 8.1, 8.2, 8.2, 7.4, 6.7, 7.2, 7.9, 8.0, 6.5, 6.6, 6.5, 7.2, 7.2, 6.8, 6.7)
cuts = cut(y, breaks=2)
cuts
[1] (5.197,6.7] (5.197,6.7] (5.197,6.7] (5.197,6.7] (6.7,8.203] (6.7,8.203] (5.197,6.7] (5.197,6.7] (5.197,6.7] (5.197,6.7] (6.7,8.203]
[12] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (5.197,6.7] (5.197,6.7]
[23] (5.197,6.7] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203]
Levels: (5.197,6.7] (6.7,8.203]
I am especially interested in the value 6.7 which appears at the end of the vector. Why does 6.7 fall into the interval (6.7, 8.203] and not into (5.197, 6.7]? As far as I understand 6.7 should NOT be part of the interval (6.7, 8.203]. Am I missing something? Thanks for the help!
Edit:
As pointed out in the comments 6.7 is actually 6.7000000000000001776
options(digits=20);
y
[1] 6.2999999999999998224 6.2000000000000001776 6.2000000000000001776 5.5000000000000000000 6.9000000000000003553 6.7999999999999998224
[7] 5.2999999999999998224 5.2999999999999998224 5.4000000000000003553 5.2000000000000001776 7.2000000000000001776 7.0999999999999996447
[13] 8.0999999999999996447 8.1999999999999992895 8.1999999999999992895 7.4000000000000003553 6.7000000000000001776 7.2000000000000001776
[19] 7.9000000000000003553 8.0000000000000000000 6.5000000000000000000 6.5999999999999996447 6.5000000000000000000 7.2000000000000001776
[25] 7.2000000000000001776 6.7999999999999998224 6.7000000000000001776
An additional question:
I will save the interval bounds for later reference because I want to check into which interval new elements fall. So imagine I have the intervals (5.197,6.7] (6.7,8.203]
generated by cut and now I will get a new element x = 6.7
and I want to check into which interval it will fall. As I check if 5.197 < x <= 6.7
it will fall into the first interval whereas my original 6.7 from the vector fell into the second interval.
Is cuts = cut(y, breaks=2, dig.lab=17)
here really my way to go to get both elements into the same interval?