2

Let's say I have sample data like this that is continuous d <- rnorm(100)

Now I want to have this variable stored in a set of levels that correspond to specific intervals. For example, anything >-1 would be level 1, -1 < x < 0 would be level 2 and so on.

I know we can create a new variable to store the levels, but is there any way of doing it without creating an additional variable of just the levels, thereby preserving the data i.e, factoring the variable based on a condition?

I want something looks like this

d  
# [1] -0.129731527  0.832232654 -1.204235933  ...
str(d)  
# Factor w/ n levels "1", "2" ...
  • It's not possible to do it EXACTLY what you described, because that would defeat the purpose of `factor` variables. It looks like you want to keep your original values and, at the same time, assign a unique level to multiple different original values. In your example you want both `-0.129731527` and `-1.204235933` to correspond to level `1`, which is not possible, as they are different values. – AntoniosK Jun 24 '19 at 11:19
  • So I'll have to create a new variable then. – JRBatHolmes Jun 25 '19 at 12:22

1 Answers1

5

You can use cut for this:

#the second argument is where you specify the breaks that you want
dc <- cut(d, c(-Inf, -1, 0, 1, Inf))

The output will be a factor with the above ranges. Then if you want you can change the levels of that to numbers:

levels(dc) <- 1:4

But I would suggest to leave them like that, since they are numbers underneath anyway.

LyzandeR
  • 37,047
  • 12
  • 77
  • 87