2

I'm very confused with the data structure concepts in R (those are much more easy to understand in SAS).

Is there any difference between x <- 1:5 and x <- c(1,2,3,4,5)? From the environment window, I know that one is int and the other is num.

Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
Zhaoshuo Ge
  • 21
  • 1
  • 2
  • 5
    `:` is a function for creating integer sequences. Both vectors are numeric, but the latter is numeric double. Double is numeric, integer is integer and also numeric. Integers generally make operations faster. – Rich Scriven Oct 26 '15 at 01:26
  • 1
    If you wanted to force the second option to be integer type instead of numeric, you could do `x <- c(1L, 2L, 3L, 4L, 5L)` or `x <- as.integer(c(1, 2, 3, 4, 5))`. – cocquemas Oct 26 '15 at 01:27
  • i believe sas has one data structure so I imagine that it would be easier – rawr Oct 26 '15 at 01:33

1 Answers1

4

x and y below are not quite identical because they have different storage modes, as you discovered by using str(x) and str(y). In my experience, this distinction is unimportant 99% of the time; R uses fairly loose typing, and integers will automatically be promoted to double (i.e. double-precision floating point) when necessary. Integers and floating point values below the maximum integer value (.Machine$integer.max) can be converted back and forth without loss of information. (Integers do take slightly less space to store, and can be slightly faster to compute with as @RichardScriven comments above.)

If you want to create an integer vector, append L as below ... or use as.integer() as suggested in comments above.

x <- 1:5
y <- c(1,2,3,4,5)
z <- c(1L,2L,3L,4L,5L)
all.equal(x,y) ## test for _practical_ equivalence: TRUE
identical(x,y) ## FALSE
identical(x,z) ## TRUE

storage.mode() and class() may also be useful, as well as is.integer(), is.double(), as.integer(), as.double(), is.numeric() ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453