15

From help("NA"):

There are also constants NA_integer_, NA_real_, NA_complex_ and NA_character_ of the other atomic vector types which support missing values: all of these are reserved words in the R language.

My question is why there is no NA_logical_ or similar, and what to do about it.

Specifically, I am creating several large very similar data.tables, which should be class compatible for later rbinding. When one of the data.tables is missing a variable, I am creating that column but with it set to all NAs of the particular type. However, for a logical I can't do that.

In this case, it probably doesn't matter too much (data.table dislikes coercing columns from one type to another, but it also dislikes adding rows, so I have to create a new table to hold the rbound version anyway), but I'm puzzled as to why the NA_logical_, which logically should exist, does not.

Example:

library(data.table)
Y <- data.table( a=NA_character_, b=rep(NA_integer_,5) )
Y[ 3, b:=FALSE ]
Y[ 2, a:="zebra" ]
> Y
       a  b
1:    NA NA
2: zebra NA
3:    NA  0
4:    NA NA
5:    NA NA
> class(Y$b)
[1] "integer"

Two questions:

  1. Why doesn't NA_logical_ exist, when its relatives do?
  2. What should I do about it in the context of data.table or just to avoid coercion as much as possible? I assume using NA_integer_ buys me little in terms of coercion (it will coerce the logical I'm adding in to 0L/1L, which isn't terrible, but isn't ideal.
Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • 2
    I can't resist referring to thedailywtf.com, where people regularly explain that a `logical` has the possible values "TRUE, FALSE, File_Not_Found" . Edit- This would have been funnier if Dirk E hadn't pointed out that `R` actually does this. – Carl Witthoft Oct 24 '13 at 13:35

2 Answers2

16

NA is already logical so NA_logical_ is not needed. Just use NA in those situations where you need a missing logical. Note:

> typeof(NA)
[1] "logical"

Since the NA_*_ names are all reserved words there was likely a desire to minimize the number of them.

Example:

library(data.table)
X <- data.table( a=NA_character_, b=rep(NA,5) )
X[ 3, b:=FALSE ]
> X
    a     b
1: NA    NA
2: NA    NA
3: NA FALSE
4: NA    NA
5: NA    NA
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • 1
    Ah, that makes much sense! Could they add a variable `NA_logical_` which pointed to the reserved word by default, rather like how `T` poitns to `TRUE`? – Ari B. Friedman Oct 24 '13 at 11:49
  • If you really must have it then its possible to define it yourself: `NA_logical_` <- NA` – G. Grothendieck Oct 24 '13 at 11:50
  • Sure, but then I would never used it because I'd be afraid it would fail on a new computer. Now that I know, `NA` is clearly the way to go. Just an interesting quirk. – Ari B. Friedman Oct 24 '13 at 11:53
  • 4
    It occurs to me that it makes total sense that `NA` is the logical version, since then it can be coerced "up" to anything else. – Ari B. Friedman Oct 24 '13 at 12:38
4

I think based on this

 #define NA_LOGICAL R_NaInt

from $R_HOME/R/include/R_ext/Arith.h we can suggest using NA_integer or NA_real and hence plain old NA in R code:

R> as.logical(c(0,1,NA))
[1] FALSE  TRUE    NA
R> 
R> as.logical(c(0L, 1L, NA_integer_))
[1] FALSE  TRUE    NA
R> 

which has

R> class(as.logical(c(0,1,NA)))
[1] "logical"
R> 
R> class(as.logical(c(0, 1, NA_real_)))
[1] "logical"
R> 

Or am I misunderstanding your question? R's logical type is three-values: yay, nay and missing. And we can use the NA from either integer or numeric to cast. Does that help?

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725