49

What are the restrictions as to what characters (and maybe other restrictions) can be used for a variable name in R?

(This screams of general reference, but I can't seem to find the answer)

zx8754
  • 52,746
  • 12
  • 114
  • 209
Kyle Brandt
  • 26,938
  • 37
  • 124
  • 165
  • 3
    R FAQ 7.14: http://cran.r-project.org/doc/FAQ/R-FAQ.html#What-are-valid-names_003f – James Feb 08 '12 at 15:00
  • 1
    You might also be interested in the discussion here: http://stackoverflow.com/questions/8396577/check-if-character-value-is-a-valid-r-object-name/8396658#8396658 – Josh O'Brien Feb 08 '12 at 15:03
  • You should have found the link to `?make.names` in the help page for `read.table`. The help page I always have difficulty remembering is the one that describes the allowable escape characters and the answer is `?Quotes`. – IRTFM Feb 08 '12 at 15:04
  • 1
    An Introduction to R, [Section 1.8: R commands, case sensitivity, etc.](http://cran.r-project.org/doc/manuals/R-intro.html#R-commands_003b-case-sensitivity-etc) – Joshua Ulrich Feb 08 '12 at 15:06

4 Answers4

36

You might be looking for the discussion from ?make.names:

A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as ".2way" are not valid, and neither are the reserved words.

In the help file itself, there's a link to a list of reserved words, which are:

if else repeat while function for in next break

TRUE FALSE NULL Inf NaN NA NA_integer_ NA_real_ NA_complex_ NA_character_

Many other good notes from the comments include the point by James to the R FAQ addressing this issue and Josh's pointer to a related SO question dealing with checking for syntactically valid names.

Community
  • 1
  • 1
joran
  • 169,992
  • 32
  • 429
  • 468
  • Note that you could use anything if you quote your variable name, eg ``TRUE` <-2` is valid. I can't write this properly using the SO syntax :( – James Feb 08 '12 at 15:04
  • Of course, my problem was I was searching Google for the inverse ("restrictions") or basically *invalid* characters when I should have been searching for *valid* characters. – Kyle Brandt Feb 08 '12 at 15:05
  • 1
    Also, a dot on its own is valid. `. = 3; print(.)` – Aaron McDaid Sep 17 '14 at 14:17
  • 1
    What's the meaning of a name starting with a dot? I've seen many in R packages. – skan Sep 14 '17 at 13:51
  • 2
    @skan It is usually done in an attempt to avoid name collisions with common names of variables, columns of data frames, or other arguments passed along to different methods or other functions. – joran Sep 14 '17 at 17:22
28

Almost NONE! You can use 'assign' to make ridiculous variable names:

assign("1",99)
ls()
# [1] "1"

Yes, that's a variable called '1'. Digit 1. Luckily it doesn't change the value of integer 1, and you have to work slightly harder to get its value:

1
# [1] 1
get("1")
# [1] 99

The "syntactic restrictions" some people might mention are purely imposed by the parser. Fundamentally, there's very little you can't call an R object. You just can't do it via the '<-' assignment operator. "get" will set you free :)

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • Are there _literally_ no restrictions using `assign` and `get`, or do you run into some limitations at some point? – joran Feb 08 '12 at 15:08
  • 16
    I believe the phrase is "Enough rope to hang yourself with" :-P – Kyle Brandt Feb 08 '12 at 15:11
  • 1
    ?assign: "There are no restrictions on ‘name’: it can be a non-syntactic name (see ‘make.names’).". This is of course a lie: c=paste(rep(letters,10000),collapse="") ; assign(c,123) produces: Error in assign(c, 123) : variable names are limited to 10000 bytes – Spacedman Feb 08 '12 at 15:12
  • 3
    @joran -- From `?name`, "Names are limited to 10,000 bytes (and were to 256 bytes in versions of R before 2.13.0).", so there's at least one limit for you! – Josh O'Brien Feb 08 '12 at 15:12
  • 5
    @joran You could start the slow descent into madness though: `assign("get",ls)` – James Feb 08 '12 at 15:13
  • 1
    @JoshO'Brien A limit presumably only reached by German or Welsh programmers, with their love of compound words. :) – joran Feb 08 '12 at 15:18
  • 6
    You don't need `get()`, backtick quoting the name will reference it fine: `` `1` `` – Gavin Simpson Feb 08 '12 at 16:46
  • 1
    good point, also works when assigning, so you don't need assign either – Spacedman Feb 08 '12 at 18:03
  • You guys are playing some dangerous games there. – Waldir Leoncio May 20 '15 at 20:27
6

The following may not directly address your question but is of great help. Try the exists() command to see if something already exists and this way you know you should not use the system names for your variables or function. Example...

   > exists('for')
   [1] TRUE

   >exists('myvariable')
   [1] FALSE
Stat-R
  • 5,040
  • 8
  • 42
  • 68
2

Using the make.names() function from the built in base package may help:

is_valid_name<- function(x)
{
  length_condition = if(getRversion() < "2.13.0") 256L else 10000L
  is_short_enough = nchar(x) <= length_condition
  is_valid_name = (make.names(x) == x)

  final_condition = is_short_enough && is_valid_name
  return(final_condition)
}
omarflorez
  • 355
  • 4
  • 4