40

Short question, if I have a string, how can I test if that string is a valid color representation in R?

Two things I tried, first uses the function col2rgb() to test if it is a color:

isColor <- function(x)
{
  res <- try(col2rgb(x),silent=TRUE)
  return(!"try-error"%in%class(res))
}

> isColor("white")
[1] TRUE
> isColor("#000000")
[1] TRUE
> isColor("foo")
[1] FALSE

Works, but doesn't seem very pretty and isn't vectorized. Second thing is to just check if the string is in the colors() vector or a # followed by a hexadecimal number of length 4 to 6:

isColor2 <- function(x)
{
  return(x%in%colors() | grepl("^#(\\d|[a-f]){6,8}$",x,ignore.case=TRUE))
}

> isColor2("white")
[1] TRUE
> isColor2("#000000")
[1] TRUE
> isColor2("foo")
[1] FALSE

Which works though I am not sure how stable it is. But it seems that there should be a built in function to make this check?

Sacha Epskamp
  • 46,463
  • 20
  • 113
  • 131
  • I suppose doing `trycatch` on `Rgames> plot(1,2,col='phlogiston') Error in plot.xy(xy, type, ...) : invalid color name 'phlogiston'` is not helpful :-) – Carl Witthoft Nov 08 '12 at 12:24
  • Sorry - the SO timeout caught me in mid-edit. The choice of function depends on what you want to do with it. Is throwing an error sufficient (which `plot` does already), or do you want to "repair" a bad color spec? If the latter, you're going to have to roll your own function anyway, based on what you view as the proper correction algorithm – Carl Witthoft Nov 08 '12 at 12:34
  • You might have the alpha digits. `isColor( "#00000000" )` should return `TRUE` – Romain Francois Nov 08 '12 at 13:01
  • 1
    @Romain yes should be 6 to 8 digits, changed it. @Carl I like having arguments of functions very flexible. E.g. a `color` argument that can be assigned a color to directly use that color, or `TRUE` to use some algorithm to define the color, or `FALSE` to omit it. – Sacha Epskamp Nov 08 '12 at 13:24
  • As Gavin's answer and comment indicate, you're going down a #FFFFFFCC path. For comparison, would you think it sensible to parse arguments to an arbitrary function to ensure that said argument names exist in the current environment? (I'd hope the answer is "no".) And further, what if you have a variable `my_colors<-c('red','blue','boogersnot')` ? Is invoking `plot(x,y,col=my_colors)` an error or not? – Carl Witthoft Nov 08 '12 at 13:48
  • You could create a vector with the actual hex values & `colors`. Then use a hash table by placing the vector in an environment and use `if(exists(x, env=env)) TRUE else FALSE`. If this were in a package with the environment saved as a data set it would be pretty fast (I think *meekishly*). – Tyler Rinker Nov 08 '12 at 13:49

2 Answers2

27

Your first idea (using col2rgb() to test color names' validity for you) seems good to me, and just needs to be vectorized. As for whether it seems pretty or not ... lots/most R functions aren't particularly pretty "under the hood", which is a major reason to create a function in the first place! Hides all those ugly internals from the user.

Once you've defined areColors() below, using it is easy as can be:

areColors <- function(x) {
     sapply(x, function(X) {
         tryCatch(is.matrix(col2rgb(X)), 
                  error = function(e) FALSE)
         })
     }

areColors(c(NA, "black", "blackk", "1", "#00", "#000000"))
#   <NA>   black  blackk       1     #00 #000000 
#   TRUE    TRUE   FALSE    TRUE   FALSE    TRUE 
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • 1
    vapply would be even better because it would return a slightly better response when the input is length 0. – hadley Nov 09 '12 at 00:50
7

Update, given the edit

?par gives a thorough description of the ways in which colours can be specified in R. Any solution to a valid colour must consider:

  1. A named colour as listed in colors()
  2. A hexademical representation, as a character, of the form "#RRGGBBAA specifying the red, green, blue and alpha channels. The Alpha channel is for transparency, which not all devices support and hence whilst it is valid to specify a colour in this way with 8 hex values it may not be valid on a specific device.
  3. NA is a valid "colour". It means transparent, but as far as R is concerned it is a valid colour representation.
  4. Likewise "transparent" is also valid, but not in colors(), so that needs to be handled as well
  5. 1 is a valid colour representation as it is the index of a colour in a small palette of colours as returned by palette()

    > palette()
    [1] "black"   "red"     "green3"  "blue"    "cyan"    "magenta" "yellow" 
    [8] "gray"
    

    Hence you need to cope with 1:8. Why is this important, well ?par tells us that it is also valid to represent the index for these colours as a character hence you need to capture "1" as a valid colour representation. However (as noted by @hadley in the comments) this is just for the default palette. Another palette may be used by a user, in which case you will have to consider a character index to an element of a vector of the maximum allowed length for your version of R.

Once you've handled all those you should be good to go ;-)

To the best of my knowledge there isn't a user-visible function that does this. All of this in buried away inside the C code that does the plotting; very quickly you end up in .Internal(....) land and there be dragons!


Original

[To be pedantic #000000 isn't a colour name in R.]

The only colour names R knows are those returned by colors(). Yes, #000000 is one of the colour representations that R understands but you specifically ask about a name and the definitive list or solution is x %in% colors() as you have in your second example.

This is about as stable as it gets. When you use a colour like col = "goldenrod", internally R matches this with a "proper" representation of the colour for whichever device you are plotting on. color() returns the list of colour names that R can do this looking up for. If it isn't in colors() then it isn't a colour name.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • You're right. I changed the title/question to indicate I am looking for a valid representation, so including `#000000` and the like. – Sacha Epskamp Nov 08 '12 at 13:25
  • In that case, technically `"3"` is also a colour, a light green. – Gavin Simpson Nov 08 '12 at 13:35
  • @GavinSimpson that's only with the default palette. – hadley Nov 08 '12 at 13:48
  • @hadley Right, good point, so technically a palette could have as many entries as is allowed in a vector in R or is representable in R's internal colour scheme `16777216`, whichever is the smaller. – Gavin Simpson Nov 08 '12 at 13:51
  • 1
    @GavinSimpson You're not going to be able to do better than the OP's first idea of using `col2rgb` and checking for an error. That said the [C source code](https://github.com/wch/r-source/blob/trunk/src/main/colors.c#L69) for `col2rgb` isn't too horrible, and reveals another case: `col2rgb(0)` gives you the background colour of the plot. – hadley Nov 08 '12 at 14:08
  • @GavinSimpson I've been working on a guide to reading R's C source code at https://github.com/hadley/devtools/wiki/c-interface - feedback welcomed. – hadley Nov 08 '12 at 14:11
  • @hadley +1, that's a useful comment! Fully agree; writing a wrapper to `col2rgb` which cna handle a vectors of colours if needed would be the way to go. – Gavin Simpson Nov 08 '12 at 14:13
  • @hadley I think I read that a short while ago and found it very useful (I've only ever used the the `.C` interface before). Will take another look and pass on anything that occurs to me. – Gavin Simpson Nov 08 '12 at 14:15
  • Thanks for the extensive answer. I think extending the logical check to `x%in%colors() | grepl("^#(\\d|[a-f]){6,8}$",x,ignore.case=TRUE) | grepl("^\\d+$",x) | x == "transparent" | is.na(x)` should work then, but I wonder if it is very faster than seeing if `col2rgb` returns an error. – Sacha Epskamp Nov 08 '12 at 14:24
  • @JoshO'Brien Fully agree - the aim here was to point out the futility of trying to keep up with all the possible definitions. Writing something to repeatedly call `col2rgb()` is clearly the "right" solution here. – Gavin Simpson Nov 08 '12 at 14:46
  • @SachaEpskamp The answer from Josh is the one to go for - I only meant this to point out the futility of having your second idea keep up with all the possible things that R considers colours. He deserves the accept. – Gavin Simpson Nov 08 '12 at 14:47
  • 1
    Yes I agree, it is also the one I am going to use. – Sacha Epskamp Nov 08 '12 at 14:49