7

In PHP we can do error_reporting(E_ALL) or error_reporting(E_ALL|E_STRICT) to have warnings about suspicious code. In g++ you can supply -Wall (and other flags) to get more checking of your code. Is there some similar in R?

As a specific example, I was refactoring a block of code into some functions. In one of those functions I had this line:

 if(nm %in% fields$non_numeric)...

Much later I realized that I had overlooked adding fields to the parameter list, but R did not complain about an undefined variable.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
Darren Cook
  • 27,837
  • 13
  • 117
  • 217
  • 2
    I'm not sure if the specific problem you mention could really be caught by stricter warnings. If it gave no error, that probably means `fields` was also a global, which means it's legit to refer to it. If you want to explicitly declare globals you could do this with a lot of environment manipulation. – Owen Aug 28 '11 at 06:51
  • 4
    R doesn't have particularly strict policies about declaring variables, and its scope is something you should read up on (which will explain why it didn't return an error in your example). You can try `options(warn=2)` to turn all warnings into errors, however. – Ari B. Friedman Aug 28 '11 at 10:46
  • The compiler-warnings tag is not really appropriate. Which compiler? – David Heffernan Aug 28 '11 at 11:35
  • 8
    How about `?codetools::checkUsage` (`codetools` is a built-in package) – Ben Bolker Aug 28 '11 at 12:35
  • 2
    Related question: "R force local scope" - http://stackoverflow.com/q/6216968/602276 – Andrie Aug 28 '11 at 14:00
  • 1
    What's with answering in comments...? ;-) – Gavin Simpson Aug 28 '11 at 21:08
  • I usually post as a comment when I have an idea or a pointer that I think would be useful but I'm not sure, or I can't/don't want to take the time to flesh it out into what I would consider to be a minimal answer ... sometimes in the hopes that someone else will take the idea and run with it (although I guess they could do that with a question too, then I could go ahead and delete my version) – Ben Bolker Aug 29 '11 at 06:17
  • There were some very good suggestions in the comments but no answers, several hours later. I think I got distracted when posting that comment. I did mean to say that the ideas here were good and should be posted as answers. – Gavin Simpson Aug 29 '11 at 10:28
  • Another closely related question: http://stackoverflow.com/questions/2140972/can-we-have-more-error-messages (Most helpful reply is: `checkUsageEnv(.GlobalEnv)` is how to test all functions you've imported so far) – Darren Cook Aug 30 '11 at 10:21

3 Answers3

6

(Posting as an answer rather than a comment)

How about ?codetools::checkUsage (codetools is a built-in package) ... ?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • checkUsage() works wonderfully (including pointing out a 2nd refactoring bug!) until I created fields as a global. (Then it cannot detect the problem.) But,combined with manipulating environment (`environment(myfunc)=parent.env(.GlobalEnv);checkUsage(myfunc)`) it works as a lint – Darren Cook Aug 29 '11 at 09:33
3

This is not really an answer, I just can't resist showing how you could declare globals explicitly. @Ben Bolker should post his comment as the Answer.

To avoiding seeing globals, you can take a function "up" one environment -- it'll be able to see all the standard functions and such (mean, etc), but not anything you put in the global environment:

explicit.globals = function(f) {
    name = deparse(substitute(f))
    env = parent.frame()
    enclos = parent.env(.GlobalEnv)

    environment(f) = enclos
    env[[name]] = f
}

Then getting a global is just retrieving it from .GlobalEnv:

global = function(n) {
    name = deparse(substitute(n))
    env = parent.frame()
    env[[name]] = get(name, .GlobalEnv)
}
assign('global', global, env=baseenv())

And it would be used like

a = 2
b = 3

f = function() {
    global(a)

    a
    b
}
explicit.globals(f)

And called like

> f()
Error in f() : object 'b' not found

I personally wouldn't go for this but if you're used to PHP it might make sense.

Owen
  • 38,836
  • 14
  • 95
  • 125
1

Summing up, there is really no correct answer: as Owen and gsk3 point out, R functions will use globals if a variable is not in the local scope. This may be desirable in some situations, so how could the "error" be pointed out?

checkUsage() does nothing that R's built-in error-checking does not (in this case). checkUsageEnv(.GlobalEnv) is a useful way to check a file of helper functions (and might be great as a pre-hook for svn or git; or as part of an automated build process).

I feel the best solution when refactoring is: at the very start to move all global code to a function (e.g. call it main()) and then the only global code would be to call that function. Do this first, then start extracting functions, etc.

Darren Cook
  • 27,837
  • 13
  • 117
  • 217