2

I'm trying to learn to write R code that I can re-use without expecting problems in the future, specifically due to names that I assign data to in my function conflicting with names in the data passed into the function. I don't see any best practices for handling this kind of thing written down anywhere. I'm looking for suggestions on how to improve what I'm doing (or validation that what I'm doing is a best practice, but that seems unlikely).

I'm using my get_name() to get a name that is not used in the data; then I'm using assign() to assign results to that name so I can use it in the updated formula; and then I have to do it again and use get() with the weights argument. All to avoid the possibility that the incoming data/formula may already contain the variables names I would've used.

The code:

fgls_harvey = function(frml, data) {
    reg = lm(frml, data)
    en = get_name('_lresid2_', 'e', data)
    assign(en, log(residuals(reg)^2))
    f = update.formula(frml, reformulate('. + 0', en))
    environment(f) = environment()
    reg2 = lm(f, data)
    exp_n = get_name('exppv', 'e', data)
    assign(exp_n, exp(fitted(reg2)) / sum(fitted(reg2)))
    environment(frml) = environment()
    reg_fgls = lm(frml, data, weights=get(exp_n))
}

get_name = function(base, suffix, df) {
    if ('data.frame' %in% class(df)) { # either a d.f-like object
        names = colnames(df)
    } else { # or an lm-like object
        names = colnames(df$model)
    }
    if (base %in% names) {
        get_name(sprintf('%s%s', base, suffix), suffix, df)
    } else {
        base
    }

}

James
  • 630
  • 1
  • 6
  • 15
  • Looks like you're reinventing the "UUID" problem. So why not just generate random numbers and `paste` them onto your data names to create unique local variables? – Carl Witthoft Feb 06 '15 at 19:38
  • Sure, I could alter get_name to do just that. I was more curious if there was some kind of programming pattern I was missing that could avoid the situation entirely. – James Feb 06 '15 at 19:54

0 Answers0