5

I use the library method loess of the R programming language for non parametric data fitting. The dataset is two-dimensional. I have not found any proper documentation of the method parameter weights.

My data points are normally distributed random variables, and I also have an estimate of their respective standard deviations. I am wondering whether the parameter weights allows me to supply R with the details of the standard deviations. In other words: I wonder whether the individual weights in weights are (relative) measures of data quality, so that the fit can be improved if some measure of data uncertainty is supplied via the parameter weights.

EDIT: I suspect the entries in weights are used as weights in the weighted least squares regressions of local datasets in the LOESS procedure (maybe as additional weight prefactors for the (position dependent) kernel functions?). This would suggest that for the case of data points which are independent normally distributed random variables, but still have different noise levels (i.e. different standard deviations) (as in my case), the weights should be chosen as 1/\sigma_{i}^2, where \sigma_{i} is the standard deviation of the respective random variable/data point. If someone knows for sure, that would be nice to know.

sperber
  • 661
  • 6
  • 20
  • 1
    In short: yes. That is what you expect from weights. For some intuition see the same argument of e.g. `weighted.mean`. Of course you don't want to use your standard deviations as weights. – mts Jul 11 '15 at 10:56
  • @mts: thank you for the hint regarding weighted.mean. Intuitively I'd use the inverse standard deviation as weight. As I do not know much about the internal workings of R's loess implementation, I am not quite sure if this is a proper choice. – sperber Jul 11 '15 at 11:02
  • 3
    I'm not sure what is a standard/good choice for the weights, if you don't find anything through google, you should ask on http://stats.stackexchange.com/ as it is not a programming question anymore. If you are interested in what `loess` is doing, consider checking the source code at this point. – mts Jul 11 '15 at 12:07

1 Answers1

1

This page confirms my suspicion:

https://docs.tibco.com/pub/enterprise-runtime-for-R/3.1.0/doc/html/Language_Reference/stats/loess.html

Regarding the parameter weights of loess it says:

an optional expression for weights to give to individual observations in the sum of squared residuals that forms the local fitting criterion. By default, an unweighted fit is carried out. If it is supplied, weights is treated as an expression to evaluate in the same data frame as the model formula. It should evaluate to a non-negative numeric vector. If the different observations have nonequal variances, weights should be inversely proportional to the variances.

sperber
  • 661
  • 6
  • 20