-1

I have been getting this negative R² so many times when I add a trendline in excel as shown on the figure below.

Do I care about this negative sign?

Here is the data:

x       y    
0.059   0.13    
0.095   0.05    
0.097   0.02    
0.12    0.2    
0.146   0.05    
0.192   0.11    
0.231   0.16    
0.25    0.16    
0.28    0.09    
0.33    0.05    
0.36    0.18    
0.37    0.24    
0.47    0.14    
0.76    0.11    
1.2     0.07
1.86    0.12

Pic

Manser Dimor
  • 21
  • 2
  • 8
  • It means your linear regression is garbage. Looking at your plot, it's pretty clear there is no linear model to this shotgun of data points. – JNevill Apr 13 '18 at 14:12
  • I guess it begs the question "Why try to fit this to a model, when it's pretty unlikely one exists (definitely not a linear one for sure)?" – JNevill Apr 13 '18 at 14:25
  • I know what you mean. But, even after removed the last 3 points (which are considered as outliers) I still get a negative R² while the trendline fits better with the points. What bothers me, is the negative sign https://drive.google.com/open?id=13AFon8O9dy91VhmHrsZzhJ9idLffDtW9 – Manser Dimor Apr 13 '18 at 14:33
  • Even if you remove those it's still a shotgun. Until your datapoints look more like a line, you are going to have a bad R squared. Negative means that your R squared has no relationship to the data. Excel is doing its best here and its best is still no fit. – JNevill Apr 13 '18 at 15:35
  • @JNevill in real life, your measurements always looks scattered. Only in academic examples the data points are well aligned. For the second example (orange line, without last 3 points) the fitting is acceptable, but the R² still negative!!! – Manser Dimor Apr 13 '18 at 15:47
  • "In real life your measurements always look scattered" This is just plain wrong it also doesn't matter if it's true or false. You can't change the reality of your situation. You have a tiny sample of like 15 points, which is starting off bad. They plot into a shotgun. You are seeking a model to describe this data (Great!)... now comes the trouble. There are a lot of models out there, you first shot is a linear regression. Super basic. IT DOES NOT DESCRIBE YOUR DATA AT ALL. That's the reality. Don't use a linear regression for your model because it doesn't fit. Rsquared is barfing all of itself – JNevill Apr 13 '18 at 16:46
  • You come on to SO and try to explain that it's math and reality that is wrong or slightly more likely, that Excel's programming is wrong. But one can take a quick look at your data and see very very plainly that a linear regression model is poor choice. I can see that, even removing your three outliers, and have guessed that the R squared was going to be well outside of the bounds of suggesting that this is a good model. The fact that excel is taking it's best guess at this and is coming up negative makes it sort of comical. – JNevill Apr 13 '18 at 16:49

2 Answers2

0

So, a negative R² is possible based on how that value is computed (it's not purely a square of a number). For a properly-defined model, the value of the correlation coefficient will be between 0 and 1, and the interpretation is that "x" percentage of the variability in your data is explained by the model.

The interpretation of a negative value is that your trend line is a worse fit than a horizontal line. This answer provides a much more thorough explanation.

theMayer
  • 15,456
  • 7
  • 58
  • 90
  • I understand that very clearly. But as you can see on this figure https://drive.google.com/open?id=13AFon8O9dy91VhmHrsZzhJ9idLffDtW9 the orange trendline should give a positive R² because it is evident that it fits better than a horizontal line. – Manser Dimor Apr 13 '18 at 14:40
  • I'm sorry, that's not accurate. When I ran the model, the trend line was zero, the correlation was zero. I think you need to brush up on how to calculate one of these things manually. – theMayer Apr 13 '18 at 14:57
0

When you did your trendline you selected Set Intercept = option with intercept = 0.05. In this case Excel returns an R^2 that doesn't have it's customary meaning and can be negative --- see here.

To fix the problem, unselect the Set Intercept option. When I run the trendline with the option unselected I get

y = -0.0017x + 0.1182 (R^2 = 0.0002)

Hope that helps.

xidgel
  • 3,085
  • 2
  • 13
  • 22