How to Calculate R^2 in Tensorflow

Question

I am trying to do regression in Tensorflow. I'm not positive I am calculating R^2 correctly as Tensorflow gives me a different answer than sklearn.metrics.r2_score Can someone please look at my below code and let me know if I implemented the pictured equation correctly. Thanks

total_error = tf.square(tf.sub(y, tf.reduce_mean(y)))
unexplained_error = tf.square(tf.sub(y, prediction))
R_squared = tf.reduce_mean(tf.sub(tf.div(unexplained_error, total_error), 1.0))
R = tf.mul(tf.sign(R_squared),tf.sqrt(tf.abs(R_squared)))

score 10 · Accepted Answer · edited Jun 02 '21 at 18:02

10

What you are computing the "R^2" is

$R^2_{\text{wrong}} = \operatorname{mean}_i \left( \frac{(y_i-\hat y_i)^2}{(y_i-\mu)^2} - 1\right)1$

compared to the given expression, you are computing the mean at the wrong place. You should take the mean when computing the errors, before doing the division.

unexplained_error = tf.reduce_sum(tf.square(tf.sub(y, prediction)))
total_error = tf.reduce_sum(tf.square(tf.sub(y, tf.reduce_mean(y))))
R_squared = tf.sub(1, tf.div(unexplained_error, total_error))

edited Jun 02 '21 at 18:02

Innat

16,113
6
53
101

answered Feb 20 '17 at 17:59

kennytm

510,854
105
1,084
1,005

1

in tf.div in the third line, you have unexplained_error and total_error in the wrong positions, they need to be switched. – Nikhil Shinday Feb 22 '18 at 16:59
In your formulation, the (yi - mu) should be squared. it was reflected in the code, but it might accidently confuse some people (like me). – Rui Nian Jan 02 '19 at 17:29

rjurney · Answer 2 · 2023-04-02T00:37:27.763

6

I would strongly recommend against using a recipe to calculate this! The examples I've found do not produce consistent results, especially with just one target variable. This gave me enormous headaches!

The correct thing to do is to use tensorflow_addons.metrics.RQsquare(). Tensorflow Add Ons is on PyPi here and the documentation is a part of Tensorflow here. All you have to do is set y_shape to the shape of your output, often it is (1,) for a single output variable.

Furthermore... I would recommend the use of R squared at all. It shouldn't be used with deep networks.

R2 tends to optimistically estimate the fit of the linear regression. It always increases as the number of effects are included in the model. Adjusted R2 attempts to correct for this overestimation. Adjusted R2 might decrease if a specific effect does not improve the model.

IBM Cognos Analytics on Adjusted R Squared

edited Apr 02 '23 at 00:37

answered Oct 23 '20 at 00:41

rjurney

4,824
5
41
62

I agree with using the addon. For anyone using Colab, you will need to install the addons before your import them as they are not included by default: `!pip install tensorflow_addons` followed by your list of imports `import tensorflow_addons as tfa` – Josh Weston Nov 07 '21 at 23:24
I would _strongly_ recommend against using R^2 with a deep network unless you know what you are doing. > R2 tends to optimistically estimate the fit of the linear regression. It always increases as the number of effects are included in the model. Adjusted R2 attempts to correct for this overestimation. Adjusted R2 might decrease if a specific effect does not improve the model. https://www.ibm.com/docs/fi/cognos-analytics/11.1.0?topic=terms-adjusted-r-squared – rjurney Apr 02 '23 at 00:34

score 5 · Answer 3 · answered Jul 19 '18 at 10:28

The function is given here:

def R_squared(y, y_pred):
  residual = tf.reduce_sum(tf.square(tf.subtract(y, y_pred)))
  total = tf.reduce_sum(tf.square(tf.subtract(y, tf.reduce_mean(y))))
  r2 = tf.subtract(1.0, tf.div(residual, total))
  return r2

The concept is explained here.

score 1 · Answer 4 · edited Dec 15 '18 at 22:17

All the other solutions wouldn't produce the right R squared score for multidimensional y. The right way to calculate R2 (variance weighted) in TensorFlow is:

unexplained_error = tf.reduce_sum(tf.square(labels - predictions))
total_error = tf.reduce_sum(tf.square(labels - tf.reduce_mean(labels, axis=0)))
R2 = 1. - tf.div(unexplained_error, total_error)

The result from this TF snippet matches exactly the result from sklearn's:

from sklearn.metrics import r2_score
R2 = r2_score(labels, predictions, multioutput='variance_weighted')

score 0 · Answer 5 · answered Sep 19 '17 at 15:13

0

It should actually be the opposite on the rhs. Unexplained variance divided by total variance

answered Sep 19 '17 at 15:13

Pierre

1

1

Could you add code to help make this a complete answer? – brennan Sep 19 '17 at 15:20

Mingfei Sun · Answer 6 · 2018-01-10T01:50:39.990

0

I think it should be like this:

total_error = tf.reduce_sum(tf.square(tf.sub(y, tf.reduce_mean(y))))
unexplained_error = tf.reduce_sum(tf.square(tf.sub(y, prediction)))
R_squared = tf.sub(1, tf.div(unexplained_error, total_error))

edited Jan 10 '18 at 01:50

answered Nov 16 '17 at 02:43

Mingfei Sun

506
5
5

How to Calculate R^2 in Tensorflow

6 Answers6

Linked