0

I am using standardized predictors in training set to train the model. When I predict the outcome in test set, how do I reverse the scale of the outcome to the original scale? It looks like I predicted the standardized score of the test outcome.

Please see the reproducible R code and output below:

> mtcars

> str(mtcars)
'data.frame':   32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

> set.seed(3422143)
> train.index=sample(32,20) 
> train=mtcars[train.index,]
> test=mtcars[-train.index,] 


> fit=lm(scale(hp)~scale(mpg)+scale(qsec)+scale(am),train)
> summary(fit)

Call:
lm(formula = scale(hp) ~ scale(mpg) + scale(qsec) + scale(am), 
    data = train)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.66237 -0.37891  0.08107  0.27530  0.82087 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  4.331e-16  9.680e-02   0.000  1.00000   
scale(mpg)  -3.746e-01  2.205e-01  -1.699  0.10873   
scale(qsec) -4.000e-01  1.157e-01  -3.457  0.00324 **
scale(am)   -3.888e-01  2.073e-01  -1.876  0.07905 . 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4329 on 16 degrees of freedom
Multiple R-squared:  0.8422,    Adjusted R-squared:  0.8126 
F-statistic: 28.46 on 3 and 16 DF,  p-value: 1.19e-06

> predict(fit,test)
          Mazda RX4       Mazda RX4 Wag      Hornet 4 Drive   Hornet Sportabout          Duster 360           Merc 240D            Merc 280 
        -0.02303164         -0.16196109         -0.01044866          0.73605764          1.26694385         -0.31174766          0.39144301 
Lincoln Continental       Toyota Corona      Ford Pantera L        Ferrari Dino       Maserati Bora 
         0.98680939         -0.15727132          0.74466200          0.28549328          0.76315171 

desertnaut
  • 57,590
  • 26
  • 140
  • 166
user11806155
  • 121
  • 5
  • Simpley create the rescaled varibales in your original data set, then split into training and test and use the rescaled variants in both, training and test. – deschen Jan 10 '21 at 17:42
  • @deschen what do you mean by creating the rescaled variables in the original dataset? specifically, how? I am confused about this... – user11806155 Jan 11 '21 at 03:13
  • Instead of using e.g. scale(hp) in the lm formula, create a varibale hp_scaled dirwctly in your data set. And then use hp_scaled in your lm formula. Same with other variables. – deschen Jan 11 '21 at 06:10
  • @deschen Thank you very much for the comments. By creating and using hp_scaled, I am treating standardized score of hp as the outcome, right? What I am trying to achieve is, how to reverse the predicted standardized score to the original scale? – user11806155 Jan 11 '21 at 15:09
  • This should answer your question: https://stats.stackexchange.com/questions/209784/rescale-predictions-of-regression-model-fitted-on-scaled-predictors – deschen Jan 11 '21 at 18:20
  • Dear @deschen,To confirm my understanding, first, obtain the mean and sd of both outcome and predictors in the trainning set. Second, make the model with the scaled predictiors and outcomes in the training set. Third, convert the raw data of test set using the mean and sd of the training set. Fourth, apply the trained model on the converted test set. Fifth, use the predicted score of the test set, multiply the sd of the training outcome, then add the mean of the training outcome, to get the reversed predictied score of the outcome in the test set back to the original scale. Correct? – user11806155 Jan 12 '21 at 04:40
  • yes, thtat's how I understand it as well. – deschen Jan 12 '21 at 07:44
  • Great! Problem solved. Thank you very much. – user11806155 Jan 12 '21 at 09:15

0 Answers0