MinMaxScaler Vs StandardScaler for Scaling Features?

Question

I'm training a neural network to predict Bitcoin close prices, I'm testing MinMaxScaler vs StandardScaler for input features (High, Low, Volatility) and MSE (Mean Square Error) to evaluate results.

MinMaxScaler

StandardScaler

My questions :

As noticed in pics, MinMaxScaler is doing a worse job of predicting prices. However, MSE is 0.107, On the other hand, StandardScaler has an MSE of 0.2. Why is that? Is it Because MinMaxScaler is scaling Between [0,1] so results are closer compared to StandardScaler
Which type of scaling is used in research papers Because most of them don't mention that information and I can't tell if my results are better or worse than theirs?
Both scalers are doing scaling on each column individually. Right ? Because each feature has a very different range of values (Volatility Vs Prices). Also, I've noticed that after fitting all features together, the relationship Between features is lost. E.g: scaled low prices are higher than scaled high prices!

score 1 · Answer 1 · answered Aug 29 '21 at 13:16

StandardScaler is useful for the features that follow a Normal distribution.Therefore, it makes mean = 0 and scales the data to unit variance.

MinMaxScaler may be used when the upper and lower boundaries are well known from domain knowledge.MinMaxScaler scales all the data features in the range [0, 1] or else in the range [-1, 1] if there are negative values in the dataset.This scaling compresses all the inliers in the narrow range.

Bitcoin price distribution seems to be normal So StandardScaler it is better predicted

Thanks for your comment. However, you didn't actually answer any of my questions!! — mr.m, Aug 29 '21 at 13:30

Utku Can · Answer 2 · 2023-03-20T19:14:18.240

1- If you do not want to read whole answer, just check the error rate with MASE when you are comparing different scaled operations. MSE is scale dependent whereas MASE is scale free. This explains why you are getting MSE low in Min-Max Normalization whereas your predictions seems to be better with Standard Normalization.

I tested code below with different error functions (MAE,MSE,MASE) and if you check the outputs,with min-max scaler, you will (almost) always get lower values. But that does not confirm (will explain later) that we did well with min-max scaler. By the way, i also observed that MAPE is not a good choice with min-max scaler and removed from the code.

import numpy as np
from sklearn.preprocessing import MinMaxScaler,StandardScaler
from sklearn.metrics import mean_squared_error,mean_absolute_error
   
np.random.seed(42)
   
y_true = np.random.randint(0,500,100)
y_pred = np.random.randint(0,500,100)
   
#scale
   
min_max_scaler = MinMaxScaler()
standard_scaler = StandardScaler()
   
y_true_min_max = min_max_scaler.fit_transform(y_true.reshape(-1,1))
y_pred_min_max = min_max_scaler.fit_transform(y_pred.reshape(-1,1))
y_true_standard = standard_scaler.fit_transform(y_true.reshape(-1,1))
y_pred_standard = standard_scaler.fit_transform(y_pred.reshape(-1,1))
   
def error_function(y_true,y_pred):
    mae = mean_absolute_error(y_true,y_pred)
    mse = mean_squared_error(y_true,y_pred)
    def mean_absolute_scaled_error(y_true,y_pred):
            return np.mean(np.abs(y_true - y_pred) / np.abs(y_true - np.mean(y_true)))
   
    mase = mean_absolute_scaled_error(y_true,y_pred)
    mase = round(mase, 2)
    mae = round(mae, 2)
    mse = round(mse, 2)   
    outputs = {'mae':mae,'mse':mse,'mase':mase}
    return outputs
   
#error
min_max_error = error_function(y_true_min_max,y_pred_min_max)
standard_scaler_error = error_function(y_true_standard,y_pred_standard)
   
print(f"min_max_error: {min_max_error}",f"standard_scaler_error: {standard_scaler_error}",sep='')
Output: min_max_error: {'mae': 0.35, 'mse': 0.19, 'mase': 5.53}standard_scaler_error: {'mae': 1.2, 'mse': 2.16, 'mase': 5.57}

Now lets try with stock price assuming we have 2 outputs with min_max and standard scaler and i assume standard scaler performed well with the output. The question is, checking the MSE value is enough?


stock_price = np.array([50,123,100,213,123,22])
predicted_stock_price_min_max = np.array([123,100,213,123,22,55]) # assumming that predicted worse
predicted_stock_price_standard = np.array([40,113,90,113,129,33]) # assuming that predicted well
    
#scale
min_max_scaler = MinMaxScaler()
standard_scaler = StandardScaler()
    
stock_price_min_max = min_max_scaler.fit_transform(stock_price.reshape(-1,1))
predicted_stock_price_min_max = min_max_scaler.fit_transform(predicted_stock_price_min_max.reshape(-1,1))
    
stock_price_standard = standard_scaler.fit_transform(stock_price.reshape(-1,1))
predicted_stock_price_standard = standard_scaler.fit_transform(predicted_stock_price_standard.reshape(-1,1))
    
##error
    
min_max_error = error_function(stock_price_min_max,predicted_stock_price_min_max)
standard_scaler_error = error_function(stock_price_standard,predicted_stock_price_standard)
    
print(f"min_max_error: {min_max_error}",f"standard_scaler_error: {standard_scaler_error}",sep='')
Output:min_max_error: {'mae': 0.38, 'mse': 0.17, 'mase': 5.23} standard_scaler_error: {'mae': 0.49, 'mse': 0.36, 'mase': 1.26}

If you check the errors, standard scaled errors supposed to be lower, but wait, min-max still low (in your case check mse), except MASE. Other results does not confirm the result as we can observe that we performed better with standard scale.

As a result, MSE may not be a good accuracy result when performing time series forecast. From my point of view, MASE is more robust when performing Time Series Forecast.

ref: https://towardsdatascience.com/time-series-forecast-error-metrics-you-should-know-cc88b8c67f27

2- Explained here : https://stackoverflow.com/a/58850139/12906920 and https://www.atoti.io/articles/when-to-perform-a-feature-scaling/#:~:text=What%20is%20Feature%20Scaling%3F,during%20the%20data%20preprocessing%20step. Long story short, if your data follows, normal distribution, use standard scaler. If the data has lots of outliers use robustscaler, else, you can use min-max scaler (for example: for image pixel values/arrays)

MinMaxScaler Vs StandardScaler for Scaling Features?

2 Answers2