On my way through learning ML stuff I am confused by the MinMaxScaler
provided by sklearn. The goal is to normalize numerical data into a range of [0, 1]
.
Example code:
from sklearn.preprocessing import MinMaxScaler
data = [[1, 2], [3, 4], [4, 5]]
scaler = MinMaxScaler(feature_range=(0, 1))
scaledData = scaler.fit_transform(data)
Giving output:
[[0. 0. ]
[0.66666667 0.66666667]
[1. 1. ]]
The first array [1, 2]
got transformed into [0, 0]
which in my eyes means:
- The ratio between the numbers is gone
- None value has any importance (anymore) as they both got set to min-value (0).
Example of what I have expected:
[[0.1, 0.2]
[0.3, 0.4]
[0.4, 0.5]]
This would have saved the ratios and put the numbers into the range of 0 to 1.
What am I doing wrong or misunderstanding with MinMaxScaler
here? Because thinking of things like training on timeseries, it makes no sense to transform important numbers like prices or temperatures etc into broken stuff like above?