11

I am trying to scale a some number to a range of 0 - 1 using preprocessing from sklearn. Thats what i did:

data = [44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform([data])
print data_scaled

But data_scaled only contains zeros. What am i doing wrong?

Gizmo
  • 871
  • 1
  • 15
  • 38

5 Answers5

22

I had the same problem when I tried scaling with MinMaxScaler from sklearn.preprocessing. Scaler returned me zeros when I used a shape a numpy array as list, i.e. [1, n] which looks like the following:

data = [[44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]]

I changed the shape of array to [n, 1]. In your case it would like the following

data = [[44.645], 
        [44.055], 
        [44.540], 
        [44.040], 
        [43.975], 
        [43.490], 
        [42.040], 
        [42.600], 
        [42.460], 
        [41.405]]

Then MinMaxScaler worked in proper way.

Future2020
  • 9,939
  • 1
  • 37
  • 51
Antonina
  • 604
  • 1
  • 5
  • 16
4

This is because data is a int32 or int64 and the MinMaxScaler needs a float. Try this:

import numpy as np
data = [44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform([np.float32(data)])
print data_scaled
Cslayer20
  • 67
  • 1
  • 9
2
data = []
data = np.array(data)
data.append([44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405])
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform(data.reshape(10,-1))
data = data_scaled.reshape( -1, 10)
print data

The reason behind this is when you're trying to apply fit_transform method of StandardScaler object to array of size (1, n) you obviously get all zeros, because for each number of array you subtract from it mean of this number, which equal to number and divide to std of this number. If you want to get correct scaling of your array, you should convert it to array with size (n, 1).

See the correct answer of this link :

alyssaeliyah
  • 2,214
  • 6
  • 33
  • 80
1

They already give the right answer, but i solve my problem using the function numpy.vstack(<your array>), in your problem you can write like this:

import numpy as np

data = [44.645, 44.055, 44.54, 44.04, 43.975, 43.49, 42.04, 42.6, 42.46, 41.405]
min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
data_scaled = min_max_scaler.fit_transform(np.vstack(data))
print(data_scaled)
#If you want to return in original format you can use 
#hstack function
data_scaled = np.hstack(data_scaled)

`

Lucas
  • 26
  • 4
0

You're putting your data into a list for some reason, but you shouldn't:

data_scaled = min_max_scaler.fit_transform(data)
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • But if i don't do that this error will occur: TypeError: 'numpy.float64' object does not support item assignment – Gizmo Sep 17 '14 at 09:22
  • What do `sklearn.__version__` and `numpy.version.version` say on your system? Because the above code works for me with recent versions. – John Zwinck Sep 17 '14 at 10:19
  • I'm using the same sklearn, NumPy 1.8.1, and Python 2.7.8 and also Python 3.4.1. When I run the code in your question I get an array of zeros; when I use the line in my answer I get a non-zero array as expected, with the first value being 1 and the last being 0. You should test on another system. – John Zwinck Sep 17 '14 at 14:30