0

I'm working on KNN algorithm in python and tried to normalise my data frames with the MinMaxScaler to transform the data in a range between 0 to 1.

However when I return the output, I observe some column min / max the output exceeds 1. Am i using it wrongly?

Below is my a snippet of the min/max value returned: enter image description here

The code used was :

kdd_data_10percent = pandas.read_csv("data/kdd_10pc", header=None, names = col_names)
features = kdd_data_10percent[num_features].astype(float)#num_features contain the specific column labels i wish to extract    
features.apply(lambda x: MinMaxScaler().fit_transform(x))

Features contain the dataframe containing the columns (e.g. wrong_fragment, urgent ...).

If i understand correctly, after the execution of the MinMaxScaler, the results returned will ensure each column values will be normalised to the range from 0 -1 only. Am i right?

misctp asdas
  • 973
  • 4
  • 13
  • 35

1 Answers1

0

You are right, MinMaxScaler will scale your data from 0 to 1. 0 will be the min of your column and 1 the max.

Apply function will not actually transform your features, it will just return a dataframe with the transformed columns. So you need to affect your transformation to your features :

features = features.apply(lambda x: MinMaxScaler().fit_transform(x))
Asclepius
  • 57,944
  • 17
  • 167
  • 143
Mohamed AL ANI
  • 2,012
  • 1
  • 12
  • 29