5

Given the following dataframe and left-x column:

|       | left-x | left-y | right-x | right-y |
|-------|--------|--------|---------|---------|
| frame |        |        |         |         |
| 0     | 149    | 181    | 170     | 175     |
| 1     | 149    | 181    | 170     | 175     |
| 2     | 149    | 181    | 170     | 175     |
| 3     | 149    | 181    | 170     | 175     |

how may I normalize left-x by standard deviation using scikit-learn library?

JP Ventura
  • 5,564
  • 6
  • 52
  • 69

2 Answers2

9

You can normalize by standard deviation without using sci-kit-learn, as follows:

df['left-x'] = df['left-x'] / df['left-x'].std()

Or if you also want to mean-center the data:

df['left-x'] = (df['left-x'] - df['left-x'].mean())/df['left-x'].std()

Here df is your asl.df[l] variable.

The .std() method gives the standard deviation of a dataframe across the given axis. By selecting a column first, the standard deviation is calculated for that column alone.

If you need to do this a lot and want to avoid the clutter, you can wrap it into a function, e.g.

def std_norm(df, column):
    c = df[column]
    df[column] = (c - c.mean())/c.std()

You call it as:

std_norm(df, 'left-x')

Note that this updates the passed DataFrame in-place.

mfitzp
  • 15,275
  • 7
  • 50
  • 70
  • I am aware that I could do `(df['left-x'] - df['left-x'].mean())/df['left-x'].std()`, but it is a so common operation that I was wondering if there was `normalize` function out of the box. – JP Ventura Mar 05 '18 at 13:13
  • 2
    @JPVentura if anywhere would expect it to be in pandas itself (rather than sci-kit) but not that I know of. You could just wrap this into a function to re-use to keep down the noise. I've added an example. – mfitzp Mar 06 '18 at 20:34
3

You can use scaling functions from sklearn.preprocessing module.

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
sc.fit(df['left-x'])

df['left-x'] = sc.transform(df['left-x'])
YOLO
  • 20,181
  • 5
  • 20
  • 40