54

This has become quite a frustrating question, but I've asked in the Coursera discussions and they won't help. Below is the question:

enter image description here

I've gotten it wrong 6 times now. How do I normalize the feature? Hints are all I'm asking for.

I'm assuming x_2^(2) is the value 5184, unless I am adding the x_0 column of 1's, which they don't mention but he certainly mentions in the lectures when talking about creating the design matrix X. In which case x_2^(2) would be the value 72. Assuming one or the other is right (I'm playing a guessing game), what should I use to normalize it? He talks about 3 different ways to normalize in the lectures: one using the maximum value, another with the range/difference between max and mins, and another the standard deviation -- they want an answer correct to the hundredths. Which one am I to use? This is so confusing.

bjd2385
  • 2,013
  • 4
  • 26
  • 47
  • 1
    I'm stuck at the same question. Did any of the answers work for you? – Alex Craciun Sep 13 '16 at 10:23
  • 2
    To anyone like me who seems to get the numbers right but still failed this question: In my case, I forgot to round as asked. My question randomised for the 4th feature, my calculations got me -0.469 which should've been rounded up to -0.47 and I posted -0.46. Doh! – ido Mar 17 '17 at 15:31
  • That's cheating bro!! – dicemaster Apr 23 '18 at 12:00

7 Answers7

47

...use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

So for any individual feature f:

f_norm = (f - f_mean) / (f_max - f_min)

e.g. for x2,(midterm exam)^2 = {7921, 5184, 8836, 4761}

> x2 <- c(7921, 5184, 8836, 4761)
> mean(x2)
 6676
> max(x2) - min(x2)
 4075
> (x2 - mean(x2)) / (max(x2) - min(x2))
 0.306  -0.366  0.530 -0.470

Hence norm(5184) = 0.366

(using R language, which is great at vectorizing expressions like this)

I agree it's confusing they used the notation x2 (2) to mean x2 (norm) or x2'


EDIT: in practice everyone calls the builtin scale(...) function, which does the same thing.

smci
  • 32,567
  • 20
  • 113
  • 146
  • thanks for the answer. but i noticed the last line of code should be (x2-mean(x2)) / (max(x2) - min(x2)) – Fang Cao Mar 15 '16 at 20:10
  • @FangCao: Doh! how on earth did I reverse that!? – smci Mar 17 '16 at 00:50
  • 15
    Shouldn't it be `-.366`? – Jossie Calderon Jul 09 '16 at 18:47
  • As a side note, an often used alternative is to divide by the standard deviation instead of (f_max - f_min). – PlsWork May 03 '18 at 16:22
  • @AnnaVopureta: yes, "scaling" can mean either dividing by the min-max range or the s.d. The advantage of the former is the result is bounded to [0,1] or [-1,1], whereas dividing by the sd the result is not bounded, which can cause problems (esp. with outliers) in feature generation or some models. – smci May 03 '18 at 22:52
  • I have a doubt what is the difference between mean normalization and feature scaling and why we are using both techniques simultaneously..?? – Deepak Chawla Jun 25 '18 at 06:02
4

It's asking to normalize the second feature under second column using both feature scaling and mean normalization. Therefore,

(5184 - 6675.5) / 4075 = -0.366

syam
  • 562
  • 6
  • 15
0

Usually we normalize all of them to have zero mean and go between [-1, 1].

You can do that easily by dividing by the maximum of the absolute value and then remove the mean of the samples.

Royi
  • 4,640
  • 6
  • 46
  • 64
0

"I'm assuming x_2^(2) is the value 5184" is this because it's the second item in the list and using the subscript _2? x_2 is just a variable identity in maths, it applies to all rows in the list. Note that the highest raw mid-term exam result (i.e. that which is not squared) goes down on the final test and the lowest raw mid-term result increases the most for the final exam result. Theta is a fixed value, a coefficient, so somewhere your normalisation of x_1 and x_2 values must become (EDIT: not negative, less than 1) in order to allow for this behaviour. That should hopefully give you a starting basis, by identifying where the pivot point is.

roganjosh
  • 12,594
  • 4
  • 29
  • 46
0

I had the same problem, in my case the thing was that I was using as average the maximum x2 value (8836) minus minimum x2 value (4761) divided by two, instead of the sum of each x2 value divided by the number of examples.

jordileft
  • 39
  • 6
0

For the same training set, I got the question as Q. What is the normalized feature x^(3)_1?

Thus, 3rd training ex and 1st feature makes out to 94 in above table. Now, normalized form is

x = (x - mean(x's)) / range(x)

Values are :

x = 94
mean(89+72+94+69) / 4 = 81
range = 94 - 69 = 25

Normalized x = (94 - 81) / 25 = 0.52
0

I'm taking this course at the moment and a really trivial mistake I made first time I answered this question was using comma instead of dot in the answer, since I did by hand and in my country we use comma to denote decimals. Ex:(0,52 instead of 0.52)

So in the second time I tried I used dot and works fine.