0

So, I am wondering if there is any difference between the StandardScaler of Spark and a simple z-score calculation.

The formula for the z-score calculation is:

z = (x-mean)/Std

However for the StandardScaler of Spark it is not clear to me how exactly it normalizes values. I could not find any formula for this. Assuming that we set both "setWithStd" and "setWithMean" to true as below:

StandardScaler scaler = new StandardScaler()
  .setInputCol("features")
  .setOutputCol("scaledFeatures")
  .setWithStd(true)
  .setWithMean(true);

Would it be the same as using a simple z-score calculation?

Des0lat0r
  • 482
  • 3
  • 18
  • I think I found the answer in [Why is another term for StandardScaler is Z-Score Normalization if it uses Standardization?](https://stackoverflow.com/questions/75351604/why-is-another-term-for-standardscaler-is-z-score-normalization-if-it-uses-stand) It is the same. – Des0lat0r Mar 15 '23 at 12:27

0 Answers0