I am looking for a Scala implementation of Python's sklearn.preprocessing.QuantileTransformer class. There doesn't seem to be a single Class that can implement the entire functionality in scala.
The Python implementation has 3 major parts:
1) Compute quantiles for given data and percentile array using numpy.percentile(). If quantile lies between two input data points, then linear interpolation is used. The closest I can find in Scala is in breeze, which has percentile() function (Observation: The DataFrame.stats.approxQuantile() does not perform the linear interpolation and thus can't be used here).
2) Uses numpy.interp() to convert the input range of values to a given range. Eg If input data range is 1-100, it can be converted to any given range say 0-1. Again this uses linear interpolation when input data is present between 2 quantiles. The closest I can find in Scala is breeze.interpolation class.
3)Calculate the inverse CDF using numpy.ppf(). I believe, for this I can use the NormalDistribution class as one answer below or StandardScaler class.
Anything better to make the coding short and simple?