1

Here is the way I could do using sklearn minmax_scale, however sklearn can not be able to integrate with pyspark. Is there anyway, I could use an alternate way in spark for minmax scaling on an array? Thanks.

for i, a in enumerate(np.array_split(target, count)):    
        start = q_l[i]
        if i == (count - 1):
            end = 1.0
        else:
            end = q_l[i + 1]
        target_scaled = minmax_scale(a, feature_range=(start, end))
        result.append(target_scaled)
results = np.concatenate(results)
data_coder
  • 97
  • 8
  • https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.feature.MinMaxScaler.html? – Emma Jan 11 '22 at 21:53

0 Answers0