2

Is it possible to specify that certain coefficients should be held constant (at a pre-determined value) during the training of a regression model in PySpark?

For example, if I have the simple, single-feature data shown below, I can fit a straight line to it with linear regression and allow both coefficients to be fit. Then I get the green line.

However, if I somehow know the slope is 2.3, I can fix that coefficient to 2.3 and fit the intercept, which is the blue line.

This is a trivial example, but is there a means to do this in Spark (PySpark especially)?

Or is there a hook to add a custom cost function? (Then I could make the cost extremely large if certain coefficients are far from the expected value.)

example of two fits, one with a fixed slope

Corey
  • 1,845
  • 1
  • 12
  • 23

0 Answers0