Regression problem where predicted value must be a whole number

Question

I am working on a regression problem where the predicted value must be a positive integer. One approach could be to just train a model, make predictions, and round the predicted values. However, I want to try a different approach of modifying the loss function. I tried this in Keras like so:

def my_custom_loss_fn(y_actual, y_predicted):
   y_predicted_rounded = K.round(y_predicted)
   custom_loss_value = K.sqrt(tf.keras.losses.mean_squared_error(y_actual, y_predicted_rounded))
   return custom_loss_value

which throws an error: No gradients provided for any variable: ...This issue is likely because there are no gradients for K.round function.

My question is: Is there any other elegant way or even a different framework (like xgboost etc) where I could modify the lost function, such that the loss is squared root mean error of y_actual and y_predicted that has been rounded.

This is not a regression problem (output is not continuous), its a classification problem. — Dr. Snoopy, Apr 22 '20 at 08:20

score 0 · Answer 1 · answered Apr 22 '20 at 05:47

An alternative to your problem would be to frame your entire problem as a classification one.

That is, you transform your dataset from a regression one, to a classification one. The more formal principle to use is bucketizing. In this way, if a value falls between 15.0 and 30.0, then you assign category X to each of the datapoints which belong do this interval.

The predicted positive integer would be of course the category_id.

Then, depending on the number of datapoints, you can extend/shrink the number of intervals. You also rid of any problem of trying to implement a custom loss function.

If this is not the exact case in your problem, then rounding the final result without "tampering" with the loss function is a good approach, like you suggested in the definition of your problem.

Regression problem where predicted value must be a whole number

1 Answers1