0

I'm exploring follow-the-regularized-leader FTRL proximal gradient descent: paper, reference implementation.

Everywhere FTRL is mentioned, the loss surface for the gradient decent is the LogLoss, and the model for prediction is Logistic regression.

Can I use the same algorithm for a linear least squares model? I have a problem I want to model with a linear model and define the loss by least squares, and then do FTRL to find the optimal solution - do you see any problem with that?

Thanks.

ihadanny
  • 4,377
  • 7
  • 45
  • 76

1 Answers1

1

I don't believe I have studied enough FTRL but I'm trying to. I've been doing research on this algorithm and I believe this python code will help you since it uses least squares for loss. Since the code is written for regression the return value of the 'predict' method is changed since using sigmoid function for regression is useless. https://www.kaggle.com/scirpus/grupo-bimbo-inventory-demand/ftlr-use-pypy/code

I hope you are familiar with python but if not then I believe the answer is yes since I've tested this same code on the kaggle bimbo competition and it produced similar score on test data as the one on the Public Leader board.

Phill Donn
  • 180
  • 1
  • 7
  • very nice! can you give a reason to why you use `max(p,0)` instead of just using p? why don't we increase the loss for large negative predictions? – ihadanny Jul 22 '16 at 16:39
  • It is my bad for not mentioning that the code is not mine. I just found it while doing research on FTRL, however I believe the reason the author does this is because the target values can be only positive, so by setting negative values to 0 he reduces the error. The target value being predicted in this example is 'returns of expired date products in a store', it is obvious that you cannot return negative amount of products so in order to improve Public Leader board score you set all the negative values to the least possible target value which is zero. – Phill Donn Jul 22 '16 at 20:13