8

In the paper A Tutorial on Energy Based Learning I have seen two definitions:

  • Energy function E(X, Y) is minimized by inference process: the goal is to find such value of Y, such that E(X, Y) takes is minimal value.
  • Loss function is a measure of a quality of an energy function using training set.

I understand the meaning of loss function (good example is the mean squared error). But can you explain me what is the difference between energy function and loss function? Can you give me an example of energy function in ML or DL?

nbro
  • 15,395
  • 32
  • 113
  • 196
DY92
  • 437
  • 5
  • 18

1 Answers1

7

In short, the energy function describes your problem. In contrast the loss function is just something that is used by an ML algorithm as input. This might be the same function but is not necessarily the case.

The energy of a system in physics might be the movement inside this system. In a ML context, you might want to minimize the movement by adjusting the parameters. Then one way to achieve this is to use the energy function as a loss function and minimize this function directly. In other cases this function might not be easy to evaluate or to differentiate and then other functions might be used as a loss for your ML algorithm. Similarly as in classification, where you care for the accuracy of the classifier, but you still use cross entropy on the softmax as a loss function and not accuracy.

nbro
  • 15,395
  • 32
  • 113
  • 196
Thomas Pinetz
  • 6,948
  • 2
  • 27
  • 46
  • The loss function is an input to a ML algorithm? "In a ML context, you might want to minimize the movement", movement of what? – nbro Apr 08 '19 at 10:31
  • 1
    The context was a physical system, where particles are floating around and you want to minimize the parameters of those particles in such a way that the energy in that system is minimized, hence the ML context. As I explain and the author accepted, the loss function is used in your ML system to train your model. It is not neccessarily the thing you are actually trying to do, but a surrogate with good properties for optimization, e.g. convex, differentiable, smooth, etc. Sometimes the energy function can also be a good loss function, but this is not neccesarily the case. – Thomas Pinetz Apr 09 '19 at 11:10