Hi I am old to programming but very new to Julia so the answer may be obvious.
Forward Mode AD is often compared with the forward pass of a neural net, Reverse Mode AD is compared with back propagation and clearly you cannot replace back propagation with a forward pass. Forward and Reverse Mode AD both compute the gradient. But are they the same function or, ignoring efficiency, is Reverse Mode doing something the Forward Mode is not? Alternatively are there applications using Reverse Mode where, ignoring efficiency, Forward Mode could not be used?
The reason for asking this is that the paper https://richarde.dev/papers/2022/ad/higher-order-ad.pdf Provably Correct, Asymptotically Efficient, Higher-Order Reverse-Mode Automatic Differentiation defines Reverse Mode AD as a correct and efficient way to compute the gradient when there is a large number of inputs. Yet their definition computes the same function as Froward Mode AD.
Any correction of my understanding much appreciated.
Clarification:
Let Forward Algorithm means an algorithm that takes dx and computes df(x) and Reverse Algorithm mean an algorithm the takes df(x) and computes possible dx.
The easiest Automatic Differentiation algorithm to understand is a Forward Algorithm called Forward Mode Automatic Differentiation but is not efficient when there is a large vector of inputs (x). Hence Reverse Algorithms were invented that were efficient with a large vector of inputs these were called Reverse Mode Automatic Differentiation. These Reverse Algorithms appear much harder both to understand and to implement.
In the paper cited a Haskel forward algorithm for computing Automatic Differentiation was given that is efficient with a large vector of inputs. This, because of the efficiency, was not unreasonably called Reverse Mode Automatic Differentiation. On the assumption both that they are correct and that their algorithm can be implemented for Julia programs does it mean that the Reverse Algorithms are no longer useful. Or is there some use cases (not Automatic Differentiation) for which Reverse Algorithms are still needed?