Does an LSTM model use trend in features?

Question

Does an LSTM take into account a trend in a feature? Or does it only see trends from the previous output (Y predicted)?

To illustrate, imagine we have a trend in feature A. In our problem, we know that in the real world Y tends to decrease as A increases (inversely proportional). However, Y is NOT related to the actual value of A.

Example, if A increases from 10 to 20, Y decrease by 1. If A increase from 40 to 50, Y decrease by 1 as well. Similarly, if A decreases from 30 to 20, Y should decrease by 1 and if A decreases from 60 to 50, Y should decrease by 1 as well.

In the above example, the LSTM will do well if it can understand the decreasing trend of feature A. However, if it using the actual value of feature A, it will not be useful at all. The value of "10" or "50" for A above is in itself meaningless because there is not direct correlation between the value of A and Y, only the trend in A influences Y. A real world example of this is the correlation between SPY (stock market) and VIX (volatility index). The VIX has an inverse correlation to the movement of SPY, but the actual value of SPY doesn't matter.

I have spent a lot of time learning about LSTMs in general, but it's not clear whether the "memory" remembers the trend in a feature's value. From what I can see, it does not, and only remembers the previous "weights" of the features and the trend in the output (Y).

I would suggest asking your question on a more specific StackExchange, like [Cross-Validated](https://stats.stackexchange.com/) or [Data Science](https://datascience.stackexchange.com/). — Kins, Jun 29 '22 at 13:58

score 0 · Answer 1 · answered Jun 29 '22 at 13:45

There are no "previous weights"; the weights are fixed at evaluation time. The network remembers whatever function of the previous inputs and the previous state it learns to remember, based on the recurrent weights it learned during training. The difference between an input at the current time-step and the previous time-step, or an approximation of a longer-range derivative, is certainly something that it could learn if that was useful.

Does an LSTM model use trend in features?

1 Answers1