Im using the MXNet implementation of the TFT model, and I want to get the feature importance for every timestep from the trained model. Unfortunately, there is no such implemented functionwhich would satisfy my demand. According to the original article for TFT, there is a way to get the feature importance by getting the weigths off of the variable selection network. Howewer, it's softmax function gives back an embedded, 3 dimensional matrix. Im stuck with this problem, due to the lack of documentation about TFT/MXNet.
Any help is highly appricated.