I need to extract a single "keyframe" from a video of a particular human action(the actions could be generic) such that it is discriminative as opposed to descriptive (Finding an interesting frame in a video).
In short, I need to find that one frame in a basketball video that discriminates it from say, a coffee-drinking video.
Most of the papers I've seen have been some kind of video summarization technique, but the frames thus extracted need not be the best to separate action categories. This is my stumbling block - during test time, I only have the test video to extract a keyframe, yet I need some model which will allow me to extract the frame most different from other action category videos.