This slide shows an equation for Q(state, action) in terms of a set of weights and feature functions. I'm confused about how to write the feature functions.
Given an observation, I can understand how to extract features from the observation. But given an observation, one doesn't know what the result of taking an action will be on the features. So how does one write a function that maps an observation and an action to a numerical value?
In the Pacman example shown a few slides later, one knows, given a state, what the effect of an action will be. But that's not always the case. For example, consider the cart-pole problem (in OpenAI gym). The features (which are, in fact, what the observation consists of) are four values: cart position, cart velocity, pole angle, and pole rotational velocity. There are two actions: push left, and push right. But one doesn't know in advance how those actions will change the four feature values. So how does one compute Q(s, a)? That is, how does one write the feature functions fi(state, action)?
Thanks.