Given a deterministic environment (or as you say, a "perfect" environment in which you are able to know the state after performing an action), I guess you can simulate the affect of all possible actions in a given state (i.e., compute all possible next states), and choose the action that achieves the next state with the maximum value V(state).
However,it should be taken into account that both value functions V(state) and Q functions Q(state,action) are defined for a given policy. In some way, the value function can be considered as an average of the Q function, in the sense that V(s) "evaluates" the state s for all possible actions. So, to compute a good estimation of V(s) the agent still needs to perform all the possible actions in s.
In conclusion, I think that although V(s) is simpler than Q(s,a), likely they need a similar quantity of experience (or time) to achieve a stable estimation.
You can find more info about value (V and Q) functions in this section of the Sutton & Barto RL book.