I'm currently reading Peter Norvig's Artificial Intelligence A Modern Approach Chapter 15 Probabilistic Reasoning over Time and I'm not able to follow the derivation for filtering and prediction (page 572).
Given the result of filtering up to time t, the agent needs to compute the result for t + 1 from the new evidence et+1,
P(Xt+1|e1:t+1) = f(et+1, P(Xt|e1:t)),
for some function f. This is called recursive estimation. We can view the calculation as being composed of two parts: first, the current state distribution is projected forward from t to t + 1; then it is updated using the new evidence et+1. This two-part process emerges quite simply when the formula is rearranged:P(Xt+1|e1:t+1) = P(Xt+1|e1:t, et+1) (dividing up the evidence)
= α P(et+1|Xt+1, e1:t) P(Xt+1|e1:t) (using Bayes' rule)
How does using Bayes' rule result in the last forumla? Shouldn't it be
α P(e1:t, et+1|Xt+1) P(Xt+1)