wikipedia gave a very nice explanation of vector space model.
http://en.wikipedia.org/wiki/Vector_space_model
except it skip one part which is not self explanatory to me. that is the definition of the query vector. The text starts with
d_j = ( w_{1,j} ,w_{2,j} , .... ,w_{t,j} ) // document vector
q = ( w_{1,q} ,w_{2,q} , ... ,w_{t,q} ) // query vector
and proceed to explain how d_j
is defined in terms of tf-idf for a document in a corpus. That's all fine, but I am not able to translate that explanation to the query vector. In the idf part, how would you apply
| {d' E D | t E d' }| ? ( I am using E to represent 'member of set').
In case of query vector, even though a term is a part of a query, the query itself is not a document in the corpus, so the above normalization term has no equivalent.
any experts in the vector space model able to clarify?