Hi i'm trying to write a little program that indexes some documents from an xml collection. I use the tf-idf method. Now when my program reads the query it returns a list of tuples ('tf-idf','docid') for each word in each document.
This is an example:
Query: "Dog water"
Documents: [(0.212,1),(0.334,1),(0.111,2),(0,2)]
in this case the document 2 has only one word inside it.
Now my question is: i know that i have to do the dot product between those documents and the query, but how can i do it? How can i translate the query into a vector of weight?
Thank you.