The similarity measures need to be based on a query. i.e. you query your Lucene document set and you get back a set of documents with relative scores.
If you want to compare every document with every other (is that right? it's hard to tell from the question) then you need to use a feature of each document as the basis for the queries.
For example, you could extract the top N terms (by frequency, excluding stop words) from each document. If you have X documents then you will have X queries. Then you execute each of your X queries against the index and you get back relative similarities of each document with every other. This is a matrix you could use for classification.
Another alternative would be to use the title, or synopsis of each document as the basis for the query (again, excluding stops).