0

Is there an easy way:

  1. Tool
  2. Code fragment

To export TFIDF vectors from a lucene index into a human friendly format such as JSON. Preferred implementation languages are Java and Python.

Thanks.

NOTE:

  1. My object here is not to debug/browse the index -- for that I can use Luke
  2. My objective is to be able to dump the tfidf vectors into a more portable data format.
user1172468
  • 5,306
  • 6
  • 35
  • 62
  • Don't know what you are trying to accomplish, exactly. If you are looking for a diagnostic tool or some such, have you looked into [Luke](https://code.google.com/p/luke/)? Or do you really need a JSON /etc for some reason? – femtoRgon Mar 18 '14 at 18:59
  • @femtoRgon, thanks for the note -- I'm not looking for something like luke -- but rather exporting it in a format that is more common such as JSON. – user1172468 Mar 19 '14 at 06:34
  • 1
    In that case, you can access that information through [IndexReader.getTermVectors](https://lucene.apache.org/core/4_7_0/core/org/apache/lucene/index/IndexReader.html#getTermVectors(int)). – femtoRgon Mar 19 '14 at 18:08
  • @femtoRgon I'll give that a spin -- if it works I'll post the code fragment as an answer. Thanks. – user1172468 Mar 20 '14 at 11:55

0 Answers0