0

I have been given the task of creating a google like Ngram view/chart of a data set. The chart is just a line chart of basically terms (ngrams) over time.

I dont have any experience with SOLR but have been given a core containing a lot of data and understand I need to use Shingles to pull the data out, its apparently already been indexed to use Ngrams although I need to find out what exactly.

So I think I can get the Ngram/shingle out for the whole of the data, but how do I get results over time, say for each month over five years? The data is newspaper data so the day and date is part of the index as is the full text.

Is there a SOLR call to get the data over time, or should i do many requests to Solr for each day/month?

Any suggestions or experiences of doing this would be much appreciated

Paul M
  • 3,937
  • 9
  • 45
  • 53

1 Answers1

0

Shingles and ngrams are usually performed when indexing content, as you want the shingles or ngrams separately indexed to get any useful counts for them. You can generate these counts using faceting on the field, but the easiest way to do it over time is to issue several queries as you've guessed. You can use a Filter Query (&fq=) to limit the set returned (or just the regular q= if you don't use it for anything else).

Anything more would be hard to say without knowing more about your content, how it's been indexed and what you want to get back.

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • OK Mat, thanks that gives me a bit more to work with. I will have a play and see what I can come up with. – Paul M Aug 28 '14 at 13:51