0

I have created a solr field as follows:

<analyzer type="index">
    <tokenizer class="solr.LowerCaseTokenizerFactory"/>              
    <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
    <filter class="solr.ShingleFilterFactory" minShingleSize="3" maxShingleSize="5"/>
    <filter class="solr.PatternReplaceFilterFactory" pattern=".*_.*" replacement=""/>
</analyzer>

It creates shingles of docs with expected results. I want to get all the shingles of specific filter query which i am not able to find. I tried using luke to get indexes but, its giving me all the shingles not from filter query. Is there a way possible to get such data?

cheffe
  • 9,345
  • 2
  • 46
  • 57

1 Answers1

0

Faceting by that field will give you all the tokens together with the counts how many times the tokens occur. This might be sufficient.

If you are doing this for testing individual inputs, you can also just try it in the Web Admin UI's Analysis screen.

Alexandre Rafalovitch
  • 9,709
  • 1
  • 24
  • 27
  • I am using following query: http://localhost:8983/solr/shingleTest/select?id:3232843&wt=json&indent=true&facet=true&facet.field=myText&facet.limit=10000 but it gives me all shingles in the core. I only want the shingles from that specific document stored. Is there any way to achieve it?? – Sanjay Lama Sep 03 '15 at 01:46
  • Facets are counting from the matches for the query. If your query only matches one document..... – Alexandre Rafalovitch Sep 04 '15 at 00:04