16

I want to find all documents in the index that have a certain field, regardless of the field's value. If at all possible using the query language, not the API.

Is there a way?

skaffman
  • 398,947
  • 96
  • 818
  • 769
Michael Böckling
  • 7,341
  • 6
  • 55
  • 76

3 Answers3

5

If you know the type of data stored in your field, you can try a range query. Per example, if your field contain string data, a query like field:[a* TO z*] would return all documents where there is a string value in that field.

Pascal Dimassimo
  • 6,908
  • 1
  • 37
  • 34
  • This should work. It would be slightly more complex in case the field has values starting with numbers or capital letters. Should be easy to do with an OR query. – Shashikant Kore Sep 14 '10 at 16:58
  • Good point! But if it starts with a capital, a range starting with a* should catch it because the javadoc of TermRangeQuery states that it uses String.compareTo to determine if a string is part of the range. – Pascal Dimassimo Sep 14 '10 at 17:13
  • This looks good. Not sure about catching records starting with numbers, but this is a good start. Thanks! – Michael Böckling Sep 14 '10 at 17:33
  • 1
    The lexicographic order of strings defines numbers before letters so a range [0* TO z*] would catch all values starting by both letters and numbers. And capital letters appears before lower-case (in contrary to what I have said in my previous comment). Don't forget to check how your values are indexed: they may be all lower-cased! – Pascal Dimassimo Sep 14 '10 at 18:07
  • 1
    According to http://stackoverflow.com/questions/2686033/lucene-search-for-documents-that-have-a-particular-field/2726574#2726574 field:[* TO *] will also work, however you may have to enable `SetAllowLeadingWildcard` on the `QueryParser`. – devios1 Mar 23 '11 at 15:46
3

I've done some experimenting, and it seems the simplest way to achieve this is to create a QueryParser and call SetAllowLeadingWildcard( true ) and search for field:* like so:

var qp = new QueryParser( Lucene.Net.Util.Version.LUCENE_29, field, analyzer );
qp.SetAllowLeadingWildcard( true );
var query = qp.Parse( "*" ) );

(Note I am setting the default field of the QueryParser to field in its constructor, hence the search for just "*" in Parse()).

I cannot vouch for how efficient this method is over other methods, but being the simplest method I can find, I would expect it to be at least as efficient as field:[* TO *], and it avoids having to do hackish things like field:[0* TO z*], which may not account for all possible values, such as values starting with non-alphanumeric characters.

devios1
  • 36,899
  • 45
  • 162
  • 260
2

Another solution is using a ConstantScoreQuery with a FieldValueFilter

new ConstantScoreQuery(new FieldValueFilter("field"))
Jerven
  • 582
  • 3
  • 7