0

I am trying to find the document with the latest date in Solr. Is there an efficient way of doing this with the Solr query syntax ?

For now, I have been reading the documentation, but I am only getting "numFound": 0 when I try to query any date in the fq parameter, even though that date does exist in the document.

Example query:

q box:

*:*

fq box:

date_published:"2019-02-28T11:57:29.926Z"

I defined in the schema this field like so:

<field name="date_published" type="pdate" indexed="true" stored="true" required="true"/>

<fieldType name="pdate" class="solr.DatePointField" docValues="true"/>

The above date does exist with the document, but it shows "numFound":0. Although this issue is just a first step, I would actually like to find the latest document with the latest date.

blah
  • 674
  • 3
  • 17
  • 1
    Do you have an example that shows how you've tried querying using `fq` and what kind of sort you have applied to the query? Fetching the laatest date should be just sorting by the date field and using `rows=1` to get the document (and if you use `fl=` that should be the only one returned) – MatsLindh Feb 09 '21 at 10:52
  • I get an error when I try to sort it by the date field name: ``` "msg":"Can't determine a Sort Order (asc or desc) in sort spec 'date_published', pos=16", "code":400}}``` – blah Feb 09 '21 at 11:00
  • 1
    `date_published desc` to order from newest to oldest – MatsLindh Feb 09 '21 at 11:08
  • Thanks, but if i change 'desc' to 'asc' it shows the same results, is that weird ? – blah Feb 09 '21 at 11:22
  • Yes, that sounds weird. It's hard to say without knowing your documents and how they differ. Did you change the field type after indexing? – MatsLindh Feb 09 '21 at 11:25
  • Yes, so basically I had to update from Solr 6 to Solr 7.6 with already indexed documents (50k docs). Since the field types were deprecated I had to change these according to the documentation. E.g. TrieDateField to DatePointField. – blah Feb 09 '21 at 11:31
  • 1
    That would require reindexing. If you just change the field type without changing the content, things will be broken in mysterious ways. The TrieDateField is still present in Solr8, but deprecated (there are still use cases where it's better than the point date field). – MatsLindh Feb 09 '21 at 11:58
  • Do you maybe know whether optimizing the cores could work rather than deleting all the data and reindexing ? I got this idea from: https://stackoverflow.com/questions/6954358/how-to-optimize-solr-index – blah Feb 10 '21 at 11:46
  • 1
    No, that won't change anything. Indexing is a lossy process and optimizing only collapses the index files to a single index file and removes deleted documents (i.e. it optimizes the index). It does not change tokens or field types indexed. Tokens are generated from the input text, and that can be (usually is) a lossy process, so there is no decent way to do that automagically. If all fields are stored it could theoretically be done, but as far as I know there is no such operation yet. Retrieving and re-submitting each document would be necessary. – MatsLindh Feb 10 '21 at 13:26

0 Answers0