0

I have multiple documents that have dates and timestamps really close relative to each other. Here is the data I have:

/* 1 */
{
    "startDate" : ISODate("2022-11-29T14:28:38.166Z"),
    "startTimestamp" : 1669732118166,
    "customerName" : "Eduardo Bechtelar"
}

/* 2 */
{
    "startDate" : ISODate("2022-11-29T14:28:38.258Z"),
    "startTimestamp" : 1669732118258,
    "customerName" : "Sylvia Wolf"
}

/* 3 */
{
    "startDate" : ISODate("2022-11-29T14:28:38.284Z"),
    "startTimestamp" : 1669732118284,
    "customerName" : "Jeremy McLaughlin"
}

/* 4 */
{
    "startDate" : ISODate("2022-11-29T14:28:38.298Z"),
    "startTimestamp" : 1669732118298,
    "customerName" : "Laura Lynch"
}

/* 5 */
{
    "startDate" : ISODate("2022-11-29T14:28:38.311Z"),
    "startTimestamp" : 1669732118311,
    "customerName" : "Noel Lubowitz"
}

/* 6 */
{
    "startDate" : ISODate("2022-11-29T14:28:38.435Z"),
    "startTimestamp" : 1669732118435,
    "customerName" : "Natalie Cummerata DVM"
}

Here is the index definition:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "customerName": {
        "type": "autocomplete"
      },
      "startDate": {
        "type": "date"
      },
      "startTimestamp": {
        "representation": "int64",
        "type": "number"
      }
    }
  }
}

As you can see the timestamps are really closed to each other. The difference is in milliseconds. What I'm trying to do is to sort the items asc or desc based on start timestamp using near operator. What I saw is that search index cannot handle big numbers using timestamps.

I tried to sort items descending using the near operator, like this:

var now = Date.now()

db.getCollection('search-tests').aggregate([
    {
        $search: {
            index: "testingSearchIndex",
            compound: {
                should: [
                       {
                           near: {
                               path: "startTimestamp",
                               origin: now,
                               pivot: 1,
                           }
                       },
                ]
            }
        }
    },
    {
        $project: {
            startDate: 1,
            startTimestamp: 1,
            score: { $meta: "searchScore" },
        }
    }
])

I was expecting to see items sorted descending based on startTimestamp. But the sorting is kind of random. What am I doing wrong?

1 Answers1

0

For this to work, you need to add a large boost factor, which is somewhat hacky but less stress on disk.

var now = Date.now()

db.getCollection('search-tests').aggregate([
    {
        $search: {
            index: "testingSearchIndex",
            compound: {
                should: [
                       {
                           near: {
                               path: "startTimestamp",
                               origin: now,
                               pivot: 1,
                               score: { "boost": { "value": 1000 } }
                           }
                       },
                ]
            }
        }
    },
    {
        $project: {
            startDate: 1,
            startTimestamp: 1,
            score: { $meta: "searchScore" },
        }
    }
])

There's a beta sort coming using Lucene. That could be a better option depending on the data size and the associated workload.

Nice-Guy
  • 1,457
  • 11
  • 20
  • Thanks for your answer! Just tried what you suggested and the sort is still random, all items having the same score. – Andrei Batinas Feb 25 '23 at 06:29
  • We had a private preview of the sort using Lucene that took a while to build safely. It's now out. I can fix the answer. This was a workaround. – Nice-Guy Jun 29 '23 at 06:15