0

I have a feeds collection with documents like this:

{
  "created": 1510000000,
  "find": [
    "title of the document",
    "body of the document"
  ],
  "filter": [
    "/example.com",
    "-en"
  ]
}
  • created contains an epoch timestamp
  • find contains an array of fulltext snippets, e.g. the title and the body of a text
  • filter is an array with further search tokens, such as hashtags, domains, locales

Problem is that find contains fulltext snippets, which we want to tokenize, e.g. with a text analyzer, but filter contains final tokens which we want to compare as a whole, e.g. with the identity analyzer.

Goal is to combine find and filter into a single custom analyzer or to combine two analyzers using two SEARCH statements or something to that end.

I did manage to query by either find or by filter successfully, but do not manage to query by both. This is how I query by filter:

I created a feeds_search view:

{
  "writebufferIdle": 64,
  "type": "arangosearch",
  "links": {
    "feeds": {
      "analyzers": [
        "identity"
      ],
      "fields": {
        "find": {},
        "filter": {},
        "created": {}
      },
      "includeAllFields": false,
      "storeValues": "none",
      "trackListPositions": false
    }
  },
  "consolidationIntervalMsec": 10000,
  "writebufferActive": 0,
  "primarySort": [],
  "writebufferSizeMax": 33554432,
  "consolidationPolicy": {
    "type": "tier",
    "segmentsBytesFloor": 2097152,
    "segmentsBytesMax": 5368709120,
    "segmentsMax": 10,
    "segmentsMin": 1,
    "minScore": 0
  },
  "cleanupIntervalStep": 2,
  "commitIntervalMsec": 1000,
  "id": "362444",
  "globallyUniqueId": "hD6FBD6EE239C/362444"
}

and I created a sample query:

FOR feed IN feeds_search
SEARCH ANALYZER(feed.created < 9990000000 AND feed.created > 1500000000 
AND (feed.find == "title of the document")
AND (feed.`filter` == "/example.com" OR feed.`filter` == "-uk"), "identity")
SORT feed.created
LIMIT 20
RETURN feed

The sample query works, because find contains the full text (identity analyzer). As soon as I switch to a text analyzer, single word tokens work for find, but filter no longer works.

I tried using a combination of SEARCH and FILTER, which gives me the desired result, but I assume it probably performs worse than having the SEARCH analyzer do the whole thing. I see that analyzers is an array in the view syntax, but I seem not to be able to set individual fields for each analyzer.

Oliver Hausler
  • 4,900
  • 4
  • 35
  • 70

1 Answers1

0

The analyzers can be added as a property to each field in fields. What is specified in analyzers is the default and is used in case a more specific analyzer is not set for a given field.

      "analyzers": [
        "identity"
      ],
      "fields": {
        "find": {
          "analyzers": [
            "text_en"
          ]
        },
        "filter": {},
        "created": {}
      },

Credits: Simran at ArangoDB

Oliver Hausler
  • 4,900
  • 4
  • 35
  • 70