0

Well guys I hope you're doing fine in this epidemic times, I'm having trouble in neglecting special characters in a query at elasticsearch : Here is what I want to do :

Select * from table where ext like %6500% and start_time like %-01-% 

Here is what I did:

   "query": {
       "bool": {
           "must": [
               {
                   "query_string": {
                       "ext": "*6500*",
                       "fields": [
                           "extension"
                       ],
                       "analyze_wildcard": true
                   }
               },
               {
                   "query_string": {
                       "query": "*\\-01\\-*",
                       "fields": [
                           "start_time"
                       ],
                       "analyze_wildcard": true
                   }
               }
           ]
       }
   }

The first one works but the second doesn't give what I want. Btw the field start_time is like this for example: 2020-01-03 15:03:45 and it's a heyword type (I found it like that).

James Z
  • 12,209
  • 10
  • 24
  • 44

2 Answers2

0

If you're forced to use the keyword type in your start_time, the following works -- no need for the leading & trailing wildcards since your start_time will adhere to a certain format:

GET karim/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "-01-",
            "fields": [
              "start_time"
            ]
          }
        }
      ]
    }
  }
}

It's advisable, though, to use date whenever working with date(time)s. So set your index up like so:

PUT karim
{
  "mappings": {
    "properties": {
      "start_time": {
        "type": "date",
        "format": "YYYY-MM-DD HH:mm:ss"
      }
    }
  }
}

and query like so

GET karim/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "start_time": {
              "gte": "01",
              "lt": "02",
              "format": "MM"
            }
          }
        }
      ]
    }
  }
}

for the month of January of any given year. Adjust the format to match a specific year etc.

This approach is guaranteed to be faster than wildcard textual queries, esp. when you're querying multiple ranges and, possibly, intend to aggregate down the road.

Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68
  • Thank you for responding, actually I'm doing a bulk request every day of data taken from a local IPBX server, i used the Elasticsearch java API to do so, and while searching for putting mappings, I've had errors saying that mappings are deprecated and index already created so on... knowinh that i put the mapping as advicesd in the documentation, so I left it as it is, intending to work with it as I thought I could in the question but didn't work, now I'm dumping all the indexes and reindexing them with slicing the date : doing the year, mounth, day each in a single parameter. – Karim Barrane May 24 '20 at 19:39
  • OK. So are you saying the question is not relevant anymore? – Joe - GMapsBook.com May 24 '20 at 21:26
  • Yes I tried the method that you gave me but unfortunaletly it didn't work for me – Karim Barrane May 24 '20 at 23:59
0

You are indexing your field with type text and sub fields of keyword type. Text fields are broken in tokens ex "2020-01-12" will be stored as ["2020","01","12"]. You need to run your query on keyword field using "start_time.keyword"

{
  "query": {
       "bool": {
           "must": [
               {
                   "query_string": {
                       "query": "*-01-*",
                       "fields": [
                           "start_time.keyword" --> note
                       ],
                       "analyze_wildcard": true
                   }
               }
           ]
       }
   }
}

As @joe mentioned wildcard queries have poor performance it is better to use date field

jaspreet chahal
  • 8,817
  • 2
  • 11
  • 29