0

I've document which has Turkish words like "şa, za, sb, şc, sd, şe" etc. as customer_address property.

I've indexed my documents as documented below because I want to order documents according to the customer_address field. Sorting is working well. Sorting and Collations

Now I'm trying to apply range query over "customer_address" field. When I sent the query below, I've got an empty result. (expected result: sb, sd, şa, şd)

curl -XGET http://localhost:9200/sampleindex/_search?pretty -d '{"query":{"bool":{"filter":[{"range":{"customer_address.sort":{"from":"plaj","to":"şcam","include_lower":true,"include_upper":true,"boost":1.0}}}],"disable_coord":false,"adjust_pure_negative":true,"boost":1.0}}}'

When I've queried I saw that my fields are encrypted as specified in the document.

curl -XGET http://localhost:9200/sampleindex/_search?pretty -d '{"aggs":{"myaggregation":{"terms":{"field":"customer_address.sort","size":10000}}},"size":0}'

{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
    "total" : 6,
    "max_score" : 0.0,
    "hits" : [ ]
  }
"aggregations" : {
    "a" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "⚕䁁䀠怀\u0001",
          "doc_count" : 1
        },
        {
          "key" : "⚗䁁䀠怀\u0001",
          "doc_count" : 1
        },
        {
          "key" : "✁ੀ⃀ၠ\u0000\u0000",
          "doc_count" : 1
        },
        {
          "key" : "✁ୀ⃀ၠ\u0000\u0000",
          "doc_count" : 1
        },
        {
          "key" : "✁ీ⃀ၠ\u0000\u0000",
          "doc_count" : 1
        },
        {
          "key" : "ⶔ䁁䀠怀\u0001",
          "doc_count" : 1
        }
      ]
    }
  }
}

So, How should I send my parameters in the range query to be able to get the successful result?

Thanks in advance.

My Mapping:

curl -XGET http://localhost:9200/sampleindex?pretty
{
  "sampleindex" : {
    "aliases" : { },
    "mappings" : {
      "invoice" : {
        "properties" : {
          "customer_address" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword"
              },
              "sort" : {
                "type" : "text",
                "analyzer" : "turkish",
                "fielddata" : true
              }
            }
          }
       } 
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "5",
        "provided_name" : "sampleindex",
        "max_result_window" : "2147483647",
        "creation_date" : "1521732167023",
        "analysis" : {
          "filter" : {
            "turkish_phonebook" : {
              "variant" : "@collation=phonebook",
              "country" : "TR",
              "language" : "tr",
              "type" : "icu_collation"
            },
            "turkish_lowercase" : {
              "type" : "lowercase",
              "language" : "turkish"
            }
          },
          "analyzer" : {
            "turkish" : {
              "filter" : [
                "turkish_lowercase",
                "turkish_phonebook"
              ],
              "tokenizer" : "keyword"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "ChNGX459TUi8VnBLTMn-Ng",
        "version" : {
          "created" : "5020099"
        }
      }
    }
  }
}
gul.cabuk
  • 1
  • 4

1 Answers1

0

I've solved my problem by defining an analyzer with char filter during index creation. I don't know whether it is a good solution or not, but I've could not solve by "turkish_phonebook" of ICU, so the solution seems working for now.

Firstly, I created an index with "turkish_collation_analyzer". And then for my properties which needs this, I created a field "property.tr" to use this defined analyzer. And for last, during range queries, I converted my values as expected by this field.

"settings": {
  "index": {
    "number_of_shards": "5",
    "provided_name": "sampleindex",
    "max_result_window": "2147483647",
    "creation_date": "1522050241730",
    "analysis": {
      "analyzer": {
        "turkish_collation_analyzer": {
          "char_filter": [
            "turkish_char_filter"
          ],
          "tokenizer": "keyword"
        }
      },
      "char_filter": {
        "turkish_char_filter": {
          "type": "mapping",
          "mappings": [
            "a => x01",
            "b => x02",
            .,
            .,
            .,

          ]
        }
      }
    },
    "number_of_replicas": "1",
    "uuid": "hiEqIpjYTLePjF142B8WWQ",
    "version": {
      "created": "5020099"
    }
  }
}
gul.cabuk
  • 1
  • 4