0

I'm using Elasticsearch and create an index with the following information for mapping and settings. The problem I have is that my field geography.locality which should use the 'name_analyser' doesn't seem to use it.

{
  "index": "programs",
  "body": {
    "settings": {
      "number_of_shards": 5,
      "analysis": {
        "filter": {
          "elision": {
            "type": "elision",
            "articles": [
              "l",
              "m",
              "t",
              "qu",
              "n",
              "s",
              "j",
              "d"
            ]
          },
          "multi_words": {
            "type": "shingle",
            "min_shingle_size": 2,
            "max_shingle_size": 10
          },
          "name_filter": {
            "type": "edgeNGram",
            "max_gram": 100,
            "min_gram": 2
          }
        },
        "tokenizer": {
          "name_tokenizer": {
            "type": "edgeNGram",
            "max_gram": 100,
            "min_gram": 2
          }
        },
        "analyser": {
          "name_analyser": {          // <-- analyser I want to use on geography.locality
            "tokenizer": "whitespace",
            "type": "custom",
            "filter": [
              "lowercase",
              "multi_words",
              "name_filter",
              "asciifolding"
            ]
          },
          "french": {
            "tokenizer": "letter",
            "filter": [
              "asciifolding",
              "lowercase",
              "elision",
              "stop"
            ]
          },
          "city_name": {
            "type": "custom",
            "tokenizer": "letter",
            "filter": [
              "lowercase",
              "asciifolding"
            ]
          }
        }
      }
    },
    "mappings": {
      "program": {
        "properties": {
          "nid": {
            "type": "integer",
            "index": "not_analyzed"
          },
          "title": {
            "type": "string"
          },
          "language": {
            "type": "string",
            "index": "not_analyzed"
          },
          "regulation": {
            "type": "integer"
          },
          "sales_state": {
            "type": "integer"
          },
          "enabled_dwell": {
            "type": "boolean"
          },
          "enabled_invest": {
            "type": "boolean"
          },
          "delivery_date": {
            "type": "date"
          },
          "address": {
            "properties": {
              "country": {
                "type": "string",
                "index": "not_analyzed"
              },
              "locality": {
                "type": "string",
                "analyser": "name_analyser"
              },
              "postal_code": {
                "type": "integer"
              },
              "thoroughfare": {
                "type": "string",
                "index": "not_analyzed"
              },
              "premise": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          },
          "location": {
            "type": "geo_point"
          },
          "geography": {
            "properties": {
              "locality": {
                "type": "string",
                "analyser": "name_analyser"  // ... here :-/
              },
              "department": {
                "type": "string",
                "index": "not_analyzed"
              },
              "region": {
                "type": "string",
                "index": "not_analyzed"
              }
            }
          },
          "lots": {
            "type": "nested",
            "include_in_all": false,
            "properties": {
              "lot_type": {
                "type": "integer"
              },
              "rooms": {
                "type": "integer"
              },
              "price_vat_inc": {
                "type": "integer"
              },
              "price_reduced_vat_inc": {
                "type": "integer"
              },
              "price_vat_ex": {
                "type": "integer"
              }
            }
          }
        }
      }
    }
  }
}

Here's the output given by ES for the mapping registered for this index.

{
  "program": {
    "properties": {
      "address": {
        "properties": {
          "country": {
            "index": "not_analyzed",
            "type": "string"
          },
          "premise": {
            "index": "not_analyzed",
            "type": "string"
          },
          "locality": {
            "type": "string"
          },
          "postal_code": {
            "type": "integer"
          },
          "thoroughfare": {
            "index": "not_analyzed",
            "type": "string"
          }
        }
      },
      "sales_state": {
        "type": "integer"
      },
      "nid": {
        "type": "integer"
      },
      "language": {
        "index": "not_analyzed",
        "type": "string"
      },
      "title": {
        "type": "string"
      },
      "enabled_invest": {
        "type": "boolean"
      },
      "geo_point": {
        "type": "string"
      },
      "lots": {
        "include_in_all": false,
        "type": "nested",
        "properties": {
          "rooms": {
            "include_in_all": false,
            "type": "integer"
          },
          "price_vat_inc": {
            "include_in_all": false,
            "type": "integer"
          },
          "price_vat_ex": {
            "include_in_all": false,
            "type": "integer"
          },
          "lot_type": {
            "include_in_all": false,
            "type": "integer"
          },
          "price_reduced_vat_inc": {
            "include_in_all": false,
            "type": "integer"
          }
        }
      },
      "enabled_dwell": {
        "type": "boolean"
      },
      "delivery_date": {
        "format": "dateOptionalTime",
        "type": "date"
      },
      "regulation": {
        "type": "integer"
      },
      "geography": {
        "properties": {
          "locality": {
            "type": "string"      // name_analyser should show up here right?????
          },
          "department": {
            "index": "not_analyzed",
            "type": "string"
          },
          "region": {
            "index": "not_analyzed",
            "type": "string"
          }
        }
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

Does anybody knows what I am doing wrong? I'm kind of lost about this.

jchatard
  • 1,881
  • 2
  • 20
  • 25
  • The mapping ES returns contains `"geo_point": { "type": "string" }` which is not defined in your static mapping. And there are some other fields that don't match the settings in your static mapping, not only `locality` and `geo_point`. How are you creating the index? It should be something like `PUT /programs { "settings": { "analysis": { "analyzer": {....`. Where is `{ "index": "programs", "body": {` coming from in your output? – Andrei Stefan May 06 '15 at 10:33
  • I'm using a php client library, so I'm just using something like: `$client->indices()->create($params);` – jchatard May 06 '15 at 10:53
  • Then I'd say there is an issue in how you create the index. I'd suggest testing the index creation only, outside your php client. I'd say it should work. After this, look closely into the php code for creating it. – Andrei Stefan May 06 '15 at 10:59
  • Ok, I'll give it a try then and update here. – jchatard May 06 '15 at 11:24
  • Ok, I just did a clean index creation from kopf, using PUT, and blabla. The mapping result is exactly the same... So I guess the php-client is not involved. – jchatard May 06 '15 at 12:40
  • Do you have any templates that might match the index you are creating? – Andrei Stefan May 06 '15 at 12:46
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/77090/discussion-between-jchatard-and-andrei-stefan). – jchatard May 06 '15 at 12:48
  • Can you provide the exact command (PUT + JSON) you used to create the index from kopf? – Andrei Stefan May 06 '15 at 13:22
  • `/programs` using PUT, with this JSON https://gist.github.com/jchatard/a52e1b38e50dc3877e59 – jchatard May 06 '15 at 13:50

2 Answers2

0

I am guessing that the index exists and you are trying to update the settings with a new analyser. This is not permitted on a live index.

Do you have any errors when you submit the updated settings?

Have a look at this thread - Change settings and mappings on existing index in Elasticsearch

and here http://www.elastic.co/guide/en/elasticsearch/reference/1.x/indices-update-settings.html#update-settings-analysis

Community
  • 1
  • 1
jrao77
  • 142
  • 5
0

You have a typo :-), actually two:

    "locality": {
      "type": "string",
      "analyser": "name_analyser"
    },

in both address and geography. It should be analyzer not analyser (with an s).

Also, the same here:

    "analyser": {
      "name_analyser": {
        "tokenizer": "whitespace",
    ...
Andrei Stefan
  • 51,654
  • 6
  • 98
  • 89