Error while indexing in Elasticsearch with phonetic beider_morse

Question

I've a problem with indexer of Elasticsearch. I try to index title : "Ali Baba et les quarante voleurs". But with phonetic "beider_morse", i've got this error :

"IllegalArgumentException[TokenStream expanded to 1024 finite strings. Only <= 256 finite strings are supported]"

I have this configuration:

{
    settings: {
        index: {
            analysis: {
                filter: {
                    beider_morse: {
                        languageset: [
                            french
                        ]
                        type: phonetic
                        encoder: beider_morse
                    }
                }
                analyzer: {
                     personId-person: {
                         filter: asciifolding
                         tokenizer: standard
                     }
                     beider_morse-title: {
                         filter: beider_morse
                         tokenizer: whitespace
                     }
                }
            }
            number_of_shards: 4
            number_of_replicas: 1
        }
    }
    mappings: {
        film: {
            properties: {
                personIds: {
                    type: string
                    fields: {
                        personId: {
                            analyzer: personId-person
                            type: string
                        }
                    }
                }
                title: {
                    type: string
                    fields: {
                        completionb: {
                            max_input_length: 50
                            payloads: false
                            analyzer: beider_morse-title
                            preserve_position_increments: true
                            type: completion
                            preserve_separators: true
                        }
                    }
                }
                poster: {
                    index: not_analyzed
                    type: string
                }
            }
        }
    }
}

And for values :

{"title":"Ali Baba et les quarante voleurs","poster":["35362","http:\/\/demo.com\/g456fg879.jpg","2012-02-13 18:13:28.468422","Ali Baba et les quarante voleurs"],"personIds":["37058","37059","37060"]}

Have you an idea for increase/fix this ?

I have never had the occasion to try it, but try adding the [`max_token_length`](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.html) to the `analyzer`. — pickypg, Apr 13 '14 at 19:35
I just tested your suggestion and I believed it, but unfortunately it does not always want :( I tried using the analyzer directly and via a custom tokenizer and two together, without results. — user3253361, Apr 14 '14 at 09:47

Error while indexing in Elasticsearch with phonetic beider_morse

0 Answers0