3

The type of the "level" field in the document was changed from "keyword" to "short" and I'm trying to reindex exist data to be able to use it in Kibana charts. Old data contains values like: "100%", "error" or just empty string "".

I want to get only integer inside new index. I use internal reindex API (new lines added to make a snippet more readable):

curl -s -X POST -H 'Content-Type: application/json' https://search-host.us-east-1.es.amazonaws.com/_reindex -d '{
  "source": {
    "index": "old-index"
  },  
  "dest": {
    "index": "new-index"
  },  
  "script": {
    "inline": "
        if (ctx._source.level == \"error\" || ctx._source.level == \"\")
        {
            ctx._source.level = -1
        } else {
            ctx._source.level = Integer.valueOf(ctx._source.level)    )
        }
    "
  }
}'

But I'm getting the error: "java.lang.String cannot be cast to java.lang.Number" because of the "%" symbol at the end of a value.

Also I don't have regular expressions enabled for AWS ElasticSearch and it's not possible to do as I think. So the variant with replaceAll doesn't work for me. If I have self-hosted ES, for example it could be something like this (didn't test it): /(%)?/.matcher(doc['level'].value).replaceAll('$1'):

But from AWS ES I see this:

Regexes are disabled. Set [script.painless.regex.enabled] to [true] in elasticsearch.yaml to allow them. Be careful though, regexes break out of Painless's protection against deep recursion and long loops.

Is it possible to replace string with Painless language without regexp?

kivagant
  • 1,849
  • 2
  • 24
  • 33
  • The same question in the Discuss ES: https://discuss.elastic.co/t/how-to-replace-string-without-regexp-inside-painless-inline-script-for-aws-elasticsearch/110707 – kivagant Dec 13 '17 at 09:18

2 Answers2

3
"script": {
    "lang":"painless",
    "source": """

      //function declaration
      String replace(String word, String oldValue, String newValue) {
        String[] pieces = word.splitOnToken(oldValue);
        int lastElIndex = pieces.length-1;
        pieces[lastElIndex] = newValue;
        def list = Arrays.asList(pieces);
        return String.join('',list);
      }

      //usage sample
      ctx._source["date"] = replace(ctx._source["date"],"+0000","Z");

    """
}
codelovesme
  • 3,049
  • 1
  • 16
  • 18
1

I was attempting to do the same thing where I would end up doing a full find and replace in a string field in one of my indexes. Unfortunately, for me as well, I didn't have access to RegEx.

This is the solution I came up with, using an ingest pipeline which looks like this:

PUT _ingest/pipeline/my-pipeline-id
{
    "description": "Used to update in place",
    "processors": [
        {
            "grok": {
                "field": "myField",
                "patterns": ["%{PART1:field1}%{REMOVAL}%{PART2:field2}"],
                "pattern_definitions": {
                    "PART1": "start",
                    "REMOVAL": "(toRemove){0,1}",
                    "PART2": ".+"
                },
                "ignore_missing": true
            }
        },
        {
            "script": {
                "lang": "painless",
                "inline": "ctx.myField = ctx.field1 + ctx.field2"
            }
        },
        {
            "script": {
                "lang": "painless",
                "inline": "ctx.remove('field1'); ctx.remove('field2')"
            }
        }
    ]
}

Then you run it (I've done it using an update by query)

POST /index/type/_update_by_query?pipeline=my-pipeline-id
{
    "query": {
        "match": {
            "id": "123456789"
        }
    }
}

Useful Links

Please note

I am using ES 5.5. Some syntax has changed for version 6, but process stays the same.

blo0p3r
  • 6,790
  • 8
  • 49
  • 68
  • 1
    Thank you, this looks interesting. At the moment I have added an additional field with the "_int" suffix and just put integer values there from AWS Lambda. And I left all old data as is with no changes. I hope that ES will add some simple library of functions to work with strings. – kivagant Dec 13 '17 at 09:16