Elasticsearch painless get string field (which contains XML) into a variable

Question

I'm trying to parse an Elasticsearch string field (named Request.Body) which contains XML. This field contains a SOAP request string, like this :

<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="myURL">
  <SOAP-ENV:Body>
    <ns1:find>
      <token>myData</token>
      <login>myData</login>
      <language>myData</language>
      <search>myData</search>
      <contains>false</contains>
    </ns1:find>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

My goal is to extract the value of the search tag in a scripted field (in Kibana) using the painless language.

I tried this :

def field = doc['Request.Body'].value;
if (field != null) {
  def matcher = /<search>(.*)<\/search>/.matcher(field);
  if (matcher.find()) {
    return matcher.group(1);
  }
  return "No match";
}
return "No field";

This code always returns No match.

To debug, I try to return the value of doc['Request.Body'].value, in this example, it returns only 1.0 instead of my full XML.

I also tried to concat the values of the values List in this object with that code :

def field = doc['Request.Body'].getValues().stream().collect(Collectors.joining(""));
if (field != null) {
  def matcher = /<search>(.*)<\/search>/.matcher(field);
  if (matcher.find()) {
    return matcher.group(1);
  }
  return "No match";
}
return "No field";

Now, the field variable is equals to the contains of XML tags values, but I lose the XML tags, so I can't extract my data with the regex and like the first script, it returns always No match.

So my question is, how to get the full XML value of my field in a variable in my script ? Why Elasticsearch is "parsing" my XML ?

Any help would be appreciated. Thanks.

score 1 · Answer 1 · answered Mar 24 '21 at 08:49

It's an old topic, but just run into the same problem. I was able to solve it using the field definition this way:

def field = params._source.Request.Body;

This code can extract the relevant information from a sting field:

def field = params._source.message;
if (field != null) {
def matcher = /<decisionText>(.*)<\/decisionText>/.matcher(field);
if (matcher.find()) {
return matcher.group(1);
}
return "No match";
}
return "No field";

In my case the data is in the "message" field

Elasticsearch painless get string field (which contains XML) into a variable

1 Answers1