I'm trying to parse an Elasticsearch string field (named Request.Body
) which contains XML. This field contains a SOAP request string, like this :
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="myURL">
<SOAP-ENV:Body>
<ns1:find>
<token>myData</token>
<login>myData</login>
<language>myData</language>
<search>myData</search>
<contains>false</contains>
</ns1:find>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
My goal is to extract the value of the search
tag in a scripted field (in Kibana) using the painless language.
I tried this :
def field = doc['Request.Body'].value;
if (field != null) {
def matcher = /<search>(.*)<\/search>/.matcher(field);
if (matcher.find()) {
return matcher.group(1);
}
return "No match";
}
return "No field";
This code always returns No match
.
To debug, I try to return the value of doc['Request.Body'].value
, in this example, it returns only 1.0
instead of my full XML.
I also tried to concat the values of the values
List in this object with that code :
def field = doc['Request.Body'].getValues().stream().collect(Collectors.joining(""));
if (field != null) {
def matcher = /<search>(.*)<\/search>/.matcher(field);
if (matcher.find()) {
return matcher.group(1);
}
return "No match";
}
return "No field";
Now, the field
variable is equals to the contains of XML tags values, but I lose the XML tags, so I can't extract my data with the regex and like the first script, it returns always No match
.
So my question is, how to get the full XML value of my field in a variable in my script ? Why Elasticsearch is "parsing" my XML ?
Any help would be appreciated. Thanks.