0

I recently started learning ELK but having hard time in understanding on how to parse XML data. I would like to parse my XML file who looks like that:

<Name nameID="xxxx">
  <Type p="1">xxxxxx</Type>
  <Type p="2">xxxxxx</Type>
    .
    .
  <Type p="9">xxxxx</Type>
  <Value obj="1"> 
    <r p="1">5.94</r>
    <r p="2">62.19</r>
    .
    .
    <r p="9">7.19</r>
  </Value>
  <Value obj="2"> 
    <r p="1">5.94</r>
    <r p="2">62.19</r>
    .
    .
    <r p="9">7.19</r>
  </Value>
</Name>
<Name nameID="yyyy">
  <Type p="1">yyyyy</Type>
  <Type p="2">yyyyyy</Type>
  <Type p="3">yyyy</Type>
  <Value obj="1"> 
    <r p="1">54.94</r>
    <r p="2">6.19</r>
    <r p="3">0</r>
  </Value>
</Name>

I would like to get something like that: in the output

"NameID = name1
Type = Type1
obj = obj1
Value = xx
"
"NameID = name1
Type = Type2
obj = obj1
Value = xx
"
"NameID = name1
Type = Type3
obj = obj1
Value = xx
"
...etc
and then
"NameID = name1
Type = Type1
obj = obj2
Value = xx
"
"NameID = name1
Type = Type2
obj = obj2
Value = xx
"
....etc

I used this logstash.conf but I didn't get what I really need (I get an array for each field)

input {
    file {
        path => "/home/test/data.xml"
        start_position => beginning
        sincedb_path => "/dev/null"
        codec => multiline
        {
            pattern => "<Name"
            negate => true
            what => "previous"
        }
    }
}
filter
{
    xml {
        source => "message"
        target => "parsed"
        add_tag => "xml"
        xpath => [
            "//Name/@nameID","Name",
            "//Type/@p","TypeID",
            "//Type/text()","Type",
            "//Value/@obj","Obj",
            "//r/text()","value"]
krlzlx
  • 5,752
  • 14
  • 47
  • 55
C.med
  • 581
  • 1
  • 5
  • 21
  • 1
    You can use ingest attachment plugin to do that easily, see https://www.elastic.co/guide/en/elasticsearch/plugins/current/ingest-attachment.html – sramalingam24 Jul 18 '18 at 04:22

2 Answers2

0
  1. Use the logstash ruby filter plugin. (need to require a gem in your code?)
  2. with ruby parse the XML, to
  3. build the json document ready for index
panchicore
  • 11,451
  • 12
  • 74
  • 100
  • Hi panchicore, I looked all day for same examples to use ruby filter plugin. But I didn't find a way to use it. In fact I don't know how to use it to solve my problem. Can you give me a suggestion? Thanks a lot ^^ – C.med Jul 17 '18 at 14:02
  • the filter receives the var `event`, with it you access to the xml string, here is a simple example on how to flatten fields into an array https://github.com/panchicore/es-gtd/blob/master/logstash/ls-gtd-pipeline.conf#L36 start from here – panchicore Jul 18 '18 at 08:01
0

solution:

filter { xml { source => "message" store_xml => true target => "theXML" force_array => false } }
split { field => "[theXML][Type]" }
split { field => "[theXML][Value]" }
split { field => "[theXML][Value][r]" }

and then use in the output:

output{
if [theXML][Type][p]==[theXML][Value][p]{
elasticsearch ....}}

hope that could help someone ;)

C.med
  • 581
  • 1
  • 5
  • 21