I'm trying to parse a logfile, containing XML and other arbitrary output. In a specific case I want to check whether reservations have been successfully sent to the customer or not.
[11-28-51.440000] Sending reservation to customer
[11-28-51.492900] <?xml version="1.0" encoding="UTF-8"?><SendReservation><ReservationId>1289</ReservationId><Customer>2892</Customer>...</SendReservation>
[11-28-51.493000] Status: Successfull
[11-28-52.261000] Something different
[11-28-51.520000] Sending reservation to customer
[11-28-54.548900] <?xml version="1.0" encoding="UTF-8"?><SendReservation><ReservationId>2732</ReservationId><Customer>7856</Customer>...</SendReservation>
[11-28-54.600000] Status: Error: Reservation was rejected
Now with logstash I need to parse some fields of the reservation, including the ReservationId. For this I can use the logstash XML filter. However I have to combine this with the success/error-status, which is being printed after the XML output as normal text.
I try to use a multiline input:
input {
file {
path => "test.log"
start_position => "beginning"
type => "reservation"
codec => multiline {
pattern => "\[(.*?)\](.*?)<\?xml[^>]*>"
negate => true
what => previous
}
}
}
With that I will have a message in the logstash event:
"message" => "[11-28-51.492900] <?xml version="1.0" encoding="UTF-8"?><SendReservation><ReservationId>1289</ReservationId><Customer>2892</Customer>...</SendReservation>\n[11-28-51.493000] Status: Successfull\n[11-28-52.261000] Something different\n[11-28-51.520000] Sending reservation to customer
To be able to parse the XML with the XML filter I need to have a field as source which contains valid XML. Therefore I'm trying to cut away the timestamp before and everything after the xml.
mutate {
gsub => [ "message", "^(.*?)<\?xml[^>]*>", "" ]
}
mutate {
gsub => [ "message", "(?<=<\/SendReservation>).*$", "" ]
}
At this point I see, that the regex-matching does only work in the first line of the message (before the first \n). Which means, that cutting away everything after the end tag will have no effect. This is my first problem, which might have something to do with multiline.
The second problem is, that I have no clue how to move the XML content, I try to cut out of 'message' into a new field, which I can use onwards in the XML filter as source field. I tried grok overwrite, but it requires an existing field and I have to create a new one.
So in conclusion, all I want is to create a head and tail field from my multiline message. Head would contain the first line with XML, holding the main information, and tail the rest with some additional information, which I have to relate.