2

I am trying to figure out why some messages received by logstash have some text missing from them.

I am sending the logs from an Untangle NGFW (11.0 - licensed) to logstash so that I can keep historical records of web filter logs etc.

I first noticed an issue with the logs coming from Untangle shortly after setting up syslog, I set up a filter which looks for messages which have been split because they were too long, this joins them back together so they can be parsed by grok & json parsers in logstash. Most events processed fine, including the majority of events with multiline tags, however a number of the messages were coming through with jsonparsefailure tags as well. I copied these messages and put them in to a json parser and found they would fail because small parts of the message shortly after the message split were missing.

My thought when I first saw the issue was that my filter was doing something unexpected, but I checked it over and logically couldn't see how it would remove any of the text which was going missing randomly. Then I removed all filters from the logstash config and checked the messages that were coming in, some were still missing some text. I asked about this in the #logstash IRC channel and someone suggested cloning all the messages as they came in, that would allow me to continue filtering all the messages but would keep the original message intact. I could then check the messages received against what was saved in the untangle log files. I also raised a support query with Untangle, who raised a bugzilla, which was closed later than day as unreproducible.

My config from logstash:

input {
  tcp {
    port => 15005
    type => "untangle-syslog"
  }
}

filter {
  #looks for messages from untangle input, then clones every one, some to be saved in a file for future reference, some to be proces
sed by following filters
  if [type] == "untangle-syslog" {
    clone {
      #"cant_touch_this" will be saved to file "modify_this" will be filtered below
      clones => [ "cant_touch_this", "modify_this" ]
    }
  }
  if [type] == "modify_this" {
    #look for any lines that end with ... this would be a multiline message split by rsyslog when untangle NGFW sends the message
    multiline {
      pattern => "\.\.\.$"
      what => "next"
    }
    #remove the part between the ... delimiters with nothing, this should return the message to it's starting format
    if "multiline" in [tags] {
      mutate {
        gsub => [ "message", "\.\.\.\n.*\.\.\.", "" ]
      }
    }
    grok {
      match => { "message" => "<%{POSINT:priority}>%{SYSLOGTIMESTAMP:syslogtimestamp} localhost node-17: \[SyslogManagerImpl\] <(?:%
{WORD:Protocol}|)> %{WORD:Severity}\s*uvm\[0\]:\s*%{GREEDYDATA:jsonmessage}" }
      add_tag => "untangle-syslog"
      tag_on_failure => "untangle_syslog_grok_failure"
    }
    mutate {
      gsub => [ "jsonmessage", '"host"', '"websiteHost"' ]
    }
    json {
      source => "jsonmessage"
    }
    date {
      match => [ "timeStamp", "yyyy-MM-dd HH:mm:ss.SSS" ]
    }
  }
}

An example event with missing text, this is taken from the cloned messages file:

{"message":"<142>Apr  7 11:01:52 localhost node-17: [SyslogManagerImpl] <TCP93307337353045> INFO  uvm[0]:  {\"requestUri\":\"/pixel.gif?e=34&ud=0&up=0&qa=1366&qb=768&qc=0&qd=0&qf=1366&qe=643&qh=1349&qg=1652&qi=1366&qj=728&qk=0&ql=%3B%5BpwxnRd%7Dt%3Aa%5DmJVOG)%2C~%405%2F%5BGI%3Fi64CE)4b%3DhSIpIm%7C%7CXIN%26JZFEYooWU!%5BFZKU37xWzi%2BGr1VsEkeSKx%26aRo.)%3FJX%5E%255uA%5D%5BC%2BQVo7X%3Ch%2CeI%7C9x%5ELmFLBJIbEBE)8%25S%40xN~ReUobxEsct!6AMCaUw_N%5BoUz%2FAbi%5DpRAJa%3Ci6ikhMn%3CRRHj%5BG%2B1ck9Ln%3AA.eqRB4%2530980o(I%7DY(T0cC%236%3AOtNCp%2F%5B*bT1(ADac8.KG%26JaybqP%23Y%25a%2FqujM37rYgq%7B!%3Dy3mG%2BQV77zANVitbKxpwoFKL3l9EU%22y%2F.D%5BIAby)rx(%40J%3A%3AsLeLG)%3B.o%5DE%24d!L2HyXLy!%7DbSqVB4%25S%40%5EXYq4%3Cc3%2B%7BE%24C%23XGmuGTtDZF%5Eh...","@version":"1","@timestamp":"2015-04-07T10:01:54.569Z","host":"10.0.0.1","type":"cant_touch_this"}
{"message":"<142>Apr  7 11:01:52 localhost ...\"pixel.moatads.com\",\"timeStam \"2015-04-07 11:01:52.447\",\"sessionId\":93307337353045,\"requestId\":93307337050879,\"tag\":\"uvm[0]: \",\"class\":\"class com.untangle.node.http.HttpRequestEvent\",\"method\":\"GET\",\"sessionEvent\":{\"protocol\":6,\"timeStamp\":\"2015-04-07 11:01:52.356\",\"SClientAddr\":\"/192.168.1.11\",\"tag\":\"uvm[0]: \",\"CServerAddr\":\"/54.237.114.13\",\"protocolName\":\"TCP\",\"CClientAddr\":\"/10.0.248.31\",\"class\":\"class com.untangle.uvm.node.SessionEvent\",\"hostname\":\"10.0.248.31\",\"SClientPort\":38237,\"serverIntf\":1,\"CServerPort\":80,\"username\":\"USER\",\"clientIntf\":2,\"sessionId\":93307337353045,\"policyId\":14,\"SServerPort\":80,\"SServerAddr\":\"/54.237.114.13\",\"CClientPort\":22810},\"contentLength\":0}","@version":"1","@timestamp":"2015-04-07T10:01:54.570Z","host":"10.0.0.1","type":"cant_touch_this"}

As you can see in the second line, timeStam has been cut off, the end of the word is gone, as well as the closing \" and the :

I then connected to the Untangle server and checked the logs to see what was saved there:

Apr  7 11:01:52 localhost node-17: [SyslogManagerImpl] <TCP93307337353045> INFO  uvm[0]:  {"requestUri":"/pixel.gif?e=34&ud=0&up=0&qa=1366&qb=768&qc=0&qd=0&qf=1366&qe=643&qh=1349&qg=1652&qi=1366&qj=728&qk=0&ql=%3B%5BpwxnRd%7Dt%3Aa%5DmJVOG)%2C~%405%2F%5BGI%3Fi64CE)4b%3DhSIpIm%7C%7CXIN%26JZFEYooWU!%5BFZKU37xWzi%2BGr1VsEkeSKx%26aRo.)%3FJX%5E%255uA%5D%5BC%2BQVo7X%3Ch%2CeI%7C9x%5ELmFLBJIbEBE)8%25S%40xN~ReUobxEsct!6AMCaUw_N%5BoUz%2FAbi%5DpRAJa%3Ci6ikhMn%3CRRHj%5BG%2B1ck9Ln%3AA.eqRB4%2530980o(I%7DY(T0cC%236%3AOtNCp%2F%5B*bT1(ADac8.KG%26JaybqP%23Y%25a%2FqujM37rYgq%7B!%3Dy3mG%2BQV77zANVitbKxpwoFKL3l9EU%22y%2F.D%5BIAby)rx(%40J%3A%3AsLeLG)%3B.o%5DE%24d!L2HyXLy!%7DbSqVB4%25S%40%5EXYq4%3Cc3%2B%7BE%24C%23XGmuGTtDZF%5Eh...
Apr  7 11:01:52 localhost ..."pixel.moatads.com","timeStamp": "2015-04-07 11:01:52.447","sessionId":93307337353045,"requestId":93307337050879,"tag":"uvm[0]: ","class":"class com.untangle.node.http.HttpRequestEvent","method":"GET","sessionEvent":{"protocol":6,"timeStamp":"2015-04-07 11:01:52.356","SClientAddr":"/192.168.1.11","tag":"uvm[0]: ","CServerAddr":"/54.237.114.13","protocolName":"TCP","CClientAddr":"/10.0.248.31","class":"class com.untangle.uvm.node.SessionEvent","hostname":"10.0.248.31","SClientPort":38237,"serverIntf":1,"CServerPort":80,"username":"USER","clientIntf":2,"sessionId":93307337353045,"policyId":14,"SServerPort":80,"SServerAddr":"/54.237.114.13","CClientPort":22810},"contentLength":0}

What is saved on the Untangle machine includes the full message including the characters which were missing from logstashes point of view. I know this isn't what is sent via rsyslog from Untangle, as the priority is missing from the front of the log entry. However this should be the message that was read by rsyslog before sending.

What I need to know is how I now figure out where this pipeline is breaking down, the message is intact at the start on the Untangle machine, but by the time it is received and saved on the logstash server the message is missing some characters.

I am pretty confident the only time this happens is when the message is split by the sending party, I don't have any issues at all with single line messages. This happens frequently with messages which are split, but not every time, I managed to get 3 examples from different messages where there were characters missing shortly after the split, in around 15 minutes on a fairly quiet server (currently testing this new Untangle machine before rolling out to production).

I have thought about setting up a tcpdump to capture all the traffic from the untangle server to logstash, which I can do fine, but I am unsure about how I would then find the messages I am looking for in the PCAP file to prove that the info sent over was with/out the characters that went missing.

I am using rsyslogd 5.8.11 and logstash 1.5.0RC2 (I was using 1.4.2 but updated because clone -> filter didn't work in 1.4.2)

Any suggestions as to how best to investigate this so I can find the cause of the issue would be greatly appreciated.

EDIT:

Today I have done some further investigation, I did a packet capture, and then went through the capture after an event came through which was missing some text. I found that the text is missing in the packet received by the server, so I believe the issue is with rsyslog. I also found that rsyslog doesn't do the splitting (as far as I can tell) it seems to be done by logger.

Rumbles
  • 994
  • 1
  • 12
  • 28

0 Answers0