I am new to fluentd.
I have applications that run in Docker containers. They are Java apps that log in JSON format. The JSON messages are usually split over multiple lines.
I would like to use the Docker fluentd log driver to send these messages to aa central fluentd server.
The Docker driver sends each line separately to fluentd so i need to be able to combine these multiline messages.
I am looking for some pointers on how to achieve this.
Using out of the box fluentd config my logs look like this:
20170501T050820+0000 docker.fa5077070a33 {"log":"{\"timestamp\":\"2017-05-01T05:08:20.168Z\", \"applicationName\":\"my-event-publisher\", \"applicationVersion\":\"0.0.6-SNAPSHOT\",","container_id":"fa5077070a330f6a3a6f9400cc0ed04f2cf61c5eb2d66c5693385b67f3b09e2e","container_name":"/ecs-td-dev-my-event-publisher-12-my-event-publisher-dcb1b5f5a383d3852d00","source":"stdout"}
20170501T050820+0000 docker.fa5077070a33 {"container_name":"/ecs-td-dev-my-event-publisher-12-my-event-publisher-dcb1b5f5a383d3852d00","source":"stdout","log":" \"logLevel\":\"INFO\", \"pid\":\"1\", \"threadId\":\"Thread-4\", \"host\":\"fa5077070a33\",","container_id":"fa5077070a330f6a3a6f9400cc0ed04f2cf61c5eb2d66c5693385b67f3b09e2e"}
20170501T050820+0000 docker.fa5077070a33 {"source":"stdout","log":" \"logger\":\"org.springframework.context.support.DefaultLifecycleProcessor\",","container_id":"fa5077070a330f6a3a6f9400cc0ed04f2cf61c5eb2d66c5693385b67f3b09e2e","container_name":"/ecs-td-dev-my-event-publisher-12-my-event-publisher-dcb1b5f5a383d3852d00"}
20170501T050820+0000 docker.fa5077070a33 {"container_id":"fa5077070a330f6a3a6f9400cc0ed04f2cf61c5eb2d66c5693385b67f3b09e2e","container_name":"/ecs-td-dev-my-event-publisher-12-my-event-publisher-dcb1b5f5a383d3852d00","source":"stdout","log":" \"message\":\"Stopping beans in phase 2147483647\""}
20170501T050820+0000 docker.fa5077070a33 {"source":"stdout","log":"}","container_id":"fa5077070a330f6a3a6f9400cc0ed04f2cf61c5eb2d66c5693385b67f3b09e2e","container_name":"/ecs-td-dev-my-event-publisher-12-my-event-publisher-dcb1b5f5a383d3852d00"}
In what order should I approach this?
I need to:
- Extract the 'log' portion of each line
- Look for a regex /^{"timestamp/ to determine the start of the message
- Combine each of the log statements in to one
- Parse the log string in to actual JSON
To be honest I don't really care for the format the fluentd has - adding in the timestamp and docker..
I would rather just have a file with my JSON messages with no additional fields added by fluentd.
I have seen the documentation on using a 'parser' but as i said, i'm just not quite sure on the order as i'm trying to marry together multiline JSON.