2

I have some difficulties getting Storm worker/topology logs into an ELK-compatible format (JSON) with Storm 2.1.0.

These are my current worker logging configs:

cluster.xml

<Configuration monitorInterval="60" shutdownHook="disable" packages="ch.qos.logback.core">
    <Properties>
        <property name="logstash">+ %msg +%n</property> <!-- only for 'bug' demonstration purposes -->
    </Properties>
    <Appenders>
        <Console name="CONSOLE">
            <PatternLayout>
                <Pattern>${logstash}</Pattern>
            </PatternLayout>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="INFO">
            <AppenderRef ref="CONSOLE" />
        </Root>
    </Loggers>
</Configuration>

worker.xml

<Configuration monitorInterval="60" shutdownHook="disable" packages="ch.qos.logback.core">
        <Properties>
            <property name="logstash">{"@timestamp":"%d{yyyy-MM-ddTHH:mm:ss.SSS}", "logger": "%logger", "message":"%msg","thread_name":"%t","level":"%level"}%n</property>
    </Properties>
    <Appenders>
        <Console name="CONSOLE">
            <PatternLayout>
                    <Pattern>${logstash}</Pattern>
            </PatternLayout>
        </Console>
    </Appenders>
    <Loggers>
        <Logger name="org.com.package" level="DEBUG" additivity="false">
            <AppenderRef ref="CONSOLE"/>
        </Logger>
        <Root level="INFO">
            <AppenderRef ref="CONSOLE"/>
        </Root>
    </Loggers>
</Configuration>

According to my this configuration, I would've expected a log message formatted similarly to the following (newlines added for clarity:

{
  "@timestamp": "2020-02-11 11:32:40,748",
  "logger": "org.com.package.aggregation.SomeAggregation",
  "message": "joinStateStream: values: [33333333-4e30-49a6-8e1c-f7817633bb34, 7c777777-a622-4ae4-a504-2490db47cafe, 2020-02-11]",
  "thread_name": "Thread-23-b-25-joinState-groupedQ-revQ-executor[22, 22]",
  "level": "DEBUG"
}

However, it seems like the messages are being "wrapped" by either the supervisor process or even the worker itself. The log messages I get look like the following instead:

(newlines/whitespace added for readability)

+ Worker Process 273c05df-f087-43ca-a59a-e281bae98ab1:  
{  
    "@timestamp":"2020-02-11 11:32:40,748",
    "logger": "STDERR",
    "message":
        "{
            "@timestamp":"2020-02-11 11:32:40,748",
            "logger": "org.com.package.aggregation.SomeAggregation",
            "message":"joinStateStream: values: [33333333-4e30-49a6-8e1c-f7817633bb34, 7c777777-a622-4ae4-a504-2490db47cafe, 2020-02-11]",
            "thread_name":"Thread-23-b-25-joinState-groupedQ-revQ-executor[22, 22]",
            "level":"DEBUG"
        }",  
    "thread_name":"Thread-2",
    "level":"INFO"} +

Several things seem to be happening here (going inward):

  1. The supervisor seems to wrap all worker messages and prepend them with Worker Process <worker-id>:. This can be noted by the + that wraps the entire message.
  2. The worker seems to wrap its own log messages somehow. The message part of the log contains yet another log message.

As for my questions:

  1. Can I disable this supervisor log prefixing? Looking at the source code in org.apache.storm.daemon.supervisor.BasicContainer#launch, this seems to be hardcoded. I can't imagine that a correct topology deployment should result in a hardcoded prefix that is added to every log message.

  2. How can this worker message wrapping happen? As I can see, the 'wrappee' is my actual topology message (that I wish to parse), and the 'wrapper' is something else entirely (with logger STDERR? Why is it logging to STDERR? With level INFO???)

Basically, I want to simply log some messages during the execution of my topology and control the format of these messages. How can I do this reliably with Storm?

filpa
  • 3,651
  • 8
  • 52
  • 91
  • 2
    Just wondering if you considered swapping `PatternLayout` with [JSONLayout](https://logging.apache.org/log4j/2.x/manual/layouts.html#JSONLayout)? – timur Feb 18 '20 at 05:12
  • I have not tried that. I'll do so and get back to you. Thank you for the suggestion. – filpa Feb 18 '20 at 16:59

0 Answers0