grpc low level telemetry logging

Question

I'm trying to measure the latency of my service at a lower level. Poking around I saw that it is possible to add a addStreamTracerFactory to the grpc builder.

I've done this simple implementation like this and printed the logs:

val server = io.grpc.netty.NettyServerBuilder.forPort(ApplicationConfig.Service.bindPort).addStreamTracerFactory(ServerStreamTracerFactory)....

class Telemetry(fullMethodName: String, headers: Metadata) extends ServerStreamTracer with LazyLogging {
  override def serverCallStarted(callInfo: ServerStreamTracer.ServerCallInfo[_, _]): Unit = {
    logger.info(s"Telemetry '$fullMethodName' '$headers' callinfo:$callInfo")
    super.serverCallStarted(callInfo)
  }

  override def inboundMessage(seqNo: Int): Unit = {
    logger.info(s"inboundMessage $seqNo")
    super.inboundMessage(seqNo)
  }
  override def inboundMessageRead(seqNo: Int, optionalWireSize: Long, optionalUncompressedSize: Long): Unit = {
    logger.info(s"inboundMessageRead $seqNo $optionalWireSize $optionalUncompressedSize")
    super.inboundMessageRead(seqNo, optionalWireSize, optionalUncompressedSize)
  }
  override def outboundMessage(seqNo: Int): Unit = {
    logger.info(s"outboundMessage $seqNo")
    super.outboundMessage(seqNo)
  }
  override def outboundMessageSent(seqNo: Int, optionalWireSize: Long, optionalUncompressedSize: Long): Unit = {
    logger.info(s"outboundMessageSent $seqNo $optionalWireSize $optionalUncompressedSize")
    super.outboundMessageSent(seqNo, optionalWireSize, optionalUncompressedSize)
  }
  override def streamClosed(status: Status): Unit = {
    logger.info(s"streamClosed $status")
    super.streamClosed(status)
  }

}

object ServerStreamTracerFactory extends Factory with LazyLogging{
  logger.info("called")
  override def newServerStreamTracer(fullMethodName: String, headers: Metadata): ServerStreamTracer = {
    logger.info(s"called with $fullMethodName $headers")
    new Telemetry(fullMethodName, headers)
  }
}

I'm running a simple grpc client in a loop and examining the output of the server stream tracer.

I see that the "lifecycle" of logs repeats itself. Here is one iteration (but it spews out the exact same again and again):

22:15:06 INFO  [grpc-default-worker-ELG-3-2] [newServerStreamTracer:38] [ServerStreamTracerFactory$] called with com.dy.affinity.service.AffinityService/getAffinities Metadata(content-type=application/grpc,user-agent=grpc-python/1.15.0 grpc-c/6.0.0 (osx; chttp2; glider),grpc-accept-encoding=identity,deflate,gzip,accept-encoding=identity,gzip)
22:15:06 INFO  [grpc-default-executor-0] [serverCallStarted:8] [Telemetry] Telemetry 'com.dy.affinity.service.AffinityService/getAffinities' 'Metadata(content-type=application/grpc,user-agent=grpc-python/1.15.0 grpc-c/6.0.0 (osx; chttp2; glider),grpc-accept-encoding=identity,deflate,gzip,accept-encoding=identity,gzip)' callinfo:io.grpc.internal.ServerCallInfoImpl@5badffd8
22:15:06 INFO  [grpc-default-worker-ELG-3-2] [inboundMessage:13] [Telemetry] inboundMessage 0
22:15:06 INFO  [grpc-default-worker-ELG-3-2] [inboundMessageRead:17] [Telemetry] inboundMessageRead 0 19 -1
22:15:06 INFO  [pool-1-thread-5] [outboundMessage:21] [Telemetry] outboundMessage 0
22:15:06 INFO  [pool-1-thread-5] [outboundMessageSent:25] [Telemetry] outboundMessageSent 0 0 0
22:15:06 INFO  [grpc-default-worker-ELG-3-2] [streamClosed:29] [Telemetry] streamClosed Status{code=OK, description=null, cause=null}

A few things that aren't quite clear to me from just looking at these logs:

Why is a new stream being created for each request? I though that the grpc client is supposed to re-use the connection. "stream closed" shouldn't be called right?
If the stream is being re-used, how come I see that the inboundMessage number (and outboundMessage) is always "0". (Also when I've started multiple clients in parallel this is always 0). In what case should the message number not be 0?
If the stream isn't being re-used, how should I be configuring the clients differently to re-use the connection?

score 0 · Accepted Answer · answered Apr 05 '19 at 00:10

In gRPC one HTTP2 stream is created for each RPC (while if retries or hedging is enabled there can be more than one streams for each RPC). HTTP2 streams are multiplexed on one connection, and it's pretty cheap to open and close streams. So, it's the connection being re-used, not the stream.

The seqNo you get from the tracer methods is the seqNo of messages for this stream, which starts from 0. Looks like you are doing unary RPCs, which makes one request and gets one response then closes. What you see is totally normal.

grpc low level telemetry logging

1 Answers1