0

I am raising this query as a result of seeing another query with no satisfactory answer (and reading the advice to not ask another question in an answer or comment). That reference is Enabling Sleuth slows requests down (a lot)

My issue is similar. I am not using Feign. I am using the following:

  1. Standalone postman to call my service
  2. A spring-boot application called LniProxy which calls an Arango database via a Spring RestTemplate
  3. A local zipkin server to http post spans to.

I am using spring-cloud-sleuth, logback and zipkin. When i remove the zipkin pom reference

<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-zipkin</artifactId> 

the performance is very quick. But when I put it back, the performance is very poor.

Changing my log level from INFO to DEBUG in the logging changes the non zipkin calls from 701ms to 1051ms. Adding zipkin http changes that timing from 1051ms to around 53 seconds. In the code, i make one call to my service which in turn makes 303 calls to an arango database (via its REST interface).

I am using spring.cloud.starter-sleuth and logstash-logback-encoder. By themselves there is no performance issue, it is only when i add the spring-cloud-starter-zipkin that the performance degrades. I am running zipkin on my local pc in a docker instance, and running my service out of eclipse as a standalone app (spring boot). Arango is on my local pc running as a service. Unit test methods of my controller (using mockito to mock the mvc calls) that usually take 0.3 to 0.5 seconds each without zipkin, end up taking ~16seconds each when zipkin is enabled. The Zipkin UI reports all the calls and the sum of the calls is the roughly the full time of the call as reported by Postman.

The Application class looks like this at the top:

@SpringBootApplication
@EnableSwagger2
@EnableAutoConfiguration(exclude={DataSourceAutoConfiguration.class})
public class Application extends SpringBootServletInitializer {

    @Bean
    public Sampler defaultSampler() {
        return new AlwaysSampler();
    }

    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }
    //...
}

The end of the log file looks like the following without the zipkin reference:

2017-07-12 10:43:10.764 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 >> "{"query":"FOR n IN network FILTER n._key == '1176300' FOR e IN networkEdges FILTER e._from == n._id && e.relationship == 'composition' FOR n2 IN network FILTER e._to == n2._id RETURN {child : n2,edge : e}","count":false,"batchSize":10,"ttl":10}"
2017-07-12 10:43:10.765 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "HTTP/1.1 201 Created[\r][\n]"
2017-07-12 10:43:10.765 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "Server: ArangoDB[\r][\n]"
2017-07-12 10:43:10.765 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "Connection: Keep-Alive[\r][\n]"
2017-07-12 10:43:10.765 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "Content-Type: application/json; charset=utf-8[\r][\n]"
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "Content-Length: 223[\r][\n]"
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "[\r][\n]"
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.wire                     : http-outgoing-0 << "{"result":[],"hasMore":false,"cached":false,"extra":{"stats":{"writesExecuted":0,"writesIgnored":0,"scannedFull":0,"scannedIndex":1,"filtered":0,"executionTime":4.990100860595703e-4},"warnings":[]},"error":false,"code":201}"
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.headers                  : http-outgoing-0 << HTTP/1.1 201 Created
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.headers                  : http-outgoing-0 << Server: ArangoDB
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.headers                  : http-outgoing-0 << Connection: Keep-Alive
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.headers                  : http-outgoing-0 << Content-Type: application/json; charset=utf-8
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] org.apache.http.headers                  : http-outgoing-0 << Content-Length: 223
2017-07-12 10:43:10.766 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] o.a.http.impl.execchain.MainClientExec   : Connection can be kept alive indefinitely
2017-07-12 10:43:10.767 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] h.i.c.PoolingHttpClientConnectionManager : Connection [id: 0][route: {}->http://localhost:8529] can be kept alive indefinitely
2017-07-12 10:43:10.767 DEBUG [LniProxy,cfa3eadd0eaa074c,86a81ebae0d1a5df,true] 6960 --- [nio-8090-exec-8] h.i.c.PoolingHttpClientConnectionManager : Connection released: [id: 0][route: {}->http://localhost:8529][total kept alive: 1; route allocated: 1 of 5; total allocated: 1 of 10]

And with the zipkin reference it looks like this:

2017-07-12 10:46:18.217 DEBUG [LniProxy,4e849f3b63c73179,068a8ea6dd096488,true] 14432 --- [nio-8090-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection released: [id: 0][route: {}->http://localhost:8529][total kept alive: 1; route allocated: 1 of 5; total allocated: 1 of 10]
2017-07-12 10:46:18.617 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] o.a.h.client.protocol.RequestAddCookies  : CookieSpec selected: default
2017-07-12 10:46:18.618 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] o.a.h.client.protocol.RequestAuthCache   : Re-using cached 'basic' auth scheme for http://localhost:8529
2017-07-12 10:46:18.618 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] o.a.h.client.protocol.RequestAuthCache   : No credentials for preemptive authentication
2017-07-12 10:46:18.618 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection request: [route: {}->http://localhost:8529][total kept alive: 1; route allocated: 1 of 5; total allocated: 1 of 10]
2017-07-12 10:46:18.618 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection leased: [id: 0][route: {}->http://localhost:8529][total kept alive: 0; route allocated: 1 of 5; total allocated: 1 of 10]
2017-07-12 10:46:18.618 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] o.a.http.impl.execchain.MainClientExec   : Executing request POST /_db/lni/_api/cursor HTTP/1.1
2017-07-12 10:46:18.618 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] o.a.http.impl.execchain.MainClientExec   : Proxy auth state: UNCHALLENGED

some logs removed to fit in the 30k character limit

2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "HTTP/1.1 201 Created[\r][\n]"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "Server: ArangoDB[\r][\n]"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "Connection: Keep-Alive[\r][\n]"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "Content-Type: application/json; charset=utf-8[\r][\n]"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "Content-Length: 223[\r][\n]"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "[\r][\n]"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.wire                     : http-outgoing-0 << "{"result":[],"hasMore":false,"cached":false,"extra":{"stats":{"writesExecuted":0,"writesIgnored":0,"scannedFull":0,"scannedIndex":1,"filtered":0,"executionTime":5.009174346923828e-4},"warnings":[]},"error":false,"code":201}"
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.headers                  : http-outgoing-0 << HTTP/1.1 201 Created
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.headers                  : http-outgoing-0 << Server: ArangoDB
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.headers                  : http-outgoing-0 << Connection: Keep-Alive
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.headers                  : http-outgoing-0 << Content-Type: application/json; charset=utf-8
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] org.apache.http.headers                  : http-outgoing-0 << Content-Length: 223
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] o.a.http.impl.execchain.MainClientExec   : Connection can be kept alive indefinitely
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection [id: 0][route: {}->http://localhost:8529] can be kept alive indefinitely
2017-07-12 10:46:18.620 DEBUG [LniProxy,4e849f3b63c73179,7a3998bf127b2daf,true] 14432 --- [nio-8090-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection released: [id: 0][route: {}->http://localhost:8529][total kept alive: 1; route allocated: 1 of 5; total allocated: 1 of 10]
chenrui
  • 8,910
  • 3
  • 33
  • 43
TobyZ
  • 1
  • 3

1 Answers1

0

Take a look at Sampling interval in the docs :

In distributed tracing the data volumes can be very high so sampling can be important (you usually don’t need to export all spans to get a good picture of what is happening). Spring Cloud Sleuth has a Sampler strategy that you can implement to take control of the sampling algorithm. Samplers do not stop span (correlation) ids from being generated, but they do prevent the tags and events being attached and exported. By default you get a strategy that continues to trace if a span is already active, but new ones are always marked as non-exportable. If all your apps run with this sampler you will see traces in logs, but not in any remote store. For testing the default is often enough, and it probably is all you need if you are only using the logs (e.g. with an ELK aggregator). If you are exporting span data to Zipkin or Spring Cloud Stream, there is also an AlwaysSampler that exports everything and a PercentageBasedSampler that samples a fixed fraction of spans.

http://cloud.spring.io/spring-cloud-sleuth/spring-cloud-sleuth.html#_sampling

Barath
  • 5,093
  • 1
  • 17
  • 42
  • Thanks, I will check this, however the case I am describing is a single service call with multiple internal span calls. In the case that the sampler selects it, it would still take sixty seconds whereas non sampled versions would only take one second. – TobyZ Jul 13 '17 at 02:16