Questions tagged [distributed-tracing]

Distributed Tracing aims to provide better observability into distributed systems and microservices for purposes of performance monitoring and troubleshooting issues.

Distributed Tracing

Distributed Tracing aims to provide better observability into distributed systems and microservices for purposes of performance monitoring and troubleshooting issues.

Modern Internet services are often implemented as complex, large-scale distributed systems. These applications are constructed from collections of software modules that may be developed by different teams, perhaps in different programming languages, and could span many thousands of machines across multiple physical facilities. Tools that aid in understanding system behavior and reasoning about performance issues are invaluable in such an environment.

Source: Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

How it works in a nutshell

Distributed Tracing works by collecting the various entry and exit points and useful intermediate data and metrics done by a request until the final response is served to the requesting end. Some Distributed Tracing systems collect this information fully automatic while some other require manual instrumentation of code.

When entering a system, the request is usually assigned a unique Trace ID. This ID is then propagated to any participating systems. Information gathered this way is sent to some sort of backend collecting the data. The collector then aggregates the data via the Trace ID, thus showing the full request as it passed through the distributed system.

Metrics usually included are request time, latency, errors, status codes, etc. but not limited to this.

Open Source implementations:

Several Open Source implementations for Distributed Tracing exist:

  • http://opencensus.io

    A single distribution of libraries for metrics and distributed tracing with minimal overhead that allows you to export data to multiple backends.

  • http://opentracing.io

    Vendor-neutral APIs and instrumentation for distributed tracing

  • http://zipkin.io

    Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in microservice architectures. It manages both the collection and lookup of this data.

  • http://www.jaegertracing.io

    Jaeger, inspired by Dapper and OpenZipkin, is a distributed tracing system released as open source by Uber Technologies. It is used for monitoring and troubleshooting microservices-based distributed systems

There is also a W3 working group aiming to standardize context propagation across various Distributed Tracing systems:

Because Distributed Tracing is crucial for application performance monitoring, most APM vendors adopted it in one way or another. Notable APM vendors offering Distributed Tracing are AppDynamics, DynaTrace, Instana, Lightstep or New Relic.

219 questions
2
votes
1 answer

Tracing Spring Boot Micro services with Jaeger deployed on AKS

I had setup Jaeger in Azure Kubernetes Cluster in monitoring namespace and I deployed my container which is instrumented with jaeger client libraries in monitoring domain. The service is up and running and I'm able to see the traces using actuator…
schilaka
  • 21
  • 3
2
votes
0 answers

Zipkin Server not logging Http request traces in debug mode

I have enabled the distributed tracing using Zipkin tracer as mentioned in https://github.com/openzipkin/brave for the microservices. I could see the service call info and time taken in the Zipkin server running in local machine. I have usecase to…
jack
  • 803
  • 3
  • 15
  • 26
1
vote
1 answer

OpenTelemetry Propagation in Erlang/Elixir - an example

I have a gRPC API and want to add Otel based tracing to it. Every request to this API contains trace/span ID, but I am struggling to properly emit child span. Here is an example from iex: require OpenTelemetry.Tracer alias…
user1453428
  • 185
  • 2
  • 8
1
vote
1 answer

Do micronaut tracing annotations work on classes that are not beans?

I'm using micronaut tracing library to instrument an application. I want to use the tracing annotations (@NewSpan and @ContinueSpan) in classes that are not beans registered in the application context. I'm not getting spans for the methods in those…
Miguel Ferreira
  • 1,282
  • 11
  • 27
1
vote
0 answers

Rust/Actix current traceId/spanId when out of scope

I'm pretty new to Actix/Rust (coming from Java+SpringBoot). I am setting up a Rust microservice and need to integrate it into the rest of the tracing ecosystem at my company. We are using ECS format for our logs so that we can easily see all of the…
Ross Sullivan
  • 396
  • 1
  • 3
  • 13
1
vote
1 answer

Golang tracing ending span early in same function as parent span

I have the following code: func handleMessage(msg Message, out chan<- string, w withdrawalSocketListener) { tracer := tracing.GetTracer() ctx, span := tracer.Start(context.Background(), "receive-withdrawal-msg-tcp-socket") defer func()…
BrianM
  • 951
  • 2
  • 12
  • 29
1
vote
1 answer

Get operation_Id/operation_ParentId in inbound section of APIM policy

I setup an Azure API Management service with Correlation protocol set to W3C. It uses the header traceparent for context propagation. If the API client sets traceparent header, the APIM service maps its content to the Azure Application Insights…
1
vote
0 answers

Extract trace context from x-cloud-trace-context request header

I am working on implementing opentelemetry tracing for a multi-microservice application deployed in Google Cloud's App Engine. App Engine provides tracing by default, and includes a context in the header (` x-cloud-trace-context `) with each…
1
vote
0 answers

How to use Google Cloud Trace to do distributed tracing for a message-driven system?

I am new to GCT. For most examples I found online, the start and end of a span are on the same machine (or even the same function like defer span.Close()). E.g. try (Scope ss = tracer.spanBuilder("ChildSpan").startScopedSpan()) { ... } Meanwhile…
1
vote
1 answer

Are traces sent from OpenCensus to OpenTelemetry recognizable

I have OpenCensus implemented in my Go application and planning on sending SpanContext to a downstream service that uses OpenTelemetry. The SpanContext would include populated TraceID, SpanID, Tracestate and I'm planning to send the SpanContext in…
1
vote
0 answers

Using custom metrics in self-hosted sentry

I have started using sentry within my org and loving it so far. I've been trying to use its performance monitoring tool with custom metrics added. While I can add custom metrics to the transactions I'm generating in sentry_sdk (for Python), I can't…
Amir
  • 421
  • 1
  • 4
  • 14
1
vote
0 answers

How to send traces from rust + tracing_opentelemetry to Honeycomb?

I've set up a basic subscriber for Rust tracing with OpenTelemetry as follows: let rust_log = dotenv::var("RUST_LOG").unwrap(); if rust_log == "OTEL" { // Use OpenTelemetry subscriber let tracer = stdout::new_pipeline().install_simple(); …
Peteris
  • 3,548
  • 4
  • 28
  • 44
1
vote
0 answers

How does Envoy participate in tracing using the x-request-id header?

Is x-request-id a header that was standardized in OpenTracing? Do tracing libraries such as those of OpenTelemetry or OpenTracing recognize this header and extract any tracing context from it? If not, can Envoy populate alternative headers such as…
1
vote
0 answers

Different Trace Id while using Feign client in Distributed Tracing

I implementing Distributed tracing in microservices. I'm using feign client for inter service communication and using micrometer for tracing. But I'm getting different trace id when I call Service B from Service A using feign Client. I read a lot of…
1
vote
1 answer

End to end tracing of a flow from API/Scheduler in a Spring Boot application

I am working with the Spring Boot application. I would like to add a TraceId to every request since it hits the application endpoint. In this context, I have added the dependency of spring-cloud-starter-sleuth similar to the following: …
Joy
  • 4,197
  • 14
  • 61
  • 131