Questions tagged [observability]

monitoring the internal state of a system by looking at its output

Observability is the ability to answer any question about a business or application through the collection and analysis of data. Succinctly, it’s an approach to understanding the operation of a system by reviewing output from the system. In the software world, observability generally is framed in the context of the ‘three pillars’ or telemetry data types: metrics, traces, and logs. Combining these three types of data gives you the power to answer questions about your business/application that you may not have known you’d need answers to when you set it up..

83 questions
0
votes
0 answers

Observability for DLQs? [AWS SQS]

I'm trying to understand what is the best way to implement some sort of Observability for DLQs. I know it's possible to have some monitoring (e.g.: There are 3 messages in the DLQ XYZ) I'm trying to understand if there's a way to know what happened…
0
votes
1 answer

unknown field `healthcheck` with Vector

I am trying to expose the endpoint "http://localhost:8686/health" using Vector so that I can monitor the Vector's health. After research, I came across guides that recommend putting the following code at the beginning of the /etc/vector/vector.toml…
Stanley Ulili
  • 702
  • 5
  • 7
0
votes
1 answer

Unable to send open telemetry traces to OpenSearch

I am trying to send open telemetry traces to OpenSearch but I am unable to. The traces are visible on Jeager UI but they are not getting ingested in OpenSearch .I have tried everything ,created collectors and pipelines still no use This is my…
0
votes
0 answers

How to export logs for certain interval from Kibana Observability/Logs/Stream?

We need to export logs from the stream for a specific time interval (which is set on the top of the main stream frame) into JSON, CSV or any other human-readable format. The logs "live" 7 days in our system and we often need to have logs to hand…
Arsenii
  • 655
  • 1
  • 8
  • 20
0
votes
1 answer

DataDog alert includes filtered out environment tag

I have a DataDog query alert like so: min(last_5m):sum:my_service.verification.failure{service:"my_service" AND env:"production"} by {region,env} >= 1 And I enabled the include tags in the alert title feature. The problem is that the notification…
0
votes
0 answers

Why use Open Telemetry Logging in Kubernetes dotnet service?

I have a dotnet web api that has been deployed to a Kubernetes cluster. I see a lot of experienced folks suggesting using Open Telemetry SDK with Console exporter instead of dotnet inbuilt logging logger. However, when I configured (see the…
Abhijit
  • 175
  • 5
  • 16
0
votes
0 answers

Spring boot Observablity for Event lifecycle

I am creating a application with spring boot 3 and its observability feature using micrometer and actuators. What my application does is, it receives a alert, generates event based on this alert, these events trigger some JS execution and the…
0
votes
1 answer

Axios HTTP backend call generates Opentelemetry span that is not correlated with the parent span

I'm trying to perform manual instrumentation in NodeJs (v18.16.1) using opentelemetry. The scenario is very simple, I'm making an axios HTTP call to a dummy JSON API and I want to collect the generated trace. Dependencies: "@opentelemetry/api":…
0
votes
0 answers

Need to integrate Splunk with Git such that Splunk can clone and commit to Git

If Splunk detects certain events, it needs to trigger cloning, committing, and creating a pull request on Git. I can find add-ons on Splunkbase that let me send data from Git to Splunk, but is there anything that lets Splunk send commands to Git?
0
votes
1 answer

How do i set response_header_timeout in Thanos Query Frontend Manifest?

I want to increase the response_header_timeout in thanos query-frontend deployment manifest as my queries are timing out, but can't find the correct syntax format expected. Current Syntax I am following: - -| …
0
votes
1 answer

Creating a log monitor based on multiple lines

I have a bunch of time sensitive functions that I schedule to run asynchronously (using an in-house async job scheduler). I am trying to use my observability tool (datadog) to get alerted if the run times of these functions do not meet specific SLAs…
shridharama
  • 949
  • 11
  • 18
0
votes
0 answers

How to call other master components' /metrics API in a managed environment?

I have a managed version of the Kubernetes cluster, that is, kube-apiserver / kcm / kube-scheduler / etcd is managed, and only kube-apiserver can be directly accessed through the network, and other master components cannot be accessed. Is there a…
flyer
  • 9,280
  • 11
  • 46
  • 62
0
votes
0 answers

Dynatrace on Distributed Tracing (Purepath) confusing the contexts of the requests and raising error on wrong API

Dynatrace is confusing the contexts of the requests. The error caused by Request A is being reported in Request B. We can see in the Error tab the stack shown in Request B is from Request A. General Timeline API B - Summary API B - Threads API A…
0
votes
0 answers

JVM Spring Boot Actuator metrics and JVM memory calculation from Linux

I'm monitoring a Spring Boot application through actuator metrics. Actuator endpoints reports these metrics: jvm memory used: 362MB jvm memory committed: 485MB jvm memory max: 548MB Using top linux command I see that the memory used from JVM is…
0
votes
0 answers

Is it possible to log killed queries reaching threshold max_execution_time

I've set max_execution_time to a sensible value so I can protect MySQL from rogue queries done by users to prevent resource exhaustion. I need though to find what queries are terminated, to analyse them and potentially improve them. Looking to the…
Baptiste Mille-Mathias
  • 2,144
  • 4
  • 31
  • 37