0

i have a kubenates cluster and I have setup istio and there's one service basically this service is an rest API, i have some endpoints exposed though an lb

the issue is basically my other endpoints works find and everything is good but there's one endpoint which specifically takes long time to run roughly around 35s ? (in my local devl env)

so whenever i try to execute this endpoint it tries to load for like 25s and re-starts the pods the api returns

upstream connect error or disconnect/reset before headers. reset reason: connection termination

and im my proxy i see the log saying

2023-04-12T13:20:43.449969Z error   Request to probe app failed: Get "http://10.0.79.104:8080/": context deadline exceeded (Client.Timeout exceeded while awaiting headers), original URL path = /app-health/server/readyz
app URL path = /

no errors from the api.

Im wondering if this issue is because istio proxy expects the response time to be lower than 25s maybe ? I wonder if i can exceed that and find a work around this issue ?

dasith
  • 11
  • 2
  • It seems like the underlying problem is that your request is 10-100x slower than one might normally expect an HTTP request to be. Have you figured out where the performance bottleneck is? Would an asynchronous channel be more appropriate? – David Maze Apr 12 '23 at 13:57
  • (IME there can be a 60-second timeout at multiple layers, and if you're getting even close to that you can hit intermittent timeouts. The Kubernetes probe requests often have an even shorter timeout, and if you're hitting some sort of thread-starvation issues, your stalled long-running requests can block the health checks, and your Pods then get unexpectedly killed.) – David Maze Apr 12 '23 at 13:58
  • actually i did test my endpoint couple of minutes ago in locally it respond around 6s, the thing is my endpoint actually sends multiple requests to the grpc server and fetch data and return those data as an array this can take a while depnds on the grpc server response times and im also starting to wonder if something is happening because im sending lot of requets to that grpc ones – dasith Apr 12 '23 at 14:48
  • note that the grpc server is not getting down only the api is getting down, i have a cache and some records are hit on cache that means grpc server did respond to some calls and some failed. and also the connection between grpc servr and api is actually working properly – dasith Apr 12 '23 at 14:49
  • actually i manage to figure out the issue the pod was dying due to an out of memory issue https://www.airplane.dev/blog/oomkilled-troubleshooting-kubernetes-memory-requests-and-limits – dasith Apr 13 '23 at 20:43

0 Answers0