0

I would like to know what happens if Dapr fails. For example, if my service's sidecar or even the Control Plane fails, what is the expected behavior of my application?

Oh, and would there be any way for me to simulate these error cases?

Context:

In my application I have a service that uses Dapr, but in a non-critical way. Therefore, I would like to ensure that it continues to run normally even if your sidecar or Dapr fails.

Ocimar
  • 53
  • 1
  • 6

1 Answers1

0

Very good question without a straight-forward answer, but I'll share how I look at it. Dapr can be used with monolithic, legacy applications, for migration and modernization purposes for example, but it is more commonly used with distributed applications. In a distributed application, there are many more components that can fail: database, transparent proxy (envoy/), ingress proxy, message broker, producer, consumer... In that regard, Dapr is no different, and it can fail, but there are a few reasons why that is less likely to happen:

  • Dapr is like a technical microservice, it has no business logic, and your app interacts with it over explicit APIs. It is harder for a failure in the sidecar to spread to your app.
  • If the sidecar is exploited, it is harder to get control of the application, acts as a security boundary.
  • As a popular open source project, Dapr has many eyes and users on it. You are more likely to get new bugs found and fixed early.
  • If that happens, upgrading Dapr is much easier than a library upgrade. You can upgrade Dapr control plane with little to no disruptions to your app, and then upgrade select sidecars (a canary release if you want) - I've also done many middleware/library patching/upgrades and I know how much work the latter is in comparison.
  • Each sidecar lives co-located with its app. Any hardware or network failure is likely to impact both the app and sidecar, rather than sidecar only.
  • With Dapr, you get many resiliency and observability benefits OOTB. See my blog on this topic here. It is more likely to improve the reliability of your app than reduce it.
  • When you follow the best practices, and enable k8s health checks, resource constraints, Kubernetes will deal with it. Dapr can even detect the health-status of your app, and stop interacting with it until it recovers.
  • In the end, if there is a bug in Dapr, it may fail. But that can happen wit a library implementing Dapr-like features too. With Dapr, you can isolate the failure, and upgrade faster, w/o a single line of code change, building, testing of the application, that is the difference from perspective of this question.

Disclaimer: I work for a company building products for running Dapr, and I'm highly biassed on this topic.

Bilgin Ibryam
  • 3,345
  • 2
  • 19
  • 20