8

So, this problem is happening randomly (it seems) and between different services.

For example we have a service A which needs to talk to service B, and some times we get this error, but after a while, the error goes away. And this error doesn't happen too often.

When this happens, we see the error log in service A throwing the “upstream connect error” message, but none in service B. So we think it might be related with the sidecars.

One thing we notice is that in service B, we get a lot of this error messages in the istio-proxy container:

[src/istio/mixerclient/report_batch.cc:109] Mixer Report failed with: UNAVAILABLE:upstream connect error or disconnect/reset before headers. reset reason: connection failure

And according to documentation when a request comes in, envoy asks Mixer if everything is good (authorization and other things), and if Mixer doesn’t reply, the request is not success. So that’s why exists an option called policyCheckFailOpen. We have that in false, I guess is a sane default, we don’t want the request to go through if Mixer cannot be reached, but why can’t?

disablePolicyChecks: true
policyCheckFailOpen: false
controlPlaneSecurityEnabled: false

NOTE: istio-policy is running with the istio-proxy sidecar. Is that correct?

We don’t see that error in some other service which can also fail.

Another log that I can see a lot, and this one happens in all the services not running as root with fsGroup defined in the YAML files is:

watchFileEvents: "/etc/certs": MODIFY|ATTRIB
watchFileEvents: "/etc/certs/..2020_02_10_09_41_46.891624651": MODIFY|ATTRIB
watchFileEvents: notifying

One of the leads I'm chasing is about default circuitBreakers values. Could that be related with this?

Thanks

codiaf
  • 569
  • 2
  • 18
  • 47

2 Answers2

3

The error you are seeing is because of a failure to establish a connection to istio-policy

Based on this github issue

Community members add two answers here which could help you with your issue


If mTLS is enabled globally make sure you set controlPlaneSecurityEnabled: true


I was facing the same issue, then I read about protocol selection. I realised the name of the port in the service definition should start with for example http-. This fixed the issue for me. And . if you face the issue still you might need to look at the tls-check for the pods and resolve it using destinationrules and policies.


istio-policy is running with the istio-proxy sidecar. Is that correct?

Yes, I just checked it and it's with sidecar.


Let me know if that help.

Jakub
  • 8,189
  • 1
  • 17
  • 31
  • 2
    Hi, thanks for the reply. I read that issue, those 2 solutions are not for us. We do have the proper protocol selection and we don't have mTLS globally enabled, only per namespace. But also, now I know that the Mixer report error is only for telemetry, so I don't think is the real issue. Though, I found some interesting about default circuit breakers values in our current Istio version, and saw that 1.4 (we have 1.3) removes this default values. Could this be a good candidate to investigate? We're trying upgrading and see what happens – codiaf Feb 12 '20 at 08:08
  • 1
    I would recommend to upgrade to the newest version which is 1.4.4, because if there were some bugs then it will be repaired, if not I will try to find the issue. Let me know if it's still happening after the upgrade. – Jakub Feb 13 '20 at 08:25
  • 1
    Hi, Did you able to get solutions for this. We are also facing same issue, Is there any direction to solve this issue. – Mohsin May 10 '20 at 11:08
0

One more reason to get this issue is that the k8s service is not able to connect with the container because it may be exposed to different port. Validate the port exposed in container. eg:


K8s service Port: ports: - name: http port: 8080 protocol: TCP targetPort: 8080

Container running inside pod should also exposed to 8080

Lokesh kumar
  • 127
  • 1
  • 6