OpenShift/K8s issue with project pods not joining same grid, but rather create multiple isolated grids when using TcpDiscoveryKubernetesIpFinder

Question

I have an issue when OpenShift project deployed with autoscaler configuration like this:

Min Pods = 10
Max Pods = 15

I can see that deployer immediately creates 5 pods and TcpDiscoveryKubernetesIpFinder creates not one grid, but multiple grids with same igniteInstanceName.

This issue could be is solved by this workaround

I changed autoscaler configuration to start with ONE pod:

Min Pods = 1
Max Pods = 15

And then scale up to 10 pods (or replicas=10):

Min Pods = 10
Max Pods = 15

Looks like TcpDiscoveryKubernetesIpFinder is not locking when it reads data from Kubernetes service that maintains list of IP addresses of all project pods. So when multiple pods started simultaneously it cause multiple grids creation. But when there is ONE pod started and grid with this pod created - new autoscaled pods are joining this existing grid.

PS No issues with ports 47100 or 47500, comms and discovery is working.

That's interesting. TcpDiscoveryKubernetesIpFinder doesn't write or register nodes, instead, it just reads information from a K8s service. What are the exact stacktrace and the error that you had initially? IgniteInstanceName is optional and makes sense only if you start multiple instances within a single JVM, which should not be a case for PODs. — Alexandr Shapkin, Feb 14 '22 at 23:02
No errors/stacktrace, just multiple grids with same name created, eg of 10 pods available - grid1=4 pods, grid2=3 pods, grid3=2 pods, grid4=1 pod (all grids with same nade, eg app-api-grid). Looks like 10 pods/JVM read ignite service almost concurrently, see empty load-balancer and think that each pod can create new grid (instead of joining existing one). — Valeri Shibaev, Feb 14 '22 at 23:56
Oh yes, now I see... Agree, some initialization delay might help here, looks like it's a good candidate for improvement or JIRA task at least. — Alexandr Shapkin, Feb 15 '22 at 00:31
Thank you, let me know when TcpDiscoveryKubernetesIpFinder early adoption fix will be available. For now I've switched my Openshift micro-service IgniteConfiguration#discoverySpi to TcpDiscoveryJdbcIpFinder - which solved this issue (as it has this kind of lock, transactionIsolation=READ_COMMITTED). — Valeri Shibaev, Feb 15 '22 at 05:07
Do you mind reviewing my answer? I'm curious if this could be resolved with a readiness probe. — Alexandr Shapkin, Feb 16 '22 at 19:45

score 1 · Answer 1 · answered Feb 15 '22 at 12:10

OP confirmed in the comment, that the problem is resolved:

Thank you, let me know when TcpDiscoveryKubernetesIpFinder early adoption fix will be available. For now I've switched my Openshift micro-service IgniteConfiguration#discoverySpi to TcpDiscoveryJdbcIpFinder - which solved this issue (as it has this kind of lock, transactionIsolation=READ_COMMITTED).

You can read more about TcpDiscoveryJdbcIpFinder - here.

score 0 · Answer 2 · answered Feb 16 '22 at 19:43

0

Thanks for the information, indeed this might happen if multiple nodes have been started simultaneously. I've filed IGNITE-16568 to keep track of it.

Meantime, there are multiple workarounds, one of them is - use different IP fInder, like you did by utilizing TcpDiscoveryJdbcIpFinder.

Another option that I suppose will work - configure readinessProbe and even set initialDelaySeconds if required. It's always recommended to have the probes configured, here is an example of their configuration in Apache Ignite:

readinessProbe:
    httpGet:
        path: /ignite?cmd=probe
        port: 8080
    initialDelaySeconds: 5
    failureThreshold: 3
    periodSeconds: 10
    timeoutSeconds: 10
livenessProbe:
    httpGet:
        path: /ignite?cmd=version
        port: 8080
    initialDelaySeconds: 5
    failureThreshold: 3
    periodSeconds: 10
    timeoutSeconds: 10

answered Feb 16 '22 at 19:43

Alexandr Shapkin

2,350
1
6
10

I could not see how to determine which pod will be first. I think we need to write some kind of marker on K8s ignite service that said "pod_N" is first and others should wait till TcpDiscoveryKubernetesIpFinder#getRegisteredAddresses() not empty. eg by adding metadata.labels.pods [pod_N] – Valeri Shibaev Feb 16 '22 at 22:29
I am not able to do this now as Ignite rest is not enabled on this project. It'll take time to arrange this, require few approvals... – Valeri Shibaev Feb 17 '22 at 22:00
Well, honestly, it doesn't seem to be working. Only once a POD is fully initialized. it appears in the service. – Alexandr Shapkin Feb 18 '22 at 12:23
I guess it's possible to enhance TcpDiscoveryKubernetesIpFinder with delayed post-initialisation callback - that will analyse all pods within OpenShift Ignite service (load balancer) and re-orginised dis-connected Ignite Grids with same name, eg choose oldest one to keep and re-start other others. Although this is messy code - much prefer for TcpDiscoveryKubernetesIpFinder to be fixed. – Valeri Shibaev Feb 20 '22 at 04:18

OpenShift/K8s issue with project pods not joining same grid, but rather create multiple isolated grids when using TcpDiscoveryKubernetesIpFinder

2 Answers2