Few days ago, Google publish this article: https://cloud.google.com/blog/big-data/2018/07/developing-a-janusgraph-backed-service-on-google-cloud-platform
We can read from there, that it is common to deploy janus graph as a separate instance behind the internal load balancer.
So, in my project we have pretty much the same architecture: bigtable, gke with janus and some app which calls janus through load balancer. The only difference ( dunno if that's important or no, we don't have internal load balancer, we have the "external(?)" one )
So. The question is: what is the state of load balancing when using gremlin driver in java application. Our research shows that it does not work. Since connections are stateful the throughput is not forwarded randomly to janus replicas. When it sticked to one - it stays with that particular replica till the end. However, when the replica is killed, the connection somehow hangs, without any exception, warning, log, anything. It's like not information about the state of the connection at all. It is bad cause if we assume that one have automatic load balancer which spins out additional replicas when needed, it will simply does not work.
We are using janus graph 0.21 with corresponding tinkerpop driver 3.2.9 ( however we've tried many different combinations ) and still the schema stays the same. Load balancing does not work for us, as well as failover when some pod gets killed. - to make this even worse it is no really deterministic - we had some tests where it worked, but when we return to that test after a while, it doesn't.
Do you, stackoverflowers have any idea what is the state of this problem?