0

we are running a spark thrift server and the configuration as below. thrift drivers and application master are separated with firewall and all the port between these two are opened . Issue is after 2hr 11 mins application server will die because its not able to connect to thrift driver . So what are the ports which need use thrift driver and application master communication ? I know thrift is based on RPC protocol and is it TCP or UDP ?

2hr11 mins actually the value of net.ipv4.tcp_keepalive_time=7200 which is default in linux OS.

I can increase this value to higher because it will impact other applications also . SO if I get a clear view on how TS will communicate then it would be easy for me to configure firewall

SARATH CHANDRAN
  • 307
  • 1
  • 6
  • 16
  • Did you configure thrift respectively to this: https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html. Did you try to connect via command line from the server where your application lies? – Matt Oct 30 '20 at 13:29
  • Yes. connection is working fine for 2hrs . Then application master not able to connect to driver and its dies. I am thinking is there some hearbeat/keepalive mechanism available in thrift which we can tune ? – SARATH CHANDRAN Oct 30 '20 at 13:45
  • What is the application master. Are you running a spark standalone cluster with a master and the thrift server is that what you mean? I am not sure they are supposed to stay connected the important thing is that the thrift server can send over commands to the master on demand. you can validate this by trying to execute queries – Matt Oct 30 '20 at 14:02

0 Answers0