0

I have an issue where my outbound connections are using the wrong interface. I believe I have routing all configured properly.

  • I have an active/standby database on two blades that use a Virtual IP. The VIP is reassigned to the blade with the active database.
  • I have processes running on the same blades that connect to the database
  • If the DB VIP is on the other blade, database connections are made using the default interface src address, and all is good.
  • If the DB VIP is on the local blade, database connections are made using the database VIP.

I am running into problems when, as in that last case, the connection uses the DB VIP as its outbound interface, and then the database VIP is reassigned to the other blade. The program stays up, but the VIP is gone.

Here are the IP addresses when the database is local:

[root@xxxx-b1 ~]# ip addr list
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
 2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP qlen 1000
    link/ether redacted
3: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether redacted
    inet 172.18.3.12/22 brd 172.18.3.255 scope global bond0
    inet 172.18.3.10/22 brd 172.18.3.255 scope global secondary bond0:0
    inet6 readcted/64 scope link 
       valid_lft forever preferred_lft forever

The database is the bond0:0 secondary.

Here is my route:

[root@xxxx-b1 ~]# ip route list
172.18.0.0/22 dev bond0  proto kernel  scope link  src 172.18.3.12 
169.254.0.0/16 dev bond0  scope link  metric 1003 
default via 172.18.0.1 dev bond0  src 172.18.3.12 

Here is an example of using telnet to connect to the database when the database VIP is local:

[root@xxxx-b1 ~]# telnet 172.18.3.10 2315 &
[1] 13676
[root@xxxx-b1 ~]# Trying 172.18.3.10...
Connected to 172.18.3.10.
Escape character is '^]'.

[1]+  Stopped                 telnet 172.18.3.10 2315
[root@xxxx-b1 ~]# netstat -np | grep telnet
tcp  0  0 172.18.3.10:53583  172.18.3.10:2315   ESTABLISHED 13676/telnet        

What am I missing? Is there some way I can make that outbound connection use the bond0 address (172.18.3.12) instead of the database VIP? Setting the src parameter on the route does not seem to help. Maybe that is just not possible?

Thanks for any feedback!

Jamie
  • 101
  • 2
  • I would like to add that the end result is that, for example, I start psql on the database-local blade, and it connects using the VIP as its outbound interface. Then I fail over the database, and psql hangs on (eg.,) "SELECT 1;" for as long as I let it sit there. Normally it would issue a notice saying the connection was broken. – Jamie Oct 26 '16 at 19:05
  • Okay, one more thing... I would also be happy that when the VIP goes away, the sockets would be torn down. But that does not appear to be the case, as shown in my previous comment (psql hangs indefinitely) – Jamie Oct 26 '16 at 20:14

0 Answers0