I have to run my initiator and acceptor on two different OS. I am seeing a weird problem of socket disconnection. Following are the different scenarios I have tried. I am using C++/Quickfix. I have been using the setup on Debian/Ubuntu for a long time, no issues. With CentOS and Ubuntu connection establishment is giving a problem. Details of the problem:-
Scenario-1 (Problem Scenario)
My initiator is running on Machine-1 which is CentOS
My acceptor is running on Machine-2 which is Ubuntu
When I try to connect, I am getting following error:-
<20121213-03:57:41.619, FIX.4.2:ft-trade->ES, event>
(Connecting to 10.0.0.40 on port 31209)
<20121213-03:57:41.620, FIX.4.2:ft-trade->ES, outgoing>
(8=FIX.4.2 9=77 35=A 34=1 49=ft-trade 52=20121213-03:57:41.620 56=ES 98=0 108=30 141=Y 10=230 )
<20121213-03:57:41.620, FIX.4.2:ft-trade->ES, event>
(Initiated logon request)
<20121213-03:57:41.621, FIX.4.2:ft-trade->ES, event>
(Socket Error: Connection reset by peer.)
<20121213-03:57:41.621, FIX.4.2:ft-trade->ES, event>
(Disconnecting)
Please do not go by the checksum field in above packet, as I had to change the sender/target compids before posting here.
Scenario-2 (Works fine)
I took the same initiator code to a different Machine-3. Both Ubuntu now.
From Machine-3, I can successfully connect to Machine-2.
This scenario had no issues, so my settings file etc. are all good.
Scenario-3 (Works fine)
I took the same acceptor code to Machine-1. Both CentOS now
Again, I could succesfully connect.
I also checked if there could be any firewall related issues. But there is no problem since, telnet from Machine-1 to Machine-2 is successful.
As I understand, this is a TCP/IP error when the peer disconnects or closes the socket after opening it. But the way message comes in, its not sure if the error is because of TCP/IP or quickfix. I do not see any reason for TCP/IP handshake problem since TELNET also works fine.