8

Using

Web-application. NodeJS + Socket.io (websockets only, heartbeats enabled).

Problem

My application works OK (it connects to Socket.io, it emits/receives messages correctly), but I detect random disconnections, not because of heartbeat timeouts. Even if the server receives a heartbeat packet from client it is still possible that it disconnects this client in a few seconds. And I do not understand the reason. It happens almost randomly, roughly each 3-15 minutes. Changing socket.io configuration does not seem to affect the frequency.

Here is the log and it clearly shows that the reason for disconnection is not heartbeat timeouts, but a transport end (socket end):

Log:

debug: emitting heartbeat for client cTVCsv2GS2R_lh3Ecao-
debug: websocket writing 2::
debug: set heartbeat timeout for client cTVCsv2GS2R_lh3Ecao-
debug: got heartbeat packet
debug: cleared heartbeat timeout for client cTVCsv2GS2R_lh3Ecao-
debug: set heartbeat interval for client cTVCsv2GS2R_lh3Ecao-
info: transport end (socket end) 
// ^ Why?
debug: set close timeout for client cTVCsv2GS2R_lh3Ecao-
debug: cleared close timeout for client cTVCsv2GS2R_lh3Ecao-
debug: cleared heartbeat interval for client cTVCsv2GS2R_lh3Ecao-
SOCK 25.12 22:23:20.675 disconnection from IO detected (USER1) 
// ^ This means disconnected event fired
debug: discarding transport
debug: client authorized
info: handshake authorized GPBLHsqhhrdsTeqTcao_
debug: setting request GET /socket.io/1/websocket/GPBLHsqhhrdsTeqTcao_?conftoken=20519986772
debug: set heartbeat interval for client GPBLHsqhhrdsTeqTcao_
debug: client authorized for
debug: websocket writing 1::
debug: client authorized for /io/activity
debug: websocket writing 1::/io/activity
SOCK 25.12 22:23:23.005 reconnected to IO (USER1)

Configuration

Client:

socket = io.connect('/io/activity',{'max reconnection attempts':Infinity})

Server:

io = require('socket.io').listen(server, { 
      log: true
    , "close timeout": 120
    , "heartbeat timeout": 120
    , "heartbeat interval": 30
    , "transports": ["websocket"]
})

io.enable('browser client minification')
io.enable('browser client etag')
io.enable('browser client gzip')
igorpavlov
  • 3,576
  • 6
  • 29
  • 56
  • from the looks of it though, it seems like a heartbeat timeout as the second time in your log the server was supposed to get a heartbeat packet, it logged a transport end. can you confirm that a heartbeat packet was received and then the transport ended? – Hayko Koryun Dec 28 '13 at 13:43
  • Yes, I am sure it is not heartbeat timeout. At first, when it is actually a timeout the log writes "info: transport end (heartbeat timeout)", not "info: transport end (socket end)". Also if I switch off heartbeats or set their timeouts to let's say 2 hours, the problem still appears. – igorpavlov Dec 28 '13 at 14:05
  • which version are you using? there seems to be a recurring problem with 0.9 versions: https://github.com/LearnBoost/socket.io/issues/777 – Hayko Koryun Dec 28 '13 at 14:20
  • Yes, I use 0.9.16. But I am not sure this bug is related to my issue. I do not have constant disconnects, like people in the thread, but random ones. The frequency of disconnects is not dependent to timeouts, while in this thread it is clear - it depends. It is said "after a minute" and "in 25 seconds" - actually, default values of heartbeat configurations. Mine happen each 3-15 minutes. – igorpavlov Dec 28 '13 at 14:56
  • Ah, I am actually wrong here. It is written, that it happens after 3rd heartbeat sent, so it looks random as well. I will check it out. – igorpavlov Dec 28 '13 at 15:04
  • Unfortunately what browser/platform your client is running on may definitely make a difference. I've had the unfortunate experience of running into problems with iOS imperceptibly dropping the WiFi momentarily due to incorrect gateway address assignment for a static ip style connection. So what are you running on? Is it consistent across browsers/platforms? – J Trana Jan 04 '14 at 06:20
  • I'm having the exact same issue as you at the moment. I believe it started after adding authorization to my app using io.set 'authorization', function() { //do stuff here } Do you do any authorization with your app? – Tim Jan 08 '14 at 10:24
  • Sorry for late reply guys. Trana, at least in Chrome and Firefox. Tim, yes, I do have an authorization, are you sure it happens because of this? Because it is possible to switch off authorization and feed each connection with session cookie instead. – igorpavlov Jan 10 '14 at 07:44
  • Are you running on HTTP or HTTPS? We've found that (proper) HTTPS connections tend to be somewhat more reliable as they are less fiddled with by firewalls. – Simon Jan 26 '14 at 22:11
  • I run all website content on HTTPS and use WSS for websockets. – igorpavlov Jan 27 '14 at 18:51
  • How many websockets are connected? – Erin Ishimoticha Feb 04 '14 at 21:32
  • Always one websocket connection for each web-page. To be more specific, always one websocket for one browser (I restrict to open 2 connections in one browser). – igorpavlov Feb 06 '14 at 02:55
  • 2
    If you need only WebSockets protocol, I would suggest using pure WebSockets modules, rather than socket.io. Socket.io has layer and some "heartbeat" implementation for other protocols, but kept it for WebSockets which in fact makes no sense, as WebSockets is persistent TCP connection, and on disconnect both sides are aware of TCP connection lose. So through need of supporting multiple protocols, WebSockets in socket.io suffered for some extra layers. Using this: https://github.com/einaros/ws for WebSockets will be much more efficient. – moka Feb 06 '14 at 11:57
  • How do you solve the problem of loose internet connection (fast, but not stable, for example)? When you close the browser, it immediately sends a disconnection message and TCP connection gets closed. What if client just switches off wifi? TCP connection will live forever if there is no such timeout as heartbeat. – igorpavlov Feb 07 '14 at 12:33
  • @igorpavlov Node TCP has got timeout: http://nodejs.org/api/all.html#all_socket_settimeout_timeout_callback although I remember idle connections with no keep alive don't survive for a long time. – Farid Nouri Neshat Mar 21 '14 at 12:25
  • I ran into a problem like this on heroku bc my dynos were going to sleep despite websocket utilization. Just a shot in the dark. – CharlesTWall3 Mar 29 '14 at 03:14
  • Is this a problem you see locally or only when talking to your service deployed in some data center somewhere? I ask since if it is not local you are at the mercy of all routers and firewalls your connection uses on the internet and it could be that some router somewhere thinks the connection is corrupt since it is open way longer than typical HTTP(s) connections ond only a little data is sent. – Cellfish Apr 08 '14 at 19:33
  • Hi, I'm also facing the same issue. Is there any solution? – Dev Apr 17 '15 at 07:34
  • try to update to latest socket.io and let me know if issue still happens – igorpavlov Apr 17 '15 at 17:34
  • i m facing same issue, can anyone help? https://stackoverflow.com/questions/56538814/socket-io-chat-disconnecting-clients-randomly-ping-timeout-and-transport-close – Faizan Jun 11 '19 at 07:53

1 Answers1

0

If I use I package which I don't know its inner working, I normally code defensibly.

Once I was having random errors with socke.io a few years back and I simply listen for error events, and restart the connection. Note some client's data maybe be lost, so becareful handling error.

It would be helpful take a look at socket.io errors. good luck

markuz-gj
  • 219
  • 1
  • 8
  • Take a look at another question about "data loss" solving http://stackoverflow.com/questions/20685208/websocket-transport-reliability-socket-io-data-loss-during-reconnection. – igorpavlov May 18 '14 at 20:43