Using Fiber v2.38.1, which in turn use fasthttp/websocket v1.5.0.
App crashed with following error:
2022-10-15T04:05:42.983563+00:00 app[web.1]: time="2022-10-15T04:05:42Z" level=info msg="close msg received: &{418 close}"
2022-10-15T04:05:42.983564+00:00 app[web.1]: conn is nil: false, msg: &{418 close <nil> conn 0}
2022-10-15T04:05:42.986035+00:00 app[web.1]: panic: runtime error: invalid memory address or nil pointer dereference
2022-10-15T04:05:42.986035+00:00 app[web.1]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x8db967]
2022-10-15T04:05:42.986036+00:00 app[web.1]:
2022-10-15T04:05:42.986037+00:00 app[web.1]: goroutine 86 [running]:
2022-10-15T04:05:42.986037+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).beginMessage(0x0, 0xc000348f60, 0x1)
2022-10-15T04:05:42.986037+00:00 app[web.1]: /tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:479 +0x27
2022-10-15T04:05:42.986038+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).NextWriter(0x0, 0x1)
2022-10-15T04:05:42.986038+00:00 app[web.1]: /tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:520 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).WriteJSON(0xe79a80?, {0xc143a0, 0xc00041e600})
2022-10-15T04:05:42.986039+00:00 app[web.1]: /tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/json.go:24 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/model.(*WsClient).WriteMsg(0xc00056a280, 0xc00041e600)
2022-10-15T04:05:42.986040+00:00 app[web.1]: /tmp/build_0b2fe533/model/ws_model.go:97 +0xf9
2022-10-15T04:05:42.986040+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1.2()
2022-10-15T04:05:42.986041+00:00 app[web.1]: /tmp/build_0b2fe533/controller/content_ctl.go:623 +0x171
2022-10-15T04:05:42.986041+00:00 app[web.1]: created by github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1
2022-10-15T04:05:42.986042+00:00 app[web.1]: /tmp/build_0b2fe533/controller/content_ctl.go:608 +0x10cc
2022-10-15T04:05:43.113930+00:00 heroku[web.1]: Process exited with status 2
It crashed at conn.go:479
, in function beginMessage()
:
if c.writer != nil {
I've checked that c
is not nil (as shown in log), so how it crashed ...
BTW, it's deployed on heroku, I saw following log on heroku:
2022-10-15T04:59:30.344791+00:00 heroku[router]: at=error code=H15 desc="Idle connection"
Not sure is it relevent.
Update: relevant code
- A separate goroutine that handle close msg (part of
content_ctl.go
):
// handle close - from a channel,
// TODO: use goroutine pool ?
go func() {
if r := recover(); r != nil { // TODO: is this proper ?,
fmt.Printf("recover from panic: %v\n", r)
}
closeMsg := <-(*client).CloseChan // handle close,
log.Infof("close msg received: %v", closeMsg)
// send close msg,
closeResp := &model.WsResp{
Status: closeMsg.HttpCode,
Source: model.RespSourceConn,
Msg: closeMsg.Msg,
}
log.Infof("is conn nil: %v, msg: %v\n", client.WsConn == nil, closeResp)
err = client.WriteMsg(closeResp)
if err != nil {
log.Errorf("error - on write close msg: %v", closeResp)
}
ctl.cwm.CloseAndCleanup(client.Id) // close & clean up,
}()
- Definition of
WsClient
struct, and its methodWriteMsg()
(part ofws_model.go
):
// websocket client,
type WsClient struct {
WsUser
ContentUuid string `json:"contentUuid"` // content uuid id,
WsConn *websocket.Conn `json:"-"` // websocket connection,
CloseChan chan *CloseMsg `json:"-"` // close channel,
WriteLock *sync.Mutex // write lock, to avoid concurrent write to ws connection, which will cause panic, acquire lock when write or close,
}
// write ws msg, with lock,
func (wsc *WsClient) WriteMsg(msg *WsResp) error {
if wsc.WsConn == nil {
return errutil.ErrNoConn
}
wsc.WriteLock.Lock()
err := wsc.WsConn.WriteJSON(msg)
wsc.WriteLock.Unlock()
return err
}
CloseMsg
struct (part ofws_model.go
):
// close msg sent to close chan,
type CloseMsg struct {
HttpCode int `json:"httpCode"`
Msg string `json:"msg"`
}
Problem status:
- Actually, before asking the question I've solved the problem by checking the
httpCode
in theCloseMsg
, if it's418
, then I won't send the close msg. - Though the issue is gone for now, but still I want to know the actual reason of the problem, so I asked the question.
Some background:
- I've several possible value for
CloseMsg
'sHttpCode
.418
is the default one, it happens only when unexpected io issue happen on the connection (e.g heroku terminate the connection due to idle ?).- For other cases, e.g if I want to kick out a connection, then I will send other http code.
- From the log, I found the problem happens only when the http code from close msg is
418
. - BTW, though
418
might not be the best http code to use, (it's from a joke), but that doesn't effect the problem I guess. - And, In
WsClient
'sWriteMsg()
method, I've always checked whether the conn nil, before send msg.
Update - more log from heroku
2022-10-16T05:19:03.088622+00:00 heroku[router]: at=error code=H15 desc="Idle connection" method=GET path="/api/v1/content/ws/RgPx8znal7AJb
2022-10-16T05:19:03.065493+00:00 app[web.1]: time="2022-10-16T05:19:03Z" level=info msg="error - on read: websocket: close 1006 (abnormal closure): unexpected EOF"
2022-10-16T05:20:57.758379+00:00 app[web.1]: time="2022-10-16T05:20:57Z" level=info msg="cleanup client, id = 'wscid-ee670cc5-4100-49d6-9857-8284d93a6d33'"
2022-10-16T05:20:57.758505+00:00 app[web.1]: time="2022-10-16T05:20:57Z" level=info msg="close msg received: &{418 close}"
Seems heroku will kill the websocket connection if there is no msg for a while. That's when 418
appear.
And, I've asked a question about heroku killing the websocket connection on SO: heroku keep killing idle connections (websocket)