5

Our applications are using the odbc driver to access an Impala database. We've discovered that in certain difficult-to-replicate situations, the driver will trigger a segfault within its cgo code, which manifests as a fatal error once it propagates back up through the driver and to our code. Since we want some cleanup and alerting to happen in these situations, I implemented a deferred panic catcher, hoping this might catch them.

However, it isn't working. The fatal error continues straight past the deferred function containing the recover() call (so apparently it's not a panic, despite the print output looking similar), though it does catch other panics. A github issue suggests that cgo signals cannot be caught, and that applications should gracelessly and immediately crash if one occurs. This is an unacceptable crash case for our production applications, so I'm wondering if that's changed in the last 6 years, or if anyone knows of another way of running some cleanup code in the event of a cgo signal. It seems like extremely poor design to have no way at all catch and handle these fatal errors.

Kaedys
  • 9,600
  • 1
  • 33
  • 40
  • Are you sure it's a "panic", and not a runtime fatal error? – JimB Jan 29 '18 at 18:06
  • Ya, you're right, it prints as a fatal error, just has the same stack trace format as a panic. Any way to catch such a fatal error? I mean, I realize that that fatal error definitely means we can't continue *anything* in the cgo code, since it's in an undefined state, but I'd at least like to be able to do some cleanup/alerting in the go code instead of just gracelessly crashing. Having to build in a heartbeat check or something just seems like so much kludge for something like this. – Kaedys Jan 29 '18 at 18:41
  • 2
    nope, fatal error is the program crashing asap. The cgo code is running in the same address space, so technically the Go memory space could also be in just as bad a state. In order to catch it you're going to have to run it in another process. – JimB Jan 29 '18 at 18:55
  • 1
    «It seems like extremely poor design to have no way at all catch and handle these fatal errors.»—how do you normally catch `SIGSEGV` signals in your normal C code (not wrapped with Go runtime)? – kostix Jan 29 '18 at 19:05
  • I've not coded in C in years, so no idea, tbh. One of the big reasons I like Go is that it doesn't really have a concept of uncatchable fatal errors. If it's just flat not catchable, I think I may have to either see if I can't debug the cgo code (ugh), or find/write an alternative driver. – Kaedys Jan 29 '18 at 19:54
  • 1
    @Kaedys: except that Go does have a concept of uncatchable fatal errors, which you've just seen. Segfaults or memory corruption (through the incorrect use of unsafe or race conditions), OOM, concurrent writes to maps, etc. can all end up as a fatal error. Go's inherent memory safety just makes it harder to get there than for example, C. – JimB Jan 29 '18 at 20:35
  • Ya, I guess. I just hate having such a significant crash vulnerability in production code, and the alternatives are...annoying. Thanks for the help. – Kaedys Jan 29 '18 at 21:21
  • May I invite you to raise your question on [the Go mailing list](https://groups.google.com/forum/#!forum/golang-nuts)? While this SO tag is monitored by a fair number of professionals it is not read by the Go devs. I'd also note that if/when you decide to post there please include in it the Go version and the OS you're using—this might be crucial for properly dealing with the problem. – kostix Jan 30 '18 at 08:37
  • On the off chance that you haven't yet tried [os.signal.Notify](https://golang.org/pkg/os/signal/#Notify) – Sridhar Feb 24 '18 at 13:07
  • Is that even capable of catching a SIGSEGV? I've honestly never tried. – Kaedys Feb 27 '18 at 17:30
  • 1
    @Kaedys no, signal.Notify() is not capable of catching a SIGSEGV caused by program execution. It can catch a SIGSEGV sent by the linux "kill" command, but that's of no use. – Martin Del Vecchio Oct 24 '19 at 13:53

0 Answers0