4

The overwhelming majority of my R analysis involves connecting to a MySQL database on a remote machine via DBI and RMySQL (using tidyverse-style connections). I'll typically build various and asundry lazy tables and eventually collect some of them.

Invariably, if I leave my workstation for lunch or a meeting, when I return I'll get a could not run statement: MySQL server has gone away error when I try to access a lazy table, forcing me to re-initialize my database connection and re-run the script to build the lazy tables.

I tried using the pool package create my database connections, but still encounter the error just as regularly.

My database connection typically looks something like this:

library(pool)
myconnection  <- dbPool(RMySQL::MySQL(), dbname = mydb, host = the_remote_machine, user = myusername, password = mypassword )

tbl1  <- tbl(myconnection, "table1")
tbl2  <- tbl(myconnection, "table2")

tbl_joined  <- tbl1 %>%
     left_join(tbl2, on = 'id')

# go to lunch, then a meeting that runs long
# come back 3 hours later

tbl_joined
Error in .local(conn, statement, ...) : 
  could not run statement: MySQL server has gone away

pool::dbIsValid(myconnection)
[1] TRUE

I'm the only user on the remote machine, and very rarely run more than one query at a time, so optimizing performance for multiple connections isn't a consideration. Too, I'm not building an application (like a Shiny app or some such), but rather looking for a way to avoid the MySQL server has gone away when working interactively.

I considered using pool::dbIsValid(myconnection) to build in some sort of checking, but it returns TRUE even when the database has gone away.

Too, I don't close my connections at the end of each script--because I'm working interactively, I want the database to always be available until I'm done with my analysis altogether. I don't want to close my database connection until I close R. Too, on the MySQL server, wait_timeout is set to 2000000, which should be 555 hours. I'm not interested in preserving my connections THAT long, more like over 3 - 24 hours idle times.

How can avoid the database going away error during interactive work when I might be idle for a long while?

crazybilly
  • 2,992
  • 1
  • 16
  • 42
  • The "Mysql gone away" error can be caused by running a bad join without indexes. looks like you are doing a left join here `tbl_joined <- tbl1 %>% left_join(tbl2, on = 'id')` i would check of table1.id and table2.id has indexes. – Raymond Nijland Nov 22 '17 at 15:32
  • 1
    I'm really only concerned here about getting the error after long idle times--I haven't had any problems with long-running queries, or the connection dying during analysis, just in between breaks. – crazybilly Nov 22 '17 at 15:40
  • 1
    Any updates for this question? – Jian Aug 06 '18 at 20:49
  • 1
    I haven't found a good solution yet, other than periodically making a new connection with the same name. – crazybilly Aug 07 '18 at 00:04
  • A further update: I've been using RMariaDB and haven't noticed the problem as much. Is that because I'm doing something differently or because the RMariaDB connection is better? Not sure, but it works just fine with my MySQL db and is in active development, unlike RMySQL. – crazybilly Mar 23 '19 at 02:02
  • I'm using RMariaDB and it still happens to me. Doesn't need to be long either. @crazybilly did you find a better solution here? I'm getting this with a shiny app, though the principle should be the same – mariachi Jun 11 '19 at 14:16

0 Answers0