0

I'm intending to populate multiple firebird databases from multiple goroutines concurrently and in order to do that my worker function has a map (dbConnections) that holds the connections to databases (maps the name of the database to the connection):

func worker() {
    dbConnections := map[string]*sql.DB {}
    for dbName, dbFileName := range dbFiles {
        connection, err := sql.Open("firebirdsql", ("sysdba:master@localhost:3050/" + url.PathEscape(dbsPath + dbFileName)))
        err = connection.Ping()
        if err != nil {
            fmt.Println("Ping failed: ", err.Error()
            return
        } else {
            dbConnections[dbName] = connection
            fmt.Println(fmt.Sprintf("Connected to the: %v", dbName))
            defer dbConnections[dbName].Close()
        }
    }

    // using the connections to populate databases...
    // ...
}

The problem is that when I run worker function as just 1 goroutine, everything works just fine, but as soon as I increase the number of goroutines, it seems that dbConnections from other goroutines gets messed up and sql executions complain about insertion into non existing tables!

How can I create dbConnections in such a way that every goroutine, has its own unique version of it?

Edit:

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
wiki
  • 1,877
  • 2
  • 31
  • 47
  • 1
    This is not a map problem. It is likely that your worker goroutine has a race. Are those connections thread-safe? If not, maybe the same database is used from multiple goroutines. – Burak Serdar May 12 '20 at 16:15
  • Why not have each goroutine creating its own connection? – bereal May 12 '20 at 16:15
  • *"Are those connections thread-safe?"* @BurakSerdar `*sql.DB` is safe for concurrency, isn't it? – mkopriva May 12 '20 at 16:19
  • _"Why not have each goroutine creating its own connection?"_> I'm doing just that. Each goroutine creates its own set of connections (defined in the beginning of the function as code above shows). problem is that they somehow gets messed up with other connections from other goroutines. – wiki May 12 '20 at 16:19
  • @wiki is the map shared across workers or does each worker have their own map, with their own set of connections? – mkopriva May 12 '20 at 16:22
  • No sharing. each worker has its own map with their own set of connections. – wiki May 12 '20 at 16:23
  • *"sql executions complain about insertion into non existing tables"* @wiki are any of the workers creating the tables? or do the tables alredy exist in the database and you're just doing inserts? what other sql commands, apart from insert, are your workers executing/triggering? – mkopriva May 12 '20 at 16:24
  • @mkopriva workers creates tables but it is not a problem because I have a logic for that in each worker function that checks if the table successfully was created or not. Actually the SQL complains happen at the final part of the function when I want to insert the data into the tables. – wiki May 12 '20 at 16:28
  • @wiki each worker creates their own tables? Or some workers create tables others do inserts into those tables? If the latter then how are you synchronizing the workers to ensure that the ones that are doing the inserts are executed *after* the tables have been created? – mkopriva May 12 '20 at 16:38
  • @mkopriva each worker creates its own table (but in the same databases). – wiki May 12 '20 at 16:40
  • 1
    @wiki you should probably add `firebirdsql` tag to your question as it might be relevant, and in the question itself you should mention what driver you're using to communicate with `firebirdsql`. Nothing in the provided code stands out as "incorrect for concurrent use", the problem will be either in the code you've omitted from the question or in how the driver, or backend, is implemented. I'll retreat now as I'm not familiar with `firebirdsql` so I can't comment on it or any of its Go drivers. – mkopriva May 12 '20 at 16:46
  • Can you update your question? Your title and description seem to be two totally unrelated things. – Adrian May 12 '20 at 17:29
  • @Adrian changed the title. – wiki May 12 '20 at 17:56
  • Do you `commit` transaction after creating tables? // I think (maybe this is obsolete) that firebird client DLL connections are not thread-safe per se. The golden rule was that each OS (Windows, Linux, etc) thread has their own connection made. Alternatively a connection should be locked exclusively for every access. And that is another can of worms, because most `select` queries fo not featch all the data at once, they only fetch few first rows and keep fetching IF client app will read further. – Arioch 'The May 13 '20 at 06:39
  • it maybe will help for all your SQL workers to log their activity alone with `select connection_id from rdb$database` value. You may also read about `monitoring tables` but their access is heavy, not suggested for massive use. You may also consider using Trace - https://stackoverflow.com/questions/61271433 – Arioch 'The May 13 '20 at 06:41
  • It seems one needs here a thread-sade `connection pool` where each worker exclusively locks a connection from the pool, uses it for a while, then releases it back to pool. Would i design an applicaiton like that (not in Go, i don't know it), i would have two sets of threads - one set would have one thread per database, and would push the data into DB, another set will be producers of the data, they will use some queues to post ready data items to relevant thread of the first set. Thread jockeying is to be done within Go app. External Go to Firebird connections would be single threaded each. – Arioch 'The May 13 '20 at 06:50
  • @Arioch I think that the problem as you mentioned is because of thread-safety but the whole point of me using a database such as Firebird is to have the ability to work with it concurrently (multiple connections insert data without locking database) or else I would use SQLite for example and the problem would never arises in the first place. – wiki May 13 '20 at 09:41
  • Make sure there is no two threads using the same connection with races, make sure you work with transactions properly - and it should work. Though since your primary if not the only function is inserting data - making several connections to the same database gonna be waste of resources without speedup (server's CPU and HDD I/O would be bottlenecking your app anyway). Frankly, using stand-alone database server - and thus comparatively expensive interprocess commmunications - for mere sake of synchronizing threads within a program seems like overkill for me. – Arioch 'The May 13 '20 at 19:36
  • DDL in Firebird is transactional. If you create a table, the transaction needs to be committed before a table can be used, and - depending on the transaction isolation level - a table can only be used by transactions started after the commit of the transaction that created the table. Without a [mre] I guess this problem is not really easy to diagnose. If you have a reproducible case, it might also be better to report on https://github.com/nakagami/firebirdsql – Mark Rotteveel May 14 '20 at 15:06
  • @Arioch'The The firebirdsql Go driver does not use fbclient.dll, so thread safety of fbclient.dll is not relevant (afaik that was only a issue with Firebird Embedded, on Linux, back in version 2.1 or earlier). – Mark Rotteveel May 14 '20 at 15:08

1 Answers1

0

As many mentioned in the comment section, the problem was because of thread-safety in firebirdsql and especially when creating new tables. Defining a mutex and enclosing the sql execution method between mutex.Lock() and mutex.Unlock() (so that only one goroutine can create tables at any given time) solved the problem.

wiki
  • 1,877
  • 2
  • 31
  • 47