4

I've encountered some inconsistent behaviour with pg_locks on go routines and PostgreSQL 9.5.

When I create additional goroutines and call SELECT pg_try_advisory_lock(1); the results are inconsistent.

I can try to get the lock at some moment, fail, try again and manage to get it, without anyone explicitly releasing the lock in the meanwhile.

I've created the a small program to reproduce the issue.

Program flow

  1. Create 10 goroutines. Each one of them tries to get the same lock on initialization.
  2. Every second, each instance would try to get the lock again, if it didn't get it already.
  3. Every second I probe all instances and count how many already got the lock.

Expected behaviour:

Only 1 goroutine would have the lock at any given moment.

Actual results:

The number of goroutines that manage to aquire the lock increases over time.


package main

var dbMap *gorp.DbMap // init code omitted for brevity

func main() {
    setup(10)
    fmt.Println("after initialization,", countOwners(), "instances of lockOwners have the lock!")

    for {
        if _, err := dbMap.Exec("SELECT pg_sleep(1)"); err != nil {
            panic(err)
        }

        fmt.Println(countOwners(), "instances of lockOwners have the lock!")
    }
}

func countOwners() int {
    possessLock := 0
    for _, lo := range los {
        if lo.hasLock {
            possessLock++
        }
    }
    return possessLock
}

var los []*lockOwner

func setup(instanceCount int) {
    var wg sync.WaitGroup
    for i := 0; i < instanceCount; i++ {
        wg.Add(1)
        newInstance := lockOwner{}
        los = append(los, &newInstance)
        go newInstance.begin(time.Second, &wg, i+1)
    }
    wg.Wait()
}

type lockOwner struct {
    id      int
    ticker  *time.Ticker
    hasLock bool
}

func (lo *lockOwner) begin(interval time.Duration, wg *sync.WaitGroup, id int) {
    lo.ticker = time.NewTicker(interval)
    lo.id = id
    go func() {
        lo.tryToGetLock()
        wg.Done()
        for range lo.ticker.C {
            lo.tryToGetLock()
        }
    }()
}

func (lo *lockOwner) tryToGetLock() {

    if lo.hasLock {
        return
    }

    locked, err := dbMap.SelectStr("SELECT pg_try_advisory_lock(4);")
    if err != nil {
        panic(err)
    }

    if locked == "true" {
        fmt.Println(lo.id, "Did get lock!")
        lo.hasLock = true
    }
}

The output of this program varies, but usually something along the lines:

1 Did get lock!
after initialization, 1 instances of lockOwners have the lock!
1 instances of lockOwners have the lock!
2 Did get lock!
2 instances of lockOwners have the lock!
2 instances of lockOwners have the lock!
7 Did get lock!
3 instances of lockOwners have the lock!
3 instances of lockOwners have the lock!
6 Did get lock!
4 instances of lockOwners have the lock!

My question:

  1. What should I expect to be protected when using pg_locks this way?
  2. What is the reason some goroutine fails to acquire the lock?
  3. What is the reason the same goroutine succeeds doing so on the next attempt?

    • Could it be that the thread is the resource being locked and every time a goroutine triggers it does so from a different thread? That would explain the inconsistent behaviour.
Alechko
  • 1,406
  • 1
  • 13
  • 27
  • Well that's the part I'm not sure of.If the lock is go-routine-exclusive so to say, then I expect only one instance to get the lock (once). If it is not go-routine-exclusive, then I expect all ten instances to get it succesffully on the first attempt and then - yes, it would be locked 10 times. – Alechko Sep 04 '18 at 14:21
  • 4
    I'm not sure what `gorp.DBMap` is, but all your routines are sharing it, which means they are all sharing the same connection/pool. Postgres has no awareness of goroutines, only connections. A Postgres lock will exclude other processes connected to the same database; if you want to control exclusion among goroutines within a program, use Go synchronization methods (e.g. channels or mutexes). – Adrian Sep 04 '18 at 14:21
  • I misread--you're always trying to lock the same id (`4`). – Jonathan Hall Sep 04 '18 at 14:22
  • 1
    Ahhh I see now! So every time I call `dbMap.Exec(...)` it chooses a random connection from the connection pool. When it's the same connection that initially got the lock - I succeed in getting it again, even if it is done from a different goroutine. – Alechko Sep 04 '18 at 14:23
  • 1
    The answer to your first question, BTW, "What should I expect to be protected...?" -- Nothing. This is just an advisory lock. The semantics of what it means is 100% up to your application. See [here](https://www.postgresql.org/docs/9.4/static/explicit-locking.html) for more explanation. – Jonathan Hall Sep 04 '18 at 14:24
  • 1
    Indeed when I use separate transactions instead of the connection pool - only one transaction succeeds in getting the lock. Furthermore, the goroutine that owns the 'winning' transaction can use it to acquire the lock any number of times (before the it unlocks of course). – Alechko Sep 04 '18 at 14:52
  • @AlexGordon Sounds like this is more about understanding the semantics of the connection pool and the advisory locks functionality? Is this answered at this point? – stderr Sep 04 '18 at 21:40
  • @stderr It is indeed. I’ve reproduced with transactions instead of dbMap and it behaved as expected. – Alechko Sep 05 '18 at 07:49

1 Answers1

0

After some months of experience with PostgreSQL via gorp, I think I understand this behaviour:

  • gorp maintains a connection pool.
  • Whenever we create a transaction, one of these connections will be picked in random(?).
  • dbMap.SomeCommand() creates a transaction, executes some command and commits the transaction.
  • pg_try_advisory_lock works on the session level.
  • A session is synonymous with a TCP connection, so is by no means stable or permanent.
  • A connection from the pool keeps using the same session when possible, but will reset it when needed.

Whenever we execute dbMap.SelectStr("SELECT pg_try_advisory_lock(4);"), a connection is selected from the pool. Then a transaction is created, it acquires the lock on the session level and then it's committed.

When another go-routine attempts to do the same, there's a chance that it will use the same session, probably depending on the connection that is taken from the pool. Since the locking is done on the session level - the new transaction is able to acquire the lock again.

Alechko
  • 1,406
  • 1
  • 13
  • 27