1

In answering another question I wrote a little struct using sync.Map to cache API requests.

type PostManager struct {
    sync.Map
}

func (pc PostManager) Fetch(id int) Post {
    post, ok := pc.Load(id)
    if ok {
        fmt.Printf("Using cached post %v\n", id)
        return post.(Post)
    }
    fmt.Printf("Fetching post %v\n", id)
    post = pc.fetchPost(id)
    pc.Store(id, post)

    return post.(Post)
}

Unfortunately, if two goroutines both fetch the same uncached Post at the same time, both will make a request.

var postManager PostManager

wg.Add(3)

var firstPost Post
var secondPost Post
var secondPostAgain Post

go func() {
    // Fetches and caches 1
    firstPost = postManager.Fetch(1)
    defer wg.Done()
}()

go func() {
    // Fetches and caches 2
    secondPost = postManager.Fetch(2)
    defer wg.Done()
}()

go func() {
    // Also fetches and caches 2
    secondPostAgain = postManager.Fetch(2)
    defer wg.Done()
}()

wg.Wait()

I need to ensure when there are simultaneous fetches of the same ID only one is allowed to actually make the request. The other must wait and will use the cached Post. But to also not lock out fetches of different IDs.

In the above example, I want there to be one and only one call to pc.fetchPost(1) and pc.fetchPost(2) and they should be simultaneous.

Link to the full code.

Schwern
  • 153,029
  • 25
  • 195
  • 336
  • 4
    "Package singleflight provides a duplicate function call suppression mechanism." https://godoc.org/golang.org/x/sync/singleflight – Peter Feb 12 '20 at 23:59
  • @Peter This is great, and by far the simplest solution. If you post an answer I'll accept it. However, `go run --race` is showing a data race at Do(). [Code here](https://gist.github.com/schwern/6209cfb4b72fa9f5d97102a6c7809970). – Schwern Feb 13 '20 at 02:00

3 Answers3

2

Looks like it's possible to use the second map to wait if fetching already in progress.

type PostManager struct {
    sync.Map
    q sync.Map
}

func (pc *PostManager) Fetch(id int) Post {
    post, ok := pc.Load(id)
    if ok {
        fmt.Printf("Using cached post %v\n", id)
        return post.(Post)
    }
    fmt.Printf("Fetching post %v\n", id)
    if c, loaded := pc.q.LoadOrStore(id, make(chan struct{})); !loaded {
        post = pc.fetchPost(id)
        pc.Store(id, post)
        close(c.(chan struct{}))
    } else {
        <-c.(chan struct{})
        post,_ = pc.Load(id)
    }
    return post.(Post)
}

Or, a little more complex, with the same map ;-)

func (pc *PostManager) Fetch(id int) Post {
    p, ok := pc.Load(id)

    if !ok {
        fmt.Printf("Fetching post %v\n", id)
        if p, ok = pc.LoadOrStore(id, make(chan struct{})); !ok {
            fetched = pc.fetchPost(id)
            pc.Store(id, fetched)
            close(p.(chan struct{}))
            return fetched
        }
    }

    if cached, ok := p.(Post); ok {
        fmt.Printf("Using cached post %v\n", id)
        return cached
    }

    fmt.Printf("Wating for cached post %v\n", id)
    <-p.(chan struct{})
    return pc.Fetch(id)
}
Alexey Sudachen
  • 364
  • 2
  • 5
  • This does not prevent multiple goroutines starting the fetch. – Burak Serdar Feb 13 '20 at 00:30
  • Why? It's fully concurrent for different topics. However, there will be only one fetchPost call for the same topic id. – Alexey Sudachen Feb 13 '20 at 00:38
  • Thanks for the answer, but neither works. Here's the full code for [two maps](https://gist.github.com/schwern/b03bff3c67f1d3b4e51b55b29c93002d) and [single map](https://gist.github.com/schwern/f50dee6ff2593f5d01ccad12aad168b9). – Schwern Feb 13 '20 at 01:06
  • My bad, there must be "func (pc *PostManager) Fetch(id int) Post". it must access pc by reference. – Alexey Sudachen Feb 13 '20 at 01:50
2

The golang.org/x/sync/singleflight package has been written for exactly this purpose.

Note that all cache accesses are supposed to happen inside the callback function passed to Do. In the code you link to in your comment you do the lookup outside; that somewhat defeats the purpose.

Also, you must use a pointer to singleflight.Group. That's the source for your data race and go vet points it out:

./foo.go:41:10: fetchPost passes lock by value: command-line-arguments.PostManager contains golang.org/x/sync/singleflight.Group contains sync.Mutex

Here's how I would write it (full example on the playground: https://play.golang.org/p/2hE721uA88S):

import (
    "strconv"
    "sync"

    "golang.org/x/sync/singleflight"
)

type PostManager struct {
    sf    *singleflight.Group
    cache *sync.Map
}

func (pc *PostManager) Fetch(id int) Post {
    x, _, _ := pc.sf.Do(strconv.Itoa(id), func() (interface{}, error) {
        post, ok := pc.cache.Load(id)
        if !ok {
            post = pc.fetchPost(id)
            pc.cache.Store(id, post)
        }

        return post, nil
    })

    return x.(Post)
}
Peter
  • 29,454
  • 5
  • 48
  • 60
1

You can do that with two maps, one keeping the cached values and the other keeping values that are being fetched. You'd also need to keep the lock a little longer so there is no need to keep a sync.Map, a regular map would do. Something like this should work (untested):

type PostManager struct {
    sync.Mutex
    cached map[int]Post
    loading map[int]chan struct{}
}

You need to handle the case where loading fails in the following:

// Need to pass pointer pc
func (pc *PostManager) Fetch(id int) Post {
    pc.Lock()
    post, ok:=pc.cached[id]
    if ok {
        pc.Unlock()
        return post
    }
    // See if it is being loaded
    loading, ok:=pc.loading[id]
    if ok {
       // Wait for the loading to complete
       pc.Unlock()
       <-loading
       // Reload
       pc.Lock()
       post,ok:=pc.cached[id]
       // Maybe you need to handle the case where loading failed?
       pc.Unlock()
       return post
    }
    // load it
    loading=make(chan struct{})
    pc.loading[id]=loading
    pc.Unlock()
    post = pc.fetchPost(id)
    pc.Lock()
    pc.cached[id]=post
    delete(pc.loading,id)
    pc.Unlock()
    close(loading)
    return post
}
Burak Serdar
  • 46,455
  • 3
  • 40
  • 59
  • Thanks. I couldn't make this work using a single mutex for all the maps, it would deadlock or race. I did make it work using two sync.Maps, but I'm getting occasional races. [Here's the code](https://gist.github.com/schwern/3e36c2d452b4ee413ba134d0931431d7). – Schwern Feb 13 '20 at 01:47
  • Yes, I see the race, and it should be fixed now. I agree that singleflight is the best solution though. – Burak Serdar Feb 13 '20 at 05:19
  • Thank you, but `go run --race` continues to report a race condition. I'm puzzled. [Code](https://gist.github.com/schwern/643de6ebf6ea7caba7247aafb2ab954e). – Schwern Feb 13 '20 at 05:40
  • 2
    fetchPost must get a pointer receiver. Without that, you're copying the mutex. – Burak Serdar Feb 13 '20 at 05:53