-2

I discovered very strange behaviour with go maps recently. The use case is to create a group of integers and have O(1) check for IsMember(id int).

The current implementation is :

func convertToMap(v []int64) map[int64]void {
    out := make(map[int64]void, len(v))
    for _, i := range v {
        out[i] = void{}
    }
   return out
}

type Group struct {
    members map[int64]void
}

type void struct{}

func (g *Group) IsMember(input string) (ok bool) {
    memberID, _ := strconv.ParseInt(input, 10, 64)      
    _, ok = g.members[memberID]
    return
}

When i benchmark the IsMember method, until 6 million members, everything looks fine. But above that the map look up is taking 1 second for each lookup!!

The benchmark test:

func BenchmarkIsMember(b *testing.B) {
    b.ReportAllocs()
    b.ResetTimer()
    g := &Group{}
    g.members = convertToMap(benchmarkV)

    for N := 0; N < b.N && N < sizeOfGroup; N++ {
        g.IsMember(benchmarkKVString[N])
    }
}

var benchmarkV, benchmarkKVString = func(size int) ([]int64, []string{
    v := make([]int64, size)
    s := make([]string, size)
    for i := range v {
        val := rand.Int63()
        v[i] = val
        s[i] = strconv.FormatInt(val, 10)
    }
return v, s
}(sizeOfGroup)

Benchmark numbers:

const sizeOfGroup  = 6000000
BenchmarkIsMember-8      2000000           568 ns/op          50 B/op          0 allocs/op

const sizeOfGroup  = 6830000
BenchmarkIsMember-8            1    1051725455 ns/op    178767208 B/op        25 allocs/op

Anything above group size of 6.8 million gives the same result.

Can someone help me to explain why this is happening, and can anything be done to make this performant while still using maps?

Also, i dont understand why so much memory is being allocated? Even if the time taken is due to collision and then linked list traversal, there shouldn't be any mem allocation, is my thought process wrong?

zaRRoc
  • 345
  • 1
  • 7
  • 18
  • Your code doesn't compile. – peterSO Nov 22 '18 at 05:29
  • 1
    `i dont understand why so much memory is being allocated`. That s a subtle statement given that you have written `make(map[int64]void, len(v))`. see also https://play.golang.org/p/JEEI4qkfYyn –  Nov 22 '18 at 05:57
  • @mh-cbon if this was the case, then the benchmark for 6 million should have shown the same mem allocation as well. The benchmark for 6 million and 6.8 million has a wide contrast in both time to lookup and mem allocation as well – zaRRoc Nov 22 '18 at 07:44
  • @mh-cbon nevermind, the benchmark is calculating the allocation of converting slice to map as well. Understood the problem – zaRRoc Nov 22 '18 at 07:50
  • `The benchmark for 6 million and 6.8 million has a wide contrast in both time to lookup and mem allocation as well `. Some runtime behavior, not sure though. See pool.Buffer. –  Nov 22 '18 at 10:32

1 Answers1

1

No need to measure extra allocation for converting slice to map because we just want to measure the lookup operation.

I've slightly modify the benchmark:

func BenchmarkIsMember(b *testing.B) {
    fn := func(size int) ([]int64, []string) {
        v := make([]int64, size)
        s := make([]string, size)

        for i := range v {
            val := rand.Int63()
            v[i] = val
            s[i] = strconv.FormatInt(val, 10)
        }

        return v, s
    }

    for _, size := range []int{
        6000000,
        6800000,
        6830000,
        60000000,
    } {
        b.Run(fmt.Sprintf("size=%d", size), func(b *testing.B) {
            var benchmarkV, benchmarkKVString = fn(size)

            g := &deltaGroup{}
            g.members = convertToMap(benchmarkV)

            b.ReportAllocs()
            b.ResetTimer()

            for N := 0; N < b.N && N < size; N++ {
                g.IsMember(benchmarkKVString[N])
            }
        })
    }
}

And got the following results:

go test ./... -bench=. -benchtime=10s -cpu=1
goos: linux
goarch: amd64
pkg: trash
BenchmarkIsMember/size=6000000          2000000000               0.55 ns/op            0 B/op          0 allocs/op
BenchmarkIsMember/size=6800000          1000000000               1.27 ns/op            0 B/op          0 allocs/op
BenchmarkIsMember/size=6830000          1000000000               1.23 ns/op            0 B/op          0 allocs/op
BenchmarkIsMember/size=60000000         100000000                136 ns/op               0 B/op          0 allocs/op
PASS
ok      trash   167.578s

Degradation isn't so significant as in your example.

dshil
  • 381
  • 2
  • 10