46

Here is my desired outcome

slice1 := []string{"foo", "bar","hello"}
slice2 := []string{"foo", "bar"}

difference(slice1, slice2)
=> ["hello"]

I am looking for the difference between the two string slices!

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
samol
  • 18,950
  • 32
  • 88
  • 127
  • Can we assume order of strings doesn't matter? – ANisus Oct 15 '13 at 06:27
  • @ANisus I presumed that I had to compare each index with the equivalent index. Otherwise it'd involve sorting the slices, or doing a very slow comparison of every slice member to every member of the other slice. Hopefully that's all that's required! – Intermernet Oct 15 '13 at 06:42
  • @Intermernet Yes, my answer was with the approach that order index position doesn't matter. Of course, mine is just a simple and "stupid" loop with time O(n*m) . For larger ones, maybe some sort or map solution is better. – ANisus Oct 15 '13 at 06:52
  • [difflib](https://gowalker.org/github.com/aryann/difflib) – Ivan Black Sep 18 '14 at 18:18

11 Answers11

83

Assuming Go maps are ~O(1), here is an ~O(n) difference function that works on unsorted slices.

// difference returns the elements in `a` that aren't in `b`.
func difference(a, b []string) []string {
    mb := make(map[string]struct{}, len(b))
    for _, x := range b {
        mb[x] = struct{}{}
    }
    var diff []string
    for _, x := range a {
        if _, found := mb[x]; !found {
            diff = append(diff, x)
        }
    }
    return diff
}
CAFxX
  • 28,060
  • 6
  • 41
  • 66
peterwilliams97
  • 931
  • 1
  • 6
  • 6
  • 2
    this's better solution than all above answer – Viet Phan Aug 04 '17 at 03:30
  • it's good but `ab := []string{}` should really be `var ab []string` – WakkaDroid Mar 26 '19 at 17:55
  • This solution breaks down if the slices contain duplicated elements – CAFxX May 17 '19 at 03:46
  • 2
    I thought the question was for go slices that were sets. There is an obvious variant of this code for multisets: Make mb a map[string]int and subtract 1 from mb[x] when x is found. – peterwilliams97 May 22 '19 at 09:59
  • Assuming len(a)=n and len(b)=m this solution shouldn't be O(n) + n*log(m) ? – Cirelli94 May 04 '21 at 14:31
  • No, if I understand https://stackoverflow.com/questions/29677670/what-is-the-big-o-performance-of-maps-in-golang correcty. – peterwilliams97 May 09 '21 at 02:24
  • @WakkaDroid - Any particular reason? Your proposed change has the effect that `[]string{} != difference([]string{"a", "b", "c"}, []string{"a", "b", "c"})`, which seems counterintuitive and undesirable, so I expect there to be some serious performance implications from the `diff := []string{}` approach to compensate. – Inaimathi Nov 02 '22 at 16:07
  • This difference is not symmetric difference – Zhasulan Berdibekov Jan 27 '23 at 08:42
25

Depending on the size of the slices, different solutions might be best.

My answer assumes order doesn't matter.

Using simple loops, only to be used with smaller slices:

package main

import "fmt"

func difference(slice1 []string, slice2 []string) []string {
    var diff []string

    // Loop two times, first to find slice1 strings not in slice2,
    // second loop to find slice2 strings not in slice1
    for i := 0; i < 2; i++ {
        for _, s1 := range slice1 {
            found := false
            for _, s2 := range slice2 {
                if s1 == s2 {
                    found = true
                    break
                }
            }
            // String not found. We add it to return slice
            if !found {
                diff = append(diff, s1)
            }
        }
        // Swap the slices, only if it was the first loop
        if i == 0 {
            slice1, slice2 = slice2, slice1
        }
    }

    return diff
}

func main() {
    slice1 := []string{"foo", "bar", "hello"}
    slice2 := []string{"foo", "world", "bar", "foo"}

    fmt.Printf("%+v\n", difference(slice1, slice2))
}

Output:

[hello world]

Playground: http://play.golang.org/p/KHTmJcR4rg

ANisus
  • 74,460
  • 29
  • 162
  • 158
  • 1
    I just noticed that if you do something like `slice1 := []string{"foo"} slice2 := []string{"foo", "foo"}` you don't get any difference. I wish we had some more direction on the finer points of what constitutes a "difference'! Is it data + index, just data, or unique data? Anyway, very nice answer. – Intermernet Oct 15 '13 at 12:32
  • @Intermernet In my example, I intentionally added two "foo" in the second slice just to make it clear on how the function behaves in such cases. But yeah, both our answers are correct depending on how OP defines "difference". – ANisus Oct 15 '13 at 12:38
  • 1
    Yes I noticed, that was what triggered my thought, and I'm glad that you're illustrating the fact in the example. It's interesting how such a seemingly simple question involves so many caveats :-) Maybe we'll get some elucidation from the OP! – Intermernet Oct 15 '13 at 12:49
  • Note that this solution is 1) incredibly inefficient if slice1 and slice2 have many elements and 2) as noted in a different comment, incorrect if the slices can contain duplicated elements – CAFxX May 14 '19 at 01:18
15

I use the map to solve this problem

package main

import "fmt"

func main() {
    slice1 := []string{"foo", "bar","hello"}
    slice2 := []string{"foo", "bar","world"}

    diffStr := difference(slice1, slice2)

    for _, diffVal := range diffStr {
        fmt.Println(diffVal)
    }

}

func difference(slice1 []string, slice2 []string) ([]string){
    diffStr := []string{}
    m :=map [string]int{}

    for _, s1Val := range slice1 {
        m[s1Val] = 1
    }
    for _, s2Val := range slice2 {
        m[s2Val] = m[s2Val] + 1
    }

    for mKey, mVal := range m {
        if mVal==1 {
            diffStr = append(diffStr, mKey)
        }
    }

    return diffStr
}

output:
hello
world

alexis
  • 569
  • 5
  • 12
3
func diff(a, b []string) []string {
    temp := map[string]int{}
    for _, s := range a {
        temp[s]++
    }
    for _, s := range b {
        temp[s]--
    }

    var result []string
    for s, v := range temp {
        if v != 0 {
            result = append(result, s)
        }
    }
    return result
}

If you want to handle duplicated strings, the v in the map can do that. And you can pick a.Remove(b) ( v>0 ) or b.Remove(a) (v<0)

rrFeng
  • 191
  • 10
2
func unique(slice []string) []string {
    encountered := map[string]int{}
    diff := []string{}

    for _, v := range slice {
        encountered[v] = encountered[v]+1
    }

    for _, v := range slice {
        if encountered[v] == 1 {
        diff = append(diff, v)
        }
    }
    return diff
}

func main() {
    slice1 := []string{"hello", "michael", "dorner"}
    slice2 := []string{"hello", "michael"}
    slice3 := []string{}
    fmt.Println(unique(append(slice1, slice2...))) // [dorner]
    fmt.Println(unique(append(slice2, slice3...))) // [michael michael]
}
Michael Dorner
  • 17,587
  • 13
  • 87
  • 117
1

As mentioned by ANisus, different approaches will suit different sizes of input slices. This solution will work in linear time O(n) independent of input size, but assumes that the "equality" includes index position.

Thus, in the OP's examples of:

slice1 := []string{"foo", "bar","hello"}
slice2 := []string{"foo", "bar"}

The entries foo and bar are equal not just due to value, but also due to their index in the slice.

Given these conditions, you can do something like:

package main

import "fmt"

func difference(s1, s2 []string) string {
    var (
        lenMin  int
        longest []string
        out     string
    )
    // Determine the shortest length and the longest slice
    if len(s1) < len(s2) {
        lenMin = len(s1)
        longest = s2
    } else {
        lenMin = len(s2)
        longest = s1
    }
    // compare common indeces
    for i := 0; i < lenMin; i++ {
        if s1[i] != s2[i] {
            out += fmt.Sprintf("=>\t%s\t%s\n", s1[i], s2[i])
        }
    }
    // add indeces not in common
    for _, v := range longest[lenMin:] {
        out += fmt.Sprintf("=>\t%s\n", v)
    }
    return out
}

func main() {
    slice1 := []string{"foo", "bar", "hello"}
    slice2 := []string{"foo", "bar"}
    fmt.Print(difference(slice1, slice2))
}

Produces:

=> hello

Playground

If you change the slices to be:

func main() {
    slice1 := []string{"foo", "baz", "hello"}
    slice2 := []string{"foo", "bar"}    
    fmt.Print(difference(slice1, slice2))
}

It will produce:

=> baz bar
=> hello

Community
  • 1
  • 1
Intermernet
  • 18,604
  • 4
  • 49
  • 61
  • 1
    But what about: `slice1 := []string{"foo", "bar", "hello"}` and `slice2 := []string{"foox","foo", "bar"}`? The program outputs `foo` and `bar`, which are in both slices. – topskip Oct 15 '13 at 08:43
  • 1
    @topskip Intermernet made the presumption that OP wanted to compare each index with the equivalent index. OP wasn't very clear in what kind of comparison he wanted considering order and index values. – ANisus Oct 15 '13 at 09:20
  • 1
    @ANisus you're right, I've missed that. I think I will leave my comment though, so that future users are aware of that issue. – topskip Oct 15 '13 at 09:35
  • @topskip , Yes, I presumed that the OP wanted to compare index by index. The solution provided by ANisus is probably what the OP was after, although I couldn't think of a way to achieve it while dealing with large slices without degrading performance. – Intermernet Oct 15 '13 at 12:06
1

Most of the other solutions here will fail to return the correct answer in case the slices contain duplicated elements.

This solution is O(n) time and O(n) space if the slices are already sorted, and O(n*log(n)) time O(n) space if they are not, but has the nice property of actually being correct.

func diff(a, b []string) []string {
    a = sortIfNeeded(a)
    b = sortIfNeeded(b)
    var d []string
    i, j := 0, 0
    for i < len(a) && j < len(b) {
        c := strings.Compare(a[i], b[j])
        if c == 0 {
            i++
            j++
        } else if c < 0 {
            d = append(d, a[i])
            i++
        } else {
            d = append(d, b[j])
            j++
        }
    }
    d = append(d, a[i:len(a)]...)
    d = append(d, b[j:len(b)]...)
    return d
}

func sortIfNeeded(a []string) []string {
    if sort.StringsAreSorted(a) {
        return a
    }
    s := append(a[:0:0], a...)
    sort.Strings(s)
    return s
}

If you know for sure that the slices are already sorted, you can remove the calls to sortIfNeeded (the reason for the defensive slice copy in sortIfNeeded is because sorting is done in-place, so we would be modifying the slices that are passed to diff).

See https://play.golang.org/p/lH-5L0aL1qr for tests showing correctness in face of duplicated entries.

CAFxX
  • 28,060
  • 6
  • 41
  • 66
1

I have this example but it works only for the elements of the first array "not present" in the second array

with generics

type HandleDiff[T comparable] func(item1 T, item2 T) bool

func HandleDiffDefault[T comparable](val1 T, val2 T) bool {
    return val1 == val2
}

func Diff[T comparable](items1 []T, items2 []T, callback HandleDiff[T]) []T {
    acc := []T{}
    for _, item1 := range items1 {
        find := false
        for _, item2 := range items2 {
            if callback(item1, item2) {
                find = true
                break
            }
        }
        if !find {
            acc = append(acc, item1)
        }
    }
    return acc
}

usage

diff := Diff(items1, items2, HandleDiffDefault[string])
priolo priolus
  • 266
  • 5
  • 7
0

I would add a small change to the solution by @peterwilliams97, so that we can ignore the order of the input.

func difference(a, b []string) []string {
    // reorder the input,
    // so that we can check the longer slice over the shorter one
    longer, shorter := a, b
    if len(b) > len(a) {
        longer, shorter = b, a
    }

    mb := make(map[string]struct{}, len(shorter))
    for _, x := range shorter {
        mb[x] = struct{}{}
    }
    var diff []string
    for _, x := range longer {
        if _, found := mb[x]; !found {
            diff = append(diff, x)
        }
    }
    return diff
}
0

Input: s1 = ["this", "apple", "is", "sweet"], s2 = ["this" "apple" "is" "sour"]

Output: ["sweet","sour"]

func difference(s1, s2 []string) []string {
    combinedSlice := append(s1, s2...)
    dm := make(map[string]int)
    for _, v := range combinedSlice {
        if _, ok := dm[v]; ok {
            // remove element later as it exist in both slice.
            dm[v] += 1
            continue
        }
        // new entry, add in map!
        dm[v] = 1
    }
    var retSlice []string
    for k, v := range dm {
        if v == 1 {
            retSlice = append(retSlice, k)
        }
    }
    return retSlice
}

// If we needs difference from first slice use below funcion.

// Output: [sweet]

func diff(s1, s2 []string) []string {
    mp1 := make(map[string]bool)
    for _, v := range s2 {
        mp1[v] = true
    }
    var dif []string
    for _, v1 := range s1 {
        if _, ok := mp1[v1]; !ok {
            dif = append(dif, v1)
        }
    }
    return dif
}
-1

The code below gives the absolute difference between strings regardless of the order. Space complexity O(n) and Time complexity O(n).

// difference returns the elements in a that aren't in b
func difference(a, b string) string {
    longest, shortest := longestString(&a, &b)
    var builder strings.Builder
    var mem = make(map[rune]bool)
    for _, s := range longest {
        mem[s] = true
    }
    for _, s := range shortest {
        if _, ok := mem[s]; ok {
            mem[s] = false
        }
    }
    for k, v := range mem {
        if v == true {
            builder.WriteRune(k)
        }
    }
    return builder.String()
}
func longestString(a *string, b *string) ([]rune, []rune) {
    if len(*a) > len(*b) {
        return []rune(*a), []rune(*b)
    }
    return []rune(*b), []rune(*a)
}
Remario
  • 3,813
  • 2
  • 18
  • 25
  • stringBuilder is always recommended for building strings, since strings are immutable in golang. – Remario Mar 17 '19 at 16:42
  • The question was about diff of `[]string`; this solution is about diff of `string`. Also the solution is incorrect, as it also doesn't handle duplicated runes. – CAFxX May 17 '19 at 06:26