E.g. for List(1, 1, 1, 2, 3, 3, 4)
it would be Set(1, 3)
, because 1 and 3 are the only elements which occur multiple times.
Asked
Active
Viewed 568 times
3
-
possible duplicate of [Scala find duplicate elements in a list](http://stackoverflow.com/questions/24729544/scala-find-duplicate-elements-in-a-list) – Kigyo Aug 16 '14 at 19:25
-
I have just posted an answer on a related thread which should very efficiently do what you are requesting...and more: http://stackoverflow.com/a/35030746/501113 – chaotic3quilibrium Jan 27 '16 at 07:04
2 Answers
7
val s = List(1, 1, 1, 2, 3, 3, 4) // a list with non-unique elements
(s diff s.distinct) toSet // Set(1, 3)

corazza
- 31,222
- 37
- 115
- 186
-
If I were a co-worker of yours stumbling upon that line of code... well you'd better fun fast! It might seem like a clever trick, but it's nearly unreadable. When using combinators, always try to be clear in your intentions (see Ende's solution for instance) – Gabriele Petronella Aug 16 '14 at 18:36
-
3By the way `s.toSet.toList` is just a convoluted way of writing `s.distinct`. – Gabriele Petronella Aug 16 '14 at 18:41
-
@GabrielePetronella Yes, that's why I posted this, I'm looking for better ways to do it. Didn't know about `distinct`, thank you. – corazza Aug 16 '14 at 19:32
-
Now that I'm using `(s diff s.distinct) toSet`, would you agree that it's more readable? My solution is essentially the same as [this one](http://stackoverflow.com/a/24744042/924313) - except I'd say even more readable. – corazza Aug 16 '14 at 19:34
-
1not quite. In order to understand what you meant by that code, I would need to mentally execute it, so it's not readable, at least for me. Ende's code is *much* more explicit in its intentions, and that's why I greatly prefer it. – Gabriele Petronella Aug 17 '14 at 02:10
-
2I find this pretty readable, especially if it's written as `def duplicateElements[A](s: Seq[A]): Seq[A] = ...` – Daenyth Aug 18 '14 at 19:38
-
@Daenyth well of course if you write what it does in the method name, I'm not even reading through the implementation. – Gabriele Petronella Aug 18 '14 at 19:43
-
I wouldn't use either of these without putting them into a method, but between the two, this feels more natural to me; the other I'd have to mentally unwrap more than this one. – Daenyth Aug 18 '14 at 19:44
-
It might be the case that I don't have enough experience with idiomatic Scala code, but I honestly think that `s.groupBy(identity).collect { case (v, l) if l.length > 1 => v }` is much more convoluted than `(s diff s.distinct)`. I mean you'd understand the latter one without even knowing Scala (the difference between a list and its distinct elements are those elements which aren't distinct) - and the syntax is quite logical. – corazza Aug 18 '14 at 21:04
5
A bit more convoluted but you can avoid having to call toSet.toList
, first group the integers:
scala> s.groupBy(identity)
res13: scala.collection.immutable.Map[Int,List[Int]] =
Map(2 -> List(2), 4 -> List(4), 1 -> List(1, 1, 1), 3 -> List(3, 3))
Then collect only the one were the list has length greater as 1:
scala> s.groupBy(identity).collect { case (v, l) if l.length > 1 => v }
res17: scala.collection.immutable.Iterable[Int] = List(1, 3)
If you want a Set
just call toSet
.

Ende Neu
- 15,581
- 5
- 57
- 68
-
I was about to post the same exact solution, you beat me of a hair :) That's indeed more readable (hence probably preferable) than the other proposed solution. – Gabriele Petronella Aug 16 '14 at 18:38
-
By the way you don't need parenthesis around the `if` condition. – Gabriele Petronella Aug 16 '14 at 18:39
-
1As soon as I read duplicates I thought about `groupBy`, I took my time because I felt Travis Brown behind my back ready to downvote for non exhaustive explanation, I made it in time before you :). The parenthesis for the if is a habit I brought from other languages. – Ende Neu Aug 16 '14 at 18:45
-
-
@jco I would argue that my solution is more legible, this is mostly subjective though, there's no real benefit. – Ende Neu Aug 16 '14 at 21:11