0

I am trying to sort a list based on the frequency of it's elements. But I get two different answers when the list is sorted and list is un-sorted. Please see the code segment below.

Could someone explain the cause. Thank you.

from collections import Counter
l = [1,1,0,0,5,2,5,5,3,4,33,0]
# Stores the frequency of each element as {elem: freq}
c = Counter(l)

# Sorting the list based on the frequency of the elements
lst1 = sorted(l, key=lambda x: -c[x])
# lst1: [0, 0, 5, 5, 5, 0, 1, 1, 2, 3, 4, 33]

l.sort()
# Sorting the list based on the frequency of the elements
lst2 = sorted(l, key=lambda x: -c[x])
# lst2: [0, 0, 0, 5, 5, 5, 1, 1, 2, 3, 4, 33]
Paul P
  • 3,346
  • 2
  • 12
  • 26
Ram
  • 4,724
  • 2
  • 14
  • 22
  • `[0, 0, 5, 5, 5, 0, 1, 1, 2, 3, 4, 33]` is just as correct as the other answer you got. – user2357112 Feb 28 '21 at 09:51
  • @user2357112supportsMonica Thanks for your response. Could you elaborate a bit more. Don't you think ```[0, 0, 5, 5, 5, 0, 1, 1, 2, 3, 4, 33]``` is NOT sorted based on the frequency ? – Ram Feb 28 '21 at 09:56
  • @Ram How is it not? 0 and 5 have the same frequency – khelwood Feb 28 '21 at 09:59
  • You need to think like a computer - a human would naturally put things in ascending order (as you've done in lst1), but like @user2357112supportsMonica says, both answers are equally valid (write the frequencies under your lists, you get the same answer), Python doesn't care if all the 0s and all the 5s aren't next to each other, and won't do more sorting than it needs to. – Chris Geatch Feb 28 '21 at 10:06

1 Answers1

1

Both results are correct.

Since both occurrences, c[0] and c[5], evaluate to 3 (in this case), and that number alone is used as the sorting key in both cases, the sorting algorithm will treat both integers as "equal" and sort them depending on the order it encountered them only.

Looking at the documentation of sorted tells us that this is a feature of the sorting algorithm:

The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal

If you'd like to sort by the integer's value, in case both occurrences are the same, you can extend the sorting function to return a tuple, e.g.:

lst = sorted(l, key=lambda x: (-c[x], x))
# lst: [0, 0, 0, 5, 5, 5, 1, 1, 2, 3, 4, 33]
Paul P
  • 3,346
  • 2
  • 12
  • 26