-1

The code below is giving me nearly the output i want but not quite.

def reducer(self, year, words):
        x = Counter(words)
        most_common = x.most_common(3) 
        sorted(x, key=x.get, reverse=True)    
        yield (year, most_common)

This is giving me output

"2020" [["coronavirus",4],["economy",2],["china",2]]

What I would like it to give me is

"2020" "coronavirus china economy"

If someone could explain to me why i am getting a list of lists instead of the output i require I would be most grateful. Along with an idea on how to improve the code to get what I need.

Reti43
  • 9,656
  • 3
  • 28
  • 44
CKZ
  • 37
  • 5
  • Could you also show us the function call of `reducer()` and which arguments you pass to the function? – intedgar Nov 17 '21 at 00:08
  • You're sorting a copy of `x` which is no way affects `most_common`, which is what you return out of your function. Am I correct to assume that you want to sort the words in terms of highest to lowest frequency and if any words are tied up for frequency to sort them in alphabetical order? – Reti43 Nov 17 '21 at 00:11
  • @Reti43 yes that is correct. sort the words in terms of highest to lowest (The top3 words) and then in alpahbetical order. – CKZ Nov 17 '21 at 00:21
  • @Reti43 I haven't used the Counter function before so I am very unsure on how it works. What should i be sorting on? Thanks. – CKZ Nov 17 '21 at 00:26

1 Answers1

0

From the documentation for Counter.most_common explains why you get a list of lists.

most_common(n=None) method of collections.Counter instance
    List the n most common elements and their counts from the most
    common to the least.  If n is None, then list all element counts.
    
    >>> Counter('abracadabra').most_common(3)
    [('a', 5), ('b', 2), ('r', 2)]

Because sorting from highest to lowest frequency is like sorting in descending order, but sorting alphabetically is in ascending order, you can use a custom tuple where you take the negative of the frequency and sort everything in ascending order.

from collections import Counter

words = Counter(['coronavirus'] * 4 + ['economy'] * 2 + ['china'] * 2 + ['whatever'])
x = Counter(words)
most_common = x.most_common(3)
# After sorting you need to discard the freqency from each (word, freq) tuple
result = ' '.join(word for word, _ in sorted(most_common, key=lambda x: (-x[1], x[0])))
Reti43
  • 9,656
  • 3
  • 28
  • 44