0

so pretty straight through idea, just want to find the most frequent element but I am just lost on exactly what is happening in this small code snippet. I like how this code doesn't import anything, just simply uses the built-ins. But I dont understand exactly how it works. Here it is:

def most_frequent(List):
    return max(set(List), key = List.count)

w3schools defines the python max() as doing the following:

The max() function returns the item with the highest value, or the item with the highest value in an iterable. If the values are strings, an alphabetically comparison is done.

Makes sense so far, so if we have a list like so a_list = [3,2,5,1], then max(a_list) gives us 5. Okay, simple enough.

But in our function why is set being used here? I understand set is a built-in data structure that ensures all data inside it to be unique. And what is key? why are there two parameters here inside like this inside the max()? I've never seen anything like this before. Running something like max(4,2,6,3,2) makes sense but putting the data structure set to enclose the list and assigning something called key and calling count on the list? What in the world is going on here? Can someone please break this down like I am five and explain how we are able to use the max() like this when the definition is just to find the highest value? What is set and key doing here? Is it like a hashtable? Quite lost here would truly appreciate the help.

Thank you

  • `set` is used because it eliminates duplicates. The `key` parameter to `max` lets you supply a filter function; it will return the element where that function has its largest value, not just where the element has its largest value. It's a bit tricky. – Tim Roberts Oct 27 '21 at 17:47
  • Note, this algorithm is bad, potentially very inefficient, don't use it. – juanpa.arrivillaga Oct 27 '21 at 17:50
  • "w3schools defines the python max() as doing the following" *No*. Don't go to w3schools. Just **go to the actual documentation**. Have you looked at it? What did it say about the `key` argument to `max`? – juanpa.arrivillaga Oct 27 '21 at 17:51
  • @juanpa.arrivillaga Thanks! What's a better approach to find the highest frequency? and where is the documentation for max() – Preston_Jarvis Oct 27 '21 at 18:00
  • @Preston_Jarvis [here's the relevant docs](https://docs.python.org/3/library/functions.html#max). Better would be to create a dictionary of counts, you could just use a `collections.Counter` object, something like `counts = collections.Counter(data)` then you just want `max(counts.items(), lambda x:x[1])`. The above is going to be linear time – juanpa.arrivillaga Oct 27 '21 at 18:02

1 Answers1

2

The max documentation states that key is an ordering function, the method gives a new value for each item of the iterable , which will be used for ordering

Without key : natural order

max([4, 3, 5, 6]) 
     4, 3, 5, 6    << values used for finding the max
           ^

With key : order if values given by the key

max([ 4,  3,  5,  6], key=lambda x: -x) 
     -4, -3, -5, -6    << values used for finding the max
          ^

Your case, the list.count, that telles that 2 occurs 2, and that is more than the ones

l =[4, 2, 6, 3, 2]

max([4, 2, 6, 3], key=lambda x: l.count(x))
     1  2  1  1      << values used for max, regarding the
        ^
azro
  • 53,056
  • 7
  • 34
  • 70
  • Thanks, still trying to wrap my head around it. W3 is a bad resource, didn't cover that detail at all. Going with the python docs all the way. So if I understand correctly max() can take either a single iterable like such as a list, or an iterable a key for ordering. Not really understanding the ordering part, ordering how? And why use set()? Doesn't that remove duplicates that need to be counter? – Preston_Jarvis Oct 27 '21 at 18:25
  • @Preston_Jarvis ordering regarding the value given by the `key` method. The `set` just allows to find the maximum between less values, and the initial list is used for counter, no matter the set – azro Oct 27 '21 at 18:43