-2

I am working on a python algorithm to find the most frequent element in the list.

def GetFrequency(a, element):
return sum([1 for x in a if x == element])

def GetMajorityElement(a):
  n = len(a)
  if n == 1:
    return a[0]
  k = n // 2

  elemlsub = GetMajorityElement(a[:k])
  elemrsub = GetMajorityElement(a[k:])
  if elemlsub == elemrsub:
    return elemlsub

  lcount = GetFrequency(a, elemlsub)
  rcount = GetFrequency(a, elemrsub)

  if lcount > k:
    return elemlsub
  elif rcount > k:
    return elemrsub
  else:
    return None

I tried some test cases. Some of them are passed, but some of them fails.

For example, [1,2,1,3,4] this should return 1, buit I get None.

The implementation follows the pseudocode here: http://users.eecs.northwestern.edu/~dda902/336/hw4-sol.pdf The pseudocode finds the majority item and needs to be at least half. I only want to find the majority item.

Can I get some help? Thanks!

chen
  • 341
  • 1
  • 3
  • 11
  • "but I get None" look for a branch of `GetMajorityElement` that returns `None` – DeepSpace Sep 04 '19 at 13:25
  • Why `k = n // 2`? – jboockmann Sep 04 '19 at 13:26
  • @jboockmann k need to be an integer. – Valentin B. Sep 04 '19 at 13:36
  • 1
    Are you looking to return a result only if the count is greater than half or just the element with the highest frequency? – Asa Stallard Sep 04 '19 at 13:37
  • @AsaStallard Just the element with the highest frequency. Thanks – chen Sep 04 '19 at 13:39
  • 3
    @chen This algorithm is for finding the majority (if there is one) in a list of items, where the majority means appearing more than 50% of the time. So its not for finding the most frequent item. In your example, there are five items [1,2,1,3,4] but no item appears more than 50% which would be three times in this case so there is no majority. – DarrylG Sep 04 '19 at 13:44
  • http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mode.html#scipy.stats.mode – Mark Setchell Sep 04 '19 at 13:45
  • According to the source of your algorithm, 'An array is said to have a majority element if more than half of its entries are the same.' 1 is not the solution here, and your output of `None` is the right one. There doesn't seem to be a bug in your code. – Thierry Lathuille Sep 04 '19 at 13:46
  • Your code returns none if there isn't an element with a greater than 50% share. @ValentinB. has a good answer if you want the highest frequency. – Asa Stallard Sep 04 '19 at 13:48
  • Thanks! I just realized that. But, I am actually do not want that constrain. I only want to find the most frequent element. @ThierryLathuille – chen Sep 04 '19 at 13:51
  • 3
    Possible duplicate of [Divide and Conquer strategy to determine if more than 1/3 same element in list](https://stackoverflow.com/questions/57787006/divide-and-conquer-strategy-to-determine-if-more-than-1-3-same-element-in-list) –  Sep 04 '19 at 13:52
  • 2
    @chen This is not what your question states, and not what the algorithm you implemented is supposed to do. And no, this can't be solved with a divide and conquer method, because you will always need the exact count of each value to be able do decide, in the end, which one is the most frequent. – Thierry Lathuille Sep 04 '19 at 13:54
  • Do you want to implement it yourself or could you just use `collections.Counter`? – Matthias Sep 04 '19 at 13:55
  • @ThierryLathuille Thanks. My purpose of finding the most frequent element is to see if the maximum occurance is larger than 1/3. – chen Sep 04 '19 at 13:59
  • 1
    @chen Your last edit completely changes the question - and is still unclear: what you mean by 'majority element' is the 'most common element', while 'majority element''s meaning is 'the one which appears more than n/2 times, if it exists'. Please don't do that, as it makes all comments, answers and the efforts of their authors meaningless. – Thierry Lathuille Sep 04 '19 at 14:00
  • 1
    @chen That is again a completely different goal. Please stop changing it all the time ! – Thierry Lathuille Sep 04 '19 at 14:01

2 Answers2

0
def majority_element(a):
    return max([(a.count(elem), elem) for elem in set(a)])[1]

EDIT

If there is a tie, the biggest value is returned. E.g: a = [1,1,2,2] returns 2. Might not be what you want but that could be changed.

EDIT 2

The pseudocode you gave divided into arrays 1 to k included, k + 1 to n. Your code does 1 to k - 1, k to end, not sure if it changes much though ? If you want to respect the algorithm you gave, you should do:

elemlsub = GetMajorityElement(a[:k+1])  # this slice is indices 0 to k
elemrsub = GetMajorityElement(a[k+1:])  # this one is k + 1 to n.

Also still according to your provided pseudocode, lcount and rcount should be compared to k + 1, not k:

if lcount > k + 1:
  return elemlsub
elif rcount > k + 1:
  return elemrsub
else:
  return None

EDIT 3

Some people in the comments highligted that provided pseudocode solves not for the most frequent, but for the item which is present more that 50% of occurences. So indeed your output for your example is correct. There is a good chance that your code already works as is.

EDIT 4

If you want to return None when there is a tie, I suggest this:

def majority_element(a):
    n = len(a)
    if n == 1:
        return a[0]

    if n == 0:
        return None

    sorted_counts = sorted([(a.count(elem), elem) for elem in set(a)], key=lambda x: x[0])

    if len(sorted_counts) > 1 and sorted_counts[-1][0] == sorted_counts[-2][0]:
        return None

    return sorted_counts[-1][1]
Valentin B.
  • 602
  • 6
  • 18
  • Thanks. But, I want a divide and conquer solution. Is it possible to fix the bug in my algorithm? – chen Sep 04 '19 at 13:34
  • Thanks. The code is correct. But, I only want the algorithm to find most frequent item, it doesn't need to be at least 50% of occurences. @Valentin B. – chen Sep 04 '19 at 13:57
  • @chen Then you cannot achieve this with the provided method, I suggest you use either of my methods, they might not be super optimized but they should work fine. – Valentin B. Sep 04 '19 at 14:01
0

I wrote an iterative version instead of the recursive one you're using in case you wanted something similar.

def GetFrequency(array):
    majority = int(len(array)/2)
    result_dict = {}
    while array:
        array_item = array.pop()
        if result_dict.get(array_item):
            result_dict[array_item] += 1
        else:
            result_dict[array_item] = 1
        if result_dict[array_item] > majority:
            return array_item   
    return max(result_dict, key=result_dict.get)

This will iterate through the array and return the value as soon as one hits more than 50% of the total (being a majority). Otherwise it goes through the entire array and returns the value with the greatest frequency.

Asa Stallard
  • 322
  • 1
  • 14