9

So I'm trying to make a basic program to learn the basics of C++, I'm generating 100 random numbers from 0 to 100 and storing them in a vector, I am then displaying the sum, mean, median, mode, high and low of the vector. I have everything else done except the mode which is where I get stuck. Here is the code I have so far.

int modeFunction()
     {
         numMode = 0;
         count = 0;
         for (int n = 0; n < 100; n++)
         {
             for (int y = 0; y < 100; y++)
             {
                 if (numVector.at(y) == numVector.at(n))
                {
                    numMode = numVector.at(y);
                    count++;
                }
             }

         }
         return numMode;
     }

After that I get stuck because in my mind that should work but it doesn't. It just out puts the last number, usually 100. Any help would be much appreciated.

Cistoran
  • 1,587
  • 15
  • 36
  • 54
  • 1
    if `myVector` is a `std::vector` (seems like it atleast), you can index it like an array: `myVector[y]` and `myVector[n]` will yield the same as the `myVector.at` version, but looks nicer imho. :) – Xeo Mar 25 '11 at 22:35
  • 2
    @Xeo: the difference being that `at` has defined behaviour when the index is out of range. Arguably `operator[]` is a micro-optimization, although as you say it's also a bit of a style difference. – Steve Jessop Mar 25 '11 at 22:50
  • @Steve: Ah, thanks for that tidbit. :) Didn't bother with `at` yet, but a normal array also has undefined behaviour for out of range access, though it certainly is nice to have defined behavious when you need it. :) – Xeo Mar 25 '11 at 22:57
  • @Xeo: to be honest, I never use `at` myself. I occasionally wonder whether I should, but in practice I never write code that I want to throw an exception when the index is out of bounds, so it only serves as a debugging assist, it "should" never happen. Despite calling it a micro-optimization, it's a reasonable amount of redundant code, so in the end if I want bounds-checking I just switch to Python ;-) – Steve Jessop Mar 25 '11 at 23:02

7 Answers7

11

since all the values are between 0 and 100, you can find the mode efficiently with a histogram:

std::vector<int> histogram(101,0);
for( int i=0; i<100; ++i )
  ++histogram[ numVector[i] ];
return std::max_element( histogram.begin(), histogram.end() ) - histogram.begin();
  • Would this not work for values outside of [0,100] (this should work for any symmetric histogram, no?) – 10GeV Sep 25 '21 at 18:57
4

Since mode is the number that occurs most frequent you shouldn't change numMode unless the new number's count is greater than numMode's count.

EDIT: To clarify, you need to keep a separate count for the current element and the current number that you think is the mode. Ideally, setting newMode to the first element is a good approach.

In addition, mode isn't necessary unique (i.e. "1 1 2 2"). You may want to keep that in mind if you care about that.

newMode = element[0]
modeCount = # of occurrence of newMode

for ( i-th element from [1 to end] ) {
   tmpCount = # of occurrence of element[i]
   if tmpCount > modeCount {
     newMode = element[i]
     modeCount = tmpCount
   }
}
nevets1219
  • 7,692
  • 4
  • 32
  • 47
  • I am who downvoted. I did it because this answer is incomplete as it assumes the array of number of occurrences of the values is known, however, this array is the most important here. – LRDPRDX Jan 15 '21 at 01:48
3

bmcnett's approach works great if number of elements are small enough. If you have large number of elements but the all element value are with in a small range using map/hashmap works well. Something like

typedef std::pair<int, int> mode_pair;

struct mode_predicate
{
  bool operator()(mode_pair const& lhs, mode_pair const& rhs)
  {
    return lhs.second < rhs.second;
  }
};

int modeFunction()
{
  std::map<int, int> mode_map;
  for (int n = 0; n < 100; n++)
    mode_map[numVector[n]]++;
  mode_predicate mp;
  return std::max_element(mode_map.begin(), mode_map.end(), mp)->first;
}
jbp
  • 1,581
  • 1
  • 10
  • 15
1

Your algorithm is wrong - it outputs the last number in the array because that's all it can ever do. Every time the number at index y matches the number at index n you overwrite the results for the previous n. Since you're using the same loop conditions, y and n are always the same at at least one point in the nested loop for each possible n value - and you'll always end up with numMode being numVector.at(99).

You need to change your algorithm to save the count for each n index along the way (or at least which n index ended up with the greatest count), so that you can know at the end of the n loop which entry occured the most times.

Carl Norum
  • 219,201
  • 40
  • 422
  • 469
1

Alternative solutions. Note: untested.

int mode1(const std::vector<int>& values)
{   
    int old_mode = 0;
    int old_count = 0;
    for(size_t n=0; n < values.size(); ++n) 
    {
        int mode = values[n];
        int count = std::count(values.begin()+n+1, values.end(), mode);

        if(count > old_count) 
        {
            old_mode = mode;
            old_count = count;
        }
    }
    return old_mode;
}

int mode2(const std::vector<int>& values)
{   
    return std::max_element(values.begin(), values.end(), [](int value)
    {
        return std::count(values.begin(), values.end(), value);
    });
}
ronag
  • 49,529
  • 25
  • 126
  • 221
  • I think in `mode1()` you should compare `count + 1` with `old_count` because you start counting from the position one after the position of the counted value: i.e. you must add the initial one to `count`. – LRDPRDX Jan 15 '21 at 02:49
  • Moreover, I think it's not a good practice to assign the initial value for the value with the searched property as `0`. It should be better to use the first element for this purpose. – LRDPRDX Jan 15 '21 at 02:55
0

Mode means a number with highest frequency. The logic should be -

//Start of function

int mode = 0, globalCount = 0 ;  
// Start of outer for loop
for i = 0 to length - 1    
   int localCount = 0   

  // Start of inner for loop  
  for j = 0 to length - 1      
     if vec[i] == vec[j]    
     ++localCount    
 // End of Inner for loop 

 if ( localCount > globalCount )  
     globalCount = localCount  
     mode = vec[i]  
// End of outer for loop  

if globalCount > 1 // This should be checked whether vec has repetitions at all
   return mode
else
   return 0

// End of function
Mahesh
  • 34,573
  • 20
  • 89
  • 115
  • @Cistoran - The logic can be even better improving efficiency but this is what the algorithm should be according to your thought process. – Mahesh Mar 25 '11 at 22:57
0
    int number = array_list[0];
    int mode = number;
    int count = 1;
    int countMode = 1;

    for (int i=1; i<size_of_list; i++)
    {
          if (array_list[i] == number) 
          { // count occurrences of the current number
             count++;
             if (count > countMode) 
                {
                      countMode = count; // mode is the biggest ocurrences
                      mode = number;
                }
          }
          else
          { // now this is a different number
                if (count > countMode) 
                {
                      countMode = count; // mode is the biggest ocurrences
                      mode = number;
                }
               count = 1; // reset count for the new number
               number = array_list[i];
      }
    }