1

I am trying to simulate a probability problem, in which there are n clients and n servers. Each client randomly sends a request to any server, so each server can receive any number of requests, I have to calculate the expected number of maximum requests that any server can receive.

I am trying to simulate this by running 10,000 iterations, where in each iteration each client chooses a random server and a request is sent to it, servers are represented as an integer array of size N.

Client chooses a random number and then the value at that index in server array is incremented. As, for better results the question says N should be about 106.

So to make it a little faster , I used multithreading in which each thread runs 100 iterations and in total there are 10 threads.

But the multithreaded code produces very different results as that from normal code. Below are the code snippets with output for both of them

Normal Version

 #include <iostream>
 #include <random>
 #include <chrono>

 #define N 1000000
 #define iterations 1000

int servers[N];

// This array's i'th index will contain count of in how many
// iterations was i the maximum number of requests received by any  server
int distr[N+1]={0};

int main(int argc, char const *argv[])
{   
   // Initialising
   auto start = std::chrono::high_resolution_clock::now();

   std::srand(time(NULL));

   // Performing iterations
   for(int itr=1; itr<=iterations; itr++)
   {
       for(int i=0;i<N;i++)
       {
           servers[i]=0;
       }

       for(int i=1;i<=N;i++)
       {
           int index = std::rand()%N;
           servers[index]++;
       }

       int maxRes = -1;
       for(int i=0;i<N;i++)
       {
           maxRes = std::max(maxRes, servers[i]);
       }
       distr[maxRes]+=1;
   }

   for(int i=0;i<=15;i++)
   {
      std::cout<<(double)distr[i]<<std::endl;
   }

   auto stop = std::chrono::high_resolution_clock::now();
   auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
   std::cout<<duration.count()<<" milliseconds"<<std::endl;

   return 0;
}

Output

0
0
0
0
0
0
0
359
552
79
10
0
0
0
0
0
1730 milliseconds

Multithreaded Version

#include <iostream>
#include <random>
#include <chrono>
#include <thread>
#include <fstream>

#define N 100000
#define iterations 1000
#define threads 10

// This array's i'th index will contain count of in how many
// iterations was i the maximum number of requests received by any server
std::atomic<int> distr[N] = {};

void execute(int number)
{
    // Performing iterations
    int servers[N]={0};
    for(int itr=1; itr<=number; itr++)
    {

        for(int i=1;i<=N;i++)
        {
            int index = std::rand()%N;
            servers[index]++;
        }

        int maxRes = -1;
        for(int i=0;i<N;i++)
        {
            maxRes = std::max(maxRes, servers[i]);
            servers[i]=0;
        }

        distr[maxRes] += 1;
    }
}

int main(int argc, char const *argv[])
{   
    // Initialising
    auto start = std::chrono::high_resolution_clock::now();

    std::srand(time(NULL));

    std::thread t[threads];
    for(int i=0;i<threads;i++)
    {
        t[i] = std::thread(execute, iterations/threads);
    }   

    for(int i=0;i<threads;i++)
    {
        t[i].join();
    }

    for(int i=0;i<=15;i++)
    {
        double temp = (double)distr[i];
        std::cout<<i<<"\t"<<distr[i]<<std::endl;
    }

    auto stop = std::chrono::high_resolution_clock::now();

    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
    std::cout<<duration.count()<<" milliseconds"<<std::endl;

    return 0;
}

Output

0   0
1   0
2   0
3   0
4   0
5   0
6   0
7   7
8   201
9   421
10  267
11  68
12  2
13  2
14  4
15  0

1385 milliseconds

Whereas I have run the normal code many times and each times the count for maximum = 9 > 500, and there is not so much scattering of data, I mean only maximum = 8,9,10,11 have significant values rest all are zeroes.

Can anyone please explain what am I doing wrong ?

Thanks in advance!

einpoklum
  • 118,144
  • 57
  • 340
  • 684

1 Answers1

1

I don't see "very different results", they're just somewhat different, so it seems it's something a bit subtle. I've noticed you're not seeding each thread separately - that might have something to do with it.

PS: You shouldn't use rand() % N if you want a uniform distribution. Why? See this explanation by Stephen Lavaveij. As commenters suggest, the skew may be small when N is small, but still.

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • *never use rand() % N if you want a uniform distribution!* That's more of a comment than an answer. If `N` is much less than `RAND_MAX`, any skew resulting from `RAND_MAX % N` not being zero will be negligible for anything but uses such as actual cryptography. Someone capable of successfully writing that type of code would already know not to do that. *You're not seeding each thread separately* Why? Seeding `std::rand()` more than once would likely be wrong. – Andrew Henle Aug 27 '18 at 11:27
  • @hellow: Fixed. – einpoklum Aug 27 '18 at 11:43
  • @AndrewHenle: You're right, but while `N` may be slow in OP's toy example, it may change, and it may not be low for other people reading this question. – einpoklum Aug 27 '18 at 11:43