13

Is std::random_shuffle threadsafe? I presume not since the regular rand() is not threadsafe. If that is the case, how would I use rand_r with random_shuffle so that I can give each thread a unique seed. I've seen examples of using custom random generators with random_shuffle, but it is still unclear to me.

Thanks.

Mark
  • 3,177
  • 4
  • 26
  • 37
  • 4
    Generally you must make the assumption that *nothing* in the C++ library is thread-safe unless documentation states otherwise. – Mark Ransom Jul 05 '11 at 19:18
  • 1
    Also, 'threadsafe' is a very overloaded term. Some algorithms are safe only if they operate on safe data. Some are safe across threads as long as there's only 1 writer, and most can't guarantee that. Generally, when deciding what is safe (ie. correct), it requires that you specify the various read/write requirements. – Kylotan Jul 05 '11 at 19:34
  • Just to clarify, I want to do shuffles on different lists in parallel. So I'm not concerned about races in the data structure, just the generation of the random numbers for the shuffle. – Mark Jul 05 '11 at 19:37
  • @Kylotan: but specifically, in Posix `rand` is not required to be re-entrant, and hence is not required to be the thing that *Posix* defines "thread-safe" to mean. It's unambiguous there, and while the questioner probably should have mentioned if he's on a Posix system, `rand_r` is a Posix function, so... – Steve Jessop Jul 06 '11 at 00:52

3 Answers3

4

To use rand_r with std::random_shuffle, you'll need to write a (fairly trivial) wrapper. The random number generator you pass to random_shuffle needs to accept a parameter that specifies the range of numbers to be produced, which rand_r does not.

Your wrapper would look something like this:

class rand_x { 
    unsigned int seed;
public:
    rand_x(int init) : seed(init) {}

    int operator()(int limit) {
        int divisor = RAND_MAX/(limit+1);
        int retval;

        do { 
            retval = rand_r(&seed) / divisor;
        } while (retval > limit);

        return retval;
    }        
};

You'd use it with random_shuffle something like:

std::random_shuffle(whatever.begin(), whatever.end(), rand_x(some_seed));
Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Thanks. Seed should be an unsigned int. Also, why is there a do while loop as opposed to returning rand_r(&seed) % limit? Is there something subtle that I'm missing? – Mark Jul 05 '11 at 20:14
  • @Mark: Oops -- fixed. About the do loop, consider how you divide 10 candies equally between 3 children (and you can't break one candy into pieces). The answer is that you can't -- you can only distribute 9 of them. This is basically doing the same thing for dividing `RAND_MAX` candies between `limit` children, and discarding any left-overs so all the piles are equal. Using `%limit` (or `/divisor`) by itself *can't* divide the numbers evenly unless `RAND_MAX` happens to be an exact multiple of `limit` (and `RAND_MAX` is often prime, so it's not an exact multiple of *any* meaningful `limit`). – Jerry Coffin Jul 05 '11 at 20:39
  • Although I agree with everything Jerry said, one might argue that if you're using `rand_r`, then you're not entitled to assume that it has a uniform distribution to preserve. But at least this way, *if* `rand_r` is good, then your shuffle is good too, you aren't introducing any new bias. – Steve Jessop Jul 06 '11 at 00:48
3

You need to provide a random number generator function or functor object that takes an integral value type and returns another value of some integral type that will not overflow the bounds of the container that the iterators you've passed into the function are iterating through. Also in the case of a functor object, it must implement the operator() so that it can be called like a function. Because you need a thread-safe random-number generator, using srand and rand from cstdlib is a bad idea ... you should instead create some functor object that implements a thread-safe random-number generator, or a random-number generator that does not implement globally accessible variables so that everything remains thread-local storage.

So for instance, one way this could work is you have some type of random number generator you've gotten from another library that will only generate random values between a fixed range of values so you can define the bounds of the container for the random-access iterators the random_shuffle algorithm uses. Now depending on what library you use, you functor could look something like the following:

class my_rand_gen
{
   private:
       random_gen_type random_range_gen;
       int min;
       int max;

   public:
       my_rand_gen(const random_gen_type& gen, int min_range, int max_range):
                     random_range_gen(gen), min(min_range), max(max_range) {}

       int operator()(int value) 
       { 
           //ignore the input value and use our own defined range
           //returns a value between min and max
           return random_range_gen(min, max); 
       }
};

Now you can call the algorithm like:

random_shuffle(my_vector_start_iter, my_vector_end_iter, 
               my_rand_gen(rand_generator_lib, 
                           vector_start_index, 
                           vector_end_index));

and it will shuffle the vector in-between the start and end iterators to your vector without overflowing the bounds of the vector ... in other words it will only use values for the shuffle between vector_start_index and vector_end_index.

Jason
  • 31,834
  • 7
  • 59
  • 78
1

Probably not.

Use the second version of adnrom_shuffle that takes a template parameter for the random number generator: http://www.sgi.com/tech/stl/random_shuffle.html. The generator must match: http://www.sgi.com/tech/stl/RandomNumberGenerator.html

struct MyRandomNumberGenerator
{
    int operator()(int limit) const
    {
         // get threadsafe random number
    }
};

// Stuff
random_shuffle(list.begin(), list.end(), MyRandomNumberGenerator());
Martin York
  • 257,169
  • 86
  • 333
  • 562