2

I've got a program which makes heavy use of random numbers to decide what it needs to be doing, and has many many execution paths based on the output of the PRNG. After pulling my hair out trying to debug it, I decided to make all PRNG calls reference the same Random instance, which has been seeded to a hard coded number at instantiation. That way, every time I run the program, the same bug should appear. Unfortunately, I still get different bugs each time I run it (though it seems to behave almost the same way).

I've searched the code many many times for any missed calls to Math.random() and I assure you there are none.

Any ideas?

Edit: I have confirmed that numbers being generated are the same, yet behaviour is still non-deterministic. This program is not multi-threaded. Still completely baffled.

So the PRNG is behaving as expected, but I still have non-determinism. What are some ways that non-determinism might inadvertently be brought into a program?

Lynden Shields
  • 1,062
  • 1
  • 10
  • 27
  • 2
    1. Post some code. 2. Do you have multiple threads using the generator? 3. There are some bugs in Sun's Random implementation when using WebStart, perhaps it is the case. – npe May 21 '12 at 13:03
  • I can't reproduce the problem within a small amount of code. There's a few thousand lines of code in the program at the moment. – Lynden Shields May 21 '12 at 13:04
  • No multiple threads, not using WebStart. – Lynden Shields May 21 '12 at 13:28

3 Answers3

3

Ok I think I've located my source of non-determinism. I was iterating over a HashSet at one point. The HashSet would have been populated by the same things in the same order, but because I haven't specified the hash method of the class being added to the HashSet, it would be defaulting to some hash depending on memory location of each instance.

Changing each instance of HashSet to LinkedHashSet seems to have been giving me consistent results for ~30 runs now, where before it would only give me the same behaviour up to a few times in a row.

Lynden Shields
  • 1,062
  • 1
  • 10
  • 27
2

OK, so this seems like a voodoo. Try creating a custom PRNG that wraps a Random object and logs calls with stacktraces.

public class CustomRNG {

    private static Logger logger = Logger.getLogger(CustomRNG.class.getName());
    private static Random random = new Random(1234);

    public int nextInt() {

       int val = random.nextInt();
       log(val);

       return val;
    }

    private void log(int value) {
        logger.log(Level.INFO, "value: " + value, new Throwable());
    }
}

This will log every call to nextInt, including value and a stacktrace (add call counting if you like). Try to run your app a few times and see at which point (stack trace) this diverges.

npe
  • 15,395
  • 1
  • 56
  • 55
  • btw; Logger.getlogger instead of getInstance, and you have a superfuous closing paren after value in your log method. – Lynden Shields May 21 '12 at 21:48
  • added nextDouble() method as well, because that's what I was using, and a new log(double value) to match. Ran the program a couple of times, until it diverged, and diffed the results. Took a while, but eventually I could see that the program's execution path has diverged without the numbers generated diverging. I wonder if there's some way that the order of things going to stderr can change between runs? In Java, the buffer should flush whenever System.err.println is called, right? – Lynden Shields May 21 '12 at 21:49
  • are there any methods in the Java standard libraries that are non-deterministic? e.g. (which I know isn't true, but just to explain my point) maybe something like Math.round() will randomly choose which int to return if it's right in the middle. – Lynden Shields May 21 '12 at 21:56
  • @LyndenShields: 1. Fixed the code. 2. There is no non-dereminism that I know of. If your code diverges, and RNG does not, then there is something wrong with the code, not the RNG. – npe May 30 '12 at 20:15
0

Possible causes of continued non-determinism (in decreasing order of likelihood):

  • You haven't replaced all the PRNG calls with the same random instance. You probably want to check this first :-) A good IDE should help you track down all the references to the Random class.
  • There are timing effects caused by concurrency (e.g. it matters which thread calls the PRNG first)
  • You have some form of external input into the system (e.g. user input? actions taken based on the system timer?)
  • Some library you are using has a non-deterministic element which is affecting teh behaviour of your program (for example, some sorting algorithms use random numbers which can affect the ordering of results they return)
  • You are hitting some kind of environmental constraint (e.g. an OutOfMemoryError or IOException error which is happening sometimes but being caught and recovered from in different ways, or the GC deciding to clear some soft/weak references)
  • Cosmic rays / hardware errors corrupting memory
mikera
  • 105,238
  • 25
  • 256
  • 415
  • 1)I've searched for any occurance of text containing 'Math' or 'random' and also specifically references to the Math class, the Math.random() method and the Random class and there are none in the project (there's some in included external libraries, but they shouldn't affect it, right?) 2)Not doing any multithreading at all, so that shouldn't happen I believe 3) I am reading some files, but only once at the start 4) Don't think so but I will look into that 5) I did just move to a higher altitude :) – Lynden Shields May 21 '12 at 13:12
  • Regarding your 4th point, that shouldn't be an issue if I'm not catching any exceptions anywhere, should it? – Lynden Shields May 21 '12 at 13:14
  • @LyndenShields - no that's not correct. Threading can result in "exception recovery" if an unchecked exception is not caught by a child thread. The child thread dies, but this can go unnoticed / unnoted, and the rest of the program continues. – Stephen C May 21 '12 at 13:19
  • External libraries could be the issue (I've added an extra bullet explaining this). Regarding the environmental constraints - not everything throws an exception. The GC can, for example, clear weak references which could again affect your program execution without throwing an exception. – mikera May 21 '12 at 13:20
  • @StephenC Is that still true of my program if I'm not threading at all? – Lynden Shields May 21 '12 at 13:22
  • Also, shouldn't my instance of Random be completely isolated from anything else that could be playing with instances of random? – Lynden Shields May 21 '12 at 13:24
  • @LyndenShields - the point is that you HAVE got non-determinacy, and it isn't coming from your random number generator ... if it have been properly isolated. – Stephen C May 21 '12 at 22:29