0

I'm trying to understand how the random state propagates to the threads when using peach but can't find much documentation on it.

Empirically, from the test below, it seems like the threads do end up with different seeds but these seeds may be unrelated to the random seed of the master.

q) \S 123 
q) {[x] show 20?10} peach til 10
5 1 5 7 4 4 2 2 4 9 0 7 6 4 5 2 1 7 1 9
0 3 9 6 4 3 6 0 9 1 6 7 4 4 0 0 7 0 8 4
5 7 9 4 6 8 8 6 4 6 4 7 7 6 8 9 8 7 4 2
2 8 4 2 3 2 0 3 6 3 1 2 7 8 8 3 9 2 7 6
1 0 3 9 5 2 7 5 5 3 1 2 6 1 8 9 5 2 5 5
2 6 6 7 2 0 6 9 1 4 7 9 9 8 2 7 1 4 4 3
3 2 2 0 0 9 2 6 3 1 6 1 6 5 3 4 6 0 8 9
2 3 0 3 4 3 9 0 8 1 5 7 1 1 3 3 0 5 3 4
0 9 5 2 1 2 0 2 6 4 5 7 9 7 2 6 9 1 9 8
5 7 8 0 5 6 0 2 0 0 0 5 6 4 4 8 3 9 9 2

but running both lines a second time does not reproduce this answer.

So my questions are:

  1. Is there a relationship between the seeds at the threads and the random state of the master?
  2. Are we guaranteed to get different seeds at each thread?
  3. Any suggestions on best practices for setting the seed at each thread or other ways of getting reproducible results?

Thanks

Will Da Silva
  • 6,386
  • 2
  • 27
  • 52
apoursh
  • 27
  • 5

1 Answers1

2

This wasn't always the case, it was fixed in v3.1

q).z.K
3f
q)count distinct {10?10}peach til 1000
931

q).z.K
4f
q)count distinct {10?10}peach til 1000
1000

From the release notes

2013.08.19
FIX
the random number generator was not thread-safe. Now the behaviour is as follows
 rng is thread local.
 \S 1234 sets the seed for the rng for the main thread only.
 the rng in a slave thread is assigned a seed based on the slave thread number.
 in multithreaded input mode, the seed is based on socket descriptor.
 instances started on ports 20000 thru 20099 (slave procs, used with e.g. q -s -4) have the main thread's default seed based on the port number.
q)\s
-2i
q)
q){value"\\q -p ",string x}each 6000+til 2;
q).z.pd:`u#hopen each 6000+til 2
q)count distinct{10?20}peach til 1000 // uniqueness not maintained across slave processes
502
q)count distinct({10?20}each til 1000),{10?20}peach til 1000 // uniqueness not maintained across master+slaves
1002
q)
q){value"\\q -p ",string x}each 20000+til 2; // ports 20000 & above as per notes
q).z.pd:`u#hopen each 20000+til 2
q)count distinct{10?20}peach til 1000 // unique across slaves
1000
q)count distinct({10?20}each til 1000),{10?20}peach til 1000 // unique across master+slaves
2000
jasonfealy
  • 1,081
  • 3
  • 5
  • Thank you for the clarification. Luckily I'm only using v3.5, 3.6. I don't understand "have the main thread's default seed based on the port number." In particular, what does "based on the port number" mean? does q take the main thread's seed and modify it based on the port #? – apoursh Oct 22 '20 at 13:25
  • I don't know the details but it seems so - I've added to my original post – jasonfealy Oct 22 '20 at 14:35
  • They have added clarification on this behavior to the docs https://code.kx.com/q/basics/syscmds/#s-random-seed (or maybe it was there before and I didn't see it) – apoursh Dec 09 '20 at 02:40