How to specify a random seed while using Python's numpy random choice?

Question

I have a list of four strings. Then in a Pandas dataframe I want to create a variable randomly selecting a value from this list and assign into each row. I am using numpy's random choice, but reading their documentation, there is no seed option. How can I specify the random seed to the random assignment so every time the random assignment will be the same?

service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L']
df['SERVICE_CODE'] = [np.random.choice(service_code_options ) for i in df.index]

score 9 · Accepted Answer · edited Jul 31 '20 at 21:42

9

You need define it before by numpy.random.seed, also list comprehension is not necessary, because is possible use numpy.random.choice with parameter size:

np.random.seed(123)

df = pd.DataFrame({'a':range(10)})

service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L']
df['SERVICE_CODE'] = np.random.choice(service_code_options, size=len(df))
print (df)
   a SERVICE_CODE
0  0       13.59P
1  1       12.42R
2  2       13.59P
3  3       13.59P
4  4      899.59O
5  5       13.59P
6  6       13.59P
7  7       12.42R
8  8      204.68L
9  9       13.59P

edited Jul 31 '20 at 21:42

Trenton McKinney

56,955
33
144
158

answered Oct 25 '18 at 14:27

jezrael

822,522
95
1,334
1,252

Question, "np.random.seed(123)" does it apply to all the following codes that call for random function from numpy. If so, is there a way to terminate it, and say, if I want to make another variable using a different seed, do I declare another "np.random.seed(897)" to affect the subsequent codes? – KubiK888 Oct 25 '18 at 15:04
1

Got the ans here https://stackoverflow.com/questions/49966770/how-to-cancel-the-effect-of-numpy-seed. Thanks. – KubiK888 Oct 25 '18 at 15:09
@KubiK888 - So sorry, I was offline. – jezrael Oct 26 '18 at 04:43

piRSquared · Answer 2 · 2018-10-25T14:30:47.973

3

Documentation numpy.random.seed

np.random.seed(this_is_my_seed)

That could be an integer or a list of integers

np.random.seed(300)

Or

np.random.seed([3, 1415])

Example

np.random.seed([3, 1415])

service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L']
np.random.choice(service_code_options, 3)

array(['899.59O', '204.68L', '13.59P'], dtype='<U7')

Notice that I passed a 3 to the choice function to specify the size of the array.

numpy.random.choice

edited Oct 25 '18 at 14:30

answered Oct 25 '18 at 14:26

piRSquared

285,575
57
475
624

What would a list of integers do? Use the n-th element as seed after n random() calls? – Guimoute Oct 25 '18 at 14:28
Nothing special except to provide a different seed – piRSquared Oct 25 '18 at 14:29
Yes but is the list parsed in order? – Guimoute Oct 25 '18 at 14:30
1

No, the whole list is just a thing that provides a starting point for the randomization. – piRSquared Oct 25 '18 at 14:31

score 2 · Answer 3 · answered Dec 22 '22 at 10:36

According to the notes of numpy.random.seed in numpy v1.2.4:

Best practice is to use a dedicated Generator instance rather than the random variate generation methods exposed directly in the random module.

Such a Generator is constructed using np.random.default_rng.

Thus, instead of np.random.seed, the current best practice is to use a np.random.default_rng with a seed to construct a Generator, which can be further used for reproducible results.

Combining jezrael's answer and the current best practice, we have:

import pandas as pd 
import numpy as np 

rng = np.random.default_rng(seed=121)

df = pd.DataFrame({'a':range(10)})

service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L']
df['SERVICE_CODE'] = rng.choice(service_code_options, size=len(df))

print(df)

   a SERVICE_CODE
0  0       12.42R
1  1       13.59P
2  2       12.42R
3  3       12.42R
4  4      899.59O
5  5      204.68L
6  6      204.68L
7  7       13.59P
8  8       12.42R
9  9       13.59P

How to specify a random seed while using Python's numpy random choice?

3 Answers3

Example

Linked

Related