I want to generate the following artificial dataset to test a contextual bandit algorithm. What is the easiest way to get it done in python may be? Can anyone point me to a link which demonstrates a code for it?
The unit vectors θ1 , ..., θK for K actions are drawn uniformly from Rd . in each iteration t of T complete iterations, a context xt is first sampled from an uniform distribution within ∥x| ≤ 1.