What methods can I use to predict the probability distribution through Reinforcement Learning?

Question

I'm wondering what methods I can use for predicting the probability distribution. The information about the AI model I want to learn is as follows:

Input: Feature vector, Weight vector, Result value (Initial vector is generated by random sampling of the uniform distribution, the Result value is a real number like 120.12)

Assuming that 'score' is assigned by ':= dot(Feature vector, Weight vector)', I want to train an AI model that predicts the probability distribution of the weight vector so that the score value can be approximated. (For minimizing the difference between the score value and result value) From the generated probability distribution, the program calculates the score value using a weight vector by random sampling.

In this case, what AI model can I use to train the most appropriate probability distribution of the weight vector? I want to use reinforcement learning, but I'm not sure if reinforcement learning matches the current situation. What methods can I use?

score -1 · Answer 1 · answered Mar 13 '23 at 11:03

-1

I'm thinking that it's just a simple regression problem, which just requires a linear/non-linear function to predict the result. You may not have to use any RL algorithm. Though I've got this thought, I'd like to suggest you to try a policy gradient algorithm, like simple REINFORCE or VPG/DDPG. The weight vector is like policy /{pi}, and the feature vector is equavalent to state-action pair in RL.

answered Mar 13 '23 at 11:03

Junyang zhao

1

If the feature vector is boolean vector in this case such as [0, 0, 1, 1, 0, 0, ..., 1], do you think that the linear/non-linear function works well? The input vector is boolean vector, but I want to make a model for generating the score value (np.dot(feature vector, weight)) closely to the result value. Linear regression is for predicting the 'linear' value like housing price, and logistic regression is for predicting the category value like red(0) or blue(1). What's the best method to do? – JaeHyeok Mar 17 '23 at 06:32

What methods can I use to predict the probability distribution through Reinforcement Learning?

1 Answers1