I have a data set of 1k records and my job is to do a decision algorithm based on those records. Here is what I can share:
The target is a continuous value.
Some of the predictors (or attributes) are continuous values, some of them are discrete and some are arrays of discrete values (there can be more than one option)
My initial thoughts were to separate the arrays of discrete values and make them individual features (predictors). For the continuous values in the predictors I was thinking about just randomly picking a few decision boundaries and see which one reduces the entropy the most. Then make a decision tree (or a random forest) which use standard deviation reduction when creating the tree.
My question is: Am I on the right path? Is there a better way to do that?