0

I am trying to model a process. My input data includes certain features and measurements about the product. I built Random forests and Gradient boosting models in Python, and got good results. I am now trying to determine which features and measurements lead to the best product (almost like reversing an equation to get the x variable back for a particular y). How can I go about doing this?

Derek Langley
  • 172
  • 1
  • 11
  • If you are using random forest you can check the attributes importance. It is basically the attribute used in most of the trees at the top level split. – venkata krishnan Jul 25 '19 at 14:27

1 Answers1

1

This is basically doing the feature selection so here are some examples you could try out

Feature selection

I was using some of the below for my feature selection which ranks your features based on the spread of the data

  1. Fishers score
  2. F score
  3. chi Squ

I found the above usefull.

Sundeep Pidugu
  • 2,377
  • 2
  • 21
  • 43