I have to compute mutual information for continuous/numeric features. I want to apply feature selection based on this. Feature set description is given below
feature1: can assume any value between 1 - 10000 feature2: measures time spent on something - thus can assume any value but integers (large) .... I have these kind of features.
I am confused on applying mutual information formula for this. Wikipedia says integration is required continuous variables.
Do I need to discretize the features prior to apply MI ??