Order of preprocessing step in mlr package in R

Question

Working with already implemented preprocessing Wrappers as well as own Wrappers in mlr, I am wondering in which order the preprocessing steps are computed for the following example?

classif.lrn.net = makePreprocWrapperCaret(classif.lrn.net, ppc.nzv=TRUE, ppc.corr=TRUE, ppc.conditionalX=TRUE, ppc.center=TRUE, ppc.scale=TRUE, ppc.spatialSign=TRUE) 

classif.lrn.net = makeSMOTEWrapper(classif.lrn.net)

classif.lrn.net = makeImputeWrapper(learner=classif.lrn.net, classes = list(numeric = imputeMedian(), integer =imputeMedian()))

From the mlr-Tutorial I know that within the caretPreprocWrapper operations are applied in the following order:

near-zero variance filter, correlation filter, imputation, spatial sign.

Moreover, the SMOTE-Wrapper will be proceeded before (because it comes after the caretWrapper in the code).

But when will the immputationWrapper be proceeded? I think it would be important that the imputation happens before the spatial sign transformation (this order is also implemented in the caretPreprocWrapper). Since I am using my own imputation-Wrapper, I am not sure, if and how I can ensure that the imputation is done in between the different caretPreproc-Steps?

score 1 · Accepted Answer · answered Nov 15 '18 at 15:56

1

The way you have specified it, the imputation happens at the beginning of the entire process, and only once (i.e. not in between different caret steps). The outermost wrapper is run first and doesn't know anything about the inner workings of the learner it wraps.

answered Nov 15 '18 at 15:56

Lars Kotthoff

107,425
16
204
204

Thank you! So if it's not possible to insert the imputation-wrapper in between the caret-preprocessing-steps, would you recommend to put it before or after the caret-preproc-steps in the process (i.e. the other way around in the code)? Isn't it most important that the imputation happens before the Spatial Transformation (i.e. to put it after the caret-preproc.-wrapper in the code)? – funkfux Nov 15 '18 at 17:52
The way you have it set up should do exactly what you want I think. While it's not possible to insert something directly in between caret preprocessing steps, you can split this up into multiple wrappers, i.e. one caret preprocessing wrapper for ppc, one for spatial, etc and then layer them accordingly. – Lars Kotthoff Nov 15 '18 at 18:23

Order of preprocessing step in mlr package in R

1 Answers1