Let's say I run random forest or kmeans. I get an R object. Now I want to save that model for future use. I thought PMML was a good format but then realized that R can't read PMML and turn it back into an object that can be used for scoring. It can only write it. Is there any alternative to saving it as an R object with the save command? That seems like a bloated solution since the data that was used for training is attached to it.
Asked
Active
Viewed 380 times
1
-
Where in a randomForest object is the training data stored? – joran Jan 30 '13 at 02:08
-
Is there any reason that you can't just strip out the training data and then store it using `save` (or `saveRDS`)? – Ryan C. Thompson Jan 30 '13 at 02:09
-
If data is being stored within the object, then it's likely that that data is useful for some of that object's methods. Maybe not the specific methods you're hoping to apply, but the data is probably there for a reason. – Marius Jan 30 '13 at 02:10
-
1I should point out that the reason I ask is that I'm reasonably sure that the training data is not returned in a rf object. – joran Jan 30 '13 at 02:13
-
OP here. Random forest does not store the data but kmeans does. Actually it just stores the cluster indices of the training data not the data itself. Yes, it can be stripped off. Was just wondering if there was some better more portable format. I thought that pmml was that format until I realized that R cannot yet read it. It can only write it. I guess I'll just use save for now. – user1827975 Jan 30 '13 at 14:56