I use RapidMiner and i have a data set which contains 40 lines, each line has 14 column. Lines are different kinds of metrics of Android applications + and the end of the line there is google-play ranking (first line is the header which contains the name of metrics).
(So the goal is predict google play ranking from metrics.)
The data set: http://pastebin.com/Cw1BR4K6
- column 1-13: different kind of metrics
- column 14: google play ranking
- line 2-40: metrics of Android projects
I used PolynomialRegression in RapidMiner and i got this result:
- 6.723 * lloc ^ 1.000
+ 1.187 * nid ^ 2.000
- 47.730 * nle ^ 1.000
- 36.433 * nel ^ 1.000
- 1.466 * nip ^ 2.000
- 97.187 * activites ^ 1.000
- 50.080 * inside-permissions ^ 1.000
- 60.291 * outside-permissions ^ 1.000
- 52.472 * all-permissions ^ 4.000
- 2.309 * jtlloc ^ 1.000
+ 36.058 * jtnm ^ 1.000
+ 9.924 * jtna ^ 1.000
+ 40.504 * jtncl ^ 1.000
+ 9.455
My question: How can i check that this result is correct? How can i check this result to an already available line? For example i would like to apply this result to the line 25: 25,8,5,10,0,1,0,0,0,239,10,14,4,3.8
My other question: What are the methods which i can do predicts about this set? And what is the best methods to do it? I would like ask you to explain it to me, if it is possible.
Thanks in advance, Peter