I have data of a lot of students who got selected by some colleges based on their marks. Iam new to machine Learning. Can I have some suggestions how can I add Azure Machine Learning for predicting the colleges that they can get based on their marks
8 Answers
Try a multi-class logistic regression - also look at this https://gallery.cortanaanalytics.com/Experiment/da44bcd5dc2d4e059ebbaf94527d3d5b?fromlegacydomain=1

- 1,233
- 8
- 9
-
Yeah, that can be done as logistic regression gives the output as True of False class. Thereby allowing you to check for what value of marks is selection equal to True. – Sir Tesla Sep 27 '17 at 05:39
Apart from logistic regression, as @neerajkh suggested, I would try as well One vs All classifiers. This method use to work very well in multiclass problems (I assume you have many inputs, which are the marks of the students) and many outputs (the different colleges).
To implement one vs all algorithm I would use Support Vector Machines (SVM). It is one of the most powerful algorithms (until deep learning came into the scene, but you don't need deep learning here)
If you could consider changing framework, I would suggest to use python libraries. In python it is very straightforward to compute very very fast the problem you are facing.

- 977
- 1
- 9
- 25
use randomforesttrees and feed this ML algorithm to OneVsRestClassifer which is a multi class classifier

- 1,384
- 2
- 14
- 27
Keeping in line with other posters' suggestions of using multi-class classification, you could use artificial neural networks (ANNs)/multilayer perceptron to do this. Each output node could be a college and, because you would be using a sigmoid transfer function (logistic) the output for each of the nodes could be directly viewed as the probability of that college accepting a particular student (when trying to make predictions).

- 1,422
- 4
- 22
- 39
Why don't you try softmax regression?
In extremely simple terms, Softmax takes an input and produces the probability distribution of the input belonging to each one of your classes. So in other words based on some input (grade in this case), your model can output the probability distribution that represents the "chance" a given sudent has to be accepted to each college.

- 425
- 2
- 9
- 19
I know this is an old thread but I will go ahead and add my 2 cents too.
I would recommend adding multi-class, multi-label classifier. This allows you to find more than one college for a student. Of course this is much easier to do with an ANN but is much harder to configure (say with the configuration of the network; number of nodes/hidden nodes or even the activation function for that matter).
The easiest method to do this as @Hoap Humanoid suggests is to use a Support Vector Classifier.
To do any of these method its a given that you have to havea well diverse data set. I cant say the number of data points you need that you have to experiment with but the accuracy of the model is dependent on number of data points and its diversity.

- 122
- 2
- 14
This is very subjective. Just applying any algorithm that classifies into categories won't be a good idea. Without performing Exploratory Data Analysis and checking following things you can't be sure of a doing predictive analytics, apart from missing values:
- Quantitative and Qualitative variable.
- Univariate, Bivariate and multivariate distribution.
- Variable relationship to your response(college) variable.
- Looking for outliers(multivariate and univariate).
- Required variable transformation.
- Can be the Y variable broken down into chunks for example location, for example whether a candidate can be a part of Colleges in California or New York. If there is a higher chance of California, then what college. In this way you could capture Linear + non-linear relationships.
For base learners you can fit Softmax regression model or 1 vs all Logistic regression which does not really matters a lot and CART for non-linear relationship. I would also do K-nn and K-means to check for different groups within data and decide on predictive learners.
I hope this makes sense!

- 144
- 1
- 7
The Least-square support vector machine (LSSVM) is a powerful algorithm for this application. Visit http://www.esat.kuleuven.be/sista/lssvmlab/ for more information.

- 105
- 2
- 12