Questions tagged [supervised-learning]

Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.

542 questions
5
votes
2 answers

Interpreting the DecisionTreeRegressor score?

I am trying to evaluate a relevance of features and I am using DecisionTreeRegressor() The related part of the code is presented below: # TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature new_data =…
5
votes
1 answer

Different ways of implementing cross-validation for SVM model in MATLAB

Suppose that we have this code in MATLAB R2015b: SVMModel = fitcsvm(INPUT, output,'KernelFunction','RBF','BoxConstraint',1); CVSVMModel = crossval(SVMModel); z = kfoldLoss(CVSVMModel) In the first line using fitcsvm model trained by hole data.…
Eghbal
  • 3,892
  • 13
  • 51
  • 112
5
votes
1 answer

How to handle class imbalance in sklearn random forests. Should I use sample weights or class weight parameter

I am trying to solve a binary classification problem with a class imbalance. I have a dataset of 210,000 records in which 92 % are 0s and 8% are 1s. I am using sklearn (v 0.16) in python for random forests . I see there are two parameters…
NG_21
  • 685
  • 2
  • 13
  • 22
5
votes
2 answers

TensorFlow MLP not training XOR

I've built an MLP with Google's TensorFlow library. The network is working but somehow it refuses to learn properly. It always converges to an output of nearly 1.0 no matter what the input actually is. The complete code can be seen here. Any…
5
votes
1 answer

Is NDCG (normalized discounted gain) flawed? I have calculated a few alternative ranking quality measures, and I can't make heads or tails of it

I'm using python for a learning-to-rank problem, and I am evaluating my success using the following DCG and NDCG code (from http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/Learning%20to%20Rank.ipynb ) def dcg(relevances, rank=20): …
5
votes
2 answers

Anomaly Detection vs Supervised Learning

I have very small data that belongs to positive class and a large set of data from negative class. According to prof. Andrew Ng (anomaly detection vs supervised learning), I should use Anomaly detection instead of Supervised learning because of…
Junaid
  • 1,668
  • 7
  • 30
  • 51
5
votes
4 answers

String Subsequence Kernel and SVM using Python

How can I use Subsequence String Kernel (SSK) [Lodhi 2002] to train a SVM (Support Vector Machine) in Python?
gcedo
  • 4,811
  • 1
  • 21
  • 28
5
votes
1 answer

what is the difference between the stacking grading, and voting algorithms?

I'm writing a machine-learning solution for a problem that may have more than one possible classifier, depending on the data. so I've collected several classifiers, each of them performs better than the others on some conditions. I'm looking into…
Amir Arad
  • 6,724
  • 9
  • 42
  • 49
4
votes
1 answer

In the algorithm LambdaRank (in Learning to Rank) what does |∆ NDCG| means?

This Article describes the LambdaRank algorithm for information retrieval. In formula 8 page 6, the authors propose to multiply the gradient (lambda) by a term called |∆NDCG|. I do understand that this term is the difference of two NDCGs when…
4
votes
2 answers

Is there any supervised clustering algorithm or a way to apply prior knowledge to your clustering?

In my case I have a dataset of letters and symbols, detected in an image. The detected items are represented by their coordinates, type (letter, number etc), value, orientation and not the actual bounding box of the image. My goal is, using this…
4
votes
0 answers

Is there a function to get the learning curves for training and validation sets in h2o used with R?

I am using h2o and R for a binary classification problem. I was wondering if there is any way to create a learning curve in h2o? I coded some splits myself and I am plotting the curve alright, but I'd like to know if there is a quick recipe…
maop
  • 194
  • 14
4
votes
1 answer

How to predict if number of features are not matching with number of features available in testset?

I am using pandas get_dummies to convert categorical variables into dummy/indicator variables, it introduce new features in the dataset. Then we fit/train this dataset into a model. Since the dimension of X_train and X_test remains the same, when…
4
votes
1 answer

How is input dataset fed into neural network?

If I have 1000 observations in my dataset with 15 features and 1 label, how is the data in input neurons fed for forward pass and back propagation? Is it fed row wise for 1000 observations (one at a time) and weights are updated with each…
4
votes
0 answers

scikit-learn MLPRegressor performance cap

Good morning I am trying to fit a sklearn Multilayer Perceptron Regressor to a dataset with about 350 features and 1400 samples, strictly positive targets (house prices). Doing a single-step GridSearch on the hidden-layer size shows that from 5…
4
votes
1 answer

Plot SVM with Matplotlib?

I have some interesting user data. It gives some information on the timeliness of certain tasks the users were asked to perform. I am trying to find out, if late - which tells me if users are on time (0), a little late (1), or quite late (2) - is…
Rachel
  • 1,937
  • 7
  • 31
  • 58
1 2
3
36 37