125

I'm following a tutorial about machine learning basics and there is mentioned that something can be a feature or a label.

From what I know, a feature is a property of data that is being used. I can't figure out what the label is, I know the meaning of the word, but I want to know what it means in the context of machine learning.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Wojtek Wencel
  • 2,257
  • 6
  • 31
  • 65
  • 3
    Features are the fields used as input and labels are used as output. As a simple example, consider how to predict whether one should sell a car based on car mileage, year, etc. Yes/no is the label whereas the mileage and year would be the features. –  Jun 09 '20 at 19:31
  • 2
    I’m voting to close this question because [Machine learning (ML) theory questions are off-topic on Stack Overflow](https://meta.stackoverflow.com/questions/291009/do-pure-machine-learning-questions-belong-to-stack-overflow/291015#291015) - [gift-wrap candidate for Cross-Validated](https://meta.stackoverflow.com/questions/404799/lets-gift-wrap-our-good-machine-learning-theory-questions-for-cross-validated?noredirect=1#comment822113_404799) – Daniel F Feb 10 '21 at 13:53

7 Answers7

249

Briefly, feature is input; label is output. This applies to both classification and regression problems.

A feature is one column of the data in your input set. For instance, if you're trying to predict the type of pet someone will choose, your input features might include age, home region, family income, etc. The label is the final choice, such as dog, fish, iguana, rock, etc.

Once you've trained your model, you will give it sets of new input containing those features; it will return the predicted "label" (pet type) for that person.

Prune
  • 76,765
  • 14
  • 60
  • 81
  • 1
    So [age, home region, family income] would be "3 feature vectors". And in Keras, your NumPy array for your LSTM network would be [samples, time steps, 3] ? – naisanza Aug 20 '17 at 11:55
  • 3
    @naisanza: that's certainly one possibility. I'm not familiar with Keras, but this high-level organization could certainly be the start of a valid implementation. – Prune Aug 22 '17 at 17:41
  • is feature and variable the same thing? – Debadatta Apr 30 '18 at 12:55
  • 1
    I would like to state that "label" is also dependent on the context; for example, for training a model, you will used "labelled" data. In this case, the label is the ground truth to which your output data is compared to. – N.Atanasov Jul 29 '18 at 12:26
  • 1
    wow. great answer, thank you this clears up alot of lingering questions on this topic space. – Andrew Ray Jan 28 '19 at 17:30
41

Feature:

In Machine Learning feature means property of your training data. Or you can say a column name in your training dataset.

Suppose this is your training dataset

Height   Sex   Age
 61.5     M     20
 55.5     F     30
 64.5     M     41
 55.5     F     51
 .     .     .
 .     .     .
 .     .     .
 .     .     .

Then here Height, Sex and Age are the features.

label:

The output you get from your model after training it is called a label.

Suppose you fed the above dataset to some algorithm and generates a model to predict gender as Male or Female, In the above model you pass features like age, height etc.

So after computing, it will return the gender as Male or Female. That's called a Label

Darshan Jain
  • 781
  • 9
  • 19
Saurabh Agrawal
  • 7,581
  • 2
  • 27
  • 51
9

Here comes a more visual approach to explain the concept. Imagine you want to classify the animal shown in a photo.

The possible classes of animals are e.g. cats or birds. In that case the label would be the possible class associations e.g. cat or bird, that your machine learning algorithm will predict.

The features are pattern, colors, forms that are part of your images e.g. furr, feathers, or more low-level interpretation, pixel values.

Bird Label: Bird
Features: Feathers

Cat

Label: Cat
Features: Furr

mrk
  • 8,059
  • 3
  • 56
  • 78
5

Let's take an example where we want to detect the alphabet using handwritten photos. We feed these sample images in the program and the program classifies these images on the basis of the features they got.

An example of a feature in this context is: the letter 'C' can be thought of like a concave facing right.

A question now arises as to how to store these features. We need to name them. Here's the role of the label that comes into existence. A label is given to such features to distinguish them from other features.

Thus, we obtain labels as output when provided with features as input.

Labels are not associated with unsupervised learning.

Darshan Jain
  • 781
  • 9
  • 19
FutureJJ
  • 2,368
  • 1
  • 19
  • 30
5

Prerequisite: Basic Statistics and exposure to ML (Linear Regression)

It can be answered in a sentence -

They are alike but their definition changes according to the necessities.

Explanation

Let me explain my statement. Suppose that you have a dataset, for this purpose consider exercise.csv. Each column in the dataset are called as features. Gender, Age, Height, Heart Rate, Body_temp, and Calories might be one among various columns. Each column represents distinct features or property.

exercise.csv

User_ID  Gender Age  Height  Weight Duration    Heart_Rate  Body_Temp   Calories
14733363 male   68  190.0   94.0    29.0           105.0    40.8        231.0
14861698 female 20  166.0   60.0    14.0            94.0    40.3        66.0
11179863 male   69  179.0   79.0    5.0             88.0    38.7        26.0

To solidify the understanding and clear out the puzzle let us take two different problems (prediction case).

CASE1: In this case we might consider using - Gender, Height, and Weight to predict the Calories burnt during exercise. That prediction(Y) Calories here is a Label. Calories is the column that you want to predict using various features like - x1: Gender, x2: Height and x3: Weight .

CASE2: In the second case here we might want to predict the Heart_rate by using Gender and Weight as a feature. Here Heart_Rate is a Label predicted using features - x1: Gender and x2: Weight.

Once you have understood the above explanation you won't really be confused with Label and Features anymore.

Community
  • 1
  • 1
3

A feature briefly explained would be the input you have fed to the system and the label would be the output you are expecting. For example, you have fed many features of a dog like his height, fur color, etc, so after computing, it will return the breed of the dog you want to know.

Aman pradhan
  • 258
  • 5
  • 12
0

Suppose you want to predict climate then features given to you would be historic climate data, current weather, temperature, wind speed, etc. and labels would be months. The above combination can help you derive predictions.

Darshan Jain
  • 781
  • 9
  • 19