Are decision trees (e.g. C4.5) considered nonparametric learning?

Question

I am relatively new to machine learning and am trying to place decision tree induction into the grand scheme of things. Are decision trees (for example, those built with C4.5 or ID3) considered parametric or nonparametric? I would guess that they may be indeed parametric because the decision split points for real values may be determined from some distribution of features values, for example the mean. However, they do not share the nonparametric characteristic of having to keep all the original training data (like one would do with kNN).

for reference : https://sebastianraschka.com/faq/docs/parametric_vs_nonparametric.html — teddcp, May 15 '20 at 08:42

score 10 · Accepted Answer · answered Dec 12 '12 at 18:37

10

The term "parametric" refers to parameters that define the distribution of the data. Since decision trees such as C4.5 don't make an assumption regarding the distribution of the data, they are nonparametric. Gaussian Maximum Likelihood Classification (GMLC) is parametric because it assumes the data follow a multivariate Gaussian distribution (classes are characterized by means and covariances). With regard to your last sentence, retaining the training data (e.g., instance-based learning) is not common to all nonparametric classifiers. For example, artificial neural networks (ANN) are considered nonparametric but they do not retain the training data.

answered Dec 12 '12 at 18:37

bogatron

18,639
6
53
47

What about the idea of the decision nodes' split point for real values being determined through some distribution? – stackoverflowuser2010 Dec 12 '12 at 18:44
3

A distribution is not required. You can sort all your instances by the value of your continuous attribute, then split between the two values that maximize the information gain. No assumption has been made regarding the distribution of the data (i.e., no assumption that the data are normally or otherwise distributed). – bogatron Dec 12 '12 at 18:48
But let's say that a particular implementation of a decision tree uses a distribution to perform splitting. Then that would make it this implementation parametric, right? – stackoverflowuser2010 Dec 12 '12 at 19:46
1

The decision tree will still be a nonparametric classifier. Even though you may use a parametric model (e.g., a Gaussian distribution) for selecting potential branches, the ultimate decision surface produced by the tree will, in general, not correspond to Gaussian distributions of classes (neither implicitly nor explicitly). – bogatron Dec 12 '12 at 20:23
5

This isn't quite accurate---your explanation is more or less correct in an informal sense, but the actual meaning of non-parametric models (not quite the same as non-parametric tests, which I think you're confusing) is that the number of parameters and model structure is decided by the data rather than fixed a-priori. See Bayesian non-parametrics for a whole family of models where data is assumed to follow a distribution, but the number of parameters grows with the data. – Ben Allison Dec 13 '12 at 11:10
@BenAllison: Are you saying that a DT could indeed be considered parametric? – stackoverflowuser2010 Dec 13 '12 at 18:58
2

No, sorry, I didn't mean to say that. Because the structure of the tree is decided by the training data, they are non-parametric. However, it's not so simple as parameterised probability distribution = parametric model, as I mention above. – Ben Allison Dec 14 '12 at 10:46
1

As @BenAllison mentioned, parametric vs nonparametric models differ in the way parameters of the model are fixed vs determined-from-data, and not in the assumptions that the model makes on distribution of data. – GuSuku Jan 31 '17 at 16:48
1

So, even if you construct a simple logistic regression model that makes no assumption on the distribution of input data, it is still a parametric model. – GuSuku Jan 31 '17 at 17:08
See my post below. The term parametric is not linked to the data distribution, but to the model definition. – marc Dec 31 '19 at 22:57

marc · Answer 2 · 2020-01-12T00:54:06.087

5

The term parametric refers to the relation between the number of parameters of the model and the data.

If the number of parameters is fixed, the model is parametric.

If the number of parameters grows with the data, the model is non parametric.

A decision tree is non parametric but if you cap its size for regularization then the number of parameters is also capped and could be considered fixed. So it's not that clear cut for decision trees.

KNN is definitely non parametric because the parameter set is the data set: to predict new data points the KNN model needs to have access to the training data points and nothing else (except hyper-parameter K).

edited Jan 12 '20 at 00:54

answered Jun 01 '19 at 21:02

marc

355
3
6

1

KNN is not parametric – efthimio Dec 13 '19 at 12:12
What does "... the parameters are the data in KNN" mean? Hope you can extend your answer a little bit. – StoryMay Jan 04 '20 at 03:35
@ChangheeKang, in linear regression, for example, we use learned parameters to make prediction. In KNN model, we use the nearest N data points to make prediction. This is what it means by the parameters are the data. Rather than use the parameters extracted / learned from the training data, we directly use the training data for prediction. Hope this helps. – Max Aug 21 '20 at 00:38

Are decision trees (e.g. C4.5) considered nonparametric learning?

2 Answers2