I found that Linear Discriminant Analysis of sklearn (discriminant_analysis.py) uses priors for calculating within-class covariance matrix.
I refered some books but they explained the priors are used for calculating intercepts.
Do you know any references which explain why the within-class covariance matrix is calculated by priors?
This is the line.
cov += priors[idx] * np.atleast_2d(_cov(Xg, shrinkage, covariance_estimator))
Here is the class from the discriminant_analysis.py
def _class_cov(X, y, priors, shrinkage=None, covariance_estimator=None):
"""Compute weighted within-class covariance matrix.
The per-class covariance are weighted by the class priors.
Parameters
----------
X : array-like of shape (n_samples, n_features)
Input data.
y : array-like of shape (n_samples,) or (n_samples, n_targets)
Target values.
priors : array-like of shape (n_classes,)
Class priors.
shrinkage : 'auto' or float, default=None
Shrinkage parameter, possible values:
- None: no shrinkage (default).
- 'auto': automatic shrinkage using the Ledoit-Wolf lemma.
- float between 0 and 1: fixed shrinkage parameter.
Shrinkage parameter is ignored if `covariance_estimator` is not None.
covariance_estimator : estimator, default=None
If not None, `covariance_estimator` is used to estimate
the covariance matrices instead of relying the empirical
covariance estimator (with potential shrinkage).
The object should have a fit method and a ``covariance_`` attribute
like the estimators in sklearn.covariance.
If None, the shrinkage parameter drives the estimate.
.. versionadded:: 0.24
Returns
-------
cov : array-like of shape (n_features, n_features)
Weighted within-class covariance matrix
"""
classes = np.unique(y)
cov = np.zeros(shape=(X.shape[1], X.shape[1]))
for idx, group in enumerate(classes):
Xg = X[y == group, :]
cov += priors[idx] * np.atleast_2d(_cov(Xg, shrinkage, covariance_estimator))
return cov
Here is the part of Linear Discriminant Analysis codes.
I understand the reason why the intercept is calculated by priors.
self.means_ = _class_means(X, y)
self.covariance_ = _class_cov(
X, y, self.priors_, shrinkage, covariance_estimator
)
self.coef_ = linalg.lstsq(self.covariance_, self.means_.T)[0].T
self.intercept_ = -0.5 * np.diag(np.dot(self.means_, self.coef_.T)) + np.log(
self.priors_
)