Connections between probability theory and statistics?

Question

Hello dear StackOverFlow.

I have a few conceptual questions regarding the intersection between probability theory and statistics.

I already know a few basic conceptual connections between probability and statistics. For example, I know that in statistics we use a sample to infer the parameters of a population probability distribution, which we can subsequently use to assess the probabilities of future events. In other words, statistics take the "backward" view and use collected data to infer the population parameters that are then used for the "forward" view of the probabilities of future outcomes.

And I know that the process of inferring the population parameters from a sample can entail e.g. maximum likelihood estimation. For example, if we are dealing with linear regression, we would use a probability density function of a normal distribution and insert estimates of our parameters (mean and variance) with our linear regression equation (b0 + b1*x) into the pdf of a normal distribution to ascertain a likelihood estimate. The final set of parameter estimates are those that have maximized the likelihood estimate.

The big picture view that I have is that we use concepts from probability theory such as probability density functions, and we apply that to statistics in order to have a data-based reason for the assignment of the specific parameter values to our probability distribution that we use, and given these parameters estimates we make assessments about the probable likelihood of future outcomes.

However, and here come the questions, I am having trouble connecting further aspects of probability theory with concepts used in statistics. For example, I know that joint probability distributions are used to ascertain marginal and condition probabilities, with conditional probabilities being used in e.g. linear regression. But I am lacking the substantial knowledge of the connection between the two.

Are joint probability distributions what underlie covariance matrices, for example? How are joint probability distributions and marginal distributions relevant in statistics (I know how conditional probability distributions are relevant)?

Or am I looking in an entirely wrongheaded direction? Am I asking the wrong questions here? In what other way would probability theory and statistics be connected? I am looking for depth of knowledge, looking for substantial knowledge. If anyone has any literature that is not excessively math heavy but more focused on the conceptual overview of the connections between probability theory and statistical analysis, I would be very happy.

score 0 · Answer 1 · answered Jan 09 '22 at 21:22

If you are a data-centric person, a good application of joint distributions is classical supervised and unsupervised learning. The best way to assign a bivariate observation as coming from one of several classes is to choose the class having the highest joint distribution value. (Assuming equal costs and priors.)

As another example, the bivariate normal joint distribution provides an insightful representation of the regression function in terms of the correlation and sdevs that explains, eg, regression to the mean very nicely.

As another example, it is the joint distribution that allows you to see how to reverse the conditionality in Bayes Theorem.

But yes, largely the joint distribution is a tool to facilitate understanding, rather than one that has direct statistical application (a notable exception being the learning application I mentioned at the outset).

Connections between probability theory and statistics?

1 Answers1