How do you compute the BIC in R with high dimensional data

Question

I have a high dimensional data set with 200 parameters and 50 observations. I am attempting to compute the the BIC in R. I am aware that BIC=log(n)*df-2*log(L) where L is the likelihood. I am just wondering how one computes L. I believe I need to compute the MSE, but I am not sure how to do that.

Hi! I think this question might require some cleanup in order to be upvoted - I wonder if it has been downvoted because it is not clear that you've done research on what L really is and how it works - rather than simply how to use it in R. Statistics is not my strong suit, but I did find an example [here on YouTube](https://www.youtube.com/watch?v=nKRCQY5tzaE) of working through a BIC calculation in R - perhaps it will help you? — J Trana, Feb 03 '20 at 02:14

score 0 · Answer 1 · answered Feb 02 '20 at 21:21

Maybe more of a Cross Validated question?

To compute BIC, you need first need a model. There are different ways of estimating models, but the two most common ones in frequentists stats are ordinary least squares (OLS) and maximum likelihood estimation (MLE). The gist of MLE is that you find slopes for the parameters in your model by picking the values of the slopes that maximize the likelihood of the model given the data (see the following video: https://www.youtube.com/watch?v=XepXtl9YKwc).

To get BIC after you've fitted an MLE model, you can use the likelihood that you got after you fitted your MLE model. You can then use it to compare two different models with different numbers of parameters. That's what BIC is for.

If you fit a model in R using MLE, you should be able to get its likelihood from the model summary. You probably wouldn't want to write a program to calculate the likelihood manually I think, you may need a pretty advanced maths background to do that (at least with more complex models).

Thank you for the tip :) I am using glmnet and having trouble computing the BIC — Mistah White, Feb 02 '20 at 21:24
No worries! As far as I know, `glmnet` uses gradient descent, which is a different, machine learning way of estimating models. However, there should still be a way of using the coefficients you got from `glmnet` to get a likelihood estimate and use that to get the BIC. — Adam B., Feb 02 '20 at 21:37
It might be a bit of a wild-goose chase though, trying to get BIC from a `glmnet` fit. Since you're using machine learning to build models, it might be a better idea to use machine learning methods to compare models. For example, you can do cross-validation to see how the RMSE of different models compares. — Adam B., Feb 02 '20 at 21:39

How do you compute the BIC in R with high dimensional data

1 Answers1