Questions tagged [bayesian]

Bayesian (after Thomas Bayes) refers to methods in probability and statistics that involve quantifying uncertainty about parameter or latent variable estimates by incorporating both prior and observed information. Bayesian modeling, inference, optimization, and model comparison techniques are on topic. A programming element is expected; theoretical/methodological questions should go to https://stats.stackexchange.com.

Overview

Bayesian inference is a method of statistical inference which uses Bayes' theorem - named after Thomas Bayes (1702-1761) - to quantify the uncertainty of parameters or latent variables. The statement of Bayes' theorem in Bayesian inference is

enter image description here

Here θ represents the parameters to be inferred and d the data. P(θ|d) is the posterior probability and P(d|θ) is the likelihood function. P(θ) is the prior: a function encoding previous beliefs about θ within a model appropriate for the data. P(d) is a normalization factor.

The formula is used as an updating procedure: as more data become available, the posterior can be updated successively. In the first instance, the prior must be specified by the user. In later updates, the prior is usually chosen to be the posterior from a previous updating procedure.

References

The following threads contain lists of references:

The following journals are dedicated to research in Bayesian statistics:

Tag usage

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

1808 questions
10
votes
1 answer

compare bayesian linear regression VS linear regression

Recently I learnt the bayesian linear regression model, but what I'm confused is that in which situation we should use the linear regression, and when to use the bayesian version. How about the performance of these two? And is the bayesian logistic…
FindBoat
  • 593
  • 2
  • 5
  • 14
9
votes
5 answers

Naive bayes calculation in sql

I want to use naive bayes to classify documents into a relatively large number of classes. I'm looking to confirm whether an mention of an entity name in an article really is that entity, on the basis of whether that article is similar to articles…
ʞɔıu
  • 47,148
  • 35
  • 106
  • 149
9
votes
1 answer

PYMC3 Seasonal Variables

I'm relatively new to PYMC3 and I'm trying to implement a Bayesian Structure Time Series (BSTS) without regressors, for instance the model fit here in R. The model is as follows: I can implement the local linear trend using a GaussianRandomWalk as…
Paul
  • 503
  • 4
  • 15
9
votes
6 answers

Hyperparameter tune for Tensorflow

I am searching for a hyperparameter tune package for code written directly in Tensorflow (not Keras or Tflearn). Could you make some suggestion?
9
votes
1 answer

Incremental model update with PyMC3

Is it possible to incrementally update a model in pyMC3. I can currently find no information on this. All documentation is always working with a priori known data. But in my understanding, a Bayesian model also means being able to update a belief.…
Christian
  • 1,341
  • 1
  • 16
  • 35
9
votes
0 answers

Bayesian error-in-variables (total least squares) model in R using MCMCglmm

I am fitting some Bayesian linear mixed models using the MCMCglmm package in R. My data includes predictors that are measured with error. I'd therefore like to build a model that takes this into account. My understanding is that a basic mixed…
Alberto
  • 133
  • 5
9
votes
7 answers

Anyone can tell me why we always use the gaussian distribution in Machine learning?

For example, we always assumed that the data or signal error is a Gaussian distribution? why?
laotao
  • 103
  • 1
  • 1
  • 4
9
votes
1 answer

Clustering and Bayes classifiers Matlab

So I am at a cross roads on what to do next, I set out to learn and apply some machine learning algorithms on a complicated dataset and I have now done this. My plan from the very beginning was to combine two possible classifiers in an attempt to…
G Gr
  • 6,030
  • 20
  • 91
  • 184
8
votes
4 answers

Calculating the probability of a token being spam in a Bayesian spam filter

I recently wrote a Bayesian spam filter, I used Paul Graham's article Plan for Spam and an implementation of it in C# I found on codeproject as references to create my own filter. I just noticed that the implementation on CodeProject uses the total…
Waleed Eissa
  • 10,283
  • 16
  • 60
  • 82
8
votes
2 answers

What does a Bayesian Classifier score represent?

I'm using the ruby classifier gem whose classifications method returns the scores for a given string classified against the trained model. Is the score a percentage? If so, is the maximum difference 100 points?
Mike Buckbee
  • 6,793
  • 2
  • 33
  • 36
8
votes
2 answers

wondering if Bayes classifier is right approach?

I'm wondering if a Bayes classifier makes sense for an application where the same phrase "served cold" (for example) is "good" when associated some things (beer, soda) but "bad" when related to other things (steak, pizza, burger)? What I'm wondering…
jpw
  • 18,697
  • 25
  • 111
  • 187
8
votes
3 answers

Way to infer the size of the userbase of a site from sampling taken usernames

Suppose you wanted to estimate the size of a userbase of a site which does not publicize this information. People are more likely to have acquired different usernames with different probabilities. For instance, if the username 'nick' doesn't exist…
ʞɔıu
  • 47,148
  • 35
  • 106
  • 149
8
votes
2 answers

Measuring the performance of classification algorithm

I've got a classification problem in my hand, which I'd like to address with a machine learning algorithm ( Bayes, or Markovian probably, the question is independent on the classifier to be used). Given a number of training instances, I'm looking…
8
votes
0 answers

CausalImpact package in R doesn't work for Poisson bsts model

I'd like to use the CausalImpact package in R to estimate the impact of an intervention on infectious disease case counts. We typically characterize the distributions of case counts as either Poisson or negative binomial. The bsts() function allows…
salauer
  • 81
  • 2
8
votes
2 answers

How to use `Dirichlet Process Gaussian Mixture Model` in Scikit-learn? (n_components?)

My understanding of "an infinite mixture model with the Dirichlet Process as a prior distribution on the number of clusters" is that the number of clusters is determined by the data as they converge to a certain amount of clusters. This R…
O.rka
  • 29,847
  • 68
  • 194
  • 309