This Vocabulary and Concept Primer for Bayesian Analysis is provided to assist in learning the distinct scientific terminology used when discussing the topic of Bayesian Analysis.
Inference is the process of estimating a quantity of interest, referred to as an unknown parameter, that cannot be subject to deterministic observation or physical measurement. Bayesian inference involves the use of Bayes' theorem to estimate the unknown parameter.
A theorem is a mathematical expression that has been subjected to extensive mathematical proof. Bayes' theorem is based on the work of Thomas Bayes and Simon Pierre Laplace and involves the use of new information to improve the confidence or reduce the uncertainty about a conclusion associated with some prior existing information.
Bayesian probability refers to the degree of belief one may hold in some knowledge or conclusion under uncertain circumstances. This is in contrast to the frequentist notion of probability, which refers to the frequency of observed occurrences among a series of repeated or repeatable possibilities. Whereas, frequentist probabilities can only be discussed with regard to events that are both observable and repeatable, Bayesian probabilities can be used with a wider range of observable and unobservable phenomena.
The prior probability represents what is known about the likelihood of different possible outcomes before a scientific test or experiment. The prior probability distribution can be based on objective or empirical information such as a base-rate or incidence rate. For example: if exactly four persons had access and opportunity to commit a crime, the prior probability is not less than 1 in 4. When little information is available, the optimal prior probability will be 1 in 2, because there are 2 possible conclusions, deception or truth-telling (an inconclusive result is not a conclusion). Outcomes can also be evaluated for a range of different possible prior probabilities.
A likelihood function is a device for obtaining a statistical or likelihood value associated with some data.
The posterior probability tells us the likelihood associated with a test result or conclusion. It is the combination of the prior probability, and likelihood function and test scores using Bayes' theorem.
Odds are a convenient and intuitive way of discussing probabilistic information. We can calculate the odds for any
probability or proportion. The relationship between odds and proportions (probabilities) is this:
odds = p / (1 - p)
Also, if the odds are known the proportion or probability can be calculated in this way:
p = odds / (1 + odds)
Although mathematically related to probabilities, use of odds can provide clearer and more intuitively useful information for many people.
Bayes factor is the value with which we would multiply the prior odds to obtain the posterior odds. When the prior odds are 1 in 2, the Bayes factor will be equal to the posterior probability. Bayes factor provides useful information to people who want to replicate or inspect an analytic result in greater detail. A Bayesian credible interval tells us the range for which we can reasonably expect a parameter of interest to exist within.
The credible interval tells us the range of variability (i.e. how sure we are) that we can expect for an analytic result or conclusion. A credible interval is the Bayesian analog for a confidence interval in frequentist statistics.
Naive-Bayes is a widely used application of Bayes' Theorem to statistical decision making, machine-learning and artificial intelligence. In this case "naive" refers to deliberate reliance on assumptions that the different sources of data (i.e. from different sensors) are independent and contribute equally to the outcome. Naive-Bayes algorithms are advantageous in that they are simpler to understand, rapid and easy to develop and often perform quite well compared to more complex classifiers.