Stick Breaking Process and Dirichlet Process Priors
Before the stick has been broken the first time, the remainder has length 1 (i.e. all of the stick). Each subsequent break affects the remainder only. Larger values of alpha lead to more, smaller sticks.
- b key breaks off fraction from the remainder (lavender) using a draw from Beta(1,alpha)
- p key mimics pressing the b key until the remainder is tiny (0.001)
- r key resets everything
- t key throws darts and computes the O statistic
- shift-up/shift-down arrow keys increase/decrease alpha (but smallest value is 0.1)
- shift-right/shift-left arrow keys increase/decrease the number of darts (within the range 10-100)
- E is the expected number of colored rectangles hit by at least one of the n darts
- O is the observed number of colored rectangles hit by at least one dart
This process provides a prior distribution known as the Dirichlet Process Prior (DPP). Imagine that the darts are sites and the colors of the rectangles represent different relative substitution rates. The stick breaking process illustrated in this applet shows what typical draws from a Dirichlet Process Prior look like for different values of alpha and different numbers of sites (darts). Used in a Bayesian MCMC analysis, a DPP would allow you to learn something about how many rate categories are present and which sites fall into which category. Normally, a hierarchical model would be used in which alpha is a hyperparameter and its hyperprior would be vague but nevertheless discourage alpha from getting too large.
Creative Commons Attribution 4.0 International. License (CC BY 4.0). To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.