154 lines
4.2 KiB
Plaintext
154 lines
4.2 KiB
Plaintext
---
|
|
title: "Chapter 3 Beta-Binomial Bayesian Model Notes"
|
|
author: "Emanuel Rodriguez"
|
|
execute:
|
|
message: false
|
|
warning: false
|
|
format:
|
|
html:
|
|
monofont: "Cascadia Mono"
|
|
highlight-style: gruvbox-dark
|
|
css: styles.css
|
|
callout-icon: false
|
|
callout-apperance: simple
|
|
toc: false
|
|
html-math-method: katex
|
|
---
|
|
|
|
```{r}
|
|
library(bayesrules)
|
|
library(tidyverse)
|
|
```
|
|
|
|
The chapter is set up with an example of polling results. We are put into
|
|
the scenario where we are managig the campaing for a candidate. We know
|
|
that on average her support based on recent polls is around 45%. In the
|
|
next few sections we'll work through our Bayesian framework and incorporate
|
|
a new tool the **Beta-Binomial** model. This model will take develop a
|
|
continuous prior, as opposed to the discrete one's we've been working with
|
|
so far.
|
|
|
|
|
|
## The Beta prior
|
|
|
|
:::{.callout-note}
|
|
## Probability Density Function
|
|
|
|
Let $\pi$ be a continuous random variable with probability density
|
|
function (pdf) $f(\pi)$. Then $f(\pi)$ has the following properties:
|
|
|
|
1. $f(\pi) \geq 0$
|
|
2. $\int_{\pi}f(\pi)d\pi = 1$ (this is analogous to $\sum$ in the case of pmfs)
|
|
3. $P(a < \pi < b) = \int_a^bf(\pi)d\pi$ when $a\leq b$
|
|
:::
|
|
|
|
:::{.callout-tip icon="true"}
|
|
a quick note on (1) above. Note that it does not place a restriction on
|
|
$f(\pi)$ being less than 1. This means that we can't interpret values of
|
|
$f$ as probabilities, we can however use to interpret plausability of
|
|
two different events, the greater the value of $f$ the more plausible.
|
|
To calculate probabilities using $f$ we must determine the area under the
|
|
curve it defines, as shown in (3).
|
|
:::
|
|
|
|
```{r}
|
|
|
|
x <- seq(0, 1, by = .05)
|
|
y1 <- dbeta(x=x, 5, 5)
|
|
y2 <- dbeta(x=x, 5, 1)
|
|
y3 <- dbeta(x=x, 1, 5)
|
|
|
|
d <- tibble(
|
|
x,
|
|
`beta(5, 1)`=y2,
|
|
`beta(5, 5)`=y1,
|
|
`beta(1, 5)`=y3
|
|
) |>
|
|
pivot_longer(names_to = "beta_shape", values_to="beta",
|
|
-x) |>
|
|
mutate(beta_shape=factor(beta_shape,
|
|
levels=c("beta(5, 1)",
|
|
"beta(5, 5)",
|
|
"beta(1, 5)")))
|
|
```
|
|
|
|
```{r}
|
|
#| label: fig-beta-shapes
|
|
#| fig-cap: The basic shapes of beta based on the hyperparameters
|
|
|
|
ggplot(data=d, aes(x, beta)) + geom_point() +
|
|
geom_line() +
|
|
facet_wrap(vars(beta_shape))
|
|
```
|
|
|
|
In general the shape of the beta distribution is skewed-left when
|
|
$\alpha > \beta$, symmetrical when $\alpha = \beta$ and skewed-right
|
|
when $\alpha < \beta$, see @fig-beta-shapes.
|
|
|
|
:::{.callout-note}
|
|
## The Standard Uniform
|
|
|
|
When $\pi$ can take equally take on any value between 0 and 1,
|
|
we can model $\pi$ using the standard uniform model.
|
|
|
|
$$\pi \sim Unif(0, 1)$$
|
|
|
|
the pdf of $Unif(0, 1)$ is $\f(\pi) = 1$
|
|
|
|
Note that $Unif(0, 1)$ is just a special case of the Beta with
|
|
hyperparameters $\alpha = \beta = 1$, see @fig-std-unif-as-beta
|
|
:::
|
|
|
|
```{r}
|
|
#| label: fig-std-unif-as-beta
|
|
#| fig-cap: "The standard uniform is a special case of the beta distrubtion with a = b = 1"
|
|
std_unif <- tibble(
|
|
x, `beta(1, 1)`=dbeta(x, 1, 1)
|
|
)
|
|
|
|
ggplot(data=std_unif, aes(x, `beta(1, 1)`)) +
|
|
geom_point() +
|
|
geom_line()
|
|
```
|
|
|
|
### Mean and Mode of the Beta
|
|
|
|
The mean and mode are both measures of centrality. The mean is average
|
|
value the mode is the most "common", in the case of pmf this is just
|
|
the value that occurs the most in the pdf its the max value.
|
|
|
|
The formulations of these for the beta are:
|
|
|
|
$$E(\pi) = \frac{\alpha}{\alpha + \beta}$$
|
|
$$\text{Mode}(\pi) = \frac{\alpha - 1}{\alpha + \beta -2}\;\; \text{when} \;\;\alpha,\beta > 1$$
|
|
|
|
|
|
When can also measure the variability of $\pi$. Take @fig-beta-vars
|
|
we can see the variability of $\pi$ differ based on the values
|
|
$\alpha, \beta$.
|
|
|
|
```{r}
|
|
#| label: fig-beta-vars
|
|
#| fig-cap: "Two symmetrical shapes of beta with different variance"
|
|
beta_variances <- tibble(
|
|
x,
|
|
`beta(5, 5)`=dbeta(x, 5, 5),
|
|
`beta(20, 20)`=dbeta(x, 20, 20)
|
|
) |>
|
|
pivot_longer(names_to = "beta_shape", values_to = "beta", -x)
|
|
|
|
ggplot(data=beta_variances, aes(x, beta)) +
|
|
geom_point() +
|
|
geom_line() +
|
|
facet_wrap(vars(beta_shape))
|
|
```
|
|
|
|
We can formulate the variance of $Beta(\alpha, \beta)$ with
|
|
|
|
$$Var(\pi) = \frac{\alpha\beta}{(\alpha + \beta)^2(\alpha+\beta+1)}$$
|
|
|
|
it follows that
|
|
|
|
$$SD = \sqrt{Var(\pi)}$$
|
|
|