686 lines
18 KiB
Plaintext
686 lines
18 KiB
Plaintext
---
|
|
title: "Chapter 2 Notes"
|
|
author: "Emanuel Rodriguez"
|
|
execute:
|
|
message: false
|
|
warning: false
|
|
format:
|
|
html:
|
|
monofont: "Cascadia Mono"
|
|
highlight-style: gruvbox-dark
|
|
css: styles.css
|
|
callout-icon: false
|
|
callout-apperance: simple
|
|
toc: false
|
|
html-math-method: katex
|
|
---
|
|
|
|
*Note: these notes are a work in progress*
|
|
|
|
In this chapter we step through an example
|
|
of "fake" vs "real" news to build a framework to determine the probability
|
|
of real vs fake of a new news article titled "The President has a secret!"
|
|
|
|
We then go on to build a probability known as the Binomial model using the
|
|
Bayesian framework
|
|
|
|
```{r}
|
|
#| message: false
|
|
#| warning: false
|
|
# libraries
|
|
library(bayesrules)
|
|
library(dplyr)
|
|
library(tidyr)
|
|
library(gt)
|
|
library(tibble)
|
|
library(ggplot2)
|
|
data(fake_news)
|
|
fake_news <- tibble::as_tibble(fake_news)
|
|
```
|
|
|
|
What is the proportion of news articles that were labeled fake vs real.
|
|
|
|
```{r}
|
|
fake_news |> head()
|
|
|
|
fake_news |>
|
|
group_by(type) |>
|
|
summarise(
|
|
total = n(),
|
|
prop = total / nrow(fake_news)
|
|
)
|
|
```
|
|
|
|
If we let $B$ be the event that a news article is "fake" news, and
|
|
$B^c$ be the event that a news article is "real", we can write the following:
|
|
|
|
$$P(B) = .4$$
|
|
$$P(B^c) = .6$$
|
|
|
|
This is the first "clue" or set of data that we have to build into our framework.
|
|
Namely, majority of articles are "real", therefore we could simply predict that
|
|
the new article is "real". This updated sense or reality now becomes our priors.
|
|
|
|
Getting additional data, and updating our priors, based on additional data. The
|
|
new observation we make is the use of exclamation marks "!". We note that the use
|
|
of "!" is more frequent in news articles labeled as "fake". We will want to incorporate
|
|
this into our framework to decide whether the new incoming should be labelled as
|
|
real or fake.
|
|
|
|
### Likelihood
|
|
|
|
:::{.callout-note}
|
|
## Probability and Likelihood
|
|
|
|
When the event $B$ is known, then we can evaluate the uncertainy of events
|
|
$A$ and $A^c$ given $B$
|
|
|
|
$$P(A|B) \text{ vs } P(A^c|B)$$
|
|
|
|
If on the other hand, we know event $A$ then we can evaluate the relative
|
|
compatability of data $A$ with $B$ and $B^c$ using likelihood functions
|
|
|
|
$$L(B|A) \text{ vs } L(B^c|A)$$
|
|
$$=P(A|B) \text{ vs } P(A|B^c)$$
|
|
|
|
:::
|
|
|
|
So in our case, we don't know whether this new incoming article is real or not,
|
|
but we do know that the title has an exclamation mark. This means we can
|
|
evaluate how likely this article is real or not given that it contains
|
|
an "!" in the title using likelihood functions. We can formualte this as:
|
|
|
|
$$L(B|A) \text{ vs } L(B^c|A)$$
|
|
|
|
And perform the computation in R as follows:
|
|
|
|
```{r}
|
|
# if fake, what are the proprotions of ! vs no-!
|
|
prop_of_excl_within_type <- fake_news |>
|
|
group_by(type, title_has_excl) |>
|
|
summarise(
|
|
total = n()
|
|
) |>
|
|
ungroup() |>
|
|
group_by(type) |>
|
|
summarise(
|
|
has_excl = title_has_excl,
|
|
prop_within_type = total / sum(total)
|
|
)
|
|
```
|
|
|
|
```{r}
|
|
prop_of_excl_within_type |>
|
|
pivot_wider(names_from = "type", values_from = prop_within_type) |>
|
|
gt() |>
|
|
gt::cols_label(
|
|
has_excl = "Contains Exclamtion",
|
|
fake = "Fake",
|
|
real = "Real") |>
|
|
gt::fmt_number(columns=c("fake", "real"), decimals = 3) |>
|
|
gt::cols_width(everything() ~ px(100))
|
|
```
|
|
|
|
The table above also shows the likelihoods for the case
|
|
when an article does not contain exclamation point in
|
|
the title as well. It's really important to note that these are likelihoods,
|
|
and its not the case that $L(B|A) + L(B^c|A) = 1$ as a matter of fact this
|
|
value evaluates to a number less than one. However, since we have that
|
|
$L(B|A) = .267$ and $L(B^c|A) = .022$ then we have gained additional
|
|
knowledge in knowing the use of "!" in a title is more compatible
|
|
with a fake news article than a real one.
|
|
|
|
Up to this point we can summarize our framework as follows
|
|
|
|
| event | $B$ | $B^c$ | Total |
|
|
|------- |-----|-------|------|
|
|
| prior | .4 | .6 | 1 |
|
|
| likelihood |.267 | .022 | .289 |
|
|
|
|
Our next goal is come up with normalizing factors in order to build our
|
|
probability table:
|
|
|
|
| | $B$| $B^c$| Total |
|
|
|------|----|------|-------|
|
|
|$A$ | (1)| (2) | |
|
|
|$A^c$ | (3)| (4) | |
|
|
|Total | .4 | .6 | 1 |
|
|
|
|
A couple things to note about our table (1) + (3) = .4 and (2) + (4) = .6.
|
|
(1) + (2) + (3) + (4) = 1.
|
|
|
|
(1.) $P(A \cap B) = P(A|B)P(B)$ we know the likelihood of $L(B|A) = P(A|B)$ and we also
|
|
know the prior so we insert these to get
|
|
$$ P(A \cap B) = P(A|B)P(B) = .267 \times .4 = .1068$$
|
|
|
|
(3.) $P(A^c \cap B) = P(A^c|B)P(B)$ in this case we do know the prior $P(B) = .4$, but we
|
|
don't directly know the value of $P(A^c|B)$, however, we note that $P(A|B) + P(A^c|B) = 1$,
|
|
therefore we compute $P(A^c|B) = 1 - P(A|B) = 1 - .267 = .733$
|
|
$$ P(A^c \cap B) = P(A^c|B)P(B) = .733 \times .4 = .2932$$
|
|
|
|
we now can confirm that $.1068 + .2932 = .4$
|
|
|
|
Moving on to (2), (4)
|
|
|
|
(2.) $P(A \cap B^c) = P(A|B^c)P(B^c)$. In this case know the likelihood $L(B^c|A) = P(A|B^c)$ and
|
|
we know the prior $P(B^c)$ therefore,
|
|
$$P(A \cap B^c) = P(A|B^c)P(B^c) = .022 \times .6 = .0132$$
|
|
|
|
(4.) $P(A^c \cap B^c) = P(A^c|B^c)P(B^c) = (1 - .022) \times .6 = .5868$
|
|
|
|
and can confirm that $.0132 + .5868 = .6$
|
|
|
|
and we can fill the rest of the table:
|
|
|
|
| | $B$ | $B^c$ | Total |
|
|
|------|---- |------ |-------|
|
|
|$A$ | .1068 | .0132 | .12 |
|
|
|$A^c$ | .2932 | .5868 | .88 |
|
|
|Total | .4 | .6 | 1 |
|
|
|
|
An important concept we implemented in above is the idea of **total probability**
|
|
|
|
:::{.callout-tip}
|
|
## total probability
|
|
|
|
The **total probability** of observing a real article is made up the sum of its
|
|
parts. Namely
|
|
|
|
$$P(B^c) = P(A \cap B^c) + P(A^c \cap B^c)$$
|
|
$$=P(A|B^c)P(B^c) + P(A^c|B^c)P(B^c)$$
|
|
$$=.0132 + .5868 = .6$$
|
|
:::
|
|
|
|
In the above calculations we also step through **joint probabilities**
|
|
|
|
:::{.callout-note}
|
|
## Joint and conditional probability
|
|
|
|
$$P(A \cap B) = P(A|B)P(B)$$
|
|
|
|
$A$ and $B$ are said to be independent events, if and only if
|
|
|
|
$$P(A \cap B) = P(A)P(B)$$
|
|
|
|
from this we can also derive the definition of a conditional probability
|
|
|
|
$$P(A|B) = \frac{P(A \cap B)}{P(B)}$$
|
|
|
|
:::
|
|
|
|
At this point we are able to answer the question, "What is the probability,
|
|
the new article is fake?". Given that the new article has an exclamation
|
|
point, we can zoom into the top row of the table of probabilitties. Within
|
|
this row we have probabilities $.1068/.12 = .833$ for fake and $.0132 / .12 = .11$
|
|
for real.
|
|
|
|
This is essentially Baye's Rule. We developed a posterior probability for an event
|
|
$B$ given some observation $A$. We did so by combining the likelihood of event $B$
|
|
given some new data $A$ and the prior probability of event $B$. More formally we
|
|
have the following definition:
|
|
|
|
:::{.callout-note}
|
|
## Baye's Rule
|
|
|
|
The posterior probability of an event $B$ given a $A$ is:
|
|
|
|
$$ P(B|A) = \frac{P(A \cap B)}{P(A)} = \frac{L(B|A)P(B)}{P(A)}$$
|
|
|
|
where $L$ is the likelihood function $L(B|A) = P(B|A)$ and $P(A)$ is the
|
|
total probability of $A$.
|
|
|
|
More generally,
|
|
|
|
$$ \frac{likelihood \cdot prior}{normalizing \;\; constant}$$
|
|
:::
|
|
|
|
### Simualation
|
|
|
|
|
|
```{r}
|
|
articles <- tibble::tibble(type = c("real", "fake"))
|
|
|
|
priors <- c(.6, .4)
|
|
|
|
articles_sim <- sample_n(articles, 10000, replace = TRUE, weight = priors)
|
|
```
|
|
|
|
```{r}
|
|
articles_sim |>
|
|
ggplot(aes(x = type)) + geom_bar()
|
|
```
|
|
|
|
and a summary table
|
|
|
|
```{r}
|
|
articles_sim |>
|
|
group_by(type) |>
|
|
summarise(
|
|
total = n(),
|
|
prop = total / nrow(articles_sim)
|
|
) |>
|
|
gt()|>
|
|
gt::cols_width(everything() ~ px(100))
|
|
```
|
|
|
|
the simulation of 10,000 articles shows us very nearly
|
|
the same priors we had from the data. We can now add
|
|
the exclamation usage into the data.
|
|
|
|
```{r}
|
|
|
|
articles_sim <- articles_sim |>
|
|
mutate(model_data = case_when(
|
|
type == "fake" ~ .267,
|
|
type == "real" ~ .022
|
|
))
|
|
```
|
|
|
|
|
|
The plan here is to iterate through the 10,000 samples
|
|
and use the `data_model` value to assign either, "yes" or
|
|
"no" using the `sample` function.
|
|
|
|
```{r}
|
|
data <- c("yes", "no")
|
|
|
|
articles_sim <- articles_sim |>
|
|
mutate(id = row_number()) |>
|
|
group_by(id) |>
|
|
mutate(usage = sample(data, 1, prob = c(model_data, 1 - model_data)))
|
|
```
|
|
|
|
|
|
```{r}
|
|
articles_sim |>
|
|
group_by(usage, type) |>
|
|
summarise(
|
|
total = n()
|
|
) |>
|
|
pivot_wider(names_from = type, values_from = total)
|
|
```
|
|
|
|
```{r}
|
|
articles_sim |>
|
|
ggplot(aes(x = type, fill = usage)) +
|
|
geom_bar() +
|
|
scale_fill_discrete(type = c("gray8", "dodgerblue4"))
|
|
```
|
|
|
|
So far have compute both the priors and likelihoods, we can simply
|
|
filter our data to reflect the incoming article and determine our
|
|
posterior.
|
|
|
|
```{r}
|
|
articles_sim |>
|
|
filter(usage == "yes") |>
|
|
group_by(type) |>
|
|
summarise(
|
|
total = n()
|
|
) |>
|
|
mutate(
|
|
prop = total / sum(total)
|
|
)
|
|
```
|
|
|
|
## Binomial Model and the chess example
|
|
|
|
The example used here is the case of a chess match between a human
|
|
and a computer "Deep Blue". The set up is such that we know the two
|
|
faced each other in 1996, in which the human won. There is a rematch
|
|
scheduled for the next 1997. We would like to model the number of games
|
|
out of 6 that the human can win.
|
|
|
|
Let $\pi$ be the probability that the human wins any one match against
|
|
the computer. To simplify things greatly we assume that $\pi$ takes on
|
|
values of .2, .5, .8. We also assume the following prior (we are told
|
|
in the book that we will learn how to build these later on):
|
|
|
|
| $\pi$ | .2 | .5 | .8 | total |
|
|
|--------|----|----|----|-------|
|
|
|$f(\pi)$|.10 |.25 |.65 | 1 |
|
|
|
|
:::{.callout-tip}
|
|
## Note
|
|
|
|
its important to note here that the sum of the values of $\pi$ **do
|
|
not** add up to 1. $\pi$ represents the chances of winning any single
|
|
game, we would expect $\pi$ to take on any value in $\mathbb{R}$. On
|
|
the other hand $f$ is a function that maps $\pi$ into a space of
|
|
probabilities, this is next.
|
|
:::
|
|
|
|
|
|
:::{.callout-note}
|
|
## Discrete Probability Model
|
|
|
|
Let $Y$ be a discrete random variable. The probability model for $Y$ is
|
|
described by a **probability mass function** (pmf) defined as:
|
|
$$f(y) = P(Y = y)$$
|
|
|
|
and has the following properties
|
|
|
|
1. $0 \leq f(y) \leq 1\;\; \forall y$
|
|
2. $\sum_{\forall y}f(y) = 1$
|
|
:::
|
|
|
|
|
|
:::{.callout-tip}
|
|
## in emanuel's words
|
|
what does this mean? well its very straightforward a pmf is a function that takes
|
|
in a some value y and outputs the probability that the random variable
|
|
$Y$ equals $y$.
|
|
:::
|
|
|
|
next we would like add a the dependancy of $Y$ on $\pi$, we do so by introducing
|
|
the conditional pmf.
|
|
|
|
:::{.callout-note}
|
|
## Conditional probability model of data $Y$
|
|
|
|
Let $Y$ be a discrete random variable that depends on some parameter
|
|
$\pi$. We define the conditional probability model of $Y$ as the
|
|
conditional pmf,
|
|
|
|
$$f(y|\pi) = P(Y = y | \pi)$$
|
|
|
|
and has the following properties,
|
|
|
|
1. $0 \leq f(y|\pi) \leq 1\;\; \forall y$
|
|
2. $\sum_{\forall y}f(y|\pi) = 1$
|
|
:::
|
|
|
|
:::{.callout-tip}
|
|
## in emanuel's words
|
|
this is essentially the same probability model had defined above, except
|
|
now we are condition probabilities by some parameter $\pi$
|
|
:::
|
|
|
|
in the example of the chess player we must make some assumptions:
|
|
|
|
1. the chances of winning any match in the game stay constant. So if
|
|
at match number 1 human has a .65% of winning, then that is the same
|
|
for match 2-6.
|
|
|
|
2. Winning or loosing a game does not affect the chances of winning
|
|
or loosing the next game, i.e matches are independent of one another.
|
|
|
|
These two assumptions lead us to the **Binomial Model**.
|
|
|
|
:::{.callout-note}
|
|
## The Binomial Model
|
|
|
|
Let the random variable $Y$ represent the number of successes in $n$ trials.
|
|
Assume that each trial is independent, and the probability of sucess in a
|
|
given trial is $\pi$. Then the conditional dependence of $Y$ on $\pi$ can
|
|
be modeled by the **Binomial Model** with parameters $n$ and $\pi$. We can
|
|
write this as,
|
|
|
|
$$Y|\pi \sim Bin(n, \pi)$$
|
|
|
|
the binomial model is specified by the pmf:
|
|
|
|
$$f(y|\pi) = {n \choose y} \pi^y(1 - \pi)^{n-y}$$
|
|
:::
|
|
|
|
knowing this we can represent $Y$ the total number of matches out of 6
|
|
that the human can win.
|
|
|
|
$$Y|\pi \sim Bin(6, \pi)$$
|
|
|
|
and conditional pmf:
|
|
|
|
$$f(y|\pi) = {6 \choose y}\pi^y(1 - \pi)^{6 - y}\;\; \text{for } y \in \{1, 2, 3, 4, 5, 6\}$$
|
|
|
|
with the pmf we can now determine the probability of the human winning $Y$ matches
|
|
out of 6 for any given value of $\pi$
|
|
|
|
```{r}
|
|
chess_pmf <- function(y, p, n = 6) {
|
|
choose(n, y) * (p ^ y) * (1 - p)^(n - y)
|
|
}
|
|
|
|
# what is probability that human wins 6 games given a pi value of .8
|
|
chess_pmf(y = 5, p = .8)
|
|
|
|
```
|
|
|
|
:::{.callout-tip}
|
|
##
|
|
|
|
the formula for the binomial is actually pretty intuitive, first you have
|
|
the scalar ${n \choose y}$ this will determine the total number of ways
|
|
the player can win $y$ games out of the possible $n$. This is first multiplied
|
|
by the probablility of success in the $n$ trials since $(p ^ y)$ can be
|
|
re-written as $p\times p\times \cdots \times p$, and then multiplied by
|
|
the probability of $n-y$ failures $(1 - p)^{n - y}$
|
|
:::
|
|
|
|
```{r}
|
|
pies <- seq(0, 1, by = .05)
|
|
py <- chess_pmf(y = 4, p = pies)
|
|
|
|
d <- data.frame(pies = pies, py = py)
|
|
|
|
d |>
|
|
ggplot(aes(pies, py)) + geom_col()
|
|
```
|
|
|
|
|
|
```{r}
|
|
pies <- c(.2, .5, .8)
|
|
ys <- 0:6
|
|
|
|
d <- tidyr::expand_grid(pies, ys)
|
|
fys <- purrr::map2_dbl(d$ys, d$pies, ~chess_pmf(.x, .y), n=6)
|
|
|
|
d$fys <- fys
|
|
d$display_pi <- as.factor(paste("pi =", d$pies))
|
|
|
|
d |>
|
|
ggplot(aes(x = ys, y = fys)) +
|
|
geom_col() +
|
|
scale_x_continuous(breaks = 0:6) +
|
|
facet_wrap(vars(display_pi))
|
|
```
|
|
|
|
The plot shows the three possible values for $\pi$ along
|
|
with the value of the pmf for each of the possible
|
|
matches the human can win in a game. The values of $f(y|\pi)$
|
|
are pretty intuitive, we would expect the random variable $Y$
|
|
to be lower when the value of $\pi$ is lower and higher when
|
|
the value of $\pi$ is higher.
|
|
|
|
For the sake of the excercise lets add more values of $\pi$
|
|
so that we can see this shift happen in more detail.
|
|
|
|
```{r}
|
|
pies <- seq(.1, .9, by = .1)
|
|
ys <- 0:6
|
|
|
|
d <- tidyr::expand_grid(pies, ys)
|
|
fys <- purrr::map2_dbl(d$ys, d$pies, ~chess_pmf(.x, .y), n=6)
|
|
|
|
d$fys <- fys
|
|
d$display_pi <- as.factor(paste("pi =", d$pies))
|
|
|
|
d |>
|
|
ggplot(aes(x = ys, y = fys)) +
|
|
geom_col() +
|
|
scale_x_continuous(breaks = 0:6) +
|
|
facet_wrap(vars(display_pi), nrow = 3)
|
|
```
|
|
|
|
as it turns out we learn that the human ended up winning just
|
|
one game in the 1997 rematch, $Y = 1$. The next step in our
|
|
analysis is to determine how compatible this new data is with
|
|
each value of $\pi$, the likelihood that is.
|
|
|
|
This is very easy to do with all the work we have done so far:
|
|
|
|
```{r}
|
|
d |>
|
|
filter(ys == 1) |>
|
|
ggplot(aes(pies, fys)) +
|
|
geom_col() +
|
|
scale_x_continuous(breaks = seq(.1, .9, by = .1))
|
|
```
|
|
|
|
It's very important to note the following
|
|
|
|
```{r}
|
|
# this will sum to a value greater than 1!!
|
|
d |>
|
|
filter(ys == 1) |>
|
|
pull(fys) |>
|
|
sum()
|
|
```
|
|
|
|
:::{.callout-important icon="true"}
|
|
this has been mentioned before but its an important message
|
|
to drive home. Note that the reason why the values sum to a
|
|
value greater than 1 is that they are **not** probabilities, they
|
|
are likelihoods. We are determining how likely each value of
|
|
$\pi$ is given that we have observed $Y = 1$.
|
|
:::
|
|
|
|
We can formalize the likelihood function $L$ in our example
|
|
as follows:
|
|
|
|
$$L(\pi|y=1) = f(y=1|\pi) = {6 \choose 1}\pi^1(1-\pi)^{6-1}$$
|
|
$$ = 6\pi(1 - \pi)^5$$
|
|
|
|
We can test this out
|
|
|
|
```{r}
|
|
6 * .2 * (.8 ^ 5)
|
|
```
|
|
which is the value we get as .2 in the bar plot.
|
|
|
|
```{r}
|
|
#| echo: false
|
|
|
|
d |>
|
|
filter(ys == 1) |>
|
|
mutate(hl = ifelse(pies == .2, "y", "n")) |>
|
|
ggplot(aes(pies, fys, fill=hl)) +
|
|
geom_col() +
|
|
scale_x_continuous(breaks = seq(.1, .9, by = .1)) +
|
|
scale_fill_manual(values = c("y"="darkblue", "n"="darkgrey"), guide=FALSE)
|
|
```
|
|
|
|
|
|
the likelihood values for $Y = 1$ are here:
|
|
|
|
```{r}
|
|
d |>
|
|
filter(ys == 1)|>
|
|
select(-display_pi) |>
|
|
knitr::kable()
|
|
```
|
|
|
|
The overall take-away from having observed the new data
|
|
$Y=1$ is that it is most compatible with a the $\pi$ value
|
|
of .2, this means that its safe to assume that the human is
|
|
a weaker player since we can think of the value $\pi$ as
|
|
a measure of relative weakness/superior of the human compared
|
|
to the computer with 0 being the weakest and 1 being the
|
|
strongest.
|
|
|
|
:::{.callout-note}
|
|
## Probability mass functions vs likelihood functions
|
|
|
|
When $\pi$ is known the conditional pmf $f(\cdot | \pi)$
|
|
allows us to compare the probabilities of the different values
|
|
of $Y$ occuring with $\pi$
|
|
|
|
On the other hand when $Y = y$ is known the likelihood
|
|
function $L(\cdot|Y=y) = f(Y=y|\cdot)$ allows us to compare
|
|
relative likelihoods of observing data $y$ under different
|
|
values of $\pi$
|
|
:::
|
|
|
|
Now that we have the priors for $\pi$ and the likelihoods
|
|
for $Y=1$ all we need is the **normalizing constant**
|
|
to make use of Beye's Rule in order to update our priors
|
|
with this new information and develop our posterior. Recall that the normalizing
|
|
constant is just the total probability of observing $Y = 1$.
|
|
To get this we simply calculate the probability of observing
|
|
$Y = 1$ for all values of $\pi$ and weight each one of these
|
|
by the corresponding prior of each value $\pi$.
|
|
|
|
$$f(y = 1) = f(Y=1|\pi = .2)f(\pi = .2)$$
|
|
$$+ f(Y=1|\pi = .5)f(\pi = .5) + f(Y = 1|\pi = .8)f(\pi=.8)$$
|
|
$$\approx .637$$
|
|
|
|
### Posterior
|
|
|
|
Our posterior distribution has pmf:
|
|
|
|
$$f(\pi|y = 1)$$
|
|
|
|
we can write this out as:
|
|
|
|
$$f(\pi | y = 1) = \frac{f(\pi)\times L(\pi| y = 1)}{f(y = 1)}$$
|
|
|
|
for $\pi \in \{0.2, 0.5, 0.8\}$
|
|
|
|
using just our simplified set of $\pi$ values we have the following
|
|
posterior:
|
|
|
|
| $\pi$ | 0.2 | 0.5 | 0.8 | Total |
|
|
|-------|-----|-----|-----|-------|
|
|
|$f(\pi)$ | 0.10 | 0.25 | 0.65 | 1 |
|
|
|
|
|
|
### Chess Posterior Simulation
|
|
|
|
Set up the scenario, we have possible values of $\pi$
|
|
and the corresponding prior probability for each one
|
|
|
|
```{r}
|
|
chess <- tibble::tibble(pi = c(.2, .5, .8))
|
|
prior <- c(.1, .25, .65)
|
|
```
|
|
|
|
next we sample use a sample function to generate 10,000
|
|
different values of $\pi$ from our dataframe, we will use
|
|
each of these to simuate a 6 match game.
|
|
|
|
```{r}
|
|
chess_sim <- sample_n(chess, size = 10000, weight = prior,
|
|
replace = TRUE)
|
|
```
|
|
|
|
Simulate 10,000 games
|
|
|
|
```{r}
|
|
chess_sim <- chess_sim |>
|
|
mutate(y = rbinom(10000, size = 6, prob = pi))
|
|
```
|
|
|
|
Lets check how close this simulation is to our known
|
|
confitional pmf's
|
|
|
|
```{r}
|
|
chess_sim |>
|
|
ggplot(aes(x = y)) +
|
|
stat_count(aes(y = ..prop..)) +
|
|
facet_wrap(~pi)
|
|
```
|
|
|
|
Let's now focus on the events where $Y = 1$ and tally up
|
|
results to see how well these approximated the values
|
|
we formally computed as our posterior.
|
|
|
|
```{r}
|
|
chess_sim |>
|
|
filter(y == 1) |>
|
|
group_by(pi) |>
|
|
tally() |>
|
|
mutate(
|
|
prop = n / sum(n)
|
|
)
|
|
```
|
|
|