more work done

This commit is contained in:
Emanuel Rodriguez 2022-09-14 22:32:54 -07:00
parent 2c6cfadb0c
commit 0ee8c085e8
7 changed files with 400 additions and 182 deletions

File diff suppressed because it is too large Load Diff

122
R/ch2.qmd
View File

@ -11,7 +11,8 @@ format:
css: styles.css
callout-icon: false
callout-apperance: simple
toc: true
toc: false
html-math-method: katex
---
*Note: these notes are a work in progress*
@ -537,7 +538,7 @@ d |>
:::{.callout-important icon="true"}
this has been mentioned before but its an important message
to drive home. Note that the reason why thes values sum to a
to drive home. Note that the reason why the values sum to a
value greater than 1 is that they are **not** probabilities, they
are likelihoods. We are determining how likely each value of
$\pi$ is given that we have observed $Y = 1$.
@ -554,7 +555,20 @@ We can test this out
```{r}
6 * .2 * (.8 ^ 5)
```
which is the value we get as .2 in the bar plot above.
which is the value we get as .2 in the bar plot.
```{r}
#| echo: false
d |>
filter(ys == 1) |>
mutate(hl = ifelse(pies == .2, "y", "n")) |>
ggplot(aes(pies, fys, fill=hl)) +
geom_col() +
scale_x_continuous(breaks = seq(.1, .9, by = .1)) +
scale_fill_manual(values = c("y"="darkblue", "n"="darkgrey"), guide=FALSE)
```
the likelihood values for $Y = 1$ are here:
@ -565,5 +579,107 @@ d |>
knitr::kable()
```
The overall take-away from having observed the new data
$Y=1$ is that it is most compatible with a the $\pi$ value
of .2, this means that its safe to assume that the human is
a weaker player since we can think of the value $\pi$ as
a measure of relative weakness/superior of the human compared
to the computer with 0 being the weakest and 1 being the
strongest.
:::{.callout-note}
## Probability mass functions vs likelihood functions
When $\pi$ is known the conditional pmf $f(\cdot | \pi)$
allows us to compare the probabilities of the different values
of $Y$ occuring with $\pi$
On the other hand when $Y = y$ is known the likelihood
function $L(\cdot|Y=y) = f(Y=y|\cdot)$ allows us to compare
relative likelihoods of observing data $y$ under different
values of $\pi$
:::
Now that we have the priors for $\pi$ and the likelihoods
for $Y=1$ all we need is the **normalizing constant**
to make use of Beye's Rule in order to update our priors
with this new information and develop our posterior. Recall that the normalizing
constant is just the total probability of observing $Y = 1$.
To get this we simply calculate the probability of observing
$Y = 1$ for all values of $\pi$ and weight each one of these
by the corresponding prior of each value $\pi$.
$$f(y = 1) = f(Y=1|\pi = .2)f(\pi = .2)$$
$$+ f(Y=1|\pi = .5)f(\pi = .5) + f(Y = 1|\pi = .8)f(\pi=.8)$$
$$\approx .637$$
### Posterior
Our posterior distribution has pmf:
$$f(\pi|y = 1)$$
we can write this out as:
$$f(\pi | y = 1) = \frac{f(\pi)\times L(\pi| y = 1)}{f(y = 1)}$$
for $\pi \in \{0.2, 0.5, 0.8\}$
using just our simplified set of $\pi$ values we have the following
posterior:
| $\pi$ | 0.2 | 0.5 | 0.8 | Total |
|-------|-----|-----|-----|-------|
|$f(\pi)$ | 0.10 | 0.25 | 0.65 | 1 |
### Chess Posterior Simulation
Set up the scenario, we have possible values of $\pi$
and the corresponding prior probability for each one
```{r}
chess <- tibble::tibble(pi = c(.2, .5, .8))
prior <- c(.1, .25, .65)
```
next we sample use a sample function to generate 10,000
different values of $\pi$ from our dataframe, we will use
each of these to simuate a 6 match game.
```{r}
chess_sim <- sample_n(chess, size = 10000, weight = prior,
replace = TRUE)
```
Simulate 10,000 games
```{r}
chess_sim <- chess_sim |>
mutate(y = rbinom(10000, size = 6, prob = pi))
```
Lets check how close this simulation is to our known
confitional pmf's
```{r}
chess_sim |>
ggplot(aes(x = y)) +
stat_count(aes(y = ..prop..)) +
facet_wrap(~pi)
```
Let's now focus on the events where $Y = 1$ and tally up
results to see how well these approximated the values
we formally computed as our posterior.
```{r}
chess_sim |>
filter(y == 1) |>
group_by(pi) |>
tally() |>
mutate(
prop = n / sum(n)
)
```

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

After

Width:  |  Height:  |  Size: 12 KiB