more work done

2022-09-14 22:32:54 -07:00 · 2022-09-14 22:32:54 -07:00 · 0ee8c085e8
parent 2c6cfadb0c
commit 0ee8c085e8
7 changed files with 400 additions and 182 deletions
--- a/R/ch2.html
+++ b/R/ch2.html
--- a/R/ch2.qmd
+++ b/R/ch2.qmd
@ -11,7 +11,8 @@ format:
        css: styles.css
        callout-icon: false
        callout-apperance: simple
-        toc: true
+        toc: false
+        html-math-method: katex
 ---

 *Note: these notes are a work in progress*
@ -537,7 +538,7 @@ d |>

 :::{.callout-important icon="true"}
 this has been mentioned before but its an important message
-to drive home. Note that the reason why thes values sum to a 
+to drive home. Note that the reason why the values sum to a 
 value greater than 1 is that they are **not** probabilities, they 
 are likelihoods. We are determining how likely each value of 
 $\pi$ is given that we have observed $Y = 1$.
@ -554,7 +555,20 @@ We can test this out
 ```{r}
 6 * .2 * (.8 ^ 5)
 ```
-which is the value we get as .2 in the bar plot above.
+which is the value we get as .2 in the bar plot.
+
+```{r}
+#| echo: false
+
+d |>
+    filter(ys == 1) |>
+    mutate(hl = ifelse(pies == .2, "y", "n")) |>
+    ggplot(aes(pies, fys, fill=hl)) + 
+    geom_col() + 
+    scale_x_continuous(breaks = seq(.1, .9, by = .1)) + 
+    scale_fill_manual(values = c("y"="darkblue", "n"="darkgrey"), guide=FALSE)
+```
+

 the likelihood values for $Y = 1$ are here:

@ -565,5 +579,107 @@ d |>
    knitr::kable()
 ```

+The overall take-away from having observed the new data
+$Y=1$ is that it is most compatible with a the $\pi$ value
+of .2, this means that its safe to assume that the human is 
+a weaker player since we can think of the value $\pi$ as 
+a measure of relative weakness/superior of the human compared
+to the computer with 0 being the weakest and 1 being the 
+strongest.
+
+:::{.callout-note}
+## Probability mass functions vs likelihood functions
+
+When $\pi$ is known the conditional pmf $f(\cdot | \pi)$
+allows us to compare the probabilities of the different values
+of $Y$ occuring with $\pi$
+
+On the other hand when $Y = y$ is known the likelihood 
+function $L(\cdot|Y=y) = f(Y=y|\cdot)$ allows us to compare
+relative likelihoods of observing data $y$ under different
+values of $\pi$
+:::
+
+Now that we have the priors for $\pi$ and the likelihoods
+for $Y=1$ all we need is the **normalizing constant** 
+to make use of Beye's Rule in order to update our priors
+with this new information and develop our posterior. Recall that the normalizing
+constant is just the total probability of observing $Y = 1$.
+To get this we simply calculate the probability of observing
+$Y = 1$ for all values of $\pi$ and weight each one of these
+by the corresponding prior of each value $\pi$. 
+
+$$f(y = 1) = f(Y=1|\pi = .2)f(\pi = .2)$$ 
+$$+ f(Y=1|\pi = .5)f(\pi = .5) + f(Y = 1|\pi = .8)f(\pi=.8)$$
+$$\approx .637$$
+
+### Posterior 
+
+Our posterior distribution has pmf:
+
+$$f(\pi|y = 1)$$
+
+we can write this out as:
+
+$$f(\pi | y = 1) = \frac{f(\pi)\times L(\pi| y = 1)}{f(y = 1)}$$
+
+for $\pi \in \{0.2, 0.5, 0.8\}$
+
+using just our simplified set of $\pi$ values we have the following 
+posterior:
+
+| $\pi$ | 0.2 | 0.5 | 0.8 | Total |
+|-------|-----|-----|-----|-------|
+|$f(\pi)$ | 0.10 | 0.25 | 0.65 | 1     |


+### Chess Posterior Simulation 
+
+Set up the scenario, we have possible values of $\pi$
+and the corresponding prior probability for each one
+
+```{r}
+chess <- tibble::tibble(pi = c(.2, .5, .8))
+prior <- c(.1, .25, .65)
+```
+
+next we sample use a sample function to generate 10,000
+different values of $\pi$ from our dataframe, we will use
+each of these to simuate a 6 match game.
+
+```{r}
+chess_sim <- sample_n(chess, size = 10000, weight = prior, 
+replace = TRUE)
+```
+
+Simulate 10,000 games
+
+```{r}
+chess_sim <- chess_sim |>
+    mutate(y = rbinom(10000, size = 6, prob = pi))
+```
+
+Lets check how close this simulation is to our known 
+confitional pmf's 
+
+```{r}
+chess_sim |>
+    ggplot(aes(x = y)) + 
+    stat_count(aes(y = ..prop..)) + 
+    facet_wrap(~pi)
+```
+
+Let's now focus on the events where $Y = 1$ and tally up 
+results to see how well these approximated the values
+we formally computed as our posterior.
+
+```{r}
+chess_sim |>
+    filter(y == 1) |>
+    group_by(pi) |>
+    tally() |>
+    mutate(
+        prop = n / sum(n)
+    )
+```
+
--- a/R/ch2_files/figure-html/echo-false-1.png
+++ b/R/ch2_files/figure-html/echo-false-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-11-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-11-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-20-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-20-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-25-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-25-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-6-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-6-1.png