more work done

2022-09-14 22:32:54 -07:00
parent 2c6cfadb0c
commit 0ee8c085e8
7 changed files with 400 additions and 182 deletions
--- a/R/ch2.html
+++ b/R/ch2.html
--- a/R/ch2.qmd
+++ b/R/ch2.qmd
@@ -11,7 +11,8 @@ format:
        css: styles.css
        callout-icon: false
        callout-apperance: simple
-        toc: true
+        toc: false
        html-math-method: katex
 ---
 *Note: these notes are a work in progress*
@@ -537,7 +538,7 @@ d |>
 :::{.callout-important icon="true"}
 this has been mentioned before but its an important message
-to drive home. Note that the reason why thes values sum to a 
+to drive home. Note that the reason why the values sum to a 
 value greater than 1 is that they are **not** probabilities, they 
 are likelihoods. We are determining how likely each value of 
 $\pi$ is given that we have observed $Y = 1$.
@@ -554,7 +555,20 @@ We can test this out
 ```{r}
 6 * .2 * (.8 ^ 5)
 ```
-which is the value we get as .2 in the bar plot above.
+which is the value we get as .2 in the bar plot.
 ```{r}
 #| echo: false
 d |>
    filter(ys == 1) |>
    mutate(hl = ifelse(pies == .2, "y", "n")) |>
    ggplot(aes(pies, fys, fill=hl)) + 
    geom_col() + 
    scale_x_continuous(breaks = seq(.1, .9, by = .1)) + 
    scale_fill_manual(values = c("y"="darkblue", "n"="darkgrey"), guide=FALSE)
 ```
 the likelihood values for $Y = 1$ are here:
@@ -565,5 +579,107 @@ d |>
    knitr::kable()
 ```
 The overall take-away from having observed the new data
 $Y=1$ is that it is most compatible with a the $\pi$ value
 of .2, this means that its safe to assume that the human is 
 a weaker player since we can think of the value $\pi$ as 
 a measure of relative weakness/superior of the human compared
 to the computer with 0 being the weakest and 1 being the 
 strongest.
 :::{.callout-note}
 ## Probability mass functions vs likelihood functions
 When $\pi$ is known the conditional pmf $f(\cdot | \pi)$
 allows us to compare the probabilities of the different values
 of $Y$ occuring with $\pi$
 On the other hand when $Y = y$ is known the likelihood 
 function $L(\cdot|Y=y) = f(Y=y|\cdot)$ allows us to compare
 relative likelihoods of observing data $y$ under different
 values of $\pi$
 :::
 Now that we have the priors for $\pi$ and the likelihoods
 for $Y=1$ all we need is the **normalizing constant** 
 to make use of Beye's Rule in order to update our priors
 with this new information and develop our posterior. Recall that the normalizing
 constant is just the total probability of observing $Y = 1$.
 To get this we simply calculate the probability of observing
 $Y = 1$ for all values of $\pi$ and weight each one of these
 by the corresponding prior of each value $\pi$. 
 $$f(y = 1) = f(Y=1|\pi = .2)f(\pi = .2)$$ 
 $$+ f(Y=1|\pi = .5)f(\pi = .5) + f(Y = 1|\pi = .8)f(\pi=.8)$$
 $$\approx .637$$
 ### Posterior 
 Our posterior distribution has pmf:
 $$f(\pi|y = 1)$$
 we can write this out as:
 $$f(\pi | y = 1) = \frac{f(\pi)\times L(\pi| y = 1)}{f(y = 1)}$$
 for $\pi \in \{0.2, 0.5, 0.8\}$
 using just our simplified set of $\pi$ values we have the following 
 posterior:
 | $\pi$ | 0.2 | 0.5 | 0.8 | Total |
 |-------|-----|-----|-----|-------|
 |$f(\pi)$ | 0.10 | 0.25 | 0.65 | 1     |
 ### Chess Posterior Simulation 
 Set up the scenario, we have possible values of $\pi$
 and the corresponding prior probability for each one
 ```{r}
 chess <- tibble::tibble(pi = c(.2, .5, .8))
 prior <- c(.1, .25, .65)
 ```
 next we sample use a sample function to generate 10,000
 different values of $\pi$ from our dataframe, we will use
 each of these to simuate a 6 match game.
 ```{r}
 chess_sim <- sample_n(chess, size = 10000, weight = prior, 
 replace = TRUE)
 ```
 Simulate 10,000 games
 ```{r}
 chess_sim <- chess_sim |>
    mutate(y = rbinom(10000, size = 6, prob = pi))
 ```
 Lets check how close this simulation is to our known 
 confitional pmf's 
 ```{r}
 chess_sim |>
    ggplot(aes(x = y)) + 
    stat_count(aes(y = ..prop..)) + 
    facet_wrap(~pi)
 ```
 Let's now focus on the events where $Y = 1$ and tally up 
 results to see how well these approximated the values
 we formally computed as our posterior.
 ```{r}
 chess_sim |>
    filter(y == 1) |>
    group_by(pi) |>
    tally() |>
    mutate(
        prop = n / sum(n)
    )
 ```
--- a/R/ch2_files/figure-html/echo-false-1.png
+++ b/R/ch2_files/figure-html/echo-false-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-11-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-11-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-20-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-20-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-25-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-25-1.png
--- a/R/ch2_files/figure-html/unnamed-chunk-6-1.png
+++ b/R/ch2_files/figure-html/unnamed-chunk-6-1.png