@@ -1345,7 +1342,7 @@ Baye’s Rule
type total prop
<chr> <int> <dbl>
1 fake 1076 0.897
-2 real 124 0.103
+2 real 123 0.103
@@ -1373,7 +1370,7 @@ Baye’s Rule
-
+
@@ -1404,7 +1401,7 @@ Discrete Probability Model
-
+
@@ -1417,8 +1414,7 @@ in emanuel’s words
what does this mean? well its very straightforward a pmf is a function that takes in a some value y and outputs the probability that the random variable \(Y\) equals \(y\).
-
-
The Binomial Model
+
next we would like add a the dependancy of \(Y\) on \(\pi\), we do so by introducing the conditional pmf.
@@ -1438,7 +1434,7 @@ Conditional probability model of data \(Y\)
-
+
@@ -1451,7 +1447,145 @@ in emanuel’s words
this is essentially the same probability model had defined above, except now we are condition probabilities by some parameter \(\pi\)
-
+
in the example of the chess player we must make some assumptions:
+
+
the chances of winning any match in the game stay constant. So if at match number 1 human has a .65% of winning, then that is the same for match 2-6.
+
Winning or loosing a game does not affect the chances of winning or loosing the next game, i.e matches are independent of one another.
+
+
These two assumptions lead us to the Binomial Model.
+
+
+
+
+
+
+The Binomial Model
+
+
+
+
Let the random variable \(Y\) represent the number of successes in \(n\) trials. Assume that each trial is independent, and the probability of sucess in a given trial is \(\pi\). Then the conditional dependence of \(Y\) on \(\pi\) can be modeled by the Binomial Model with parameters \(n\) and \(\pi\). We can write this as,
+
\[Y|\pi \sim Bin(n, \pi)\]
+
the binomial model is specified by the pmf:
+
\[f(y|\pi) = {n \choose y} \pi^y(1 - \pi)^{n-y}\]
+
+
+
knowing this we can represent \(Y\) the total number of matches out of 6 that the human can win.
with the pmf we can now determine the probability of the human winning \(Y\) matches out of 6 for any given value of \(\pi\)
+
+
chess_pmf <-function(y, p, n =6) {
+choose(n, y) * (p ^ y) * (1- p)^(n - y)
+}
+
+# what is probability that human wins 6 games given a pi value of .8
+chess_pmf(y =5, p = .8)
+
+
[1] 0.393216
+
+
+
+
+
+
+
+
+
+
+
+
+
the formula for the binomial is actually pretty intuitive, first you have the scalar \({n \choose y}\) this will determine the total number of ways the player can win \(y\) games out of the possible \(n\). This is first multiplied by the probablility of success in the \(n\) trials since \((p ^ y)\) can be re-written as \(p\times p\times \cdots \times p\), and then multiplied by the probability of \(n-y\) failures \((1 - p)^{n - y}\)
The plot shows the three possible values for \(\pi\) along with the value of the pmf for each of the possible matches the human can win in a game. The values of \(f(y|\pi)\) are pretty intuitive, we would expect the random variable \(Y\) to be lower when the value of \(\pi\) is lower and higher when the value of \(\pi\) is higher.
+
For the sake of the excercise lets add more values of \(\pi\) so that we can see this shift happen in more detail.
as it turns out we learn that the human ended up winning just one game in the 1997 rematch, \(Y = 1\). The next step in our analysis is to determine how compatible this new data is with each value of \(\pi\), the likelihood that is.
+
This is very easy to do with all the work we have done so far:
+
+
d |>
+filter(ys ==1) |>
+ggplot(aes(pies, fys)) +
+geom_col() +
+scale_x_continuous(breaks =seq(.1, .9, by = .1))
+
+
+
+
+
It’s very important to note the following
+
+
# this will sum to a value greater than 1!!
+d |>
+filter(ys ==1) |>
+pull(fys) |>
+sum()
+
+
[1] 1.37907
+
+
+
+
+
+
+
+
+Important
+
+
+
+
this has been mentioned before but its an important message to drive home. Note that the reason why thes values sum to a value greater than 1 is that they are not probabilities, they are likelihoods. We are determining how likely each value of \(\pi\) is given that we have observed \(Y = 1\).
+
+
diff --git a/R/ch2.qmd b/R/ch2.qmd
index c4b1d05..4701623 100644
--- a/R/ch2.qmd
+++ b/R/ch2.qmd
@@ -339,7 +339,7 @@ in the book that we will learn how to build these later on):
|--------|----|----|----|-------|
|$f(\pi)$|.10 |.25 |.65 | 1 |
-:::{.callout-caution}
+:::{.callout-tip}
## Note
its important to note here that the sum of the values of $\pi$ **do
@@ -364,14 +364,15 @@ and has the following properties
:::
-:::{.callout-caution}
+:::{.callout-tip}
## in emanuel's words
what does this mean? well its very straightforward a pmf is a function that takes
in a some value y and outputs the probability that the random variable
$Y$ equals $y$.
:::
-### The Binomial Model
+next we would like add a the dependancy of $Y$ on $\pi$, we do so by introducing
+the conditional pmf.
:::{.callout-note}
## Conditional probability model of data $Y$
@@ -388,8 +389,158 @@ and has the following properties,
2. $\sum_{\forall y}f(y|\pi) = 1$
:::
-:::{.callout-caution}
+:::{.callout-tip}
## in emanuel's words
this is essentially the same probability model had defined above, except
now we are condition probabilities by some parameter $\pi$
-:::
\ No newline at end of file
+:::
+
+in the example of the chess player we must make some assumptions:
+
+1. the chances of winning any match in the game stay constant. So if
+at match number 1 human has a .65% of winning, then that is the same
+for match 2-6.
+
+2. Winning or loosing a game does not affect the chances of winning
+or loosing the next game, i.e matches are independent of one another.
+
+These two assumptions lead us to the **Binomial Model**.
+
+:::{.callout-note}
+## The Binomial Model
+
+Let the random variable $Y$ represent the number of successes in $n$ trials.
+Assume that each trial is independent, and the probability of sucess in a
+given trial is $\pi$. Then the conditional dependence of $Y$ on $\pi$ can
+be modeled by the **Binomial Model** with parameters $n$ and $\pi$. We can
+write this as,
+
+$$Y|\pi \sim Bin(n, \pi)$$
+
+the binomial model is specified by the pmf:
+
+$$f(y|\pi) = {n \choose y} \pi^y(1 - \pi)^{n-y}$$
+:::
+
+knowing this we can represent $Y$ the total number of matches out of 6
+that the human can win.
+
+$$Y|\pi \sim Bin(6, \pi)$$
+
+and conditional pmf:
+
+$$f(y|\pi) = {6 \choose y}\pi^y(1 - \pi)^{6 - y}\;\; \text{for } y \in \{1, 2, 3, 4, 5, 6\}$$
+
+with the pmf we can now determine the probability of the human winning $Y$ matches
+out of 6 for any given value of $\pi$
+
+```{r}
+chess_pmf <- function(y, p, n = 6) {
+ choose(n, y) * (p ^ y) * (1 - p)^(n - y)
+}
+
+# what is probability that human wins 6 games given a pi value of .8
+chess_pmf(y = 5, p = .8)
+
+```
+
+:::{.callout-tip}
+##
+
+the formula for the binomial is actually pretty intuitive, first you have
+the scalar ${n \choose y}$ this will determine the total number of ways
+the player can win $y$ games out of the possible $n$. This is first multiplied
+by the probablility of success in the $n$ trials since $(p ^ y)$ can be
+re-written as $p\times p\times \cdots \times p$, and then multiplied by
+the probability of $n-y$ failures $(1 - p)^{n - y}$
+:::
+
+```{r}
+pies <- seq(0, 1, by = .05)
+py <- chess_pmf(y = 4, p = pies)
+
+d <- data.frame(pies = pies, py = py)
+
+d |>
+ ggplot(aes(pies, py)) + geom_col()
+```
+
+
+```{r}
+pies <- c(.2, .5, .8)
+ys <- 0:6
+
+d <- tidyr::expand_grid(pies, ys)
+fys <- purrr::map2_dbl(d$ys, d$pies, ~chess_pmf(.x, .y), n=6)
+
+d$fys <- fys
+d$display_pi <- as.factor(paste("pi =", d$pies))
+
+d |>
+ ggplot(aes(x = ys, y = fys)) +
+ geom_col() +
+ scale_x_continuous(breaks = 0:6) +
+ facet_wrap(vars(display_pi))
+```
+
+The plot shows the three possible values for $\pi$ along
+with the value of the pmf for each of the possible
+matches the human can win in a game. The values of $f(y|\pi)$
+are pretty intuitive, we would expect the random variable $Y$
+to be lower when the value of $\pi$ is lower and higher when
+the value of $\pi$ is higher.
+
+For the sake of the excercise lets add more values of $\pi$
+so that we can see this shift happen in more detail.
+
+```{r}
+pies <- seq(.1, .9, by = .1)
+ys <- 0:6
+
+d <- tidyr::expand_grid(pies, ys)
+fys <- purrr::map2_dbl(d$ys, d$pies, ~chess_pmf(.x, .y), n=6)
+
+d$fys <- fys
+d$display_pi <- as.factor(paste("pi =", d$pies))
+
+d |>
+ ggplot(aes(x = ys, y = fys)) +
+ geom_col() +
+ scale_x_continuous(breaks = 0:6) +
+ facet_wrap(vars(display_pi), nrow = 3)
+```
+
+as it turns out we learn that the human ended up winning just
+one game in the 1997 rematch, $Y = 1$. The next step in our
+analysis is to determine how compatible this new data is with
+each value of $\pi$, the likelihood that is.
+
+This is very easy to do with all the work we have done so far:
+
+```{r}
+d |>
+ filter(ys == 1) |>
+ ggplot(aes(pies, fys)) +
+ geom_col() +
+ scale_x_continuous(breaks = seq(.1, .9, by = .1))
+```
+
+It's very important to note the following
+
+```{r}
+# this will sum to a value greater than 1!!
+d |>
+ filter(ys == 1) |>
+ pull(fys) |>
+ sum()
+```
+
+:::{.callout-important icon="true"}
+this has been mentioned before but its an important message
+to drive home. Note that the reason why thes values sum to a
+value greater than 1 is that they are **not** probabilities, they
+are likelihoods. We are determining how likely each value of
+$\pi$ is given that we have observed $Y = 1$.
+:::
+
+
diff --git a/R/ch2_files/figure-html/unnamed-chunk-11-1.png b/R/ch2_files/figure-html/unnamed-chunk-11-1.png
index d6b4e1a..7057949 100644
Binary files a/R/ch2_files/figure-html/unnamed-chunk-11-1.png and b/R/ch2_files/figure-html/unnamed-chunk-11-1.png differ
diff --git a/R/ch2_files/figure-html/unnamed-chunk-14-1.png b/R/ch2_files/figure-html/unnamed-chunk-14-1.png
new file mode 100644
index 0000000..847f110
Binary files /dev/null and b/R/ch2_files/figure-html/unnamed-chunk-14-1.png differ
diff --git a/R/ch2_files/figure-html/unnamed-chunk-15-1.png b/R/ch2_files/figure-html/unnamed-chunk-15-1.png
new file mode 100644
index 0000000..0dc24a7
Binary files /dev/null and b/R/ch2_files/figure-html/unnamed-chunk-15-1.png differ
diff --git a/R/ch2_files/figure-html/unnamed-chunk-16-1.png b/R/ch2_files/figure-html/unnamed-chunk-16-1.png
new file mode 100644
index 0000000..0f99fa6
Binary files /dev/null and b/R/ch2_files/figure-html/unnamed-chunk-16-1.png differ
diff --git a/R/ch2_files/figure-html/unnamed-chunk-17-1.png b/R/ch2_files/figure-html/unnamed-chunk-17-1.png
new file mode 100644
index 0000000..66a3b69
Binary files /dev/null and b/R/ch2_files/figure-html/unnamed-chunk-17-1.png differ
diff --git a/R/ch2_files/figure-html/unnamed-chunk-6-1.png b/R/ch2_files/figure-html/unnamed-chunk-6-1.png
index 303bf1d..d2ed0b1 100644
Binary files a/R/ch2_files/figure-html/unnamed-chunk-6-1.png and b/R/ch2_files/figure-html/unnamed-chunk-6-1.png differ