start qmd doc for this:

2022-09-03 23:50:30 -07:00
parent 601bfc9411
commit f9340fd7aa
14 changed files with 3039 additions and 6 deletions
--- a/R/ch2.qmd
+++ b/R/ch2.qmd
@@ -0,0 +1,48 @@
+---
+title: "Chapter 2 Notes"
+author: "Emanuel Rodriguez"
+format:
+    html:
+        mainfont: arial
+        monofont: "Cascadia Mono"
+        highlight-style: ayu-dark
+---
+
+In this chapter we step through an example 
+of "fake" vs "real" news to build a framework to determine the probability 
+of real vs fake of a new news article titled "The President has a secret!"
+
+```{r}
+#| message: false
+#| warning: false
+# libraries
+library(bayesrules)
+library(dplyr)
+data(fake_news)
+fake_news <- tibble::as_tibble(fake_news)
+```
+
+What is the proportion of news articles that were labeled fake vs real. 
+
+```{r}
+fake_news |> glimpse()
+
+fake_news |>
+    group_by(type) |> 
+    summarise(
+        total = n(),
+        prop = total / nrow(fake_news)
+    ) 
+```
+
+If we let $B$ be the event that a news article is "fake" news, and
+$B^c$ be the event that a news article is "real", we can write the following:
+
+$$P(B) = .4$$
+$$P(B^c) = .6$$
+
+This is the first "clue" or set of data that we have to build into our framework.
+Namely, majority of articles are "real", therefore we could simply predict that 
+the new article is "real". This updated sense or reality now becomes our priors.
+
+Getting additional data, and updating our priors, based on additional data.