--- title: "Chapter 2 Notes" author: "Emanuel Rodriguez" format: html: mainfont: arial monofont: "Cascadia Mono" highlight-style: ayu-dark --- In this chapter we step through an example of "fake" vs "real" news to build a framework to determine the probability of real vs fake of a new news article titled "The President has a secret!" ```{r} #| message: false #| warning: false # libraries library(bayesrules) library(dplyr) data(fake_news) fake_news <- tibble::as_tibble(fake_news) ``` What is the proportion of news articles that were labeled fake vs real. ```{r} fake_news |> glimpse() fake_news |> group_by(type) |> summarise( total = n(), prop = total / nrow(fake_news) ) ``` If we let $B$ be the event that a news article is "fake" news, and $B^c$ be the event that a news article is "real", we can write the following: $$P(B) = .4$$ $$P(B^c) = .6$$ This is the first "clue" or set of data that we have to build into our framework. Namely, majority of articles are "real", therefore we could simply predict that the new article is "real". This updated sense or reality now becomes our priors. Getting additional data, and updating our priors, based on additional data.