Files
bayes-rules-notes/R/ch2.qmd
2022-09-03 23:50:30 -07:00

48 lines
1.2 KiB
Plaintext

---
title: "Chapter 2 Notes"
author: "Emanuel Rodriguez"
format:
html:
mainfont: arial
monofont: "Cascadia Mono"
highlight-style: ayu-dark
---
In this chapter we step through an example
of "fake" vs "real" news to build a framework to determine the probability
of real vs fake of a new news article titled "The President has a secret!"
```{r}
#| message: false
#| warning: false
# libraries
library(bayesrules)
library(dplyr)
data(fake_news)
fake_news <- tibble::as_tibble(fake_news)
```
What is the proportion of news articles that were labeled fake vs real.
```{r}
fake_news |> glimpse()
fake_news |>
group_by(type) |>
summarise(
total = n(),
prop = total / nrow(fake_news)
)
```
If we let $B$ be the event that a news article is "fake" news, and
$B^c$ be the event that a news article is "real", we can write the following:
$$P(B) = .4$$
$$P(B^c) = .6$$
This is the first "clue" or set of data that we have to build into our framework.
Namely, majority of articles are "real", therefore we could simply predict that
the new article is "real". This updated sense or reality now becomes our priors.
Getting additional data, and updating our priors, based on additional data.