start qmd doc for this:

This commit is contained in:
2022-09-03 23:50:30 -07:00
parent 601bfc9411
commit f9340fd7aa
14 changed files with 3039 additions and 6 deletions

48
R/ch2.qmd Normal file
View File

@@ -0,0 +1,48 @@
---
title: "Chapter 2 Notes"
author: "Emanuel Rodriguez"
format:
html:
mainfont: arial
monofont: "Cascadia Mono"
highlight-style: ayu-dark
---
In this chapter we step through an example
of "fake" vs "real" news to build a framework to determine the probability
of real vs fake of a new news article titled "The President has a secret!"
```{r}
#| message: false
#| warning: false
# libraries
library(bayesrules)
library(dplyr)
data(fake_news)
fake_news <- tibble::as_tibble(fake_news)
```
What is the proportion of news articles that were labeled fake vs real.
```{r}
fake_news |> glimpse()
fake_news |>
group_by(type) |>
summarise(
total = n(),
prop = total / nrow(fake_news)
)
```
If we let $B$ be the event that a news article is "fake" news, and
$B^c$ be the event that a news article is "real", we can write the following:
$$P(B) = .4$$
$$P(B^c) = .6$$
This is the first "clue" or set of data that we have to build into our framework.
Namely, majority of articles are "real", therefore we could simply predict that
the new article is "real". This updated sense or reality now becomes our priors.
Getting additional data, and updating our priors, based on additional data.