more work
This commit is contained in:
parent
c01c507087
commit
838f9e05f3
342
R/ch2.html
342
R/ch2.html
|
@ -103,10 +103,23 @@ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warni
|
|||
<link rel="stylesheet" href="styles.css">
|
||||
</head>
|
||||
|
||||
<body class="fullcontent">
|
||||
<body>
|
||||
|
||||
<div id="quarto-content" class="page-columns page-rows-contents page-layout-article">
|
||||
|
||||
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
|
||||
<nav id="TOC" role="doc-toc" class="toc-active">
|
||||
<h2 id="toc-title">Table of contents</h2>
|
||||
|
||||
<ul>
|
||||
<li><a href="#likelihood" id="toc-likelihood" class="nav-link active" data-scroll-target="#likelihood">Likelihood</a></li>
|
||||
<li><a href="#simualation" id="toc-simualation" class="nav-link" data-scroll-target="#simualation">Simualation</a></li>
|
||||
<li><a href="#binomial-model-and-the-chess-example" id="toc-binomial-model-and-the-chess-example" class="nav-link" data-scroll-target="#binomial-model-and-the-chess-example">Binomial Model and the chess example</a>
|
||||
<ul class="collapse">
|
||||
<li><a href="#the-binomial-model" id="toc-the-binomial-model" class="nav-link" data-scroll-target="#the-binomial-model">The Binomial Model</a></li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
</nav>
|
||||
</div>
|
||||
<main class="content" id="quarto-document-content">
|
||||
|
||||
<header id="title-block-header" class="quarto-title-block default">
|
||||
|
@ -131,7 +144,9 @@ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warni
|
|||
|
||||
</header>
|
||||
|
||||
<p><em>Note: these notes are a work in progress</em></p>
|
||||
<p>In this chapter we step through an example of “fake” vs “real” news to build a framework to determine the probability of real vs fake of a new news article titled “The President has a secret!”</p>
|
||||
<p>We then go on to build a probability known as the Binomial model using the Bayesian framework</p>
|
||||
<div class="cell">
|
||||
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># libraries</span></span>
|
||||
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(bayesrules)</span>
|
||||
|
@ -145,40 +160,24 @@ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warni
|
|||
</div>
|
||||
<p>What is the proportion of news articles that were labeled fake vs real.</p>
|
||||
<div class="cell">
|
||||
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>fake_news <span class="sc">|></span> <span class="fu">glimpse</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>fake_news <span class="sc">|></span> <span class="fu">head</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="cell-output cell-output-stdout">
|
||||
<pre><code>Rows: 150
|
||||
Columns: 30
|
||||
$ title <chr> "Clinton's Exploited Haiti Earthquake ‘to Stea…
|
||||
$ text <chr> "0 SHARES Facebook Twitter\n\nBernard Sansaric…
|
||||
$ url <chr> "http://freedomdaily.com/former-haitian-senate…
|
||||
$ authors <chr> NA, NA, "Sierra Marlee", "Jack Shafer,Nolan D"…
|
||||
$ type <fct> fake, real, fake, real, fake, real, fake, fake…
|
||||
$ title_words <int> 17, 18, 16, 11, 9, 12, 11, 18, 10, 13, 10, 11,…
|
||||
$ text_words <int> 219, 509, 494, 268, 479, 220, 184, 500, 677, 4…
|
||||
$ title_char <int> 110, 95, 96, 60, 54, 66, 86, 104, 66, 81, 59, …
|
||||
$ text_char <int> 1444, 3016, 2881, 1674, 2813, 1351, 1128, 3112…
|
||||
$ title_caps <int> 0, 0, 1, 0, 0, 1, 0, 2, 1, 1, 0, 1, 0, 0, 0, 0…
|
||||
$ text_caps <int> 1, 1, 3, 3, 0, 0, 0, 12, 12, 1, 2, 5, 1, 1, 6,…
|
||||
$ title_caps_percent <dbl> 0.000000, 0.000000, 6.250000, 0.000000, 0.0000…
|
||||
$ text_caps_percent <dbl> 0.4566210, 0.1964637, 0.6072874, 1.1194030, 0.…
|
||||
$ title_excl <int> 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0…
|
||||
$ text_excl <int> 0, 0, 2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0…
|
||||
$ title_excl_percent <dbl> 0.0000000, 0.0000000, 2.0833333, 0.0000000, 0.…
|
||||
$ text_excl_percent <dbl> 0.00000000, 0.00000000, 0.06942034, 0.00000000…
|
||||
$ title_has_excl <lgl> FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE…
|
||||
$ anger <dbl> 4.24, 2.28, 1.18, 4.66, 0.82, 1.29, 2.56, 3.47…
|
||||
$ anticipation <dbl> 2.12, 1.71, 2.16, 1.79, 1.23, 0.43, 2.05, 1.74…
|
||||
$ disgust <dbl> 2.54, 1.90, 0.98, 1.79, 0.41, 1.72, 2.05, 1.35…
|
||||
$ fear <dbl> 3.81, 1.90, 1.57, 4.30, 0.82, 0.43, 5.13, 4.25…
|
||||
$ joy <dbl> 1.27, 1.71, 1.96, 0.36, 1.23, 0.86, 1.54, 1.35…
|
||||
$ sadness <dbl> 4.66, 1.33, 0.78, 1.79, 0.82, 0.86, 2.05, 1.93…
|
||||
$ surprise <dbl> 2.12, 1.14, 1.18, 1.79, 0.82, 0.86, 1.03, 1.35…
|
||||
$ trust <dbl> 2.97, 4.17, 3.73, 2.51, 2.46, 2.16, 5.13, 3.86…
|
||||
$ negative <dbl> 8.47, 4.74, 3.33, 6.09, 2.66, 3.02, 4.10, 4.63…
|
||||
$ positive <dbl> 3.81, 4.93, 5.49, 2.15, 4.30, 2.16, 4.10, 4.25…
|
||||
$ text_syllables <int> 395, 845, 806, 461, 761, 376, 326, 891, 1133, …
|
||||
$ text_syllables_per_word <dbl> 1.803653, 1.660118, 1.631579, 1.720149, 1.5887…</code></pre>
|
||||
<pre><code># A tibble: 6 × 30
|
||||
title text url authors type title…¹ text_…² title…³ text_…⁴ title…⁵
|
||||
<chr> <chr> <chr> <chr> <fct> <int> <int> <int> <int> <int>
|
||||
1 Clinton's E… "0 S… http… <NA> fake 17 219 110 1444 0
|
||||
2 Donald Trum… "\n\… http… <NA> real 18 509 95 3016 0
|
||||
3 Michelle Ob… "Mic… http… Sierra… fake 16 494 96 2881 1
|
||||
4 Trump hits … "“Cr… http… Jack S… real 11 268 60 1674 0
|
||||
5 Australia V… "Whe… http… Blair … fake 9 479 54 2813 0
|
||||
6 It’s “Trump… "Lik… http… View A… real 12 220 66 1351 1
|
||||
# … with 20 more variables: text_caps <int>, title_caps_percent <dbl>,
|
||||
# text_caps_percent <dbl>, title_excl <int>, text_excl <int>,
|
||||
# title_excl_percent <dbl>, text_excl_percent <dbl>, title_has_excl <lgl>,
|
||||
# anger <dbl>, anticipation <dbl>, disgust <dbl>, fear <dbl>, joy <dbl>,
|
||||
# sadness <dbl>, surprise <dbl>, trust <dbl>, negative <dbl>, positive <dbl>,
|
||||
# text_syllables <int>, text_syllables_per_word <dbl>, and abbreviated
|
||||
# variable names ¹title_words, ²text_words, ³title_char, ⁴text_char, …</code></pre>
|
||||
</div>
|
||||
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>fake_news <span class="sc">|></span></span>
|
||||
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">group_by</span>(type) <span class="sc">|></span> </span>
|
||||
|
@ -245,12 +244,12 @@ Probability and Likelihood
|
|||
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> gt<span class="sc">::</span><span class="fu">cols_width</span>(<span class="fu">everything</span>() <span class="sc">~</span> <span class="fu">px</span>(<span class="dv">100</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="cell-output-display">
|
||||
|
||||
<div id="cgeetizxio" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
|
||||
<div id="ibvcfeegcr" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
|
||||
<style>html {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_table {
|
||||
#ibvcfeegcr .gt_table {
|
||||
display: table;
|
||||
border-collapse: collapse;
|
||||
margin-left: auto;
|
||||
|
@ -275,7 +274,7 @@ Probability and Likelihood
|
|||
border-left-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_heading {
|
||||
#ibvcfeegcr .gt_heading {
|
||||
background-color: #FFFFFF;
|
||||
text-align: center;
|
||||
border-bottom-color: #FFFFFF;
|
||||
|
@ -287,7 +286,7 @@ Probability and Likelihood
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_title {
|
||||
#ibvcfeegcr .gt_title {
|
||||
color: #333333;
|
||||
font-size: 125%;
|
||||
font-weight: initial;
|
||||
|
@ -299,7 +298,7 @@ Probability and Likelihood
|
|||
border-bottom-width: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_subtitle {
|
||||
#ibvcfeegcr .gt_subtitle {
|
||||
color: #333333;
|
||||
font-size: 85%;
|
||||
font-weight: initial;
|
||||
|
@ -311,13 +310,13 @@ Probability and Likelihood
|
|||
border-top-width: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_bottom_border {
|
||||
#ibvcfeegcr .gt_bottom_border {
|
||||
border-bottom-style: solid;
|
||||
border-bottom-width: 2px;
|
||||
border-bottom-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_col_headings {
|
||||
#ibvcfeegcr .gt_col_headings {
|
||||
border-top-style: solid;
|
||||
border-top-width: 2px;
|
||||
border-top-color: #D3D3D3;
|
||||
|
@ -332,7 +331,7 @@ Probability and Likelihood
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_col_heading {
|
||||
#ibvcfeegcr .gt_col_heading {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -352,7 +351,7 @@ Probability and Likelihood
|
|||
overflow-x: hidden;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_column_spanner_outer {
|
||||
#ibvcfeegcr .gt_column_spanner_outer {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -364,15 +363,15 @@ Probability and Likelihood
|
|||
padding-right: 4px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_column_spanner_outer:first-child {
|
||||
#ibvcfeegcr .gt_column_spanner_outer:first-child {
|
||||
padding-left: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_column_spanner_outer:last-child {
|
||||
#ibvcfeegcr .gt_column_spanner_outer:last-child {
|
||||
padding-right: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_column_spanner {
|
||||
#ibvcfeegcr .gt_column_spanner {
|
||||
border-bottom-style: solid;
|
||||
border-bottom-width: 2px;
|
||||
border-bottom-color: #D3D3D3;
|
||||
|
@ -384,7 +383,7 @@ Probability and Likelihood
|
|||
width: 100%;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_group_heading {
|
||||
#ibvcfeegcr .gt_group_heading {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -409,7 +408,7 @@ Probability and Likelihood
|
|||
vertical-align: middle;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_empty_group_heading {
|
||||
#ibvcfeegcr .gt_empty_group_heading {
|
||||
padding: 0.5px;
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
|
@ -424,15 +423,15 @@ Probability and Likelihood
|
|||
vertical-align: middle;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_from_md > :first-child {
|
||||
#ibvcfeegcr .gt_from_md > :first-child {
|
||||
margin-top: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_from_md > :last-child {
|
||||
#ibvcfeegcr .gt_from_md > :last-child {
|
||||
margin-bottom: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_row {
|
||||
#ibvcfeegcr .gt_row {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -451,7 +450,7 @@ Probability and Likelihood
|
|||
overflow-x: hidden;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_stub {
|
||||
#ibvcfeegcr .gt_stub {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -464,7 +463,7 @@ Probability and Likelihood
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_stub_row_group {
|
||||
#ibvcfeegcr .gt_stub_row_group {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -478,11 +477,11 @@ Probability and Likelihood
|
|||
vertical-align: top;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_row_group_first td {
|
||||
#ibvcfeegcr .gt_row_group_first td {
|
||||
border-top-width: 2px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_summary_row {
|
||||
#ibvcfeegcr .gt_summary_row {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
text-transform: inherit;
|
||||
|
@ -492,16 +491,16 @@ Probability and Likelihood
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_first_summary_row {
|
||||
#ibvcfeegcr .gt_first_summary_row {
|
||||
border-top-style: solid;
|
||||
border-top-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_first_summary_row.thick {
|
||||
#ibvcfeegcr .gt_first_summary_row.thick {
|
||||
border-top-width: 2px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_last_summary_row {
|
||||
#ibvcfeegcr .gt_last_summary_row {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -511,7 +510,7 @@ Probability and Likelihood
|
|||
border-bottom-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_grand_summary_row {
|
||||
#ibvcfeegcr .gt_grand_summary_row {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
text-transform: inherit;
|
||||
|
@ -521,7 +520,7 @@ Probability and Likelihood
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_first_grand_summary_row {
|
||||
#ibvcfeegcr .gt_first_grand_summary_row {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -531,11 +530,11 @@ Probability and Likelihood
|
|||
border-top-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_striped {
|
||||
#ibvcfeegcr .gt_striped {
|
||||
background-color: rgba(128, 128, 128, 0.05);
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_table_body {
|
||||
#ibvcfeegcr .gt_table_body {
|
||||
border-top-style: solid;
|
||||
border-top-width: 2px;
|
||||
border-top-color: #D3D3D3;
|
||||
|
@ -544,7 +543,7 @@ Probability and Likelihood
|
|||
border-bottom-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_footnotes {
|
||||
#ibvcfeegcr .gt_footnotes {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
border-bottom-style: none;
|
||||
|
@ -558,7 +557,7 @@ Probability and Likelihood
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_footnote {
|
||||
#ibvcfeegcr .gt_footnote {
|
||||
margin: 0px;
|
||||
font-size: 90%;
|
||||
padding-left: 4px;
|
||||
|
@ -567,7 +566,7 @@ Probability and Likelihood
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_sourcenotes {
|
||||
#ibvcfeegcr .gt_sourcenotes {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
border-bottom-style: none;
|
||||
|
@ -581,7 +580,7 @@ Probability and Likelihood
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_sourcenote {
|
||||
#ibvcfeegcr .gt_sourcenote {
|
||||
font-size: 90%;
|
||||
padding-top: 4px;
|
||||
padding-bottom: 4px;
|
||||
|
@ -589,64 +588,64 @@ Probability and Likelihood
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_left {
|
||||
#ibvcfeegcr .gt_left {
|
||||
text-align: left;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_center {
|
||||
#ibvcfeegcr .gt_center {
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_right {
|
||||
#ibvcfeegcr .gt_right {
|
||||
text-align: right;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_font_normal {
|
||||
#ibvcfeegcr .gt_font_normal {
|
||||
font-weight: normal;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_font_bold {
|
||||
#ibvcfeegcr .gt_font_bold {
|
||||
font-weight: bold;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_font_italic {
|
||||
#ibvcfeegcr .gt_font_italic {
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_super {
|
||||
#ibvcfeegcr .gt_super {
|
||||
font-size: 65%;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_footnote_marks {
|
||||
#ibvcfeegcr .gt_footnote_marks {
|
||||
font-style: italic;
|
||||
font-weight: normal;
|
||||
font-size: 75%;
|
||||
vertical-align: 0.4em;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_asterisk {
|
||||
#ibvcfeegcr .gt_asterisk {
|
||||
font-size: 100%;
|
||||
vertical-align: 0;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_indent_1 {
|
||||
#ibvcfeegcr .gt_indent_1 {
|
||||
text-indent: 5px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_indent_2 {
|
||||
#ibvcfeegcr .gt_indent_2 {
|
||||
text-indent: 10px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_indent_3 {
|
||||
#ibvcfeegcr .gt_indent_3 {
|
||||
text-indent: 15px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_indent_4 {
|
||||
#ibvcfeegcr .gt_indent_4 {
|
||||
text-indent: 20px;
|
||||
}
|
||||
|
||||
#cgeetizxio .gt_indent_5 {
|
||||
#ibvcfeegcr .gt_indent_5 {
|
||||
text-indent: 25px;
|
||||
}
|
||||
</style>
|
||||
|
@ -855,12 +854,12 @@ Baye’s Rule
|
|||
<span id="cb10-8"><a href="#cb10-8" aria-hidden="true" tabindex="-1"></a> gt<span class="sc">::</span><span class="fu">cols_width</span>(<span class="fu">everything</span>() <span class="sc">~</span> <span class="fu">px</span>(<span class="dv">100</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
|
||||
<div class="cell-output-display">
|
||||
|
||||
<div id="riybaxjrki" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
|
||||
<div id="dpxebxbyvj" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
|
||||
<style>html {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_table {
|
||||
#dpxebxbyvj .gt_table {
|
||||
display: table;
|
||||
border-collapse: collapse;
|
||||
margin-left: auto;
|
||||
|
@ -885,7 +884,7 @@ Baye’s Rule
|
|||
border-left-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_heading {
|
||||
#dpxebxbyvj .gt_heading {
|
||||
background-color: #FFFFFF;
|
||||
text-align: center;
|
||||
border-bottom-color: #FFFFFF;
|
||||
|
@ -897,7 +896,7 @@ Baye’s Rule
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_title {
|
||||
#dpxebxbyvj .gt_title {
|
||||
color: #333333;
|
||||
font-size: 125%;
|
||||
font-weight: initial;
|
||||
|
@ -909,7 +908,7 @@ Baye’s Rule
|
|||
border-bottom-width: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_subtitle {
|
||||
#dpxebxbyvj .gt_subtitle {
|
||||
color: #333333;
|
||||
font-size: 85%;
|
||||
font-weight: initial;
|
||||
|
@ -921,13 +920,13 @@ Baye’s Rule
|
|||
border-top-width: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_bottom_border {
|
||||
#dpxebxbyvj .gt_bottom_border {
|
||||
border-bottom-style: solid;
|
||||
border-bottom-width: 2px;
|
||||
border-bottom-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_col_headings {
|
||||
#dpxebxbyvj .gt_col_headings {
|
||||
border-top-style: solid;
|
||||
border-top-width: 2px;
|
||||
border-top-color: #D3D3D3;
|
||||
|
@ -942,7 +941,7 @@ Baye’s Rule
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_col_heading {
|
||||
#dpxebxbyvj .gt_col_heading {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -962,7 +961,7 @@ Baye’s Rule
|
|||
overflow-x: hidden;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_column_spanner_outer {
|
||||
#dpxebxbyvj .gt_column_spanner_outer {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -974,15 +973,15 @@ Baye’s Rule
|
|||
padding-right: 4px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_column_spanner_outer:first-child {
|
||||
#dpxebxbyvj .gt_column_spanner_outer:first-child {
|
||||
padding-left: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_column_spanner_outer:last-child {
|
||||
#dpxebxbyvj .gt_column_spanner_outer:last-child {
|
||||
padding-right: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_column_spanner {
|
||||
#dpxebxbyvj .gt_column_spanner {
|
||||
border-bottom-style: solid;
|
||||
border-bottom-width: 2px;
|
||||
border-bottom-color: #D3D3D3;
|
||||
|
@ -994,7 +993,7 @@ Baye’s Rule
|
|||
width: 100%;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_group_heading {
|
||||
#dpxebxbyvj .gt_group_heading {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -1019,7 +1018,7 @@ Baye’s Rule
|
|||
vertical-align: middle;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_empty_group_heading {
|
||||
#dpxebxbyvj .gt_empty_group_heading {
|
||||
padding: 0.5px;
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
|
@ -1034,15 +1033,15 @@ Baye’s Rule
|
|||
vertical-align: middle;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_from_md > :first-child {
|
||||
#dpxebxbyvj .gt_from_md > :first-child {
|
||||
margin-top: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_from_md > :last-child {
|
||||
#dpxebxbyvj .gt_from_md > :last-child {
|
||||
margin-bottom: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_row {
|
||||
#dpxebxbyvj .gt_row {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -1061,7 +1060,7 @@ Baye’s Rule
|
|||
overflow-x: hidden;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_stub {
|
||||
#dpxebxbyvj .gt_stub {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -1074,7 +1073,7 @@ Baye’s Rule
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_stub_row_group {
|
||||
#dpxebxbyvj .gt_stub_row_group {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
font-size: 100%;
|
||||
|
@ -1088,11 +1087,11 @@ Baye’s Rule
|
|||
vertical-align: top;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_row_group_first td {
|
||||
#dpxebxbyvj .gt_row_group_first td {
|
||||
border-top-width: 2px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_summary_row {
|
||||
#dpxebxbyvj .gt_summary_row {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
text-transform: inherit;
|
||||
|
@ -1102,16 +1101,16 @@ Baye’s Rule
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_first_summary_row {
|
||||
#dpxebxbyvj .gt_first_summary_row {
|
||||
border-top-style: solid;
|
||||
border-top-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_first_summary_row.thick {
|
||||
#dpxebxbyvj .gt_first_summary_row.thick {
|
||||
border-top-width: 2px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_last_summary_row {
|
||||
#dpxebxbyvj .gt_last_summary_row {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -1121,7 +1120,7 @@ Baye’s Rule
|
|||
border-bottom-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_grand_summary_row {
|
||||
#dpxebxbyvj .gt_grand_summary_row {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
text-transform: inherit;
|
||||
|
@ -1131,7 +1130,7 @@ Baye’s Rule
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_first_grand_summary_row {
|
||||
#dpxebxbyvj .gt_first_grand_summary_row {
|
||||
padding-top: 8px;
|
||||
padding-bottom: 8px;
|
||||
padding-left: 5px;
|
||||
|
@ -1141,11 +1140,11 @@ Baye’s Rule
|
|||
border-top-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_striped {
|
||||
#dpxebxbyvj .gt_striped {
|
||||
background-color: rgba(128, 128, 128, 0.05);
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_table_body {
|
||||
#dpxebxbyvj .gt_table_body {
|
||||
border-top-style: solid;
|
||||
border-top-width: 2px;
|
||||
border-top-color: #D3D3D3;
|
||||
|
@ -1154,7 +1153,7 @@ Baye’s Rule
|
|||
border-bottom-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_footnotes {
|
||||
#dpxebxbyvj .gt_footnotes {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
border-bottom-style: none;
|
||||
|
@ -1168,7 +1167,7 @@ Baye’s Rule
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_footnote {
|
||||
#dpxebxbyvj .gt_footnote {
|
||||
margin: 0px;
|
||||
font-size: 90%;
|
||||
padding-left: 4px;
|
||||
|
@ -1177,7 +1176,7 @@ Baye’s Rule
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_sourcenotes {
|
||||
#dpxebxbyvj .gt_sourcenotes {
|
||||
color: #333333;
|
||||
background-color: #FFFFFF;
|
||||
border-bottom-style: none;
|
||||
|
@ -1191,7 +1190,7 @@ Baye’s Rule
|
|||
border-right-color: #D3D3D3;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_sourcenote {
|
||||
#dpxebxbyvj .gt_sourcenote {
|
||||
font-size: 90%;
|
||||
padding-top: 4px;
|
||||
padding-bottom: 4px;
|
||||
|
@ -1199,64 +1198,64 @@ Baye’s Rule
|
|||
padding-right: 5px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_left {
|
||||
#dpxebxbyvj .gt_left {
|
||||
text-align: left;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_center {
|
||||
#dpxebxbyvj .gt_center {
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_right {
|
||||
#dpxebxbyvj .gt_right {
|
||||
text-align: right;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_font_normal {
|
||||
#dpxebxbyvj .gt_font_normal {
|
||||
font-weight: normal;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_font_bold {
|
||||
#dpxebxbyvj .gt_font_bold {
|
||||
font-weight: bold;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_font_italic {
|
||||
#dpxebxbyvj .gt_font_italic {
|
||||
font-style: italic;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_super {
|
||||
#dpxebxbyvj .gt_super {
|
||||
font-size: 65%;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_footnote_marks {
|
||||
#dpxebxbyvj .gt_footnote_marks {
|
||||
font-style: italic;
|
||||
font-weight: normal;
|
||||
font-size: 75%;
|
||||
vertical-align: 0.4em;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_asterisk {
|
||||
#dpxebxbyvj .gt_asterisk {
|
||||
font-size: 100%;
|
||||
vertical-align: 0;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_indent_1 {
|
||||
#dpxebxbyvj .gt_indent_1 {
|
||||
text-indent: 5px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_indent_2 {
|
||||
#dpxebxbyvj .gt_indent_2 {
|
||||
text-indent: 10px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_indent_3 {
|
||||
#dpxebxbyvj .gt_indent_3 {
|
||||
text-indent: 15px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_indent_4 {
|
||||
#dpxebxbyvj .gt_indent_4 {
|
||||
text-indent: 20px;
|
||||
}
|
||||
|
||||
#riybaxjrki .gt_indent_5 {
|
||||
#dpxebxbyvj .gt_indent_5 {
|
||||
text-indent: 25px;
|
||||
}
|
||||
</style>
|
||||
|
@ -1276,11 +1275,11 @@ Baye’s Rule
|
|||
</thead>
|
||||
<tbody class="gt_table_body">
|
||||
<tr><td class="gt_row gt_left">fake</td>
|
||||
<td class="gt_row gt_right">3941</td>
|
||||
<td class="gt_row gt_right">0.3941</td></tr>
|
||||
<td class="gt_row gt_right">4031</td>
|
||||
<td class="gt_row gt_right">0.4031</td></tr>
|
||||
<tr><td class="gt_row gt_left">real</td>
|
||||
<td class="gt_row gt_right">6059</td>
|
||||
<td class="gt_row gt_right">0.6059</td></tr>
|
||||
<td class="gt_row gt_right">5969</td>
|
||||
<td class="gt_row gt_right">0.5969</td></tr>
|
||||
</tbody>
|
||||
|
||||
|
||||
|
@ -1317,8 +1316,8 @@ Baye’s Rule
|
|||
# Groups: usage [2]
|
||||
usage fake real
|
||||
<chr> <int> <int>
|
||||
1 no 2936 5932
|
||||
2 yes 1005 127</code></pre>
|
||||
1 no 2955 5845
|
||||
2 yes 1076 124</code></pre>
|
||||
</div>
|
||||
</div>
|
||||
<div class="cell">
|
||||
|
@ -1345,8 +1344,46 @@ Baye’s Rule
|
|||
<pre><code># A tibble: 2 × 3
|
||||
type total prop
|
||||
<chr> <int> <dbl>
|
||||
1 fake 1005 0.888
|
||||
2 real 127 0.112</code></pre>
|
||||
1 fake 1076 0.897
|
||||
2 real 124 0.103</code></pre>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
<section id="binomial-model-and-the-chess-example" class="level2">
|
||||
<h2 class="anchored" data-anchor-id="binomial-model-and-the-chess-example">Binomial Model and the chess example</h2>
|
||||
<p>The example used here is the case of a chess match between a human and a computer “Deep Blue”. The set up is such that we know the two faced each other in 1996, in which the human won. There is a rematch scheduled for the next 1997. We would like to model the number of games out of 6 that the human can win.</p>
|
||||
<p>Let <span class="math inline">\(\pi\)</span> be the probability that the human wins any one match against the computer. To simplify things greatly we assume that <span class="math inline">\(\pi\)</span> takes on values of .2, .5, .8. We also assume the following prior (we are told in the book that we will learn how to build these later on):</p>
|
||||
<table class="table">
|
||||
<thead>
|
||||
<tr class="header">
|
||||
<th><span class="math inline">\(\pi\)</span></th>
|
||||
<th>.2</th>
|
||||
<th>.5</th>
|
||||
<th>.8</th>
|
||||
<th>total</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr class="odd">
|
||||
<td><span class="math inline">\(f(\pi)\)</span></td>
|
||||
<td>.10</td>
|
||||
<td>.25</td>
|
||||
<td>.65</td>
|
||||
<td>1</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<div class="callout-caution callout callout-style-default no-icon callout-captioned">
|
||||
<div class="callout-header d-flex align-content-center">
|
||||
<div class="callout-icon-container">
|
||||
<i class="callout-icon no-icon"></i>
|
||||
</div>
|
||||
<div class="callout-caption-container flex-fill">
|
||||
Note
|
||||
</div>
|
||||
</div>
|
||||
<div class="callout-body-container callout-body">
|
||||
<p>its important to note here that the sum of the values of <span class="math inline">\(\pi\)</span> <strong>do not</strong> add up to 1. <span class="math inline">\(\pi\)</span> represents the chances of winning any single game, we would expect <span class="math inline">\(\pi\)</span> to take on any value in <span class="math inline">\(\mathbb{R}\)</span>. On the other hand <span class="math inline">\(f\)</span> is a function that maps <span class="math inline">\(\pi\)</span> into a space of probabilities, this is next.</p>
|
||||
</div>
|
||||
</div>
|
||||
<div class="callout-note callout callout-style-default no-icon callout-captioned">
|
||||
|
@ -1380,6 +1417,41 @@ in emanuel’s words
|
|||
<p>what does this mean? well its very straightforward a pmf is a function that takes in a some value y and outputs the probability that the random variable <span class="math inline">\(Y\)</span> equals <span class="math inline">\(y\)</span>.</p>
|
||||
</div>
|
||||
</div>
|
||||
<section id="the-binomial-model" class="level3">
|
||||
<h3 class="anchored" data-anchor-id="the-binomial-model">The Binomial Model</h3>
|
||||
<div class="callout-note callout callout-style-default no-icon callout-captioned">
|
||||
<div class="callout-header d-flex align-content-center">
|
||||
<div class="callout-icon-container">
|
||||
<i class="callout-icon no-icon"></i>
|
||||
</div>
|
||||
<div class="callout-caption-container flex-fill">
|
||||
Conditional probability model of data <span class="math inline">\(Y\)</span>
|
||||
</div>
|
||||
</div>
|
||||
<div class="callout-body-container callout-body">
|
||||
<p>Let <span class="math inline">\(Y\)</span> be a discrete random variable that depends on some parameter <span class="math inline">\(\pi\)</span>. We define the conditional probability model of <span class="math inline">\(Y\)</span> as the conditional pmf,</p>
|
||||
<p><span class="math display">\[f(y|\pi) = P(Y = y | \pi)\]</span></p>
|
||||
<p>and has the following properties,</p>
|
||||
<ol type="1">
|
||||
<li><span class="math inline">\(0 \leq f(y|\pi) \leq 1\;\; \forall y\)</span></li>
|
||||
<li><span class="math inline">\(\sum_{\forall y}f(y|\pi) = 1\)</span></li>
|
||||
</ol>
|
||||
</div>
|
||||
</div>
|
||||
<div class="callout-caution callout callout-style-default no-icon callout-captioned">
|
||||
<div class="callout-header d-flex align-content-center">
|
||||
<div class="callout-icon-container">
|
||||
<i class="callout-icon no-icon"></i>
|
||||
</div>
|
||||
<div class="callout-caption-container flex-fill">
|
||||
in emanuel’s words
|
||||
</div>
|
||||
</div>
|
||||
<div class="callout-body-container callout-body">
|
||||
<p>this is essentially the same probability model had defined above, except now we are condition probabilities by some parameter <span class="math inline">\(\pi\)</span></p>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
</section>
|
||||
|
||||
</main>
|
||||
|
|
58
R/ch2.qmd
58
R/ch2.qmd
|
@ -11,12 +11,18 @@ format:
|
|||
css: styles.css
|
||||
callout-icon: false
|
||||
callout-apperance: simple
|
||||
toc: true
|
||||
---
|
||||
|
||||
*Note: these notes are a work in progress*
|
||||
|
||||
In this chapter we step through an example
|
||||
of "fake" vs "real" news to build a framework to determine the probability
|
||||
of real vs fake of a new news article titled "The President has a secret!"
|
||||
|
||||
We then go on to build a probability known as the Binomial model using the
|
||||
Bayesian framework
|
||||
|
||||
```{r}
|
||||
#| message: false
|
||||
#| warning: false
|
||||
|
@ -34,7 +40,7 @@ fake_news <- tibble::as_tibble(fake_news)
|
|||
What is the proportion of news articles that were labeled fake vs real.
|
||||
|
||||
```{r}
|
||||
fake_news |> glimpse()
|
||||
fake_news |> head()
|
||||
|
||||
fake_news |>
|
||||
group_by(type) |>
|
||||
|
@ -316,6 +322,33 @@ articles_sim |>
|
|||
)
|
||||
```
|
||||
|
||||
## Binomial Model and the chess example
|
||||
|
||||
The example used here is the case of a chess match between a human
|
||||
and a computer "Deep Blue". The set up is such that we know the two
|
||||
faced each other in 1996, in which the human won. There is a rematch
|
||||
scheduled for the next 1997. We would like to model the number of games
|
||||
out of 6 that the human can win.
|
||||
|
||||
Let $\pi$ be the probability that the human wins any one match against
|
||||
the computer. To simplify things greatly we assume that $\pi$ takes on
|
||||
values of .2, .5, .8. We also assume the following prior (we are told
|
||||
in the book that we will learn how to build these later on):
|
||||
|
||||
| $\pi$ | .2 | .5 | .8 | total |
|
||||
|--------|----|----|----|-------|
|
||||
|$f(\pi)$|.10 |.25 |.65 | 1 |
|
||||
|
||||
:::{.callout-caution}
|
||||
## Note
|
||||
|
||||
its important to note here that the sum of the values of $\pi$ **do
|
||||
not** add up to 1. $\pi$ represents the chances of winning any single
|
||||
game, we would expect $\pi$ to take on any value in $\mathbb{R}$. On
|
||||
the other hand $f$ is a function that maps $\pi$ into a space of
|
||||
probabilities, this is next.
|
||||
:::
|
||||
|
||||
|
||||
:::{.callout-note}
|
||||
## Discrete Probability Model
|
||||
|
@ -336,4 +369,27 @@ and has the following properties
|
|||
what does this mean? well its very straightforward a pmf is a function that takes
|
||||
in a some value y and outputs the probability that the random variable
|
||||
$Y$ equals $y$.
|
||||
:::
|
||||
|
||||
### The Binomial Model
|
||||
|
||||
:::{.callout-note}
|
||||
## Conditional probability model of data $Y$
|
||||
|
||||
Let $Y$ be a discrete random variable that depends on some parameter
|
||||
$\pi$. We define the conditional probability model of $Y$ as the
|
||||
conditional pmf,
|
||||
|
||||
$$f(y|\pi) = P(Y = y | \pi)$$
|
||||
|
||||
and has the following properties,
|
||||
|
||||
1. $0 \leq f(y|\pi) \leq 1\;\; \forall y$
|
||||
2. $\sum_{\forall y}f(y|\pi) = 1$
|
||||
:::
|
||||
|
||||
:::{.callout-caution}
|
||||
## in emanuel's words
|
||||
this is essentially the same probability model had defined above, except
|
||||
now we are condition probabilities by some parameter $\pi$
|
||||
:::
|
Binary file not shown.
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 16 KiB |
Binary file not shown.
Before Width: | Height: | Size: 12 KiB After Width: | Height: | Size: 12 KiB |
Loading…
Reference in New Issue