more work

This commit is contained in:
Emanuel Rodriguez 2022-09-10 01:27:19 -07:00
parent c01c507087
commit 838f9e05f3
4 changed files with 264 additions and 136 deletions

View File

@ -103,10 +103,23 @@ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warni
<link rel="stylesheet" href="styles.css">
</head>
<body class="fullcontent">
<body>
<div id="quarto-content" class="page-columns page-rows-contents page-layout-article">
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#likelihood" id="toc-likelihood" class="nav-link active" data-scroll-target="#likelihood">Likelihood</a></li>
<li><a href="#simualation" id="toc-simualation" class="nav-link" data-scroll-target="#simualation">Simualation</a></li>
<li><a href="#binomial-model-and-the-chess-example" id="toc-binomial-model-and-the-chess-example" class="nav-link" data-scroll-target="#binomial-model-and-the-chess-example">Binomial Model and the chess example</a>
<ul class="collapse">
<li><a href="#the-binomial-model" id="toc-the-binomial-model" class="nav-link" data-scroll-target="#the-binomial-model">The Binomial Model</a></li>
</ul></li>
</ul>
</nav>
</div>
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
@ -131,7 +144,9 @@ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warni
</header>
<p><em>Note: these notes are a work in progress</em></p>
<p>In this chapter we step through an example of “fake” vs “real” news to build a framework to determine the probability of real vs fake of a new news article titled “The President has a secret!”</p>
<p>We then go on to build a probability known as the Binomial model using the Bayesian framework</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># libraries</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(bayesrules)</span>
@ -145,40 +160,24 @@ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warni
</div>
<p>What is the proportion of news articles that were labeled fake vs real.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>fake_news <span class="sc">|&gt;</span> <span class="fu">glimpse</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>fake_news <span class="sc">|&gt;</span> <span class="fu">head</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>Rows: 150
Columns: 30
$ title &lt;chr&gt; "Clinton's Exploited Haiti Earthquake to Stea…
$ text &lt;chr&gt; "0 SHARES Facebook Twitter\n\nBernard Sansaric…
$ url &lt;chr&gt; "http://freedomdaily.com/former-haitian-senate…
$ authors &lt;chr&gt; NA, NA, "Sierra Marlee", "Jack Shafer,Nolan D"…
$ type &lt;fct&gt; fake, real, fake, real, fake, real, fake, fake…
$ title_words &lt;int&gt; 17, 18, 16, 11, 9, 12, 11, 18, 10, 13, 10, 11,…
$ text_words &lt;int&gt; 219, 509, 494, 268, 479, 220, 184, 500, 677, 4…
$ title_char &lt;int&gt; 110, 95, 96, 60, 54, 66, 86, 104, 66, 81, 59, …
$ text_char &lt;int&gt; 1444, 3016, 2881, 1674, 2813, 1351, 1128, 3112…
$ title_caps &lt;int&gt; 0, 0, 1, 0, 0, 1, 0, 2, 1, 1, 0, 1, 0, 0, 0, 0…
$ text_caps &lt;int&gt; 1, 1, 3, 3, 0, 0, 0, 12, 12, 1, 2, 5, 1, 1, 6,…
$ title_caps_percent &lt;dbl&gt; 0.000000, 0.000000, 6.250000, 0.000000, 0.0000…
$ text_caps_percent &lt;dbl&gt; 0.4566210, 0.1964637, 0.6072874, 1.1194030, 0.…
$ title_excl &lt;int&gt; 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0…
$ text_excl &lt;int&gt; 0, 0, 2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0…
$ title_excl_percent &lt;dbl&gt; 0.0000000, 0.0000000, 2.0833333, 0.0000000, 0.…
$ text_excl_percent &lt;dbl&gt; 0.00000000, 0.00000000, 0.06942034, 0.00000000…
$ title_has_excl &lt;lgl&gt; FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALSE…
$ anger &lt;dbl&gt; 4.24, 2.28, 1.18, 4.66, 0.82, 1.29, 2.56, 3.47…
$ anticipation &lt;dbl&gt; 2.12, 1.71, 2.16, 1.79, 1.23, 0.43, 2.05, 1.74…
$ disgust &lt;dbl&gt; 2.54, 1.90, 0.98, 1.79, 0.41, 1.72, 2.05, 1.35…
$ fear &lt;dbl&gt; 3.81, 1.90, 1.57, 4.30, 0.82, 0.43, 5.13, 4.25…
$ joy &lt;dbl&gt; 1.27, 1.71, 1.96, 0.36, 1.23, 0.86, 1.54, 1.35…
$ sadness &lt;dbl&gt; 4.66, 1.33, 0.78, 1.79, 0.82, 0.86, 2.05, 1.93…
$ surprise &lt;dbl&gt; 2.12, 1.14, 1.18, 1.79, 0.82, 0.86, 1.03, 1.35…
$ trust &lt;dbl&gt; 2.97, 4.17, 3.73, 2.51, 2.46, 2.16, 5.13, 3.86…
$ negative &lt;dbl&gt; 8.47, 4.74, 3.33, 6.09, 2.66, 3.02, 4.10, 4.63…
$ positive &lt;dbl&gt; 3.81, 4.93, 5.49, 2.15, 4.30, 2.16, 4.10, 4.25…
$ text_syllables &lt;int&gt; 395, 845, 806, 461, 761, 376, 326, 891, 1133, …
$ text_syllables_per_word &lt;dbl&gt; 1.803653, 1.660118, 1.631579, 1.720149, 1.5887…</code></pre>
<pre><code># A tibble: 6 × 30
title text url authors type title…¹ text_…² title…³ text_…⁴ title…⁵
&lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;fct&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1 Clinton's E… "0 S… http… &lt;NA&gt; fake 17 219 110 1444 0
2 Donald Trum… "\n\… http… &lt;NA&gt; real 18 509 95 3016 0
3 Michelle Ob… "Mic… http… Sierra… fake 16 494 96 2881 1
4 Trump hits … "“Cr… http… Jack S… real 11 268 60 1674 0
5 Australia V… "Whe… http… Blair … fake 9 479 54 2813 0
6 Its “Trump… "Lik… http… View A… real 12 220 66 1351 1
# … with 20 more variables: text_caps &lt;int&gt;, title_caps_percent &lt;dbl&gt;,
# text_caps_percent &lt;dbl&gt;, title_excl &lt;int&gt;, text_excl &lt;int&gt;,
# title_excl_percent &lt;dbl&gt;, text_excl_percent &lt;dbl&gt;, title_has_excl &lt;lgl&gt;,
# anger &lt;dbl&gt;, anticipation &lt;dbl&gt;, disgust &lt;dbl&gt;, fear &lt;dbl&gt;, joy &lt;dbl&gt;,
# sadness &lt;dbl&gt;, surprise &lt;dbl&gt;, trust &lt;dbl&gt;, negative &lt;dbl&gt;, positive &lt;dbl&gt;,
# text_syllables &lt;int&gt;, text_syllables_per_word &lt;dbl&gt;, and abbreviated
# variable names ¹title_words, ²text_words, ³title_char, ⁴text_char, …</code></pre>
</div>
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>fake_news <span class="sc">|&gt;</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">group_by</span>(type) <span class="sc">|&gt;</span> </span>
@ -245,12 +244,12 @@ Probability and Likelihood
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a> gt<span class="sc">::</span><span class="fu">cols_width</span>(<span class="fu">everything</span>() <span class="sc">~</span> <span class="fu">px</span>(<span class="dv">100</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output-display">
<div id="cgeetizxio" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<div id="ibvcfeegcr" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>html {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;
}
#cgeetizxio .gt_table {
#ibvcfeegcr .gt_table {
display: table;
border-collapse: collapse;
margin-left: auto;
@ -275,7 +274,7 @@ Probability and Likelihood
border-left-color: #D3D3D3;
}
#cgeetizxio .gt_heading {
#ibvcfeegcr .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
@ -287,7 +286,7 @@ Probability and Likelihood
border-right-color: #D3D3D3;
}
#cgeetizxio .gt_title {
#ibvcfeegcr .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
@ -299,7 +298,7 @@ Probability and Likelihood
border-bottom-width: 0;
}
#cgeetizxio .gt_subtitle {
#ibvcfeegcr .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
@ -311,13 +310,13 @@ Probability and Likelihood
border-top-width: 0;
}
#cgeetizxio .gt_bottom_border {
#ibvcfeegcr .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#cgeetizxio .gt_col_headings {
#ibvcfeegcr .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
@ -332,7 +331,7 @@ Probability and Likelihood
border-right-color: #D3D3D3;
}
#cgeetizxio .gt_col_heading {
#ibvcfeegcr .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -352,7 +351,7 @@ Probability and Likelihood
overflow-x: hidden;
}
#cgeetizxio .gt_column_spanner_outer {
#ibvcfeegcr .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -364,15 +363,15 @@ Probability and Likelihood
padding-right: 4px;
}
#cgeetizxio .gt_column_spanner_outer:first-child {
#ibvcfeegcr .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#cgeetizxio .gt_column_spanner_outer:last-child {
#ibvcfeegcr .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#cgeetizxio .gt_column_spanner {
#ibvcfeegcr .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
@ -384,7 +383,7 @@ Probability and Likelihood
width: 100%;
}
#cgeetizxio .gt_group_heading {
#ibvcfeegcr .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -409,7 +408,7 @@ Probability and Likelihood
vertical-align: middle;
}
#cgeetizxio .gt_empty_group_heading {
#ibvcfeegcr .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
@ -424,15 +423,15 @@ Probability and Likelihood
vertical-align: middle;
}
#cgeetizxio .gt_from_md > :first-child {
#ibvcfeegcr .gt_from_md > :first-child {
margin-top: 0;
}
#cgeetizxio .gt_from_md > :last-child {
#ibvcfeegcr .gt_from_md > :last-child {
margin-bottom: 0;
}
#cgeetizxio .gt_row {
#ibvcfeegcr .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -451,7 +450,7 @@ Probability and Likelihood
overflow-x: hidden;
}
#cgeetizxio .gt_stub {
#ibvcfeegcr .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -464,7 +463,7 @@ Probability and Likelihood
padding-right: 5px;
}
#cgeetizxio .gt_stub_row_group {
#ibvcfeegcr .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -478,11 +477,11 @@ Probability and Likelihood
vertical-align: top;
}
#cgeetizxio .gt_row_group_first td {
#ibvcfeegcr .gt_row_group_first td {
border-top-width: 2px;
}
#cgeetizxio .gt_summary_row {
#ibvcfeegcr .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
@ -492,16 +491,16 @@ Probability and Likelihood
padding-right: 5px;
}
#cgeetizxio .gt_first_summary_row {
#ibvcfeegcr .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#cgeetizxio .gt_first_summary_row.thick {
#ibvcfeegcr .gt_first_summary_row.thick {
border-top-width: 2px;
}
#cgeetizxio .gt_last_summary_row {
#ibvcfeegcr .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -511,7 +510,7 @@ Probability and Likelihood
border-bottom-color: #D3D3D3;
}
#cgeetizxio .gt_grand_summary_row {
#ibvcfeegcr .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
@ -521,7 +520,7 @@ Probability and Likelihood
padding-right: 5px;
}
#cgeetizxio .gt_first_grand_summary_row {
#ibvcfeegcr .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -531,11 +530,11 @@ Probability and Likelihood
border-top-color: #D3D3D3;
}
#cgeetizxio .gt_striped {
#ibvcfeegcr .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#cgeetizxio .gt_table_body {
#ibvcfeegcr .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
@ -544,7 +543,7 @@ Probability and Likelihood
border-bottom-color: #D3D3D3;
}
#cgeetizxio .gt_footnotes {
#ibvcfeegcr .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
@ -558,7 +557,7 @@ Probability and Likelihood
border-right-color: #D3D3D3;
}
#cgeetizxio .gt_footnote {
#ibvcfeegcr .gt_footnote {
margin: 0px;
font-size: 90%;
padding-left: 4px;
@ -567,7 +566,7 @@ Probability and Likelihood
padding-right: 5px;
}
#cgeetizxio .gt_sourcenotes {
#ibvcfeegcr .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
@ -581,7 +580,7 @@ Probability and Likelihood
border-right-color: #D3D3D3;
}
#cgeetizxio .gt_sourcenote {
#ibvcfeegcr .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
@ -589,64 +588,64 @@ Probability and Likelihood
padding-right: 5px;
}
#cgeetizxio .gt_left {
#ibvcfeegcr .gt_left {
text-align: left;
}
#cgeetizxio .gt_center {
#ibvcfeegcr .gt_center {
text-align: center;
}
#cgeetizxio .gt_right {
#ibvcfeegcr .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#cgeetizxio .gt_font_normal {
#ibvcfeegcr .gt_font_normal {
font-weight: normal;
}
#cgeetizxio .gt_font_bold {
#ibvcfeegcr .gt_font_bold {
font-weight: bold;
}
#cgeetizxio .gt_font_italic {
#ibvcfeegcr .gt_font_italic {
font-style: italic;
}
#cgeetizxio .gt_super {
#ibvcfeegcr .gt_super {
font-size: 65%;
}
#cgeetizxio .gt_footnote_marks {
#ibvcfeegcr .gt_footnote_marks {
font-style: italic;
font-weight: normal;
font-size: 75%;
vertical-align: 0.4em;
}
#cgeetizxio .gt_asterisk {
#ibvcfeegcr .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#cgeetizxio .gt_indent_1 {
#ibvcfeegcr .gt_indent_1 {
text-indent: 5px;
}
#cgeetizxio .gt_indent_2 {
#ibvcfeegcr .gt_indent_2 {
text-indent: 10px;
}
#cgeetizxio .gt_indent_3 {
#ibvcfeegcr .gt_indent_3 {
text-indent: 15px;
}
#cgeetizxio .gt_indent_4 {
#ibvcfeegcr .gt_indent_4 {
text-indent: 20px;
}
#cgeetizxio .gt_indent_5 {
#ibvcfeegcr .gt_indent_5 {
text-indent: 25px;
}
</style>
@ -855,12 +854,12 @@ Bayes Rule
<span id="cb10-8"><a href="#cb10-8" aria-hidden="true" tabindex="-1"></a> gt<span class="sc">::</span><span class="fu">cols_width</span>(<span class="fu">everything</span>() <span class="sc">~</span> <span class="fu">px</span>(<span class="dv">100</span>))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output-display">
<div id="riybaxjrki" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<div id="dpxebxbyvj" style="overflow-x:auto;overflow-y:auto;width:auto;height:auto;">
<style>html {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;
}
#riybaxjrki .gt_table {
#dpxebxbyvj .gt_table {
display: table;
border-collapse: collapse;
margin-left: auto;
@ -885,7 +884,7 @@ Bayes Rule
border-left-color: #D3D3D3;
}
#riybaxjrki .gt_heading {
#dpxebxbyvj .gt_heading {
background-color: #FFFFFF;
text-align: center;
border-bottom-color: #FFFFFF;
@ -897,7 +896,7 @@ Bayes Rule
border-right-color: #D3D3D3;
}
#riybaxjrki .gt_title {
#dpxebxbyvj .gt_title {
color: #333333;
font-size: 125%;
font-weight: initial;
@ -909,7 +908,7 @@ Bayes Rule
border-bottom-width: 0;
}
#riybaxjrki .gt_subtitle {
#dpxebxbyvj .gt_subtitle {
color: #333333;
font-size: 85%;
font-weight: initial;
@ -921,13 +920,13 @@ Bayes Rule
border-top-width: 0;
}
#riybaxjrki .gt_bottom_border {
#dpxebxbyvj .gt_bottom_border {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
}
#riybaxjrki .gt_col_headings {
#dpxebxbyvj .gt_col_headings {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
@ -942,7 +941,7 @@ Bayes Rule
border-right-color: #D3D3D3;
}
#riybaxjrki .gt_col_heading {
#dpxebxbyvj .gt_col_heading {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -962,7 +961,7 @@ Bayes Rule
overflow-x: hidden;
}
#riybaxjrki .gt_column_spanner_outer {
#dpxebxbyvj .gt_column_spanner_outer {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -974,15 +973,15 @@ Bayes Rule
padding-right: 4px;
}
#riybaxjrki .gt_column_spanner_outer:first-child {
#dpxebxbyvj .gt_column_spanner_outer:first-child {
padding-left: 0;
}
#riybaxjrki .gt_column_spanner_outer:last-child {
#dpxebxbyvj .gt_column_spanner_outer:last-child {
padding-right: 0;
}
#riybaxjrki .gt_column_spanner {
#dpxebxbyvj .gt_column_spanner {
border-bottom-style: solid;
border-bottom-width: 2px;
border-bottom-color: #D3D3D3;
@ -994,7 +993,7 @@ Bayes Rule
width: 100%;
}
#riybaxjrki .gt_group_heading {
#dpxebxbyvj .gt_group_heading {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -1019,7 +1018,7 @@ Bayes Rule
vertical-align: middle;
}
#riybaxjrki .gt_empty_group_heading {
#dpxebxbyvj .gt_empty_group_heading {
padding: 0.5px;
color: #333333;
background-color: #FFFFFF;
@ -1034,15 +1033,15 @@ Bayes Rule
vertical-align: middle;
}
#riybaxjrki .gt_from_md > :first-child {
#dpxebxbyvj .gt_from_md > :first-child {
margin-top: 0;
}
#riybaxjrki .gt_from_md > :last-child {
#dpxebxbyvj .gt_from_md > :last-child {
margin-bottom: 0;
}
#riybaxjrki .gt_row {
#dpxebxbyvj .gt_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -1061,7 +1060,7 @@ Bayes Rule
overflow-x: hidden;
}
#riybaxjrki .gt_stub {
#dpxebxbyvj .gt_stub {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -1074,7 +1073,7 @@ Bayes Rule
padding-right: 5px;
}
#riybaxjrki .gt_stub_row_group {
#dpxebxbyvj .gt_stub_row_group {
color: #333333;
background-color: #FFFFFF;
font-size: 100%;
@ -1088,11 +1087,11 @@ Bayes Rule
vertical-align: top;
}
#riybaxjrki .gt_row_group_first td {
#dpxebxbyvj .gt_row_group_first td {
border-top-width: 2px;
}
#riybaxjrki .gt_summary_row {
#dpxebxbyvj .gt_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
@ -1102,16 +1101,16 @@ Bayes Rule
padding-right: 5px;
}
#riybaxjrki .gt_first_summary_row {
#dpxebxbyvj .gt_first_summary_row {
border-top-style: solid;
border-top-color: #D3D3D3;
}
#riybaxjrki .gt_first_summary_row.thick {
#dpxebxbyvj .gt_first_summary_row.thick {
border-top-width: 2px;
}
#riybaxjrki .gt_last_summary_row {
#dpxebxbyvj .gt_last_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -1121,7 +1120,7 @@ Bayes Rule
border-bottom-color: #D3D3D3;
}
#riybaxjrki .gt_grand_summary_row {
#dpxebxbyvj .gt_grand_summary_row {
color: #333333;
background-color: #FFFFFF;
text-transform: inherit;
@ -1131,7 +1130,7 @@ Bayes Rule
padding-right: 5px;
}
#riybaxjrki .gt_first_grand_summary_row {
#dpxebxbyvj .gt_first_grand_summary_row {
padding-top: 8px;
padding-bottom: 8px;
padding-left: 5px;
@ -1141,11 +1140,11 @@ Bayes Rule
border-top-color: #D3D3D3;
}
#riybaxjrki .gt_striped {
#dpxebxbyvj .gt_striped {
background-color: rgba(128, 128, 128, 0.05);
}
#riybaxjrki .gt_table_body {
#dpxebxbyvj .gt_table_body {
border-top-style: solid;
border-top-width: 2px;
border-top-color: #D3D3D3;
@ -1154,7 +1153,7 @@ Bayes Rule
border-bottom-color: #D3D3D3;
}
#riybaxjrki .gt_footnotes {
#dpxebxbyvj .gt_footnotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
@ -1168,7 +1167,7 @@ Bayes Rule
border-right-color: #D3D3D3;
}
#riybaxjrki .gt_footnote {
#dpxebxbyvj .gt_footnote {
margin: 0px;
font-size: 90%;
padding-left: 4px;
@ -1177,7 +1176,7 @@ Bayes Rule
padding-right: 5px;
}
#riybaxjrki .gt_sourcenotes {
#dpxebxbyvj .gt_sourcenotes {
color: #333333;
background-color: #FFFFFF;
border-bottom-style: none;
@ -1191,7 +1190,7 @@ Bayes Rule
border-right-color: #D3D3D3;
}
#riybaxjrki .gt_sourcenote {
#dpxebxbyvj .gt_sourcenote {
font-size: 90%;
padding-top: 4px;
padding-bottom: 4px;
@ -1199,64 +1198,64 @@ Bayes Rule
padding-right: 5px;
}
#riybaxjrki .gt_left {
#dpxebxbyvj .gt_left {
text-align: left;
}
#riybaxjrki .gt_center {
#dpxebxbyvj .gt_center {
text-align: center;
}
#riybaxjrki .gt_right {
#dpxebxbyvj .gt_right {
text-align: right;
font-variant-numeric: tabular-nums;
}
#riybaxjrki .gt_font_normal {
#dpxebxbyvj .gt_font_normal {
font-weight: normal;
}
#riybaxjrki .gt_font_bold {
#dpxebxbyvj .gt_font_bold {
font-weight: bold;
}
#riybaxjrki .gt_font_italic {
#dpxebxbyvj .gt_font_italic {
font-style: italic;
}
#riybaxjrki .gt_super {
#dpxebxbyvj .gt_super {
font-size: 65%;
}
#riybaxjrki .gt_footnote_marks {
#dpxebxbyvj .gt_footnote_marks {
font-style: italic;
font-weight: normal;
font-size: 75%;
vertical-align: 0.4em;
}
#riybaxjrki .gt_asterisk {
#dpxebxbyvj .gt_asterisk {
font-size: 100%;
vertical-align: 0;
}
#riybaxjrki .gt_indent_1 {
#dpxebxbyvj .gt_indent_1 {
text-indent: 5px;
}
#riybaxjrki .gt_indent_2 {
#dpxebxbyvj .gt_indent_2 {
text-indent: 10px;
}
#riybaxjrki .gt_indent_3 {
#dpxebxbyvj .gt_indent_3 {
text-indent: 15px;
}
#riybaxjrki .gt_indent_4 {
#dpxebxbyvj .gt_indent_4 {
text-indent: 20px;
}
#riybaxjrki .gt_indent_5 {
#dpxebxbyvj .gt_indent_5 {
text-indent: 25px;
}
</style>
@ -1276,11 +1275,11 @@ Bayes Rule
</thead>
<tbody class="gt_table_body">
<tr><td class="gt_row gt_left">fake</td>
<td class="gt_row gt_right">3941</td>
<td class="gt_row gt_right">0.3941</td></tr>
<td class="gt_row gt_right">4031</td>
<td class="gt_row gt_right">0.4031</td></tr>
<tr><td class="gt_row gt_left">real</td>
<td class="gt_row gt_right">6059</td>
<td class="gt_row gt_right">0.6059</td></tr>
<td class="gt_row gt_right">5969</td>
<td class="gt_row gt_right">0.5969</td></tr>
</tbody>
@ -1317,8 +1316,8 @@ Bayes Rule
# Groups: usage [2]
usage fake real
&lt;chr&gt; &lt;int&gt; &lt;int&gt;
1 no 2936 5932
2 yes 1005 127</code></pre>
1 no 2955 5845
2 yes 1076 124</code></pre>
</div>
</div>
<div class="cell">
@ -1345,8 +1344,46 @@ Bayes Rule
<pre><code># A tibble: 2 × 3
type total prop
&lt;chr&gt; &lt;int&gt; &lt;dbl&gt;
1 fake 1005 0.888
2 real 127 0.112</code></pre>
1 fake 1076 0.897
2 real 124 0.103</code></pre>
</div>
</div>
</section>
<section id="binomial-model-and-the-chess-example" class="level2">
<h2 class="anchored" data-anchor-id="binomial-model-and-the-chess-example">Binomial Model and the chess example</h2>
<p>The example used here is the case of a chess match between a human and a computer “Deep Blue”. The set up is such that we know the two faced each other in 1996, in which the human won. There is a rematch scheduled for the next 1997. We would like to model the number of games out of 6 that the human can win.</p>
<p>Let <span class="math inline">\(\pi\)</span> be the probability that the human wins any one match against the computer. To simplify things greatly we assume that <span class="math inline">\(\pi\)</span> takes on values of .2, .5, .8. We also assume the following prior (we are told in the book that we will learn how to build these later on):</p>
<table class="table">
<thead>
<tr class="header">
<th><span class="math inline">\(\pi\)</span></th>
<th>.2</th>
<th>.5</th>
<th>.8</th>
<th>total</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><span class="math inline">\(f(\pi)\)</span></td>
<td>.10</td>
<td>.25</td>
<td>.65</td>
<td>1</td>
</tr>
</tbody>
</table>
<div class="callout-caution callout callout-style-default no-icon callout-captioned">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-caption-container flex-fill">
Note
</div>
</div>
<div class="callout-body-container callout-body">
<p>its important to note here that the sum of the values of <span class="math inline">\(\pi\)</span> <strong>do not</strong> add up to 1. <span class="math inline">\(\pi\)</span> represents the chances of winning any single game, we would expect <span class="math inline">\(\pi\)</span> to take on any value in <span class="math inline">\(\mathbb{R}\)</span>. On the other hand <span class="math inline">\(f\)</span> is a function that maps <span class="math inline">\(\pi\)</span> into a space of probabilities, this is next.</p>
</div>
</div>
<div class="callout-note callout callout-style-default no-icon callout-captioned">
@ -1380,6 +1417,41 @@ in emanuels words
<p>what does this mean? well its very straightforward a pmf is a function that takes in a some value y and outputs the probability that the random variable <span class="math inline">\(Y\)</span> equals <span class="math inline">\(y\)</span>.</p>
</div>
</div>
<section id="the-binomial-model" class="level3">
<h3 class="anchored" data-anchor-id="the-binomial-model">The Binomial Model</h3>
<div class="callout-note callout callout-style-default no-icon callout-captioned">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-caption-container flex-fill">
Conditional probability model of data <span class="math inline">\(Y\)</span>
</div>
</div>
<div class="callout-body-container callout-body">
<p>Let <span class="math inline">\(Y\)</span> be a discrete random variable that depends on some parameter <span class="math inline">\(\pi\)</span>. We define the conditional probability model of <span class="math inline">\(Y\)</span> as the conditional pmf,</p>
<p><span class="math display">\[f(y|\pi) = P(Y = y | \pi)\]</span></p>
<p>and has the following properties,</p>
<ol type="1">
<li><span class="math inline">\(0 \leq f(y|\pi) \leq 1\;\; \forall y\)</span></li>
<li><span class="math inline">\(\sum_{\forall y}f(y|\pi) = 1\)</span></li>
</ol>
</div>
</div>
<div class="callout-caution callout callout-style-default no-icon callout-captioned">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon no-icon"></i>
</div>
<div class="callout-caption-container flex-fill">
in emanuels words
</div>
</div>
<div class="callout-body-container callout-body">
<p>this is essentially the same probability model had defined above, except now we are condition probabilities by some parameter <span class="math inline">\(\pi\)</span></p>
</div>
</div>
</section>
</section>
</main>

View File

@ -11,12 +11,18 @@ format:
css: styles.css
callout-icon: false
callout-apperance: simple
toc: true
---
*Note: these notes are a work in progress*
In this chapter we step through an example
of "fake" vs "real" news to build a framework to determine the probability
of real vs fake of a new news article titled "The President has a secret!"
We then go on to build a probability known as the Binomial model using the
Bayesian framework
```{r}
#| message: false
#| warning: false
@ -34,7 +40,7 @@ fake_news <- tibble::as_tibble(fake_news)
What is the proportion of news articles that were labeled fake vs real.
```{r}
fake_news |> glimpse()
fake_news |> head()
fake_news |>
group_by(type) |>
@ -316,6 +322,33 @@ articles_sim |>
)
```
## Binomial Model and the chess example
The example used here is the case of a chess match between a human
and a computer "Deep Blue". The set up is such that we know the two
faced each other in 1996, in which the human won. There is a rematch
scheduled for the next 1997. We would like to model the number of games
out of 6 that the human can win.
Let $\pi$ be the probability that the human wins any one match against
the computer. To simplify things greatly we assume that $\pi$ takes on
values of .2, .5, .8. We also assume the following prior (we are told
in the book that we will learn how to build these later on):
| $\pi$ | .2 | .5 | .8 | total |
|--------|----|----|----|-------|
|$f(\pi)$|.10 |.25 |.65 | 1 |
:::{.callout-caution}
## Note
its important to note here that the sum of the values of $\pi$ **do
not** add up to 1. $\pi$ represents the chances of winning any single
game, we would expect $\pi$ to take on any value in $\mathbb{R}$. On
the other hand $f$ is a function that maps $\pi$ into a space of
probabilities, this is next.
:::
:::{.callout-note}
## Discrete Probability Model
@ -336,4 +369,27 @@ and has the following properties
what does this mean? well its very straightforward a pmf is a function that takes
in a some value y and outputs the probability that the random variable
$Y$ equals $y$.
:::
### The Binomial Model
:::{.callout-note}
## Conditional probability model of data $Y$
Let $Y$ be a discrete random variable that depends on some parameter
$\pi$. We define the conditional probability model of $Y$ as the
conditional pmf,
$$f(y|\pi) = P(Y = y | \pi)$$
and has the following properties,
1. $0 \leq f(y|\pi) \leq 1\;\; \forall y$
2. $\sum_{\forall y}f(y|\pi) = 1$
:::
:::{.callout-caution}
## in emanuel's words
this is essentially the same probability model had defined above, except
now we are condition probabilities by some parameter $\pi$
:::

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

After

Width:  |  Height:  |  Size: 12 KiB