class: center, middle, inverse, title-slide .title[ # Estimations ] .subtitle[ ## EDP 613 ] .author[ ### Week 8 ] --- <script> function resizeIframe(obj) { obj.style.height = obj.contentWindow.document.body.scrollHeight + 'px'; } </script>
# <span style='color:#bff4ee;'>A Note About The Slides</span> Currently the equations may not show up properly in Firefox. Other browsers such as Chrome and Safari do appear to render them correctly. --- # <span style='color:#bff4ee;'>A Note About Probability</span> We're going to introduce some concepts from Chapter 8 here. --- # <span style='color:#bff4ee;'>From</span> To -- <br> <center> <b style="color:#bff4ee;">Descriptive Statistics</b><br><br><i style="color:#bff4ee;">mathematical techniques for organizing and summarizing a set of numerical data</i> </center> <br> -- <br style="line-height: 3px" /> <center>
</center> <br style="line-height: 3px" /> <center>
</center> <br style="line-height: 3px" /> <center>
</center> <br style="line-height: 3px" /> -- <br> <center> <b style="color:#f4eebf;">Inferential Statistics</b><br><br><i style="color:#f4eebf;">generalizing from a sample to a population</i> </center> --- # Terms - **Statistic** - Mathematical expression that describes some aspects of a set of scores for a sample -- - **Parameter** - Describes some aspect of a set of scores for a population --- # First a Brief Intro to Hypothesis Testing -- >- Formally - Testing an assumption about a population parameter -- >- In Better Terms - An assumption about a particular situation of the world that is testable --- # The Null Hypothesis >- Represented as `\(H_0\)` -- >- is basically what you expect to happen before you run an experiment -- >- *You have to know what the Null is!* --- # The Alternative Hypothesis >- Represented as `\(H_1\)` (or `\(H_A\)`) -- >- is basically what else could happen if what you expect doesn't occur -- >- *You don't have to know this!* --- # Tests of Statistical Significance >- *Formally* - Done to determine whether `\(H_0\)` or `\(H_1\)` can be rejected -- >- *Better Explanation* - Test to figure out whether you can reasonably say if your initial assumption won't happen -- >- *Results* - If the outcomes of a study don't go against what you expected to happen, then you aren't finding anything new or surprising --- # Term A **(statistical) estimation ** is a sample statistic is used to estimate the value of an unknown population parameter. --- # Idea of Positive and Negative Outcomes - The Null hypothesis `\(H_0\)` is typically assuming nothing is going to happen - If `\(H_0\)` turns out to be right, then its called a ***negative*** outcome because nothing changed. -- - If `\(H_1\)` turns out to be right, then its called a ***positive*** outcome because something that you expected to happen didn't happen. -- >- Experiment: Over the span of one year, a group of people with ADHD gets an experimental pill that may help them focus better than their current medication -- >>- `\(H_0\)`: Group stays the same (expected) >>- `\(H_A\)`: Group is more focused (what we want to happen) -- >- Results: After an assessment >>- if the Group doesn't show greater focus, then we have a ***negative*** outcome because that's what was expected to happen >>- if the Group shows greater focus, then we have a ***positive*** outcome because that's NOT what was expected to happen --- # Notes about `\(H_0\)` and `\(H_A\)` <center> `\(H_A\)` is typically not the only alternative explanation </center> </br> -- - What if the Group was found to more focused? >- As a rule of thumb don't say that `\(H_A\)` is correct unless you absolutely know there are two outcomes (aka *binary outcomes*) >- Instead write that "we reject `\(H_0\)`" because you don't know if that's the ONLY alternative hypothesis. >>- It could also be that in other experiments that groups are found to be less focused! -- <br> - What if nothing happened to the Group? >- You can absolutely say that `\(H_0\)` is correct because that's what you expected >- So you can write that "we accept `\(H_0\)`" --- # Formal Table of Statistical Error Types -- .center2[ <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;background-color: #212121 !important;"> Decision </th> <th style="text-align:left;background-color: #212121 !important;"> Null is True </th> <th style="text-align:left;background-color: #212121 !important;"> Null is False </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> Reject Null </td> <td style="text-align:left;width: 10em; "> <b style="color:#f4bfc5;">Type I Error</b><br>(aka <b><i style="color:#f4bfc5;">False Positive</i></b>) </td> <td style="text-align:left;width: 10em; "> <span style="color:#bfe0f4;">Correct Outcome</span><br>(aka <i style="color:#bfe0f4;">True Positive</i>) </td> </tr> <tr> <td style="text-align:left;width: 10em; background-color: #212121 !important;"> Fail to Reject Null </td> <td style="text-align:left;width: 10em; background-color: #212121 !important;"> <span style="color:#bfe0f4;">Correct Outcome</span><br>(aka <i style="color:#bfe0f4;">True Negative</i>) </td> <td style="text-align:left;width: 10em; background-color: #212121 !important;"> <b style="color:#f4bfc5;">Type II Error</b><br>(aka <b><i style="color:#f4bfc5;">False Negative</i></b>) </td> </tr> </tbody> </table> ] --- # Nutshell Table of Statistical Error Types -- <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;background-color: #212121 !important;"> Decision </th> <th style="text-align:center;background-color: #212121 !important;"> Your first thought was right </th> <th style="text-align:center;background-color: #212121 !important;"> Your first thought was wrong </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;width: 20em; "> </td> <td style="text-align:center;width: 30em; "> </td> <td style="text-align:center;width: 30em; "> </td> </tr> <tr> <td style="text-align:center;width: 20em; background-color: #212121 !important;"> </td> <td style="text-align:center;width: 30em; background-color: #212121 !important;"> </td> <td style="text-align:center;width: 30em; background-color: #212121 !important;"> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> You changed your mind </td> <td style="text-align:center;width: 30em; "> <b style="color:#f4bfc5;">You changed your mind<br>BUT<br>the reality is you shouldn't have</b> </td> <td style="text-align:center;width: 30em; "> <span style="color:#bfe0f4;">You changed your mind<br>AND<br>in reality that was the right decision</span> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> </td> <td style="text-align:center;width: 30em; "> </td> <td style="text-align:center;width: 30em; "> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> <center>Results in a</center> </td> <td style="text-align:center;width: 30em; "> <center><b><i style="color:#f4bfc5;">False Positive / Type I Error</i></b></center> </td> <td style="text-align:center;width: 30em; "> <center><i style="color:#bfe0f4;">True Positive</i></center> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> </td> <td style="text-align:center;width: 30em; "> </td> <td style="text-align:center;width: 30em; "> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> </td> <td style="text-align:center;width: 30em; "> </td> <td style="text-align:center;width: 30em; "> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> You didn't change your mind </td> <td style="text-align:center;width: 30em; "> <span style="color:#bfe0f4;">You didn't change your mind<br>AND<br>in reality that was the right decision</span> </td> <td style="text-align:center;width: 30em; "> <b style="color:#f4bfc5;">You didn't change your mind<br>BUT<br>the reality is that you should have</b> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> </td> <td style="text-align:center;width: 30em; "> </td> <td style="text-align:center;width: 30em; "> </td> </tr> <tr> <td style="text-align:center;width: 20em; "> <center>Results in a</center> </td> <td style="text-align:center;width: 30em; "> <center><i style="color:#bfe0f4;">True Negative</i></center> </td> <td style="text-align:center;width: 30em; "> <center><b><i style="color:#f4bfc5;">False Negative / Type II Error</i></b></center> </td> </tr> </tbody> </table> --- ## Example <img src="img/type1-type2-error.svg" style="display: block; margin: auto;" /> --- # Alpha Formally -- >- rejecting `\(H_0\)` when it is true -- >- the probability of making a <b style='color:#f4bfc5;'>Type I Error</b> -- <br> In Better Terms -- >- the chance of making the wrong decision when what was initially expected to happen actually occurs -- >- Given by `\(\alpha\)` >- Ranges from 0-1 like all other probabilities -- <br> <center> Typically `\alpha = 0.05` but its really context dependent </center> --- # Example <center> For airplanes </center> <br> -- .pull-left[ - if they fly people around, then when **analyzing failures** >- you may want to lower the probability of making a wrong decision >- use a **smaller** `\(\alpha\)` ] -- .pull-right[ - if they're made of paper, then when **analyzing failures** >- you might be willing accept the higher risk of making the wrong decision >- use a **higher** `\(\alpha\)` ] --- # Beta Formally -- >- not rejecting the `\(H_0\)` when `\(H_1\)` is true -- >- the probability of making a <b style='color:#f4bfc5;'>Type II Error</b> -- <br> In Better Terms -- >- the chance of making the wrong decision when an something else actually occurs -- >- Given by `\(\beta\)` -- >- Ranges from 0-1 like all other probabilities --- # Power - `\(1-\beta\)` is called **statistical power** -- - extremely important! -- - Formally - the probability of NOT making a Type II error -- - In Better Terms - the chance that you can separate if an outcome is a result of something occurring vs. pure luck! --- # Decision Making <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Reality </th> <th style="text-align:center;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Rejected Null </th> <th style="text-align:center;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Did Not Reject Null </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> Type I Error </td> <td style="text-align:center;width: 10em; "> Correct decision </td> </tr> <tr> <td style="text-align:left;width: 10em; "> `H_0` is true </td> <td style="text-align:center;width: 10em; "> `alpha` </td> <td style="text-align:center;width: 10em; "> `1-alpha` </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> Chance of rejecting `H_0` when it is true /<br><span style="color:#f4eebf;"><b><i>Level of Significance</i></b></span> </td> <td style="text-align:center;width: 10em; "> <span style="color:#f4eebf;"><b><i>Level of Confidence</i></b></span> </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> Correct Decision </td> <td style="text-align:center;width: 10em; "> Type II Error </td> </tr> <tr> <td style="text-align:left;width: 10em; "> `H_0` is false </td> <td style="text-align:center;width: 10em; "> `1-beta` </td> <td style="text-align:center;width: 10em; "> `beta` </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> <span style="color:#f4eebf;"><b><i>Statistical Power!</i></b></span> </td> <td style="text-align:center;width: 10em; "> <span style="color:#f4eebf;"><b><i>Rate of a Type II Error</i></b></span> /<br>Chance of accepting `H_0` when it is false </td> </tr> <tr> <td style="text-align:left;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> <td style="text-align:center;width: 10em; "> </td> </tr> </tbody> </table> --- # Decision Making | | --------|---------|--------- Null | `\(H_0 =\)` | "Forecast says its NOT going to rain" Alternative | `\(H_1 =\)` | "Something else will happen" | | <br style="line-height: 3px" /> -- <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;border-bottom: 0;"> <thead> <tr> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Reality </th> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Did not reject the forecast </th> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Rejected forecast </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> Forecast was right </td> <td style="text-align:left;width: 10em; "> Did not take an umbrella and you're dry </td> <td style="text-align:left;width: 10em; "> Took an umbrella AND you're dry but may look silly or possibly fancy </td> </tr> <tr> <td style="text-align:left;width: 10em; border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Forecast was wrong </td> <td style="text-align:left;width: 10em; border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Did not take an umbrella AND you're wet </td> <td style="text-align:left;width: 10em; border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Took an umbrella AND you're dry </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <span style="font-style: italic;"><small>Note: </small></span> <sup></sup> <small><i>You could have also gotten wet from snow, a flood, etc. so again <b>the alternative hypothesis generally does not imply the opposite!</b></i></small> </td></tr></tfoot> </table> <br style="line-height: 3px" /> --- # Estimation - **(Statistical) Estimation** - a sample statistic is used to estimate the value of an unknown population parameter -- - **Point estimation** - use of sample data to calculate a single value -- - **Interval estimation** - use of sample data to calculate a possible range of values -- <br> <center> <i>Selecting a sample mean</i> </center> <br> <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Classification </th> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Hypothesis Testing </th> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Point/Interval Estimation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> Process </td> <td style="text-align:left;width: 10em; "> Determine the probability of getting that mean if the Null is true </td> <td style="text-align:left;width: 10em; "> Estimate the value of a population mean </td> </tr> <tr> <td style="text-align:left;width: 10em; "> Outcomes </td> <td style="text-align:left;width: 10em; "> Gain information about the population mean </td> <td style="text-align:left;width: 10em; "> Gain information about the population mean </td> </tr> </tbody> </table> --- # Updating Estimation for Sample Means -- - **Point estimation** - use of sample data to calculate a single **mean** value -- - Benefit - the sample mean will equal the population mean on average -- - Drawback - unable to figure out if a sample mean actually equals the population mean -- - **Interval estimation** - use of sample data to calculate a possible range of **mean** values --- # The Characteristic of Hypothesis Testing and Estimation <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Question </th> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Hypothesis Testing </th> <th style="text-align:left;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Point/Interval Estimation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;width: 10em; "> Do we know the population mean? </td> <td style="text-align:left;width: 10em; "> Yes its the Null hypothesis </td> <td style="text-align:left;width: 10em; "> No we're trying to estimate it </td> </tr> <tr> <td style="text-align:left;width: 10em; "> What is the process use dto determine? </td> <td style="text-align:left;width: 10em; "> The chance of obtaining a sample mean </td> <td style="text-align:left;width: 10em; "> The value of a population mean </td> </tr> <tr> <td style="text-align:left;width: 10em; "> What is learned? </td> <td style="text-align:left;width: 10em; "> Whether the population mean is likely correct </td> <td style="text-align:left;width: 10em; "> The range of values within which the population mean is probably contained </td> </tr> <tr> <td style="text-align:left;width: 10em; "> What is our decision? </td> <td style="text-align:left;width: 10em; "> To retain or reject the null hypothesis </td> <td style="text-align:left;width: 10em; "> No actual decison </td> </tr> </tbody> </table> --- # Confidence -- - **Confidence Interval** - an interval that contains an unknown parameter (e.g. `\(\mu\)`) with certain degree of confidence -- - **Level of Confidence** - probability or likelihood that an interval estimate will contain an unknown population parameter --- # Determining the Confidence Interval 1. Calculate the standard error of the mean `$$\sigma_{\overline{Y}} =\dfrac{\sigma}{\sqrt{N}}$$` -- 2. Decide on a level of confidence <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> Probability </th> <th style="text-align:center;border-bottom: solid; border-bottom-width:1px; border-bottom-color: #666666;"> `z`-score </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 0.90 </td> <td style="text-align:center;"> 1.645 </td> </tr> <tr> <td style="text-align:center;"> 0.95 </td> <td style="text-align:center;"> 1.96 </td> </tr> <tr> <td style="text-align:center;"> 0.99 </td> <td style="text-align:center;"> 2.576 </td> </tr> </tbody> </table> <br> <br> <center> Again its typical to have a 95% level of confidence thereby making \[\alpha = 0.05\] </center> --- # Determining the Confidence Interval (continued) <ol start=3> <li> `CI = \overline{Y} \pm z\cdot\sigma_{\overline{Y}}` </ol> -- <ol start=4> <li> Interpret the results </ol> --- # Example IQ scores in the general healthy population are approximately normally distributed with `\(100 ± 15\)`. In a sample of 100 students a sample mean IQ of 103. Find the 90% confidence interval for this data. -- Firstly we have `\(N = 100\)`, `\(\mu=100\)`, `\(\sigma = 15\)`, and `\(\overline{Y} = 103\)`. -- 1. `$$\sigma_{\overline{Y}} = \dfrac{\sigma}{\sqrt{N}} =\dfrac{15}{\sqrt{100}} = 1.50$$` -- 2. Want to find 90% confidence interval, so choose a 90% level of confidence. `$$z\cdot \sigma_{\overline{Y}} = 1.645\cdot 1.50 = 2.47$$` --- <ol start=3> <li> So $$ 90\%\, CI = 103\pm2.47 = (105.47, 100.53) $$ </ol> -- <ol start=4> <li> We are 90% confident that the overall mean IQ is between 100.53 and 105.47. </ol> --- ## That's it. Take a break before our R session!