class: center, middle, inverse, title-slide .title[ # The Not so Tiny t-test ] .author[ ### Week 10 ] --- <script> function resizeIframe(obj) { obj.style.height = obj.contentWindow.document.body.scrollHeight + 'px'; } </script>
# Packages needed and a Note about Icons Please load up the following packages. Remember to first install the ones you don't have. <br> <br> You may come across the following icons. The table below lists what each means. <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;background-color: transparent !important;"> Icon </th> <th style="text-align:left;background-color: transparent !important;"> Description </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;width: 10em; background-color: #transparent !important;"> <svg aria-hidden="true" role="img" viewbox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill: #4682b4;overflow:visible;position:relative;"><path d="M52.51 440.6l171.5-142.9V214.3L52.51 71.41C31.88 54.28 0 68.66 0 96.03v319.9C0 443.3 31.88 457.7 52.51 440.6zM308.5 440.6l192-159.1c15.25-12.87 15.25-36.37 0-49.24l-192-159.1c-20.63-17.12-52.51-2.749-52.51 24.62v319.9C256 443.3 287.9 457.7 308.5 440.6z"></path></svg> </td> <td style="text-align:left;width: 40em; background-color: #transparent !important;"> Indicates that an example continues on the following slide. </td> </tr> <tr> <td style="text-align:center;width: 10em; background-color: #transparent !important;"> <svg aria-hidden="true" role="img" viewbox="0 0 384 512" style="height:1em;width:0.75em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#ff6347;overflow:visible;position:relative;"><path d="M384 128v255.1c0 35.35-28.65 64-64 64H64c-35.35 0-64-28.65-64-64V128c0-35.35 28.65-64 64-64H320C355.3 64 384 92.65 384 128z"></path></svg> </td> <td style="text-align:left;width: 40em; background-color: #transparent !important;"> Indicates that a section using common syntax has ended. </td> </tr> <tr> <td style="text-align:center;width: 10em; background-color: #transparent !important;"> <svg aria-hidden="true" role="img" viewbox="0 0 640 512" style="height:1em;width:1.25em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#5cb85c;overflow:visible;position:relative;"><path d="M172.5 131.1C228.1 75.51 320.5 75.51 376.1 131.1C426.1 181.1 433.5 260.8 392.4 318.3L391.3 319.9C381 334.2 361 337.6 346.7 327.3C332.3 317 328.9 297 339.2 282.7L340.3 281.1C363.2 249 359.6 205.1 331.7 177.2C300.3 145.8 249.2 145.8 217.7 177.2L105.5 289.5C73.99 320.1 73.99 372 105.5 403.5C133.3 431.4 177.3 435 209.3 412.1L210.9 410.1C225.3 400.7 245.3 404 255.5 418.4C265.8 432.8 262.5 452.8 248.1 463.1L246.5 464.2C188.1 505.3 110.2 498.7 60.21 448.8C3.741 392.3 3.741 300.7 60.21 244.3L172.5 131.1zM467.5 380C411 436.5 319.5 436.5 263 380C213 330 206.5 251.2 247.6 193.7L248.7 192.1C258.1 177.8 278.1 174.4 293.3 184.7C307.7 194.1 311.1 214.1 300.8 229.3L299.7 230.9C276.8 262.1 280.4 306.9 308.3 334.8C339.7 366.2 390.8 366.2 422.3 334.8L534.5 222.5C566 191 566 139.1 534.5 108.5C506.7 80.63 462.7 76.99 430.7 99.9L429.1 101C414.7 111.3 394.7 107.1 384.5 93.58C374.2 79.2 377.5 59.21 391.9 48.94L393.5 47.82C451 6.731 529.8 13.25 579.8 63.24C636.3 119.7 636.3 211.3 579.8 267.7L467.5 380z"></path></svg> </td> <td style="text-align:left;width: 40em; background-color: #transparent !important;"> Indicates that there is an active hyperlink on the slide. </td> </tr> <tr> <td style="text-align:center;width: 10em; background-color: #transparent !important;"> <svg aria-hidden="true" role="img" viewbox="0 0 384 512" style="height:1em;width:0.75em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:#faffbd;overflow:visible;position:relative;"><path d="M384 48V512l-192-112L0 512V48C0 21.5 21.5 0 48 0h288C362.5 0 384 21.5 384 48z"></path></svg> </td> <td style="text-align:left;width: 40em; background-color: #transparent !important;"> Indicates that a section covering a concept has ended. </td> </tr> </tbody> </table> --- # Comparing the Means Between Groups of Things The `\(t\)`-test is: -- - One of the most common tests in statistics -- - Used to determine whether the means of two groups are equal --- # Ideas -- > **One-sample *t*-tests**: Compare the sample mean with a known value, when the variance of the population is unknown -- > **Two-sample *t*-tests**: Compare the means of two groups under the assumption that both samples are random, independent, and normally distributed with unknown but equal variances -- > **Paired *t*-tests**: Compare the means of two sets of paired samples, taken from two populations with unknown variance --- ## Packages Please load up the following packages ```r library(tidyverse) library(patchwork) ``` --- # The Base R `t.test` command ```r t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95) ``` <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;"> Option </th> <th style="text-align:left;"> Function </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> x </td> <td style="text-align:left;width: 40em; "> a numeric vector from a data set </td> </tr> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> y </td> <td style="text-align:left;width: 40em; "> an optional numeric vector from a data set </td> </tr> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> mu </td> <td style="text-align:left;width: 40em; "> a number indicating the true value of the mean </td> </tr> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> alternative </td> <td style="text-align:left;width: 40em; "> preference on type of test you wish to run </td> </tr> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> paired </td> <td style="text-align:left;width: 40em; "> preference on whether you wish to perform a paired <i>t</i>-test </td> </tr> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> var.equal </td> <td style="text-align:left;width: 40em; "> indicates whether or not to assume equal variances when performing a two-sample <i>t</i>-test </td> </tr> <tr> <td style="text-align:center;width: 10em; font-family: monospace;color: #b2dfdb !important;"> conf.level </td> <td style="text-align:left;width: 40em; "> the confidence level of the reported confidence interval </td> </tr> </tbody> </table> --- ### Notes - The `var.equals` argument has a default setting of <span style='color: #64b5f6;'>FALSE</span> indicating unequal variances and applies the Welsch approximation to the degrees of freedom. - If you wish to have equal variances, this can be done by changing the setting to <span style='color: #64b5f6;'>TRUE</span> -- - The `conf.level` argument is set to 95%, or where `\(\alpha = 0.05\)`. - The confidence interval is determined by - `\(\mu\)` for the one-sample *t*-test - `\(\mu_1-\mu_2\)` for the two-sample *t*-test. --- ### Be Aware! > The `wilcox.test` function provides the same basic functionality and arguments -- <br> <br> > However it is used when we ***do not want to assume the data to follow a normal distribution*** -- <br> <br> > We're assuming normality -- <br> <br> > So please ignore it for now! --- # Assumptions -- .pull-left[**Random sampling**] .pull-right[*Data is derived from random sampling*] -- <br> <br> <br> <br> .pull-left[**Independent observations**] .pull-right[*Observations are independent from one another*] -- <br> <br> <br> <br> .pull-left[**Normality**] .pull-right[*Observations are from a normally distributed population*] -- <br> <br> <br> <br> .pull-left[**Homogeneity**] .pull-right[*If more than one population is sampled from, then the populations have equal variances (aka **homogeneity of variances**)*] --- --- ## One- or Two-sample *t*-tests -- If `y` is – excluded, `t.test` will run as a one-sample *t*-test – included, `t.test` will run as a two-sample *t*-test - default `t.test` command will run as a two-sided *t*-test - you can run a one-sided *t*-test by changing the `alternative` option to `greater` or `less` --- #### Example `t.test(x, alternative = "greater", mu = 47)` performs a one-sample *t*-test on the data contained in `x` where `$$H_0: \mu = 47$$` `$$H_1: \mu > 47$$` --- ## Example ```r midwest %>% head() ``` ``` ## # A tibble: 6 × 28 ## PID county state area poptotal popdens…¹ popwh…² popbl…³ popam…⁴ popas…⁵ ## <int> <chr> <chr> <dbl> <int> <dbl> <int> <int> <int> <int> ## 1 561 ADAMS IL 0.052 66090 1271. 63917 1702 98 249 ## 2 562 ALEXANDER IL 0.014 10626 759 7054 3496 19 48 ## 3 563 BOND IL 0.022 14991 681. 14477 429 35 16 ## 4 564 BOONE IL 0.017 30806 1812. 29344 127 46 150 ## 5 565 BROWN IL 0.018 5836 324. 5264 547 14 5 ## 6 566 BUREAU IL 0.05 35688 714. 35157 50 65 195 ## # … with 18 more variables: popother <int>, percwhite <dbl>, percblack <dbl>, ## # percamerindan <dbl>, percasian <dbl>, percother <dbl>, popadults <int>, ## # perchsd <dbl>, percollege <dbl>, percprof <dbl>, poppovertyknown <int>, ## # percpovertyknown <dbl>, percbelowpoverty <dbl>, percchildbelowpovert <dbl>, ## # percadultpoverty <dbl>, percelderlypoverty <dbl>, inmetro <int>, ## # category <chr>, and abbreviated variable names ¹popdensity, ²popwhite, ## # ³popblack, ⁴popamerindian, ⁵popasian ``` .footnote[Please use `?midwest` for more details on the variables] --- ### Purpose We want to compare the differences between the average percent of college educated adults in Ohio versus Michigan count: false .panel1-sw1-auto[ ```r *midwest ``` ] .panel2-sw1-auto[ ``` ## # A tibble: 437 × 28 ## PID county state area poptotal popden…¹ popwh…² popbl…³ popam…⁴ popas…⁵ ## <int> <chr> <chr> <dbl> <int> <dbl> <int> <int> <int> <int> ## 1 561 ADAMS IL 0.052 66090 1271. 63917 1702 98 249 ## 2 562 ALEXANDER IL 0.014 10626 759 7054 3496 19 48 ## 3 563 BOND IL 0.022 14991 681. 14477 429 35 16 ## 4 564 BOONE IL 0.017 30806 1812. 29344 127 46 150 ## 5 565 BROWN IL 0.018 5836 324. 5264 547 14 5 ## 6 566 BUREAU IL 0.05 35688 714. 35157 50 65 195 ## 7 567 CALHOUN IL 0.017 5322 313. 5298 1 8 15 ## 8 568 CARROLL IL 0.027 16805 622. 16519 111 30 61 ## 9 569 CASS IL 0.024 13437 560. 13384 16 8 23 ## 10 570 CHAMPAIGN IL 0.058 173025 2983. 146506 16559 331 8033 ## # … with 427 more rows, 18 more variables: popother <int>, percwhite <dbl>, ## # percblack <dbl>, percamerindan <dbl>, percasian <dbl>, percother <dbl>, ## # popadults <int>, perchsd <dbl>, percollege <dbl>, percprof <dbl>, ## # poppovertyknown <int>, percpovertyknown <dbl>, percbelowpoverty <dbl>, ## # percchildbelowpovert <dbl>, percadultpoverty <dbl>, ## # percelderlypoverty <dbl>, inmetro <int>, category <chr>, and abbreviated ## # variable names ¹popdensity, ²popwhite, ³popblack, ⁴popamerindian, … ``` ] --- count: false .panel1-sw1-auto[ ```r midwest %>% * filter(state == "OH" | state == "MI") ``` ] .panel2-sw1-auto[ ``` ## # A tibble: 171 × 28 ## PID county state area poptotal popdensity popwh…¹ popbl…² popam…³ popas…⁴ ## <int> <chr> <chr> <dbl> <int> <dbl> <int> <int> <int> <int> ## 1 1197 ALCONA MI 0.041 10145 247. 10026 27 56 26 ## 2 1198 ALGER MI 0.051 8972 176. 8422 213 304 24 ## 3 1199 ALLEGAN MI 0.049 90509 1847. 86760 1448 543 411 ## 4 1200 ALPENA MI 0.034 30605 900. 30372 35 93 85 ## 5 1201 ANTRIM MI 0.031 18185 587. 17895 23 211 24 ## 6 1202 ARENAC MI 0.021 14931 711 14695 10 139 38 ## 7 1203 BARAGA MI 0.054 7954 147. 6971 49 918 10 ## 8 1204 BARRY MI 0.034 50057 1472. 49429 104 188 144 ## 9 1205 BAY MI 0.026 111723 4297. 107747 1242 726 428 ## 10 1206 BENZIE MI 0.02 12200 610 11863 30 237 35 ## # … with 161 more rows, 18 more variables: popother <int>, percwhite <dbl>, ## # percblack <dbl>, percamerindan <dbl>, percasian <dbl>, percother <dbl>, ## # popadults <int>, perchsd <dbl>, percollege <dbl>, percprof <dbl>, ## # poppovertyknown <int>, percpovertyknown <dbl>, percbelowpoverty <dbl>, ## # percchildbelowpovert <dbl>, percadultpoverty <dbl>, ## # percelderlypoverty <dbl>, inmetro <int>, category <chr>, and abbreviated ## # variable names ¹popwhite, ²popblack, ³popamerindian, ⁴popasian ``` ] --- count: false .panel1-sw1-auto[ ```r midwest %>% filter(state == "OH" | state == "MI") %>% * select(state, percollege) ``` ] .panel2-sw1-auto[ ``` ## # A tibble: 171 × 2 ## state percollege ## <chr> <dbl> ## 1 MI 14.1 ## 2 MI 16.3 ## 3 MI 18.1 ## 4 MI 18.9 ## 5 MI 19.0 ## 6 MI 11.8 ## 7 MI 14.6 ## 8 MI 17.3 ## 9 MI 18.2 ## 10 MI 21.4 ## # … with 161 more rows ``` ] <style> .panel1-sw1-auto { color: white; width: 49%; hight: 32%; float: top; padding-left: 1%; font-size: 80% } .panel2-sw1-auto { color: white; width: 49%; hight: 32%; float: top; padding-left: 1%; font-size: 80% } .panel3-sw1-auto { color: white; width: NA%; hight: 33%; float: top; padding-left: 1%; font-size: 80% } </style> --- ```r ohio_mi <- midwest %>% filter(state == "OH" | state == "MI") %>% select(state, percollege) ``` --- ### Descriptives -- .pull-left[ <img src="Slides-Week-10R_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> ```r ohio_mi %>% filter(state == "OH") %>% summary() ``` ``` ## state percollege ## Length:88 Min. : 7.913 ## Class :character 1st Qu.:13.089 ## Mode :character Median :15.462 ## Mean :16.890 ## 3rd Qu.:18.995 ## Max. :32.205 ``` ] -- .pull-right[ <img src="Slides-Week-10R_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> ```r ohio_mi %>% filter(state == "MI") %>% summary() ``` ``` ## state percollege ## Length:83 Min. :11.31 ## Class :character 1st Qu.:14.61 ## Mode :character Median :17.43 ## Mean :19.42 ## 3rd Qu.:21.31 ## Max. :48.08 ``` ] -- <br> <br> <center> Ohio appears to have slightly less college educated adults than Michigan but let's see if that's actually true </center> --- ### Boxplot count: false .panel1-sw2-auto[ ```r *ggplot(ohio_mi, aes(x = state, * y = percollege, * fill = state)) ``` ] .panel2-sw2-auto[ ![](Slides-Week-10R_files/figure-html/sw2_auto_01_output-1.png)<!-- --> ] --- count: false .panel1-sw2-auto[ ```r ggplot(ohio_mi, aes(x = state, y = percollege, fill = state)) + * geom_boxplot(alpha = 0.7, * outlier.size = 2.5) ``` ] .panel2-sw2-auto[ ![](Slides-Week-10R_files/figure-html/sw2_auto_02_output-1.png)<!-- --> ] --- count: false .panel1-sw2-auto[ ```r ggplot(ohio_mi, aes(x = state, y = percollege, fill = state)) + geom_boxplot(alpha = 0.7, outlier.size = 2.5) + * scale_fill_manual(values = c("#00274C", * "#BB0000")) ``` ] .panel2-sw2-auto[ ![](Slides-Week-10R_files/figure-html/sw2_auto_03_output-1.png)<!-- --> ] --- count: false .panel1-sw2-auto[ ```r ggplot(ohio_mi, aes(x = state, y = percollege, fill = state)) + geom_boxplot(alpha = 0.7, outlier.size = 2.5) + scale_fill_manual(values = c("#00274C", "#BB0000")) + * theme_minimal() ``` ] .panel2-sw2-auto[ ![](Slides-Week-10R_files/figure-html/sw2_auto_04_output-1.png)<!-- --> ] --- count: false .panel1-sw2-auto[ ```r ggplot(ohio_mi, aes(x = state, y = percollege, fill = state)) + geom_boxplot(alpha = 0.7, outlier.size = 2.5) + scale_fill_manual(values = c("#00274C", "#BB0000")) + theme_minimal() + * theme(panel.grid.major.x = element_blank(), * panel.grid.minor.x = element_blank()) ``` ] .panel2-sw2-auto[ ![](Slides-Week-10R_files/figure-html/sw2_auto_05_output-1.png)<!-- --> ] <style> .panel1-sw2-auto { color: white; width: 53.9%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-sw2-auto { color: white; width: 44.1%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-sw2-auto { color: white; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- ### Shifting Skewness count: false .panel1-sw3-auto[ ```r *ggplot(ohio_mi, aes(x = percollege)) ``` ] .panel2-sw3-auto[ ![](Slides-Week-10R_files/figure-html/sw3_auto_01_output-1.png)<!-- --> ] --- count: false .panel1-sw3-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + * geom_histogram(aes(fill = ..count..), * bins = 20) ``` ] .panel2-sw3-auto[ ![](Slides-Week-10R_files/figure-html/sw3_auto_02_output-1.png)<!-- --> ] --- count: false .panel1-sw3-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + * scale_fill_viridis_c("Frequency") ``` ] .panel2-sw3-auto[ ![](Slides-Week-10R_files/figure-html/sw3_auto_03_output-1.png)<!-- --> ] --- count: false .panel1-sw3-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + * facet_wrap(. ~ state) ``` ] .panel2-sw3-auto[ ![](Slides-Week-10R_files/figure-html/sw3_auto_04_output-1.png)<!-- --> ] --- count: false .panel1-sw3-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + facet_wrap(. ~ state) + * theme_minimal() ``` ] .panel2-sw3-auto[ ![](Slides-Week-10R_files/figure-html/sw3_auto_05_output-1.png)<!-- --> ] --- count: false .panel1-sw3-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + facet_wrap(. ~ state) + theme_minimal() + * scale_x_log10() ``` ] .panel2-sw3-auto[ ![](Slides-Week-10R_files/figure-html/sw3_auto_06_output-1.png)<!-- --> ] <style> .panel1-sw3-auto { color: white; width: 53.9%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-sw3-auto { color: white; width: 44.1%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-sw3-auto { color: white; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- ### More Shifting Skewness count: false .panel1-sw4-auto[ ```r *ggplot(ohio_mi, aes(x = percollege)) ``` ] .panel2-sw4-auto[ ![](Slides-Week-10R_files/figure-html/sw4_auto_01_output-1.png)<!-- --> ] --- count: false .panel1-sw4-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + * geom_histogram(aes(fill = ..count..), * bins = 20) ``` ] .panel2-sw4-auto[ ![](Slides-Week-10R_files/figure-html/sw4_auto_02_output-1.png)<!-- --> ] --- count: false .panel1-sw4-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + * scale_fill_viridis_c("Frequency") ``` ] .panel2-sw4-auto[ ![](Slides-Week-10R_files/figure-html/sw4_auto_03_output-1.png)<!-- --> ] --- count: false .panel1-sw4-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + * facet_wrap(. ~ state, ncol = 1) ``` ] .panel2-sw4-auto[ ![](Slides-Week-10R_files/figure-html/sw4_auto_04_output-1.png)<!-- --> ] --- count: false .panel1-sw4-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + facet_wrap(. ~ state, ncol = 1) + * theme_minimal() ``` ] .panel2-sw4-auto[ ![](Slides-Week-10R_files/figure-html/sw4_auto_05_output-1.png)<!-- --> ] --- count: false .panel1-sw4-auto[ ```r ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + facet_wrap(. ~ state, ncol = 1) + theme_minimal() + * scale_x_log10() ``` ] .panel2-sw4-auto[ ![](Slides-Week-10R_files/figure-html/sw4_auto_06_output-1.png)<!-- --> ] <style> .panel1-sw4-auto { color: white; width: 53.9%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-sw4-auto { color: white; width: 44.1%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-sw4-auto { color: white; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- -- .pull-left[ ```r regularplot <- ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + facet_wrap(~ state, ncol = 1) + theme_minimal() + ggtitle("Regular") ``` ] -- .pull-right[ ```r scaledplot <- ggplot(ohio_mi, aes(x = percollege)) + geom_histogram(aes(fill = ..count..), bins = 20) + scale_fill_viridis_c("Frequency") + facet_wrap(~ state, ncol = 1) + theme_minimal() + ggtitle("Scaled") + scale_x_log10() # this line was added ``` ] --- ```r regularplot + scaledplot ``` <img src="Slides-Week-10R_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> --- ### Testing as is ```r t.test(percollege ~ state, data = ohio_mi) ``` ``` ## ## Welch Two Sample t-test ## ## data: percollege by state ## t = 2.5953, df = 161.27, p-value = 0.01032 ## alternative hypothesis: true difference in means between group MI and group OH is not equal to 0 ## 95 percent confidence interval: ## 0.6051571 4.4568579 ## sample estimates: ## mean in group MI mean in group OH ## 19.42146 16.89045 ``` -- <br> > Results show a *p*-value < .01 so **there is a statistical difference between the two means** -- <br> > This supports the alternative hypothesis that there is a difference between the average percent of college educated adults in Ohio versus Michigan --- ### Testing using a `log` function ```r t.test(log(percollege) ~ state, data = ohio_mi) ``` ``` ## ## Welch Two Sample t-test ## ## data: log(percollege) by state ## t = 2.9556, df = 168.98, p-value = 0.003567 ## alternative hypothesis: true difference in means between group MI and group OH is not equal to 0 ## 95 percent confidence interval: ## 0.04724892 0.23732151 ## sample estimates: ## mean in group MI mean in group OH ## 2.915873 2.773587 ``` -- <br> > Results show a *p*-value < .01 so **there is a statistical difference between the two means** -- <br> > So **there is a statistical difference between the two means** --- ## Paired-samples *t*-test -- Same `t.test` command as in the previous sections but just change your option to `paired =` <span style='color: #64b5f6;'>TRUE</span> ```r t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = TRUE, var.equal = FALSE, conf.level = 0.95) ``` --- ## Example ```r sleep %>% head() ``` ``` ## extra group ID ## 1 0.7 1 1 ## 2 -1.6 1 2 ## 3 -0.2 1 3 ## 4 -1.2 1 4 ## 5 -0.1 1 5 ## 6 3.4 1 6 ``` .footnote[Please use `?sleep` for more details on the variables] --- ## Purpose We are assessing if there is a statistically significant effect of a particular drug on sleep (increase in hours of sleep compared to control) for 10 patients count: false .panel1-sw5-auto[ ```r *sleep ``` ] .panel2-sw5-auto[ ``` ## extra group ID ## 1 0.7 1 1 ## 2 -1.6 1 2 ## 3 -0.2 1 3 ## 4 -1.2 1 4 ## 5 -0.1 1 5 ## 6 3.4 1 6 ## 7 3.7 1 7 ## 8 0.8 1 8 ## 9 0.0 1 9 ## 10 2.0 1 10 ## 11 1.9 2 1 ## 12 0.8 2 2 ## 13 1.1 2 3 ## 14 0.1 2 4 ## 15 -0.1 2 5 ## 16 4.4 2 6 ## 17 5.5 2 7 ## 18 1.6 2 8 ## 19 4.6 2 9 ## 20 3.4 2 10 ``` ] --- count: false .panel1-sw5-auto[ ```r sleep %>% * select(-ID) ``` ] .panel2-sw5-auto[ ``` ## extra group ## 1 0.7 1 ## 2 -1.6 1 ## 3 -0.2 1 ## 4 -1.2 1 ## 5 -0.1 1 ## 6 3.4 1 ## 7 3.7 1 ## 8 0.8 1 ## 9 0.0 1 ## 10 2.0 1 ## 11 1.9 2 ## 12 0.8 2 ## 13 1.1 2 ## 14 0.1 2 ## 15 -0.1 2 ## 16 4.4 2 ## 17 5.5 2 ## 18 1.6 2 ## 19 4.6 2 ## 20 3.4 2 ``` ] <style> .panel1-sw5-auto { color: white; width: 49%; hight: 32%; float: top; padding-left: 1%; font-size: 80% } .panel2-sw5-auto { color: white; width: 49%; hight: 32%; float: top; padding-left: 1%; font-size: 80% } .panel3-sw5-auto { color: white; width: NA%; hight: 33%; float: top; padding-left: 1%; font-size: 80% } </style> --- ## Descriptives ```r sleep %>% summary() ``` ``` ## extra group ID ## Min. :-1.600 1:10 1 :2 ## 1st Qu.:-0.025 2:10 2 :2 ## Median : 0.950 3 :2 ## Mean : 1.540 4 :2 ## 3rd Qu.: 3.400 5 :2 ## Max. : 5.500 6 :2 ## (Other):8 ``` --- ## Boxplot -- .pull-left[ ```r sleep %>% ggplot(aes(group, extra, fill = group)) + geom_boxplot(alpha = 0.8) + scale_fill_manual( values = c("#428bca", "#d9534f") ) + theme_minimal() + theme( panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank() ) ``` ] -- .pull-right[ <img src="Slides-Week-10R_files/figure-html/unnamed-chunk-23-1.png" width="68%" style="display: block; margin: auto;" /> ] -- <br> <br> Asessing if there is a statistically significant effect of a particular drug on sleep (increase in hours of sleep compared to control) for 10 patients --- ### Testing We want to see if the mean values for the extra variable differs between group 1 and group 2 ```r t.test(extra ~ group, data = sleep, paired = TRUE) ``` ``` ## ## Paired t-test ## ## data: extra by group ## t = -4.0621, df = 9, p-value = 0.002833 ## alternative hypothesis: true mean difference is not equal to 0 ## 95 percent confidence interval: ## -2.4598858 -0.7001142 ## sample estimates: ## mean difference ## -1.58 ``` -- <br> > Results show a *p*-value < .01 so **there is a statistical difference between the two means** -- <br> > This supports the alternative hypothesis that suggesting that the drug increases sleep on average by 1.58 hours --- ## Thats it!