Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Oneway ANOVA

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Oneway ANOVA**• Analysis of variance is used to test for differences among more than two populations. It can be viewed as an extension of the t-test we used for testing two population means. • The specific analysis of variance test that we will study is often referred to as the oneway ANOVA. ANOVA is an acronym for ANalysis Of VAriance. The adjective oneway means that there is a single variable that defines group membership (called a factor). Comparisons of means using more than one variable is possible with other kinds of ANOVA analysis.**Why Not Use Multiple T-tests**• It might seem logical to use multiple t-tests if we wanted to compare a variable for more than two groups. For example, if we had three groups, we might do three t-tests: group 1 versus group 2, group 1 versus group 3, and group 2 versus group3. • However, doing three hypothesis tests to compare groups changes the probability that we are making an error (the alpha error rate). When conducting multiple tests of significance, the chance of making at least one alpha error over the series of tests is greater than the selected alpha level for each individual test. Thus, if we do multiple t-tests on the same variables with an alpha level of 0.05, the chances that we are making a mistake in applying our findings to the population is actually greater than 0.05.**Logic of Analysis of Variance**• The logic of the analysis of variance test is the same as the logic for the test of two population means. • In both tests, we are comparing the differences among group means to a measure of dispersion for the sampling distribution. • In ANOVA, differences of group means is computed as the difference for each group mean from the mean for all subjects regardless of group. The measure of dispersion for the sampling distribution is a combination of the dispersion within each of the groups.**Step 1. Assumptions for the Test**• Level of measurement of the group variable can be any level of variable that identifies groups. • Level of measurement of the test variable is interval. • The test variable is normally distributed in the population: • skewness and kurtosis between –1.0 and +1.0, or • number is each group is greater than 10 (central limit theorem) • The variances (dispersion) of the groups are equal. The Levene test of equality of population variances is used to test this assumption.**Levene Test of Homogeneity of Variances**• The Levene test of equality of population variances tests whether or not the variances for the groups are equal. It is a test of the research hypothesis that the variance (variability) of one or more groups is different from the others. The null hypothesis states that the variances of all groups are equal. • If the probability of the test statistic is greater than 0.05, we do not reject the null hypothesis and conclude that none of the variances are different. This is the desired outcome. • If the probability of the test statistic is less than or equal to 0.05, we conclude the variances are different. There is no alternative formula adjusting for unequal variances for an ANOVA test, like there is for a t-test. Though we violate the assumption, we will add a caution to any true answers instead of deciding that it is an incorrect application of a statistic because the analysis of variable is robust to violations of the assumption.**Step 2. Hypotheses and alpha**• The research hypothesis is that the mean of at least one of the population groups is different from the means of the other groups. • The null hypothesis is that the means of all of the population groups are equal. • If we don’t have a specific reason for setting the level of significance to a specific probability, we can use the traditional social science benchmark of 0.05. This means that we are willing to risk making a mistake in our decision to reject the null hypothesis if it only happens once in every 20 decisions, or our decision would be correct 19 out of 20 times. The alpha level to use will be stated in the problems.**Step 3. Sampling distribution and test statistic**• In the ANOVA test, the probability is obtained from the “F” distribution instead of the normal curve distribution. • The test statistic is also referred to as the F-ratio or F-test because it follows the f-distribution.**Step 4. Computing the Test Statistic**• Conceptually the test statistic is computed in a way similar to the independent samples t-test. Both are computed by dividing the differences in means by the measure of variability among the groups. • We identify the probability of the test statistic from the SPSS statistical output.**Step 5. Decision and Interpretation**• If the probability of the test statistic is less than or equal to the probability of the level of significance (alpha error rate), we reject the null hypothesis and conclude that our data supports the research hypothesis. • If the probability of the test statistic is greater than the probability of the level of significance (alpha error rate), we fail to reject the null hypothesis and conclude that our data does not support the research hypothesis.**Interpreting Differences in Population Means**• If we fail to reject the null hypothesis, we can state that we found no differences among the means for the population groups for this characteristic. We do not say they are equal. • If we reject the null hypothesis, we can conclude that the mean for at least one population group is different from the others. • The ANOVA test itself does NOT tell us which group means are different. To determine this, we use a Post Hoc test, such as the Tukey HSD (honestly significant differences) Post Hoc Test.**Post Hoc Test for Difference in Means**• Just as we used a post hoc test to identify which cells in a frequency table were responsible for the statistically significant result, we use a post hoc test to identify the differences in pairs of means that produce a statistically significant result in an ANOVA table. • We only look at the post hoc test when the probability of the ANOVA statistic causes us to reject the null hypothesis, i.e. the probability of the test statistic is less than the level of significance. • The Post Hoc Test may NOT reveal differences among group means even when we reject the null hypothesis in the ANOVA test.**Inflation of Type I Error (Alpha)**• Type I Error: Probability of falsely rejecting null hypothesis when it is true. • The only time you need to worry about inflation of Type I error rate is when you look for a lot of effects in your data. • The more effects you look for, the more likely it is that you will turn up an effect that doesn't really exist (Type I error!). • Doing all possible pair-wise comparisons (t-test) on a one-way ANOVA would increase the overall Type I error rate.**The Tukey HSD Post Hoc Test**• The Tukey HSD Post Hoc Test compares all possible pairs of group means to determine which differences in group means are statistically different. • The post hoc test finds the numeric difference between means that is statistically different at the 0.05 and 0.01 levels of significance. It does this without inflating the alpha error rate, which would happen if we used a series of t-tests to identify differences. • We use the same level of significance of the ANOVA test and the post hoc test.**ANOVA post hoc Test Practice Problem – 1**This question asks you to use ANOVA to answer whether there is a relationship between country [lcouncu] and age [age] and, if there is, to do a Tukey HSD post hoc test to see if respondents from US were older than respondents from the UK. A one-way analysis of variance requires that the independent variable specify groups or categories and the dependent variable be interval level. The independent variable [lcouncu] is nominal and the dependent variable [age] is interval, satisfying the requirement for the independent and dependent variables.**ANOVA post hoc Test in SPSS (1)**Next step is to examine the distribution of the dependent variable. You can check whether the dependent variable is normally distributed or not in: Analyze > Descriptive Statistics > Descriptives…**ANOVA post hoc Test in SPSS (2)**After moving [age] into “Variable(s):” box, click “Options…” button to select the distribution statistics.**ANOVA post hoc Test in SPSS (3)**Select “Kurtosis” and “Skewness” to examine whether [age] is normally distributed or not. Then, click “Continue” and “OK” buttons.**ANOVA post hoc Test in SPSS (4)**[Age] satisfied the criteria for a normal distribution. The skewness of the distribution (.590) was between -1.0 and +1.0 and the kurtosis of the distribution (-.150) was between -1.0 and +1.0.**ANOVA post hoc Test in SPSS (5)**You can conduct ANOVA by clicking: Analyze > Compare Means > One-Way ANOVA…**ANOVA post hoc Test in SPSS (6)**Dependent variable [age] goes to “Dependent List:” box and the independent variable [lcouncu] goes to “Factor:” box. Then, click “Options…” button to select statistics options.**ANOVA post hoc Test in SPSS (7)**Select “Descriptive” and “Homogeneity of variance test” in the “Statistics” section of “One-Way ANOVA: Options” window. Then, click “Continue”.**ANOVA post hoc Test in SPSS (8)**Now, click “Post Hoc…” button to select post hoc test option.**ANOVA post hoc Test in SPSS (9)**Select “Tukey” in “Equal Variances Assumed” panel. Enter alpha in the “Significance level:” textbox. It is same as the alpha level (.01) in the problem. Then, click “Continue” and “OK” buttons.**ANOVA post hoc Test in SPSS (10)**First of all, you have to check the equal variance assumption. The probability associated with Levene's Test for Equality of Variances (p<0.001) is less than or equal to the level of significance (0.01). The assumption of equal variances is not satisfied. However, since analysis of variance is robust to violations of this assumption, we will add a caution to any true findings rather than conclude that this is an incorrect application of a statistic.**ANOVA post hoc Test in SPSS (11)**The probability of the F test statistic (F=32.638) was p<0.001, less than or equal to the alpha level of significance of 0.01. The null hypothesis that the mean "age" [age] is the same for all groups defined by the variable "current country of residence" [lcouncu] is rejected. The research hypothesis that the mean "age" [age] for groups defined by the variable "current country of residence" [lcouncu] is not the same for all groups is supported by this analysis.**ANOVA post hoc Test in SPSS (12)**Based on the Tukey post hoc test, the difference in the mean for survey respondents from the United States (40.47) and the mean for survey respondents from the United Kingdom (33.95) was 6.53, a statistically significant difference at the 0.01 level of significance. Survey respondents from the United States were older than survey respondents from the United Kingdom.**ANOVA post hoc Test in SPSS (13)**If you identify a statistically significant difference, but aren’t sure which group is older than the other, look back at the table of descriptive statistics. In this table, it is clear that the mean age for respondents in the United states (40.47) is higher than the mean age for respondents in the United Kingdom (33.95). Survey respondents from the United States were older than survey respondents from the United Kingdom. The answer to the question is true with caution, with the caution added for the violation of the assumption of equality of variances.**ANOVA post hoc Test Practice Problem – 2**This question asks you to use ANOVA to answer whether there is a relationship between country [lcouncu] and web surfing [intera06] and, if there is, to do a Tukey HSD post hoc test to see if respondents from Canada said they surf the web for recreational purposes less frequently than respondents from the UK. A one-way analysis of variance requires that the independent variable specify groups or categories and the dependent variable be interval level. The independent variable [lcouncu] is nominal and the dependent variable [intera06] is ordinal, satisfying the requirement for the independent and dependent variables.**ANOVA post hoc Test in SPSS (13)**Next step is to examine the distribution of the dependent variable. You can check whether the dependent variable is normally distributed or not in: Analyze > Descriptive Statistics > Descriptives…**ANOVA post hoc Test in SPSS (14)**After moving [intera06] into “Variable(s):” box, click “Options…” button to select the distribution statistics.**ANOVA post hoc Test in SPSS (15)**Select “Kurtosis” and “Skewness” to examine whether [age] is normally distributed or not. Then, click “Continue” and “OK” buttons.**ANOVA post hoc Test in SPSS (16)**“Surf Web for recreational purposes” [intera06] satisfied the criteria for a normal distribution. The skewness of the distribution (-0.687) was between -1.0 and +1.0 and the kurtosis of the distribution (-0.296) was between -1.0 and +1.0.**ANOVA post hoc Test in SPSS (17)**You can conduct ANOVA by clicking: Analyze > Compare Means > One-Way ANOVA…**ANOVA post hoc Test in SPSS (18)**Dependent variable [intera06] goes to “Dependent List:” box and the independent variable [lcouncu] goes to “Factor:” box. Then, click “Options…” button to select statistics options.**ANOVA post hoc Test in SPSS (19)**Select “Descriptive” and “Homogeneity of variance test” in the “Statistics” section of “One-Way ANOVA: Options” window. Then, click “Continue”.**ANOVA post hoc Test in SPSS (20)**Now, click “Post Hoc…” button to select post hoc test option.**ANOVA post hoc Test in SPSS (21)**Select “Tukey” in “Equal Variances Assumed” section and make sure the “Significance level:” is same as the alpha level (.01) in the problem. Then, click “Continue” and “OK” buttons.**ANOVA post hoc Test in SPSS (22)**First of all, you have to check the equal variance assumption. The probability associated with Levene's Test for Equality of Variances (p = 0.003) is less than or equal to the level of significance (0.01). The assumption of equal variances is not satisfied. However, since analysis of variance is robust to violations of this assumption, we will add a caution to any true findings rather than conclude that this is an incorrect application of a statistic.**ANOVA post hoc Test in SPSS (23)**The probability of the F test statistic (F=0.086) was p=.918, larger than the alpha level of significance of 0.01. The null hypothesis that the mean “surf the web for recreational purpose" [intera06] is the same for all groups defined by the variable "current country of residence" [lcouncu] is not rejected. The research hypothesis that the mean [intera06] for groups defined by the variable [lcouncu] is not the same for all groups is not supported by this analysis.**ANOVA post hoc Test in SPSS (24)**When the F test statistic is not significant, the results of the post hoc tests are not interpreted, even if statistically significant differences between pairs of groups are found. The answer to the question is false.**ANOVA post hoc Test Practice Problem – 3**This question asks you to use ANOVA to answer whether there is a relationship between country [lcouncu] and closeness to community[valatt01] and, if there is, to do a Tukey HSD post hoc test to see if respondents from Canada agreed more strongly that they feel close to other people in their community than respondents from the US. A one-way analysis of variance requires that the independent variable specify groups or categories and the dependent variable be interval level. The independent variable [lcouncu] is nominal and the dependent variable [valatt01] is ordinal, satisfying the requirement for the independent and dependent variables.**ANOVA post hoc Test in SPSS (25)**Next step is to examine the distribution of the dependent variable. You can check whether the dependent variable is normally distributed or not in: Analyze > Descriptive Statistics > Descriptives…**ANOVA post hoc Test in SPSS (26)**After moving [valatt01] into “Variable(s):” box, click “Options…” button to select the distribution statistics.**ANOVA post hoc Test in SPSS (27)**Select “Kurtosis” and “Skewness” to examine whether [age] is normally distributed or not. Then, click “Continue” and “OK” buttons.**ANOVA post hoc Test in SPSS (28)**“I feel close to other people in my community” [valatt01] satisfied the criteria for a normal distribution. The skewness of the distribution (-0.655) was between -1.0 and +1.0 and the kurtosis of the distribution (-0.835) was between -1.0 and +1.0.**ANOVA post hoc Test in SPSS (29)**You can conduct ANOVA by clicking: Analyze > Compare Means > One-Way ANOVA…**ANOVA post hoc Test in SPSS (30)**Dependent variable [valatt01] goes to “Dependent List:” box and the independent variable [lcouncu] goes to “Factor:” box. Then, click “Options…” button to select statistics options.**ANOVA post hoc Test in SPSS (31)**Select “Descriptive” and “Homogeneity of variance test” in the “Statistics” section of “One-Way ANOVA: Options” window. Then, click “Continue”.**ANOVA post hoc Test in SPSS (32)**Now, click “Post Hoc…” button to select post hoc test option.**ANOVA post hoc Test in SPSS (33)**Select “Tukey” in “Equal Variances Assumed” section and make sure the “Significance level:” is same as the alpha level (.01) in the problem. Then, click “Continue” and “OK” buttons.