How to proceed when normality and sphericity are violated in the repeated measures ANOVA
Supporting Agencies
- This research was supported by grant PID2020-113191GB-I00, awarded through MCIN/AEI/10.13039/501100011033.
Abstract
Adjusted F-tests have typically been proposed as an alternative to the F-statistic in repeated measures ANOVA. Despite considerable research, it remains unclear how these statistics perform under simultaneous violation of normality and sphericity. Accordingly, our aim here was to conduct a detailed examination of Type I error and power of the F-statistic and the Greenhouse-Geisser (F-GG) and Huynh-Feldt (F-HF) adjustments, manipulating the number of repeated measures (3-6), sample size (10-300), sphericity (Greenhouse-Geisser epsilon estimator, from its lower to upper limit), and distribution shape (slight to extreme deviations from normality). The findings show that the behavior of F-GG and F-HF depends on the degree of violation of both normality, sphericity, and sample size. Overall, we suggest using F-GG under violation of sphericity and slight or moderate deviations from normality in all sample size; with severe deviations from both normality and sphericity F-GG may be used with a sample size larger than 10; and with extreme deviation from both normality and sphericity this statistic may be used with a sample size larger than 30. In the event of discrepant results between F-GG and F-HF, the choice depends on the epsilon value.
Downloads
References
Al-Subaihi, A. A. (2000). A Monte Carlo study of the Friedman and Conover tests in the single-factor repeated measures design. Journal of Statistical Computation and Simulation, 65(1-4), 203-223. https://doi.org/10.1080/00949650008811999
Armstrong, R. (2017). Recommendations for analysis of repeated-measures designs: Testing and correcting for sphericity and use of MANOVA and mixed model analysis. Ophthalmic & Physiological Optics, 37(5), 585–593. https://doi.org/1.1111/opo.12399.
Arnau, J., Bono, R., Blanca, M. J., & Bendayan, R. (2012). Using the linear mixed model to analyze non-normal data distributions in longitudinal designs. Behavior Research Methods, 44(4), 1224–1238. https://doi.org/10.3758/s13428-012-0196-y
Arnau, J., Bendayan, R., Blanca, M. J., & Bono, R. (2013). The effect of skewness and kurtosis on the robustness of linear mixed models. Behavior Research Methods, 45(3), 873–879. https://doi.org/10.3758/s13428-012-0306-x
Algina, J., & Keselman, H. (1997). Detecting repeated measures effects with univariate and multivariate statistics. Psychological Methods, 2(2), 208–218. https://doi.org/10.1037/1082-989X.2.2.208
Barcikowski, R. S., & Robey, R. R. (1984). Decisions in single group repeated measures analysis: Statistical tests and three computer packages. The American Statistician, 38(2), 148–150.
Berkovits, I., Hancock, G., & Nevitt, J. (2000). Bootstrap resampling approaches for repeated measure designs: Relative robustness to sphericity and normality violations. Educational and Psychological Measurement, 60(6), 877–892. https://doi.org/10.1177/00131640021970961
Blanca, M., Alarcón, R., & Bono, R. (2018). Current practices in data analysis procedures in psychology: What has changed? Frontiers in Psychology, 9, Article 2558. https://doi.org/10.3389/fpsyg.2018.02558
Blanca, M. J., Arnau, J., García-Castro, F. J., Alarcón, R., & Bono, R. (2023a). Non-normal data in repeated measures: Impact on Type I error and power. Psicothema, 35(1), 21–29. https://doi.org/10.7334/psicothema2022.292
Blanca, M. J., Arnau, J., García-Castro, F. J., Alarcón, R., & Bono, R. (2023b). Repeated measures ANOVA and adjusted F-tests when sphericity is violated: Which procedure is best? Frontiers in Psychology, 14, Article 1192453. https://doi.org/10.3389/fpsyg.2023.1192453
Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and kurtosis in real data samples. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78–84. https://doi.org/10.1027/1614-2241/a000057
Bono, R., Blanca, M. J., Arnau, J., & Gómez-Benito, J. (2017). Non-normal distributions commonly used in health, education, and social sciences: A systematic review. Frontiers in Psychology, 8, Article 1602. https://doi.org/10.3389/fpsyg.2017.01602
Box, G. E. P. (1954). Some theorems on quadratic forms applied in the study of analysis of variance problems II. Effect of inequality of variance and of correlation of error in the two-way classification. Annals of Mathematical Statistics, 25, 484–498. https://doi.org/10.1214/aoms/1177728717
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152. https://doi.org/10.1111/j.2044-8317.1978.tb00581.x
Collier, R. O., Baker, F. B., Mandeville, G. K., & Hayes, T. F. (1967). Estimates of test size for several test procedures based on conventional variance ratios in the repeated measures design. Psychometrika, 32(3), 339–353. https://doi.org/10.1007/BF02289596
Cooper, J. A., & Garson, G. D. (2016). Power analysis. Statistical Associates Blue Book Series.
Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–91. https://doi.org/10.3758/bf03193146
Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43(4), 521–532. https://1.1007/BF02293811
Geisser, S. W., & Greenhouse, S. (1958). An extension of Box's results on the use of the F distribution in multivariate analysis. The Annals of Mathematical Statistics, 29(3) 885–891. https://doi.org/10.1214/aoms/1177706545
Goedert, K., Boston, R., & Barrett, A. (2013). Advancing the science of spatial neglect rehabilitation: An improved statistical approach with mixed linear modeling. Frontiers in Human Neuroscience, 7, Article 211. https://doi.org/10.3389/fnhum.2013.00211
Greenhouse, S. W., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika 24(2), 95–112. https://doi.org/10.1007/BF02289823
Harwell, M. R., & Serlin, R. C. (1994). A Monte Carlo study of the Friedman test and some competitors in the single factor, repeated measures design with unequal covariances. Computational Statistics & Data Analysis, 17(1), 35-49. https://doi.org/10.1016/0167-9473(92)00060-5
Haverkamp, N., & Beauducel, A. (2017). Violation of the sphericity assumption and its effect on Type-I error rates in repeated measures ANOVA and multi-level linear models (MLM). Frontiers in Psychology, 8, Article 1841. https://doi.org/10.3389/fpsyg.2017.01841
Haverkamp, N., & Beauducel, A. (2019). Differences of Type I error rates for ANOVA and multilevel-linear-models using SAS and SPSS for repeated measures designs. Meta-Psychology, 3, Article MP.2018.898. https://doi.org/10.15626/mp.2018.898
Hayoz, S. (2007). Behavior of nonparametric tests in longitudinal design. 15th European young statisticians meeting Available at: http://matematicas.unex.es/~idelpuerto/WEB_EYSM/Articles/ch_stefanie_hayoz_art.pdf
Huynh, H., & Feldt, L. S. (1976). Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs. Journal of Educational Statistics, 1(1), 69–82. https://doi.org/10.2307/1164736
Keselman, J. C., Lix, L. M., & Keselman, H. J. (1996). The analysis of repeated measurements: A quantitative research synthesis. British Journal of Mathematical and Statistical Psychology, 49(2), 275–298. https://doi.org/10.1111/j.2044-8317.1996.tb01089.x
Kherad-Pajouh, S., & Renaud, O. (2015). A general permutation approach for analyzing repeated measures ANOVA and mixed-model designs. Statistical Papers, 56(4), 947–967. https://doi.org/1.1007/s00362-014-0617-3
Kirk, R. E. (2013). Experimental design. Procedures for the behavioral sciences (4th ed.). Sage Publications.
Livacic-Rojas, P., Vallejo, G., & Fernández, P. (2010). Analysis of Type I error rates of univariate and multivariate procedures in repeated measures designs. Communications in Statistics — Simulation and Computation, 39(3), 624–640. https://doi.org/10.1080/03610910903548952
Maxwell, S. E., & Delaney, H. D. (2004). Designing experiments and analyzing data: A model comparison perspective (2nd ed.). Lawrence Erlbaum Associates.
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156–166. https://doi.org/10.1037/0033-2909.105.1.156
Muhammad, L. N. (2023). Guidelines for repeated measures statistical analysis approaches with basic science research considerations. The Journal of Clinical Investigation, 133(11), e171058. https://doi.org/10.1172/JCI171058
Muller, K. E., & Barton, C. N. (1989). Approximate power for repeated-measures ANOVA lacking sphericity. Journal of the American Statistical Association, 84(406), 549-555. https://doi.org/10.1080/01621459.1989.10478802
Muller, K., Edwards, L., Simpson, S., & Taylor, D. (2007). Statistical tests with accurate size and power for balanced linear mixed models. Statistics in Medicine, 26(19), 3639–3660. https://doi.org/10.1002/sim.2827
Oberfeld, D., & Franke, T. (2013). Evaluating the robustness of repeated measures analyses: The case of small sample sizes and nonnormal data. Behavior Research Methods, 45(3), 792–812. https://doi.org/10.3758/s13428-012-0281-2
Sheskin, D. J. (2003). Handbook of parametric and nonparametric statistical procedures. Chapman and Hall/CRC.
Voelkle, M. C., & McKnight, P. E. (2012). One size fits all? A Monte-Carlo simulation on the relationship between repeated measures (M)ANOVA and latent curve modeling. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 8, 23–38. https://doi.org/10.1027/1614-2241/a000044
Wilcox, R. R. (2022). Introduction to robust estimation and hypothesis testing (5th ed.). Academic Press.
Copyright (c) 2024 Servicio de Publicaciones, University of Murcia (Spain)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The works published in this journal are subject to the following terms:
1. The Publications Service of the University of Murcia (the publisher) retains the property rights (copyright) of published works, and encourages and enables the reuse of the same under the license specified in paragraph 2.
© Servicio de Publicaciones, Universidad de Murcia, 2022
2. The works are published in the online edition of the journal under a Creative Commons Reconocimiento-CompartirIgual 4.0 (legal text). You can copy, use, distribute, transmit and publicly display, provided that: i) you cite the author and the original source of publication (journal, editorial and URL of the work), ii) are not used for commercial purposes, iii ) mentions the existence and specifications of this license.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
3. Conditions of self-archiving. Is allowed and encouraged the authors to disseminate electronically pre-print versions (version before being evaluated and sent to the journal) and / or post-print (version reviewed and accepted for publication) of their works before publication, as it encourages its earliest circulation and diffusion and thus a possible increase in its citation and scope between the academic community. RoMEO Color: Green.