Robustness of Generalized Linear Mixed Models for Split-Plot Designs with Binary Data

Roser Bono; Rafael Alarcón; Jaume Arnau; F. Javier García-Castro; Maria J. Blanca

doi:10.6018/analesps.527421

Authors

Roser Bono (1) Department of Social Psychology and Quantitative Psychology, Faculty of Psychology, University of Barcelona. (2) Institute of Neurosciences.University of Barcelona. Barcelona (Spain) https://orcid.org/0000-0001-7991-6668
Rafael Alarcón (3) Department of Psychobiology and Behavioral Sciences Methodology, Faculty of Psychology, University of Malaga https://orcid.org/0000-0003-2122-1374
Jaume Arnau (1) Department of Social Psychology and Quantitative Psychology, Faculty of Psychology, University of Barcelona
F. Javier García-Castro (4) Department of Psychology, Universidad Loyola Andalucía. Seville (Spain)
Maria J. Blanca (3) Department of Psychobiology and Behavioral Sciences Methodology, Faculty of Psychology, University of Malaga (Spain) https://orcid.org/0000-0003-4046-9308

DOI: https://doi.org/10.6018/analesps.527421

Keywords: Generalizated linear mixed models, Binary data, Monte Carlo simulation, Type 1 error rate

Abstract

This paper examined the robustness of the generalized linear mixed model (GLMM). The GLMM estimates fixed and random effects, and it is especially useful when the dependent variable is binary. It is also useful when the dependent variable involves repeated measures, since it can model correlation. The present study used Monte Carlo simulation to analyze the empirical Type I error rates of GLMMs in split-plot designs. The variables manipulated were sample size, group size, number of repeated measures, and correlation between repeated measures. Extreme conditions were also considered, including small samples, unbalanced groups, and different correlation in each group (pairing between group size and correlation between repeated measures). For balanced groups, the results showed that the group effect was robust under all conditions, while for unbalanced groups the effect tended to be conservative with positive pairing and liberal with negative pairing. Regarding time and interaction effects, the results showed, for both balanced and unbalanced groups, that: (a) The test was robust with low correlation (.2), but conservative for medium values of correlation (.4 and .6), and (b) the test tended to be conservative for positive and negative pairing, especially the latter.

Downloads

Download data is not yet available.

Metrics

Views/Downloads

Abstract
1840
pdf
782

References

Aiken, L. S., Mistler, S. A., Coxe, S., & West, S. G. (2015). Analyzing count variables in individuals and groups: Single level and multilevel models. Group Process & Intergroup Relations, 18(3), 290–314. https://doi.org/10.1177/1368430214556702

Amatya, A., & Bhaumik, D. K. (2018). Sample size determination for multilevel hierarchical designs using generalized linear mixed models. Biometrics, 74(2), 673–684. https://doi.org/10.1111/biom.12764

Arnau, J., Bono, R., Blanca, M. J., & Bendayan, R. (2012). Using the linear mixed model to analyze non-normal data distributions in longitudinal designs. Behavior Research Methods, 44(4), 1224–1238. https://doi.org/10.3758/s13428-012-0196-y

Arnau, J., Bendayan, R., Blanca, M. J., & Bono, R. (2013). The effect of skewness and kurtosis on the robustness of linear mixed models. Behavior Research Methods, 45(3), 873–879. https://doi.org/10.3758/s13428-012-0306-x

Arnau, J., Bendayan, R., Blanca, M. J., & Bono, R. (2014a). The effect of skewness and kurtosis on the Kenward-Roger approximation when group distributions differ. Psicothema, 26(2), 279–285. https://doi.org/10.7334/psicothema2013.174

Arnau, J., Bendayan, R., Blanca, M. J., & Bono, R. (2014b). Should we rely on the Kenward–Roger approximation when using linear mixed models if the groups have different distributions? British Journal of Mathematical and Statistical Psychology, 67, 408–429. https://doi.org/10.1111/bmsp.12026

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. https://doi.org/10.1016/j.jml.2007.12.005

Baayen, H., Vasishth, S., Kliegl, R., & Bates, D. (2017). The cave of shadows: Addressing the human factor with generalized additive mixed models. Journal of Memory and Language, 94, 206–234. https://dx.doi.org/10.1016/j.jml.2016.11.006

Bakbergenuly, I., & Kulinskaya, E. (2018). Meta-analysis of binary outcomes via generalized linear mixed models: A simulation study. BMC Medical Research Methodology, 18(70), 1–18. https://doi.org/10.1186/s12874-018-0531-9

Bandera, E., & Pérez, L. (2018). Los modelos lineales generalizados mixtos. Su aplicación en el mejoramiento de plantas [Generalized linear mixed models: Their application in plant breeding]. Cultivos Tropicales, 39(1), 127–133.

Barker, D., D’Este, C., Campbell, M. J., & McElduff, P. (2017). Minimum number of clusters and comparison of analysis methods for cross sectional stepped wedge cluster randomised trials with binary outcomes: A simulation study. Trials, 18(119), 1–11. https://doi.org/10.1186/s13063-017-1862-2

Bauer, D. J., & Sterba, S. K. (2011). Fitting multilevel models with ordinal outcomes: Performance of alternative specifications and methods of estimation. Psychological Methods, 16(4), 373–390. https://doi.org/10.1037/a0025813

Bell, M. L., & Grunwald, G. K. (2011). Small sample estimation properties of longitudinal count models. Journal of Statistical Computation and Simulation, 81(9), 1067–1079. https://doi.org/10.1080/00949651003674144

Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2017). Non-normal data: Is ANOVA still a valid option? Psicothema, 29(4), 552–557. https://doi.org/10.7334/psicothema2016.383

Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2018). Effect of variance ratio on ANOVA robustness: Might 1.5 be the limit? Behavior Research Methods, 50, 937-962. https://doi.org/10.3758/s13428-017-0918-2

Blanca, M. J., Arnau, J., López-Montiel, D., Bono, R., & Bendayan, R. (2013). Skewness and kurtosis in real data samples. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78–84. https://doi.org/10.1027/1614-2241/a000057

Bolker, B. M., Brooks, M. E., Clark, C. J., Geange, S. W., Poulsen, J. R., Stevens, M. H. H., & White, J. S. (2009). Generalized linear mixed models: A practical guide for ecology and evolution. Trends in Ecology and Evolution, 24(2), 127–135. https://doi.org/10.1016/j.tree.2008.10.008

Bono, R., Alarcón, R., & Blanca, M. J. (2021). Report quality of generalized linear mixed models in psychology: A systematic review. Frontiers in Psychology, 12, Article 666182. https://doi.org/10.3389/fpsyg.2021.666182

Bono, R., Blanca, M. J., Arnau, J., & Gómez-Benito, J. (2017). Non-normal distributions commonly used in health, education, and social sciences: A systematic review. Frontiers in Psychology, 8, Article1602. https://doi.org/10.3389/fpsyg.2017.01602

Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. https://doi.org/10.1111/j.2044-8317.1978.tb00581.x

Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(421), 9–25. https://doi.org/10.2307/2290687

Brown, H., & Prescott, R. (2006). Applied mixed models in medicine. (2nd ed.). John Wiley & Sons.

Casals, M., Girabent-Farrés, M., & Carrasco J. L. (2014). Methodological quality and reporting of generalized linear mixed models in clinical medicine (2000-2012): A systematic review. PLoS One, 9, Article e112653. https://doi.org/10.1371/journal.pone.0112653

Chen, T., Lu, N., Arora, J., Katz, I., Bossarte, R., He, H., Xia, Y., Zhang, H., & Tu, X.M. (2016). Power analysis for cluster randomized trials with binary outcomes modeled by generalized linear mixed-effects models. Journal of Applied Statistics, 43(6), 1104–1118. https://doi.org/10.1080/02664763.2015.1092109

Cho, S. J., Brown-Schmidt, S., & Lee, W. Y. (2018). Autoregressive generalized linear mixed effect models with crossed random effects: An application to intensive binary time series eye-tracking data. Psychometrika, 83(3), 751–771. https://doi.org/10.1007/s11336-018-9604-2

Cho, S., & Goodwin, A. P. (2017). Modeling learning in doubly multilevel binary longitudinal data using generalized linear mixed models: An application to measuring and explaining word learning. Psychometrika, 82(3), 846–870. https://doi.org/10.1007/s11336-016-9496-y

Cnnan, A., Laird, N. M., & Slasor, P. (1998). Tutorial in biostatistics: Using the general linear mixed model to analyse unbalanced repeated measures and longitudinal data. Statistics in Medicine, 16(20), 2349–2380. https://doi.org/10.1002/(sici)1097-0258(19971030)16:20<2349::aid-sim667>3.0.co;2-e

Coupé, C. (2018). Modeling linguistic variables with regression models: Addressing non-gaussian distributions, non-independent observations, and non-linear predictors with random effects and generalized additive models for location, scale, and shape. Frontiers in Psychology, 9, Article 513. http://doi.org/10.3389/fpsyg.2018.00513

Dang, Q., Mazumdar, S., & Houck, P. R. (2008). Sample size and power calculations based on generalized linear mixed models with correlated binary outcomes. Computer Methods and Programs in Biomedicine, 91(2), 122–127. https://doi.org/10.1016/j.cmpb.2008.03.001

Elosua, P., & De Boeck, P. (2020). Educational assessment issues in linguistically diverse contexts: A case study using a generalised linear mixed model. Language, Culture and Curriculum, 33(3), 305–318. https://doi.org/10.1080/07908318.2019.1662432

Emrich, L. J., & Piedmonte, M. R. (1991). A method for generating high-dimensional multivariate binary variables. American Statistician, 45(4), 302–304. https://doi.org/10.2307/2684460

Fang, L., & Louchin, T. M. (2013). Analyzing binomial data in split-plot design: classical approach or modern techniques? Communications in Statistics –Simulation and Computation, 42(4), 727–740. https://doi.org/10.1080/03610918.2011.650264

Fieberg, J., Matthiopoulos, J., Hebblewhite, M., Boyce, M. S., & Frair, J. L. (2010). Correlation and studies of habitat selection: Problem, red herring or opportunity? Philosophical Transactions of the Royal Society B, 365, 2233–2244. https://doi.org/10.1098/rstb.2010.0079

Gawarammana, M. B. M. B. K., & Sooriyarachchi, M. R. (2017). Comparison of methods for analyzing binary repeated measures data: A simulation-based study. Communications in Statistics – Simulation and Computation, 46(3), 2103–2120. https://doi.org/10.1080/03610918.2015.1035445

Hoque, E., & Torabi, M. (2018). Modeling the random effects covariance matrix for longitudinal data with covariates measurement error. Statistics in Medicine, 37(28), 4167–4184. https://doi.org/10.1002/sim.7908

Huang, L., Tang, L, Zhang, B., Zhang, Z., & Zhang, H. (2016). Comparison of different computational implementations on fitting generalized linear mixed-effects models for repeated count measures. Journal of Statistical Computation and Simulation, 86(12), 2392–2404. https://doi.org/10.1080/00949655.2015.1111376

Jacqmin-Gadda, H., Sibillot, S., Proust, C., Molina, J. M., & Thiébaut, R. (2007). Robustness of the linear mixed model to misspecified error distribution. Computational Statistics and Data Analysis, 51(10), 5142–5154. https://doi.org/10.1016/j.csda.2006.05.021

Jiang, D., & Oleson, J. J. (2011). Simulation study of power and sample size for repeated measures with multinomial outcomes: An application to sound direction identification experiments (SDIE). Statistics in Medicine, 30(19), 2451–2466. https://doi.org/10.1002/sim.4302

Johnson, P. C. D., Barry, S. J. E., Ferguson, H. M., & Müller, P. (2015). Power analysis for generalized linear mixed models in ecology and evolution. Methods in Ecology and Evolution 6, 133–42. https://doi.org/10.1111/2041-210X.12306

Kain, M. P., Bolker, B. M., & McCoy, M. W. (2015). A practical guide and power analysis for GLMMs: Detecting among treatment variation in random effects. PeerJ, 3, Article e1226. https://doi.org/10.7717/peerj.1226

Kenward, M. G., & Roger, J. H. (2009). An improved approximation to the precision of fixed effects from restricted maximum likelihood. Computational Statistics and Data Analysis, 53(7), 2583–2595. https://doi.org/10.1016/j.csda.2008.12.013

Koh, H., Li, Y., Zhan, X., Chen, J., & Zhao, N. (2019). A distance-based kernel association test based on the generalized linear mixed model for correlated microbiome studies. Frontiers in Genetics, 10, Article 458. https://doi.org/10.3389/fgene.2019.00458

Kowalchuk, R. K., Keselman, H. J., Algina, J., & Wolfinger, R. D. (2004). The analysis of repeated measurements with mixed-model adjusted F tests. Educational and Psychological Measurement, 64(2), 224–242. https://doi.org/10.1177/0013164403260196

Kruppa, J., & Hothorn, L. (2021). A comparison study on modeling of clustered and overdispersed count data for multiple comparisons. Journal of Applied Statistics, 48(16), 3220–3232. https://doi.org/10.1080/02664763.2020.1788518

Landerman, L. R., Mustillo, S. A., & Land, K. C. (2011). Modeling repeated measures of dichotomous data: Testing whether the within-person trajectory of change varies across levels of between-person factors. Social Science Research, 40(5), 1456–1464. https://doi.org/10.1016/j.ssresearch.2011.05.006

Lei, M., & Lomax, R. G. (2005). The effect of varying degrees on nonnormality in structural equation modeling. Structural Equation Modeling, 12(1), 1–27. https://doi.org/10.1207/s15328007sem1201_1

Li, P., & Redden, D. T. (2015). Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials. BMC Medical Research Methodology, 15(38), 1–12. https://doi.org/10.1186/s12874-015-006-x

Lin, K. C. (2010). Goodness-of-fit tests for modeling longitudinal ordinal data. Computational Statistics and Data Analysis, 54(7), 1872–1880. https://doi.org/10.1016/j.csda.2010.02.013

Lin, K. C., & Chen, Y. J. (2016). Goodness-of-fit- tests of generalized linear mixed models for repeated ordinal responses. Journal of Applied Statistics, 43(11), 2053–2064. https://doi.org/10.1080/02664763.2015.1126568

Litière, S., Alonso, A., & Molenberghs, G. (2007). Type I and Type II error under random-effects misspecification in generalized liner mixed models. Biometrics, 63(4), 1038–1044. https://doi.org/10.1111/j.1541-0420.2007.00782.x

Liu, S., Rovine, M. J., & Molenaar, P. C. (2012). Selecting a linear mixed model for longitudinal data: Repeated measures analysis of variance, covariance pattern model, and growth curve approaches. Psychological Methods, 17(1), 15–30. https://doi.org/10.1037/a0026971

Livacic-Rojas, P., Vallejo, G., & Fernández, P. (2010). Analysis of Type I error rates of univariate and multivariate procedures in repeated measures designs. Communications in Statistics — Simulation and Computation, 39(3), 624–640. https://doi.org/10.1080/03610910903548952

Lix, L. M., & Hinds, A. M. (2004). Multivariate contrasts for repeated measures designs under assumptions violations. Journal of Modern Applied Statistical Methods, 3(2), 333–344. https://doi.org/10.22237/jmasm/1099267620

Lo, S., & Andrews, S. (2015). To transform or not transform: Using generalized linear mixed models to analyses reaction time data. Frontiers in Psychology, 6, Article 1171. https://doi.org./10.3389/fpsyg.2015.01171

Malik, W. A., Marco-Llorca, C., Berendzen, K, & Piepho, H. P. (2020). Choice of link and variance function for generalized linear mixed models: A case study with binomial response in proteomics. Communications in Statistics – Theory and Methods, 49(17), 4313–4332. https://doi.org/10.1080/03610926.2019.1599021

McCulloch, C. E., & Neuhaus, J. M. (2011). Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statistical Science, 26(3), 388–402. https://doi.org/10.1214/11-STS361

Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1), 156–166. https://doi.org/10.1037/0033-2909.105.1.156

Miller, M. L., Roe, D. J., Hu, C., & Bell, M. L. (2020). Power difference in a χ2 test vs generalized linear mixed model in the presence of missing data: A simulation study. BMC Medical Research Methodology, 20(50), 1–12. https://doi-org.sire.ub.edu/10.1186/s12874-020-00936-w

Moscatelli, A., & Lacquaniti, F. (2011). The weight of time: Gravitational force enhances discrimination of visual motion duration. Journal of Vision, 11(4), 1–17. https://doi.org/10.1167/11.4.5

Moscatelli, A., Mezzetti, M., & Lacquaniti, F. (2012). Modeling psychophysical data at the population-level: The generalized linear mixed model. Journal of Vision 12(26), 1–17. https://doi.org/10.1167/12.11.26

Moscatelli, A., Polito, L., & Lacquaniti, F. (2011). Time perception of action photographs is more precise than that of still photographs. Experimental Brain Research, 210(1), 25–32. https://doi.org./10.1007/s00221-011-2598-y

Mowen, T. J., & Culhane, S. E. (2017). Modeling recidivism within the study of offender reentry: Hierarchical generalized linear models and lagged dependent variable models. Criminal Justice and Behavior, 44(1), 85–102. https://doi.org/10.1177/0093854816678647

Noh, M., Wu, L., & Lee, Y. (2012). Hierarchical likelihood methods for nonlinear and generalized linear mixed models with missing data and measurement errors in covariates. Journal of Multivariate Analysis, 109, 42–51. http://doi.org/10.1016/j.jmva.2012.02.011

Platt, R. W., Leroux, B. G., & Breslow, N. (1999). Generalized linear mixed models for meta-analysis. Statistics in Medicine, 18(6), 643–654. https://doi.org/10.1002/(SICI)1097-0258(19990330)18:6<643::AID-SIM76>3.0.CO;2-M

Quené, H., & van den Bergh, H. (2008). Example of mixed-effects modeling with crossed random effects and with binomial data. Journal of Memory and Language, 59(4), 413–425. https://doi.org/10.1016/j.jml.2008.02.002

SAS Institute Inc. (2013). The GLIMMIX procedure. In SAS/STAT® 13.1 User’s Guide. SAS Institute Inc.

SAS Institute Inc. (2016). SAS/STAT® 14.2 User’s Guide. SAS Institute Inc.

Searle, M. P., Waters, D. J., Rex, D. C., & Wilson, R. N. (1992). Pressure, temperature and time constraints on Himalayan metamorphism from eastern Kashmir and western Zanskar. Journal of the Geological Society, 149(5), 753–773. https://doi.org./10.1144/gsjgs.149.5.0753

Skrondal, A., & Rabe-Hesketh, S. (2003). Some applications of generalized linear latent and mixed models in epidemiology: Repeated measures, measurement error and multilevel modeling. Norwegian Journal of Epidemiology, 13(2), 265–278.

Smith, L. M., Stroup, W. W., & Marx, D. B. (2020). Poisson cokriging as a generalized linear mixed model. Spatial Statistics, 35, Article 100399. https://doi.org/10.1016/j.spasta.2019.100399

Stroup, W. W. (2013). Generalized linear mixed models. Modern concepts, methods and applications. Taylor and Francis.

Stroup, W. W., Milliken, G. A., Claassen, E. A., & Wolfinger, R. D. (2018). SAS for mixed models: Introduction and basic applications. SAS Institute Inc.

Sun, S., Zhu, J., Mozaffari, S., Ober, C., Chen, M., & Zhou, X. (2019). Heritability estimation and differential analysis of count data with generalized linear mixed models in genomic sequencing studies. Bioinformatics, 35(3), 487-496. https://doi.org/10.1093/bioinformatics/bty644

Thiele, J., & Markusen, B. (2012). Potential of GLMM in modelling invasive spread. CAB Reviews, 7(16), 1–10. https://doi.org/10.1079/PAVSNNR20127016

Vallejo, G., Ato, M., Fernández, M. P., & Livacic-Rojas, P. E. (2019). Sample size estimation for heterogeneous growth curve models with attrition. Behavior Research Methods, 51(3), 1216–1243. https://doi.org/10.3758/s13428-018-1059-y

Vallejo, G., Ato, M., & Valdés, T. (2008). Consequences of misspecifying the error covariance structure in linear mixed models for longitudinal data. Methodology: European Journal of Research for the Behavioral and Social Sciences, 4(1), 10–21. https://doi.org/10.1027/1614-2241.4.1.10

Wicklin, R. (2013). Simulating data with SAS. SAS Institute Inc.

Witte, J. S., Greenland, S., Kim, L., & Arab, L. (2000). Multilevel modeling in epidemiology with GLIMMIX. Epidemiology, 11(6), 684–688. https://doi.org/10.1097/00001648-200011000-00012

Wolfinger, R., & O’Connell, M. (1993). Generalized linear models: A pseudo-likelihood approach. Journal of Statistical Computation and Simulation, 48(3-4), 233–243. https://doi.org/10.1080/00949659308811554

Yu, S., & Huang, X. (2019). Link misspecification in generalized linear mixed models with a random intercept for binary responses. Test, 28(3), 827–843. https://doi.org/10.1007/s11749-018-0602-6

Zhang, H., Lu, N., Feng, C., Thurston, S. W., Xia, Y., Zhu, L., & Tu, X. M. (2011). On fitting generalized linear mixed-effects models for binary responses using different statistical packages. Statistics in Medicine, 30(20), 2562–2572. https://doi.org/10.1002/sim.4265

Zhang, H., Yu, Q., Feng, C., Gunzler, D., Wu, P., & Tu, X. M. (2012). A new look at the difference between the GEE and the GLMM when modeling longitudinal count responses. Journal of Applied Statistics, 39(9), 2067–2079. https://doi.org/10.1080/02664763.2012.700452