Is it quality, is it redundancy, or is model inadequacy? Some strategies for judging the appropriateness of high-discrimination items

Authors

DOI: https://doi.org/10.6018/analesps.535781
Keywords: Item discrimination, Factor analysis, Item analysis, Item Response Theory, Item Redundancy, Clinical Measurement

Supporting Agencies

  • Spanish Ministry of Science and Innovation (PID2020-112894GB-I00) and a grant from the Catalan Ministry of Universities, Research and the Information Society (2021 SGR 00036).

Abstract

When developing new questionnaires, it is traditionally assumed that the items should be as discriminative as possible, as if this was always indicative of their quality. However, in some cases these high discriminations may be masking some problems such as redundancies, shared residuals, biased distributions, or model limitations which may contribute to inflate the discrimination estimates. Therefore, the inspection of these indices may lead to erroneous decisions about which items to keep or eliminate. To illustrate this problem, two different scenarios with real data are described. The first focuses on a questionnaire that contains an item apparently highly discriminant, but redundant. The second focuses on a clinical questionnaire administered to a community sample, which gives place to highly right-skewed item response distributions and inflated discriminant indices, despite the items do not discriminate well among the majority of participants. We propose some strategies and checks to identify these situations, so that the items that are inappropriate may be identified and removed. Therefore, this article seeks to promote a critical attitude, which may involve going against routine stablished principles when they are not appropriate.

Downloads

Download data is not yet available.

References

Asparouhov, T., & Muthén, B. (2009). Exploratory structural equation modeling. Structural Equation Modeling, 16, 397–438. https://doi.org/ 1.1080/10705510903008204

Bandalos, D. (2021). Item meaning and order as causes of correlated residuals in confirmatory factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 28(6), 903-913. https://doi.org/1.1080/10705511.2021.1916395

Beck, A., Ward, C., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring Depression. Archives of General Psychiatry, 4(6), 561-571. http://doi.org/1.1001/archpsyc.1961.01710120031004

Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110(2), 305-314. https://doi.org/1.1037/0033-2909.11.2.305

Boyle, G. (1991). Does item homogeneity indicate internal consistency or item redundancy in psychometric scales? Personality and Individual Differences, 12(3), 291-294. http://doi.org/1.1016/0191-8869(91)90115-R

Briggs, S., & Cheek, J. (1986). The role of factor analysis in the development and evaluation of personality scales. Journal of Personality, 54(1), 106-148. https://doi.org/1.1111/j.1467-6494.1986.tb00391.x

Burisch, M. (1984). Approaches to personality inventory construction: a comparison of merits. American Psychologist, 39(3), 214-227. https://doi.org/1.1037/0003-066X.39.3.214

Campbell-Sills, L., Forde, D., & Stein, M. (2009). Demographic and childhood environmental predictors of resilience in a community sample. Journal of Psychiatric Research, 43(12), 1007-1012. https://doi.org/1.1016/j.jpsychires.2009.01.013

Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289. https://doi.org/1.2307/1165285

Diener, E., Emmons, R., Larsen, R. J., & Griffin, S. (1985). The satisfaction with life scale. Journal of Personality Assessment, 49, 71-75. https://doi.org/1.1207/s15327752jpa4901_13

DiStefano. C., Schweizer, K., & Troche, S. (2022) Editorial: controlling psychometric measures for method effects by means of factor analysis. Frontiers in Psychology, 13, 1-2. https://doi.org/1.3389/fpsyg.2022.984050

Edwards, M., Houts, C., & Cai, L. (2018). A diagnostic procedure to detect departures from local independence in item response theory models. Psychological Methods, 23(1), 138-149. https://doi.org/1.1037/met0000121

Ferrando, P. J. (2012). Assessing the discriminating power of item and test scores in the linear factor-analysis model. Psicológica, 33(1), 111-134.

Ferrando, P. J., Lorenzo-Seva, U., Hernández-Dorado, A., & Muñiz, J. (2022). Decalogue for the Factor Analysis of test items. Psicothema, 34(1), 7-17. https://doi.org/1.7334/psicothema2021.456

Henryson, S. (1971). Gathering, analyzing and using data on test items. In R. L. Thorndike (Ed.), Educational measurement (pp. 130-159). America Council on Education.

Kline, P. (1987). Factor analysis and personality theory. European Journal of Personality, 1(1), 21-36. https://doi.org/1.1002/per.2410010105

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley.

Lorenzo-Seva, U. & Ferrando, P.J. (2006). FACTOR: A computer program to fit the exploratory factor analysis model. Behavior Research Methods, 38(1), 88-91. http://doi.org/1.3758/BF03192753

Lorenzo-Seva, U., & Ferrando, P. J. (2021). Not positive definite correlation matrices in exploratory item factor analysis: causes, consequences and a proposed solution. Structural Equation Modeling: A Multidisciplinary Journal, 28(1), 138-147. https://doi.org/1.1080/10705511.202.1735393

Masters, G.N. (1988). Item discrimination: When more is worse. Journal of Educational Measurement, 25(1), 15-29. http://doi.org/1.1111/j.1745-3984.1988.tb00288.x

McDonald, R.P. (1999). Test theory: A unified treatment. LEA

Morales-Vives, F., Ferrando, P.J., & Dueñas, J.M. (2022). Should suicidal ideation be regarded as a dimension, a unipolar trait or a mixture? A model-based analysis at the score level. Current Psychology, 1-15. http://doi.org/1.1007/s12144-022-03224-6

Nunnally, J.C. (1978). Psychometric theory (2nd ed.). McGraw-Hill.

Reise, S.P., & Waller, N.G. (2009). Item response theory and clinical measurement. Annual Review of Clinical Psychology, 5, 27-48. https://doi.org/1.1146/annurev.clinpsy.032408.153553

Revelle, W. (2021). Psych: Procedures for psychological, psychometric, and personality research. Software. R package version 2.1.3. https://CRAN.R-project.org/package=psych

Rigobon, R., & Stoker, T. (2009). Bias from censored regressors. Journal of Business & Economic Statistics, 27(3), 340-353. https://doi.org/1.1198/jbes.2009.06119

Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48, 1–36. https://doi.org/1.18637/jss. v048.i02

Vigil-Colet, A., Morales-Vives, F., Camps, E., Tous, J., & Lorenzo-Seva, U. (2013). Development and validation of the Overall Personality Assessment Scale (OPERAS). Psicothema, 25(1), 100-106. https://doi.org/1.7334/psicothema2011.411

Published
27-08-2023
How to Cite
Ferrando, P. J., & Morales-Vives, F. (2023). Is it quality, is it redundancy, or is model inadequacy? Some strategies for judging the appropriateness of high-discrimination items. Anales de Psicología / Annals of Psychology, 39(3), 517–527. https://doi.org/10.6018/analesps.535781