An investigation of enhancement of ability evaluation by using a nested logit model for multiple-choice items
Multiple-choice item is wildly used in psychological and educational test. The present study investigated that if a multiple-choice item have an advantage than a dichotomous item on ability evaluation.An item response model,nested logitmodel (NLM),was used to fit the multiple-choice data. Both simulation study and empirical study indicated that the accuracy and the stability of ability estimation were enhanced by using multiple-choice model rather than dichotomous model, because more information was included in multiple-choice items’ distractors. But the accuracy of ability parameter estimation showed little differences in 4-choice items, 5-choice items and 6-choice items. Moreover, NLM could extract more information from low-level respondents than from high-level ones, because they hadmore distractor chosen behaviors. Furthermore, respondents at different trait levels would be attracted by different distractors in an empirical study of a Chinese Vocabulary Test for Grade 1 by using the changing traces of distractor probabilities calculated from NLM. It is suggested that the responses of students at different levelsmight reflect the students’ vocabulary development process.
Attali, Y., & Fraenkel, T. (2000). The Point-Biserial as a Discrimination Index for Distractors in Multiple-Choice Items：Deficiencies in Usage and an alternative. Journal of Educational Measurement, 37(1), 77-86. doi: 10.1111/j.1745-3984.2000.tb01077.x
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29-51.doi: 10.1007/BF02291411
Bolt, D. M., Wollack, J. A., &Suh, Y. (2012). Application of a multidimensional nested logit model to multiple-choice test items. Psychometrika, 77, 339-357.doi: 10.1007/S11336-012-9257-5
Briggs, D. C., Alonzo, A. C., Schwab, C., & Wilson, M. (2006). Diagnostic assessment with ordered multiple-choice items. Educational Assessment, 11, 33-63.doi: 10.1207/s15326977ea1101_2
Cao, Y. W. (1999). Construction of vocabulary tests for junior school level. Acta Psychologica Sinica, 31, 460-467.
Davis, F. B., & Fifer, G. (1959). The effect on test reliability and validity of scoring aptitude and achievement tests with weights for every choice. Educational and Psychological Measurement, 19, 159-170.doi: 10.1177/001316445901900202
Divgi, D. R. (1986). Does the Rasch model really work for multiple choice items? Not if you look closely. Journal of Educational Measurement, 23, 283-298.doi: 10.1111/j.1745-3984.1986.tb00251.x
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness Measurement with Polychotomous Item Response Models and Standardized Indices. British Journal of Mathematical and Statistical Psychology, 38, 67-86. doi: 10.1111/j.2044-8317.1985.tb00817.x
Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Mahwah, New Jersey: Lawrence Erlbaum Associates, Inc.
Green, B. F., Crone, C. R., & Folk, V. G. (1989). A Method for Studying Differential Distractor Functioning. Journal of Educational Measurement, 26, 147-160.doi: 10.1111/j.1745-3984.1989.tb00325.x
Haladyna, T. M., & Downing, S. M. (1989). A Taxonomy of Multiple-Choice Item-Writing Rules. Applied Measurement in Education, 2, 37-50.doi: 10.1207/s15324818ame0201_3
Haladyna, T. M., & Downing, S. M. (1993). How Many Options is Enough for a Multiple-Choice Testing Item. Educational and Psychological Measurement, 53, 999-1010.doi: 10.1177/0013164493053004013
Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A Review of Multiple-Choice Item-Writing Guidelines for Classroom Assessment. Applied Measurement in Education, 15, 309-333.doi: 10.1207/S15324818AME1503_5
Henning, G. (1989). Does the Rasch model really work for multiple-choice items? Take another look: a response to Divgi. Journal of Educational Measurement, 26, 91-97.doi: 10.1111/j.1745-3984.1989.tb00321.x
Hofmann, K. P. (2007). Psychology of Decision Making in Economics, Business and Finance. Nova Publishers.
Jacobs, P. I., &Vandeventer, M. (1970). Information in wrong responses. Psychological Reports, 26, 311-315.doi: 10.2466/pr0.19126.96.36.1991
Kim, J. (2006). Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking. Journal of Educational Measurement, 43, 193-213.doi: 10.1111/j.1745-3984.2006.00013.x
Levine, M. V., &Drasgow, F. (1983). The relation between incorrect option choice and estimated ability. Educational and Psychological Measurement, 43, 675-685.doi: 10.1177/001316448304300301
Liu, O. L., Lee, H., & Linn, M. C. (2011). An investigation of explanation multiple-choice items in science assessment. Educational Assessment, 16, 164-184.doi: 10.1080/10627197.2011.611702
Love, T. E. (1997). Distractor selection ratios. Psychometrika, 62, 51-62.doi: 10.1007/BF02294780
Luecht, R. M. (2007). Using information from multiple-choice distractors to enhance cognitive-diagnostic score reporting. In J. P. Leighton &M. J. Gierl (Eds.),Cognitive diagnostic assessment for education: Theory and practices(pp. 319–340). Cambridge University Press.doi: 10.1017/CBO9780511611186
Muraki, E. (1992). A Generalized Partial Credit Model：Application of an EM Algorithm. Applied Psychological Measurement, 16, 159-176.doi: 10.1002/j.2333-8504.1992.tb01436.x
Penfield, R. D. (2011). How are the Form and Magnitude of DIF Effects in Multiple-Choice Items Determined by Distractor-Level Invariance Effects？. Educational And Psychological Measurement, 71, 54-67.doi: 10.1177/0013164410387340
Roediger III, H. L., & Marsh, E. J. (2005). The positive and negative consequences of multiple-choice testing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1155.doi: 10.1037/0278-73188.8.131.525
Sadler, P. M. (1998). Psychometric models of student conceptions in science: Reconciling qualitative studies and distractor-driven assessment instruments. Journal of Research in science Teaching, 35, 265-296.doi: 10.1002/(SICI)1098-2736(199803)35:3<265::AID-TEA3>3.0.CO;2-P
Sigel, I. E. (1963). How intelligence tests limit understanding of intelligence. Merrill-Paker Quarterly,9, 39-56.
Suh, Y., & Bolt, D. M. (2010). Nested logit models for multiple-choice item response data. Psychometrika, 75, 454-473.doi: 10.1007/s11336-010-9163-7
Suh, Y., & Bolt, D. M. (2011). A Nested Logit Approach for Investigating Distractors as Cause of Different Item Functioning. Journal of Educational Measurement, 48, 188-205.doi: 10.1111/j.1745-3984.2011.00139.x
Suh, Y., & Talley, A. E. (2015). An Empirical Comparison of DDF Detection Methods for Understanding the Causes of DIF in Multiple-Choice Items. Applied Measurement in Education, 28, 48-67.doi: 10.1080/08957347.2014.973560
Tamir, P. (1971). An alternative approach to the construction of multiple choice test items. Journal of Biological Education, 5, 305-307.doi: 10.1080/00219266.1971.9653728
Tamir, P. (1989). Some issues related to the use of justifications to multiple-choice answers. Journal of Biological Education, 23, 285-292.doi: 10.1080/00219266.1989.9655083
Thissen, D. M. (1976). Information in wrong responses to the Raven Progressive Matrices. Journal of Educational Measurement, 13, 201-214.doi: 10.1111/j.1745-3984.1976.tb00011.x
Thissen, D., & Steinberg, L. (1984). A Response Model for Multiple Choice Items. Psychometrika, 49, 501-519.doi: 10.1007/BF02302588
Thissen, D., Steinberg, L., & Fitzpatrick, A. R. (1989). Multiple-Choice Models: The Distractors Are Also Part of the Item. Journal of Educational Measurement, 26, 161-176.doi: 10.1111/j.1745-3984.1989.tb00326.x
Walther B. A., & Moore J. L. (2005). The concepts of bias, precision and accuracy, and their use in testing the performance of species richness estimators, with a literature review of estimator performance. Ecography, 28, 815-829. doi: 10.1111/j.2005.0906-7590.04112.x
Wollack, J. A. (1997). A Nominal Response Model Approach for Detecting Answer Copying. Applied Psychological Measurement, 21, 307-320.doi: 10.1177/01466216970214002
The works published in this journal are subject to the following terms:
1. The Publications Service of the University of Murcia (the publisher) retains the property rights (copyright) of published works, and encourages and enables the reuse of the same under the license specified in paragraph 2.
2. The works are published in the online edition of the journal under a Creative Commons Reconocimiento-CompartirIgual 4.0 (legal text). You can copy, use, distribute, transmit and publicly display, provided that: i) you cite the author and the original source of publication (journal, editorial and URL of the work), ii) are not used for commercial purposes, iii ) mentions the existence and specifications of this license.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
3. Conditions of self-archiving. Is allowed and encouraged the authors to disseminate electronically pre-print versions (version before being evaluated and sent to the journal) and / or post-print (version reviewed and accepted for publication) of their works before publication, as it encourages its earliest circulation and diffusion and thus a possible increase in its citation and scope between the academic community. RoMEO Color: Green.