Frequently Asked Questions About Licensing Exams
Equating test/ use of scaled scores
CLEAR Exam Review
Eric Werner, M.A.
Question: I would like a reference for a clear, practical application of test equating methodology. I am familiar with Test Equating by Holland and Rubin and with the relevant proportions of Educational Measurement by Thorndike. Is there a step-by-step application of equating to a sample data set, including formulas and equations?
Answer: Try Introduction to Classical and Modern Test Theory by Crocker and Algina (Holt, Rinehart and Winston, 1986). Chapter 20 satisfies your reference needs in relation to both linear and equipercentile equating, the first as applied to three basic data collection designs and the second as applied to two. The use of item response theory to equate tests is also covered. The chapter makes use of fundamental equating formulas presented in the references you mention and applies them to hypothetical test data. Exercises at the end of the chapter provide still more practical applications.
CLEAR Exam Review
Question: We use a criterion-referenced methodology when establishing the passing score for our examination. The result from applying this methodology is that the actual passing score varies from examination to examination. What is the best way to explain the variation?
Answer: In theory, the variation is easy to explain. Since the difficulty of the questions selected for a given administration of an examination differ from those of another administration, the passing score varies around the fixed concept of minimal competence to account for the differences in the examination questions. That is, the examination with the more difficult questions overall will have a lower passing score. The variation in the passing score is psychometrically sound and legally defensible. As you mentioned, the variation is difficult to explain to candidates who may achieve a score that fails them on one examination but would be a passing score on a subsequent examination.
A good way to reduce or even eliminate varying passing scores is to assemble tests that are equal in difficulty according to average difficulty (p values). The passing score is not the equivalent of the average of the p values; however, if the average difficulty of the examinations were about the same, then the passing scores should not vary significantly. The limitation for this method is that the items should have stable statistics. The stability usually is not obtained until the items have been administered to several hundred candidates. For small examination programs, stable item statistics may not be available.
A second method that works for both large and small programs is to report scaled scores. When a scaled criterion-referenced passing score is used, the score required for passing remains consistent. Scaling the score does not affect the level of performance required for passing the examination. Scaling simply allows the licensing board to report, for example, that the passing score is 70 (not to be confused with 70%) while the actual passing score is free to vary according to the difficulty of the examination.
© 2002 Council on Licensure, Enforcement and Regulation