|
Frequently Asked Questions About Licensing Exams |
Constructed-response test items
CLEAR Exam Review (Summer
1993)
Eric Werner, M.A.
Question: We include several essay and fill-in questions in our state exam, which is scored by board members. In response to a recent candidate appeal, our board is debating the issue of whether to change the kinds of questions we use. What are some pros and cons that we should consider?
Answer: Essay questions (short and long), short-answer questions, and fill-in items are members of the constructed-response family of exam problems. In contrast to multiple-choice questions, constructed-response items challenge the test taker to construct an acceptable answer rather than just to recognize one. The construction might be very brief (e.g., Name the three methods of ), short (e.g., Briefly explain how ), or long (e.g., Explain five important differences between "x" and "y" and illustrate how the two approaches can be used together in order to ). Other formats that belong to this family are in-basket simulations, oral exams, and some performance problems. Some persons believe that constructed-response items are the key to more authentic assessment and should be more widely used.
Most, if not all, testing specialists believe that there is no one best test question format and that a testing agency should use the one, or combination of several, that offers the most advantages and the fewest disadvantages in relation to the agency's objectives and resources. For state licensing exams and for certification exams, resource limitations dictate the frequent use of multiple-choice questions even, unfortunately, in situations where constructed-response items would be technically preferable. Here are some suggestions and observations to keep in mind when your board makes its test format decisions:
If your objective is to assess mastery of
important factual information, use multiple-choice items. If you
want to test candidates on their ability to organize and relate
ideas, compare and contrast methods, explain things clearly, or
create complex solutions, consider constructed-response
questions.
Because many constructed-response
questions appear, on the surface, to assess higher-level thinking
about important topics, they many be more readily accepted as
fair and valid by test takers, board members, and the public. For
this reason, constructed-response tests may have considerable
face validity, independently of whether they are more or less
valid in other respects than multiple-choice tests.
Advocates of constructed-response
questions point out that these questions eliminate the guessing
factor of multiple-choice tests. Further, candidates cannot
select a given answer of which they are unsure and then deduce by
reverse process whether or not it is the best option presented.
However, verbally fluent and test-wise candidates can bluff their
way to an acceptable score on essay questions that are
inadequately developed and scored. 'If you're smart, you can get
the right answer without knowing anything" wrote Charles
Schulz in Peanuts. This is as applicable to poorly
constructed essays as it is to poorly constructed multiple-choice
questions.
If the range of subjects you want to
assess is quite broad, keep in mind that constructed-response
questions are unlikely to allow you to cover the range
thoroughly. The response time required by essays, for example, is
such that relatively few of them can be used in the typical
amounts of time allotted for state tests.
Although the reliability of
constructed-response questions can be high, you should be
prepared for it to be lower than the reliability of
multiple-choice questions. Lower reliability can result from the
limited coverage of subject areas (as mentioned above) and from
judgmental and subjective factors involved in scoring
constructed-response items.
Many persons believe that they can
quickly prepare effective short-answer and essay questions. They
therefore prefer these formats to multiple-choice questions, the
development of which usually requires considerable time and
tedium. In fact, developing short-answer and essay questions that
work well is at least as demanding as developing
multiple-choice questions. Furthermore, it is important to test
constructed-response questions before using them operationally.
This can be quite difficult and time consuming.
Don't take scoring lightly. A variety of
potential problems must be solved if short-answer and essay
question scoring is going to result in reliable and valid scores.
Some of these difficulties include illegibility, spelling,
grammar, and acceptable answers that are not anticipated when
grading standards are developed.
Others involve graders who might be expert
in the subject tested, but vary in terms of the severity with
which they apply grading standards, or misapply the standards or
ignore them altogether.
Before you even start to think about these
problems in specific terms, you'll want to decide whether an
analytic or a holistic grading strategy is more appropriate for
your test. Others have discussed this last matter in previous
issues of CER (Shimberg, 1990; Julian and Orr, 1992), in
nontechnical articles. Even though they focus on performance and
simulation tests, the approaches they cover are adaptable to
other kinds of constructed-response tests as well. A more
technical review and additional references are available in
Millman and Greene (1989, pp. 343-345).
It's easier for candidates to remember constructed-response questions and the answer they give, than it is to do the same for a multiple-choice test of many items. Therefore, exam security considerations often lead testing agencies that rely on essays, for example, to use them only once. This creates a significant test construction workload for the agency. New questions must be developed frequently, and assuring the equivalence of two or more test forms becomes difficult, if not impossible.
I hope this helps you and your board reach the right decisions concerning the format of your state test. However, keep in mind that there are no absolute right or wrong formats to be chosen, just formats that work best in the context of your particular exam program.
Back to
index
© 2002
Council
on Licensure, Enforcement and Regulation