CLEAR News - Summer/Fall 2001


Pass/Fail Decisions: Subject to Tweaking
by Anne Paxton

Computer hackers often refer to a "fudge factor" as a value or parameter that is varied in an ad hoc way to produce the desired result. A fudge factor can often be tweaked in more than one direction, making it useful when results don’t come out the way people like them. Several recent stories indicate that pass/fail decisions in licensing are sometimes subject to post hoc adjustment or tweaking; just like most computer programs, licensing tests are rarely airtight algorithms.

A state audit in Arizona, for instance, uncovered an extreme case when it found that the naturopathic board had had a 100 percent pass rate on its exam since September 1998—but only because the board made substantial adjustments to candidates’ test scores. In February 1999, no one would have passed the test as originally scored, but the board identified "difficult" questions on the exam by looking at the number of candidates who got them wrong, then giving them full credit for the items. Nine still failed, so the board made additional adjustments without any apparent justification, the audit discovered, and all 18 examinees passed.

The California Board of Psychology is hardly as subjective as that, but the state Office of Examination Resources (OER) nevertheless recommended July 3 that the board eliminate its oral exam. Despite the board’s efforts to improve the psychometric quality of the exam, problems in development, administration, and scoring raise questions, said OER. It suggested that even an exam with acceptable internal reliability does not automatically produce valid pass-fail decisions.

One problem is that the clinical content depends upon how psychologists’ varying interpretations of vignettes, structured questions, and criteria responses, OER noted. But another major source of error it found is the lack of ability to calibrate the examiners’ judgments to a shared definition of minimum competence. For example, the pass rates of exam administrations in Los Angeles and San Francisco vary as little as 1% and as much as 11% in either direction. "The only real differences between the two sites are the examiners," the OER pointed out.

The OER also cited an October 1999 study of examiner reliability, in which 15 experienced examiners listened to the tapes of four candidates who sat for the oral exam previously. For three of the candidates, the average score was lower than the original one, and for the fourth, the score was higher. At the extremes, one rater would have passed all four candidates and one would have failed all four. "In summary, there was very little convergence in the licensing decisions between the original examiners and the raters.," said the OER.

Of course, now and then licensees themselves can contrive to pass by encouraging some helpful subjectivity on the part of the examiners. The going rate was about $1,300 for commercial trucking licenses in Illinois and Florida, federal investigators have found. In June 2000, they indicted two men on charges of selling illegal licenses to truckers. Their modus operandi was to send the applicants, usually Eastern European immigrants with limited English, to Florida, where interpreters are allowed on the exam. One would meet with applicants in advance and discuss in Polish or Russian how he would indicate the correct answers—by touching his nose or modulating his voice.

At least whatever changes in standards state boards adopt usually apply across the board. Virginia, for example, decided to backpedal on its adoption of the toughest cutoff score standards in the nation for high school teachers. In April, the state Board of Education agreed to ease a five-year-old requirement that prospective teachers meet or exceed cutoff scores on each of three tests in reading, writing, and math that makeup the Praxis teacher exam. Instead, they can now be certified based on a composite score. If the math grade isn’t high enough, the teacher can still become certified if the other scores are high.

The reason for the policy shift: worsening teacher shortages. Almost half of the teachers who received new licenses to work in state classrooms last year didn’t meet the Praxis standard and were issued provisional licenses. Many of them will now be able to resubmit their scores under the new guidelines. Pennsylvania makes a similar change in scoring its bar exam starting in July, because too many prospective lawyers have been flunking the test.

No one can say if Pennsylvania or Virginia will be flooded with incompetent licensees as a result of their new scoring methodology. But it could qualify as something programmers call a "kluge," a clever programming trick intended to fix a bug in an expedient, if not clear, manner. Ideally, we’d like a testing process under which candidates pass if, and only if, they’re competent to practice, but keeping discretion and subjectivity out of pass/fail decisions is difficult. Until that occurs, these cases suggest that in some cases a kluge—or just an expedient fudge factor—could continue to be the answer.

Next