This month brought news of two studies that question the correlation between student evaluations of their teachers and student learning. One tested whether the “fluency” of a teacher’s delivery would impact student perceptions or student learning. The Chronicle describes these experiments in “Smooth Lectures Foster Only the Illusion of Learning, Study Finds” (log in required). Here’s how the experiment was designed:
42 undergraduates in an introductory psychology course at Iowa State were randomly assigned to two groups. Each was told their memory would be tested later. They watched a 65-second video of a lecturer explaining why calico cats are usually female. . . . One group watched the fluent video, in which the speaker stood in front of a desk, spoke without notes, maintained eye contact with the camera, and gestured with her hands for emphasis. The other students watched the disfluent one. The same lecturer stood behind a desk and delivered identical content, but read haltingly from notes as she hunched over a podium, pausing occasionally to glance at the camera.
After watching the videos, the students in both groups were asked how well they thought they had learned the material, how much they predicted they would remember 10 minutes later, and how organized, effective, and knowledgeable the speaker was. The students who had watched the fluent lecture were about twice as likely as those who had watched the disfluent one to predict that they would remember what they had heard and to say they had learned the material. The fluent speaker was also rated as significantly more organized, knowledgeable, and effective than the disfluent speaker.
All pretty much what we’d expect, right? The more clearly delivered lecture should result in better learning. But it didn’t happen. Students in both groups did equally well (actually, equally badly): they got about 25% of the material correct. So while the fluent lecture gave a stronger perception of learning, and that lecture received higher ratings, it did not lead to greater knowledge. Would you get the same result from a longer lecture, or a semester-long course? This experiment can’t answer that question. But it does point to a quickly-formed perception/learning gap.
The second article also examines the tie between student evaluations and student learning. A study of more than 10,000 Air Force Academy cadets randomly assigned students to different sections of a course and used identical syllabi and identical final exams (graded by someone other than the professor). The students also all took mandatory follow-up classes, which again had identical final exams. Students of the less experienced professors did better on the exam in the initial course, and gave their professors higher evaluations. However, the students of the more experienced professors did better on the exam in the follow-up courses.
The authors theorize that the more experienced professors broadened their coverage, while the less experienced ones “taught to the test.” The broader coverage may have led to deeper understanding of the material, which in turn helped the students do better in more advanced classes (while feeling less happy in the intro course).
Does this mean that student evaluations are worthless? Certainly not (and the Air Force experiment may demonstrate a flawed alignment between the coverage of the introductory and advanced courses). But it is further evidence that when we use student evaluations we need to be quite clear about what questions to ask the students, and we need to understand that there is a gap between what people think they have learned and what they actually learn. It’s also another reason not to base decisions about the quality of teaching for tenure and promotion on student evaluations alone.