Want Your Students to Learn More? Test Them in Groups!
Posted January 9, 2017
By Tabitha Kirkland, University of Washington & Deepti Karkhanis, Bellevue College
We teach with the hope that our students are learning, and we test to find out whether or not they have. At least, this is the traditional approach. We certainly give our students active opportunities to learn during class sessions, but we wondered: wouldn’t it be great if we allowed students to continue learning throughout the class? Traditionally, exams assess rather than create learning. And in introductory classes, these exams are better at evaluating recall and recognition rather than evaluating or even promoting deeper levels of understanding. We suggest that learning can occur throughout the course, even during a test. And if we really believe that deep learning is more important than memorization and regurgitation, we should be willing to reconceptualize the way we approach testing.
So, we’d like to share our adventures in group testing.
How group testing works
We did a series of little experiments at a large community college in Washington, where all classes are for first- and second-year students. We tried several variations on this basic theme, but what all of these had in common were that psychology students had the opportunity to review and discuss their multiple-choice exams during class with peers before submitting it for a grade.
The first round, we gave students the same multiple-choice exam twice: once alone, and then immediately afterward in assigned small groups of 3-5 students. We let students complete individual response forms during both portions, so they were still responsible for their own scores and were not required to agree with the group. We then averaged both their individual and group efforts to calculate their exam score. The results? Almost 20% improvement from individual to group exams -- with some students improving by as much as 40%! (Karkhanis & Kirkland Turowski, 2015)
Importantly, this improvement in learning did not seem to be due to one high-performing student dominating the group conversation. We noticed that almost everyone seemed to be contributing about equally during group exams, and students’ feedback lined up with these observations: in a follow-up survey, they reported that almost everyone contributed equally to discussion.
Scores also improved across the grade distribution: even the highest-performing individual students benefited from the group conversation (improved scores from individual to group exams), suggesting these overall improvements were not simply due to the lower-performing students catching up.
We tried mixing up who was in the groups. It didn’t seem to matter whether the groups were carefully pre-sorted to distribute members across the grade spectrum, or assigned randomly, or even chosen by the students themselves. It seems that what matters for performance is simply the group format, not the specific makeup of the group membership. (Group size probably does matter, however -- smaller groups give more opportunity for individual contribution.)
We also checked to see if there were reliable differences based on class topic (e.g., general psychology vs. lifespan psychology) or instructor or even instructional quarter. Nope. So this suggests that this approach would be appropriate for a variety of psychology classes.
One of us also tried this neat variation: group quizzes followed by individual exams. The rationale was that if students are learning more in the group atmosphere, then that learning should show when being tested on those concepts later. She put students in randomly-assigned pairs for weekly quizzes, then compared their individual exam performance with another class (same instructor) that had taken their weekly quizzes alone. Students in the paired-quiz class performed an average of 10% better on the individual exams!
Why do group exams work so well?
We asked our students what worked for them about these exams.
First, our students enjoyed the opportunity to discuss each answer choice with one another. They found the process of talking, debating, and bouncing ideas off each other to be useful, and they were able to see different ways of reasoning through these conversations. Students reported that they learned more through teaching others and being taught, and sharing these answers boosted their confidence in their own knowledge. Groups used different ways of arriving at a conclusion: some voted and went with the majority, others discussed until a consensus was reached, and others chose their own answers after discussion (remember, they had individual response forms). Regardless, almost everyone enjoyed the group exam format, which can positively affect the entire class experience (Stearns, 1996).
Second, group testing relieved stress and anxiety that often is caused by evaluative situations. Our class sessions tend to be interactive, experiential, and collaborative, and we try to build community to encourage a positive learning environment. But when it comes time to test them, that environment changes. Tests can create high levels of anxiety for many students, yet we know that the best performance under challenging circumstances comes from moderate levels of arousal (Yerkes & Dodson, 1908; Broadhurst, 1959). How do we manage this traditionally? We just tell our students not to be anxious and hope it works out. Building on notions of context- and state-dependent memory, this means we are actually creating incongruent contexts for our students to perform in, while still hoping that they will be able to succeed. We teach this science to our students, so we should really be implementing practices based on it. Some of our students explicitly discussed the stress relief provided by the group-testing environment, and how that boosted their performance and recall. Group testing then not only mitigates test anxiety, it also make in-class testing a more positive educational experience (just like the regular class sessions) (Ioannou & Artino, 2010).
A third reason group exams work: they give students an opportunity to explicitly review their answers and reasoning. If you’re not yet on board with testing in groups, you might just have students take the exam individually, then get them into groups to discuss the exam afterward and figure out why their answers were correct or incorrect together. This would be a more traditional approach -- collaboration during review -- in which one major benefit is simply having students look at the exam again instead of the more common practice of tossing it in their bags and never looking at it again. Of course, we’d like to think that some of the benefits of the group format would carry over to this also, like students learning from one another. But in this format, learning would occur after the fact, so this might work best if you plan to give a comprehensive final or something else cumulative to reward that effort.
Some of you must be wondering about social loafing and individual accountability. We do think it’s important to include an individual component of the grade so as to encourage students to prepare for exams with the same rigor. One way to address this is to implement some minimum cut-off grade that students must earn individually in order to be eligible for group benefits. For example, one of us set a minimum of 70%. Students who scored lower than this on the individual exam did not have their group score count toward their grade. Another strategy is lower-stakes group quizzing followed by higher-stakes individual testing (described in “Variations” above).
We recognize that our suggestions have some possible limitations:
- First, we tried all this stuff at a community college, where competition is limited and stakes are lower, and class sizes are fairly small (30-40 students). We think the possibility of group dynamics encouraging collaboration over competition would definitely benefit students at more competitive schools, provided the grading scale allowed for this. And we think it would probably be feasible to do in moderately large classes (e.g., 100 students), provided there is additional instructional support staff. One of us has actually just moved to a large university, and plans to try this out in a larger class, so we will keep you posted on how well this works!
- We also did this with multiple-choice exams, on which performance can be more easily quantified. Of course, exams were rigorous -- using a variety of difficulty levels and skewing toward conceptual/application questions -- but we have not tested this out with short-answer or essay exams. We’d imagine that you would have to vary your approach if this is more your exam style, like designating a single group scribe (thus sacrificing a bit of independence) or having meticulous rubrics.
- Group exams are good for performance. Students performed significantly better on exams taken in small groups relative to exams taken individually (Bloom, 2009; Karkhanis & Kirkland Turowski, 2015). This improvement in performance was due to a number of factors, including the opportunity to think carefully about the concepts being tested and the ability to teach and learn from one’s peers.
- More importantly, group exams are good for learning. Students reported that the group-learning atmosphere helped to relieve stress, increase confidence, and solidify their understanding of course material. And students who completed frequent group quizzes scored higher on midterm and final exams, suggesting that they learned more during those collaborative quizzes than students taking them alone.
We’d love to study this further, including how these effects might change when we take into account question difficulty and level of abstraction. But we’re pretty clear on one thing -- if your class size permits it, you should definitely consider implementing group exams.
Tabitha Kirkland is a lecturer in psychology at the University of Washington. She received her B.A. from the University of California, San Diego, and her M.A. and Ph.D. from The Ohio State University. Tabitha lives in Seattle with her family and enjoys traveling, outdoor adventures, and a strong cup of coffee.
Deepti Karkhanis is an Associate Professor and Department Chair of Psychology at Bellevue College. She received her Bachelor’s and Master’s degree from Delhi University in India, and her Ph.D. in Applied Developmental Psychology from George Mason University in Fairfax, VA. She is a developmentalist whose teaching interests include lifespan psychology, adolescent and youth psychology, cross-cultural psychology and positive psychology. Dr. Karkhanis explores a variety of pedagogical topics such as collaborative testing, student-teacher rapport, positive psychology in classroom curriculum, and teacher training on social justice and educational equity.
1.Bloom, D. (2009). Collaborative test taking: benefits for learning and retention. College Teaching, 57(4), 216-220.
2.Broadhurst, P. L. (1959). The interaction of task difficulty and motivation: The Yerkes-Dodson Law revived. Acta Psychologica, 16, 321-338.
3.Ioannou, A., & Artino Jr, A. R. (2010). Learn more, stress less: Exploring the benefits of collaborative assessment. College Student Journal, 44(1), 189-199.
4.Karkhanis, D. G., & Kirkland Turowski, T. (May 2015). Group exams improve student learning. Psychology Teacher Network, 25(2), 8-10. http://www.apa.org/ed/precollege/ptn/2015/05/may-ptn.pdf
5.Stearns, S. A. (1996). Collaborative exams as learning tools. College Teaching, 44(3), 111-112.
6.Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habit‐formation. Journal of Comparative Neurology and Psychology, 18(5), 459-482.