The effects of financial incentives on standardised testing

Can financial incentives improve outcomes on standardised tests? In an unpublished paper, John List, Jeffrey Livingston and Susanne Neckermann examined whether motivating pupils to gain the knowledge needed to succeed on the International Student Admissions Test (ISAT) would result in higher test scores.

Subjects were 3rd–8th grade (Year 4–Year 9) pupils (n=226E, 226C) who were judged to be at risk of not passing the ISAT and were receiving tutoring to improve their scores in nine elementary and middle schools in a suburb outside of Chicago. Using a system designed by the ISAT’s developers, authors created short benchmark tests designed to measure the knowledge needed to be successful on the ISAT. Pupils, their parents and their tutors were informed that they would receive up to $90 if pupils’ performance improved on these benchmark tests and if they met other academic and behavioural goals.

Results showed that pupils demonstrated significant gains on the incentivised benchmark testing compared to control pupils (effect size =+0.29), indicating they had the knowledge to pass the ISAT. However, they did not demonstrate significant gains compared to controls for the ISAT itself (effect size=+0.05), for which they did not receive any financial incentive. This was true regardless of whether incentives were provided immediately or were delayed. The authors conclude that pupils may not be motivated to show what they know on standardised testing that holds no personal stake for them.

Similar results were found in a review of international experiments evaluating financial incentives in education by Robert Slavin, posted on The Best Evidence Encyclopedia. The emerging consensus is that financial incentives can have an impact on easily counted outcomes for which the incentives were directly given (such as attendance), but not on more general outcomes that should flow from the incentivised outcomes (such as achievement).

Source:  Do students show what they know on standardized tests? working papers (2016),  from the selected works of Jeffrey A Livingston, Bentley University

Pupils may do better on tests if they can go back and check their work

Joseph Hardcastle and colleagues conducted a study to compare pupil performance on computer-based tests (CBT) and traditional paper-and-pencil tests (PPT). More than 30,000 pupils in grades 4–12 (Years 5–13) were assessed on their understanding of energy using three testing systems: a paper and pencil test; a computer-based test that allowed pupils to skip items and move freely through the test; or a CBT that did not allow pupils to return to previous questions.

Overall, the results showed that being able to skip through questions, and review and change previous answers, could benefit younger pupils. Elementary (Years 5 and 6) and middle school (Years 7–9) pupils scored lower on a CBT that did not allow them to return to previous items than on a comparable computer-based test that allowed them to skip, review, and change previous responses. Elementary pupils also scored slightly higher on a CBT that allowed them to go back to previous answers than on the PPT, but there was no significant difference for middle school pupils on those two types of tests. High school pupils (Years 10–13) showed no difference in their performance on the three types of tests.

Gender was found to have little influence on a pupil’s performance on PPT or CBT; however, pupils whose primary language was not English had lower performance on both CBTs compared with the PPT.

Source: Comparing student performance on paper-and-pencil and computer-based-tests. Paper presented at the 2017 AERA Annual Meeting, 30 April 2017. American Association for the Advancement of Sciences.

Rethinking the use of tests

Olusola O Adesope and colleagues conducted a meta-analysis to summarise the learning benefits of taking a practice test versus other forms of non-testing learning conditions, such as re-studying, practice, filler activities, or no presentation of the material.

Analysis of 272 independent effect sizes from 188 separate experiments demonstrated that the use of practice tests is associated with a moderate, statistically significant weighted mean effect size compared to re-studying (+0.51) and a much larger weighted mean effect size (+0.93) when compared to filler or no activities.

In addition, the format, number and frequency of practice tests make a difference for the learning benefits on a final test. Practice tests with a multiple-choice option have a larger weighted mean effect size (+0.70) than short-answer tests (+0.48). A single practice test prior to the final test is more effective than when pupils take several practice tests. However, the timing should be carefully considered. A gap of less than a day between the practice and final tests showed a smaller weighted effect size than when there is a gap of one to six days (+0.56 and +0.82, respectively).

Source: Rethinking the use of tests: A meta-analysis of practice testing (February 2017), Review of Educational Research DOI: 10.3102/0034654316689306