An increasing number of courts around the U.S. rely on the Public Safety Assessment (PSA), an algorithmic risk-gauging tool that judges can opt to use when deciding whether a defendant should be released before a trial. The PSA draws on administrative data to predict the likelihood a person will commit a crime (particularly a violent crime) or fail to return for a future court hearing if released pending trial. But while advocates argue that the PSA isn’t biased in its decision-making, a study from researchers at Harvard and the University of Massachusetts find evidence the algorithm encourages prejudice against men while recommending sentencing that’s potentially too severe.
The U.S. court system has a history of adopting AI tools that are later found to exhibit bias against defendants belonging to certain demographic groups. Perhaps the most infamous of these is Northpointe’s Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), which is designed to predict a person’s likelihood of becoming a recidivist, a term used to describe reoffending criminals. A ProPublica report found that COMPAS was far more likely to incorrectly judge black defendants to be at higher risk of recidivism than white defendants, while at the same time flagging white defendants as low risk more often than black defendants.
To determine whether the PSA exhibits bias, the Harvard and UMass researchers conducted a 24-month randomized control trial involving a judge in Dane County, Wisconsin. They analyzed the outcomes of 1,890 court cases in total, of which 40.1% involved white male arrestees; 38.8% involved non-white males; 13.0% were white female; and 8.1% were non-white women. Based on the distribution of the bail amount and expert opinions, the researchers categorized the judge’s decisions into three categories: signature bond, small cash bond (less than $1,000), and large cash bond (greater than or equal to $1,000).
As the researchers note, the PSA considers nine variables across criminal history — primarily prior convictions, failure to appear, and age, but not gender or race — in making its predictions. Despite this, according to the results of the randomized trial, the PSA recommendations were often more stringent than the judge’s decisions. Moreover, the judge was more likely to impose a cash bond on male arrestees than on female arrestees within each risk category, suggesting that the PSA motivated gender bias.
“The PSA … might make the judge’s decision more lenient for female arrestees while it leads to a harsher decision for male arrestees among preventable and easily preventable cases,” the researchers wrote. “Thus, the PSA provision appears to reduce gender fairness.”
However, on the subject of racial bias, the researchers say they found no “statistically significant” impact regarding the PSA — at least among male arrestees. While the judge was more likely to impose a cash bond on non-whites than on whites even when they belonged to the same risk category, these decisions were made in the absence of PSA predictions, suggesting the judge was implicitly biased.
“In today’s data-rich society, many human decisions are guided by algorithmic-recommendations. While some of these algorithmic-assisted human decisions may be trivial and routine (e.g., online shopping and movie suggestions), others that are much more consequential include judicial and medical decision-making,” the researchers wrote. “As algorithmic recommendation systems play increasingly important roles in our lives, we believe that a policy-relevant question is how such systems influence human decisions and how the biases of algorithmic recommendations interact with those of human decisions … These results might bring into question the utilities of using PSA in judicial decision-making.”
Arnold Ventures, the company behind the PSA, has repeatedly chose to stand behind its product. But several reports have questioned the efficacy of it and comparable predictive tools actively in use. According to a report by The Appeal, only nine jurisdictions using pretrial assessment pools reported that their pretrial jail populations had decreased after the adoption of tools. Problematically, most jurisdictions don’t track the impact of risk assessment tools on their jail populations at all.
Last year, the Partnership on AI released a document declaring the algorithms now in use unfit to automate the pretrial bail process or label some people as high risk and detain them. Validity, data sampling bias, and bias in statistical predictions were called out as issues in currently available risk assessment tools. Human-computer interface issues and unclear definitions of high risk and low risk were also considered important shortcomings in those tools.
The state of California recently proposed — and struck down — a ballot measure that would have eliminated cash bail and required judges to use predictive algorithms in their decisions. The measure wouldn’t have standardized the algorithms, meaning that each county’s might process slightly different data about a defendant (as they do now). An estimated one in three counties in the U.S. employ algorithms in the pretrial space, and many are privately owned.
“We do not condone the use [of tools like PSA],” Ben Winters, the creator of a report from the Electronic Privacy Information Center that called pretrial risk assessment tools a strike against individual liberties. “But we would absolutely say that where they are being used, they should be regulated pretty heavily.”