Training data to make predictive policing less biased is still racist


In their defense, many developers of predictive policing tools say they have started using victim reports to get a more accurate picture of crime rates in different neighborhoods. In theory, victim reports should be less biased because they are unaffected by police bias or feedback loops.

But Nil-Jana Akpinar and Alexandra Chouldechova of Carnegie Mellon University show that the perspective provided by the victim reports is also biased. The pair built their own predictive algorithm using the same model found in several popular tools, including PredPol, the most widely used system in the United States. They trained the victim reporting data model for Bogotá in Colombia, one of the very few cities for which independent crime reporting data is available at the district-by-district level.

When they compared their tool’s predictions to the actual crime data for each district, they found that there were significant errors. For example, in a district where little crime was reported, the tool only predicts about 20% of actual hotspots – places with high crime rates. In contrast, in a neighborhood with a high number of reports, the tool predicts 20% more hotspots than there actually were.

For Rashida Richardson, a lawyer and researcher who studies algorithmic bias at the AI ​​Now Institute in New York, these findings reinforce existing work that highlights problems with datasets used in predictive policing. “They lead to biased results that don’t improve public safety,” she said. “I think a lot of predictive policing providers like PredPol fundamentally don’t understand how structural and social conditions skew or distort many forms of crime data.”

So why did the algorithm hurt so badly? The problem with victim reports is that blacks are more likely to be reported for a crime than whites. Richer whites are more likely to report a poorer black person than the other way around. And black people are also more likely to speak out against other black people. As with arrest data, this leads to black neighborhoods being reported as crime hotspots more often than they should.

Other factors also distort the image. “Reporting victims is also linked to community trust or mistrust of the police,” says Richardson. “So if you are in a community with a historically corrupt or notoriously racist police service, it will affect how and whether people report crimes.” In this case, a predictive tool may underestimate the level of crime in an area, preventing it from getting the police services it needs.

No miracle solution

Worse still, there is still no obvious technical solution. Akpinar and Chouldechova tried to fit their Bogotá model to account for the biases they observed, but did not have enough data for the fits to make a big difference – despite there being more data in the district level for Bogotá than for any American city. “Ultimately, it’s unclear whether mitigating the bias in this case is easier than previous efforts that shut down the data-driven systems,” Akpinar says.

What can be done? Richardson believes that public pressure to dismantle racist tools and the policies that underpin them is the only answer. “It’s just a matter of political will,” she said. She notes that early users of predictive policing tools, like Santa Cruz, have announced that they will no longer use them and that there have been scathing official reports of LAPD and Chicago’s use of predictive policing. PD. “But the answers in each city were different,” she says.

Chicago has suspended the use of predictive policing but has reinvested in a database for police gangs, which Richardson says has many similar problems.

“It’s worrying that even when government investigations and reports uncover significant problems with these technologies, it’s not enough for politicians and police to say it shouldn’t be used,” she said.

Leave a Reply

Your email address will not be published. Required fields are marked *