To test this possibility, the researchers trained a deep learning model to predict the patient’s self-reported pain level from the knee x-ray. If the resulting model had terrible precision, it would suggest that the self-reported pain is rather arbitrary. But if the model had very good accuracy, it would provide evidence that the self-reported pain actually correlates with the radiographic markers on the x-ray.
After conducting several experiments, including to rule out any confounding factors, the researchers found that the model was much more accurate than KLG at predicting self-reported pain levels for white and black patients, but especially for black patients. He nearly halved the racial disparity at each level of pain.
The goal is not necessarily to start using this algorithm in a clinical setting. But by surpassing KLG’s methodology, he revealed that the standard method of measuring pain was flawed, at a much higher cost to blacks. This should prompt the medical community to research any X-ray markers that the algorithm might see and update its scoring methodology.
“This highlights a really exciting part of where these types of algorithms can fit into the medical discovery process,” says Obermeyer. “It tells us if there is something here that is worth looking into that we don’t understand. This sets the stage for humans to step in next and, using these algorithms as tools, try to figure out what’s going on.
“What’s cool about this article is that it thinks about things from a completely different perspective,” says Irene Chen, a researcher at MIT who studies how to reduce inequalities in healthcare in learning. automatic and was not involved in the article. Instead of training the algorithm on the basis of well-established expert knowledge, she says, the researchers chose to treat the patient’s self-report as truth. As a result, he revealed significant gaps in what the medical field generally considers to be the most “objective” measure of pain.
“It was exactly the secret,” Obermeyer admits. If algorithms are never trained to match the performance of experts, he says, they will only perpetuate existing gaps and inequalities. “This study is a snapshot of a more general pipeline that we are increasingly able to use in medicine to generate new knowledge.”