AI Can Reduce Overdiagnosis in Ultrasound Screening for Breast Cancer


By Pawel Slabiak • September 7, 2021

Center for Advanced Imaging Innovation and Research

Breast cancer is the most common and deadliest malignant disease affecting women worldwide.

Screening is fundamental to early detection, found to cut mortality from the disease by about half.

But "dense" breasts, which are at higher risk of cancer, appear opaque on mammograms, making interpretation more difficult.

Most routine screening in developed countries is performed with mammography.

Dense and extremely dense breasts contain a high proportion of "radiopaque" parenchyma.

Breasts that have low mamographic density comprise mostly "radiolucent" fat.

When mammography is hard to decipher, doctors often turn to ultrasound for more information.

Less than 10 percent of which confirm cancer.

But secondary screening with ultrasound results in up to 8 percent more biopsies.

To investigate whether AI can help doctors make fewer false-positive findings, a team of researchers from NYU Langone Health, New York University, and NYU Abu Dhabi created a deep learning model to detect breast cancer in ultrasounds.

Scientists prepared a set of deidentified images and anonymized medical reports from more than 143,000 patients who underwent screening at NYU Langone Health.


TEST 30%



To compare the model's performance with that of human experts, the researchers had 10 radiologists read a subset of the exams.




In most cases, the AI model agreed with human readers.

In this ultrasound exam, all 10 radiologists found cancer.

And so did the AI.

But in many cases in which the radiologists erroneously suspected cancer, the AI correctly found none.

In both exams below, all 10 radiologists suspected malignancy and ordered biopsies.

But the AI correctly classified the lesions as benign.

Overall, the AI was as accurate as the experts in identifying malignant lesions (it matched radiologists' sensitivity).

The model had a lower tendency to overdiagnose  (had higher specificity).

And the AI's positive findings more often correlated with actual cancer cases (had greater positive predictive value).

But with their powers combined, the human experts and the machine learning model achieved even higher specificity and even lower biopsy rates...








Biopsy rate

...while also identifying cancer cases more accurately than either the AI or the radiologists did on their own.




Positive predictive value







To accept AI's assistance in the reading room, radiologists need better insight into how deep learning algorithms arrive at recommendations.

For now, the "hybrid model" of human readers and the algorithm is a statistical construct.

"We're looking for more intuitive ways to show how AI is making predictions," said Yiqiu "Artie" Shen, who co-led the research.

Shen, PhD candidate at NYU Center for Data Science, developed weakly supervised machine learning methods for mammography before investigating ultrasound.

"Our suggestion is to use saliency maps," said Shen, referring to the red- and green-tinted heat maps that indicate where the model bases its findings of malignant or benign lesions.

"This is one of the first ultrasound models to do that with breast cancer," said Farah Shamout, DPhil, who co-led the study.

Shamout, assistant professor at NYU Abu Dhabi, investigates AI technolgies and their potential to support decision-making in healthcare.

"Maybe the next step would be to give reasons," he said. "Maybe we could develop an AI able to describe its reasoning."

"There's a lot of potential here," Shamout said. "We can show that a hybrid approach improves performance."

Shen stressed that machine learning research needs to focus more on collaboration and explainability.

Research images, data, and photo of Yiqiu "Artie" Shen courtesy of Yiqiu "Artie" Shen. Photo of Farah Shamout by Sam Hollenshead/NYU Photo Bureau. Body photos by Ivan Stern/Unsplash, Annie Spratt/Unsplash. Abstract illustrations by Oleksii Lishchyshyn/Shutterstock. Text, media editing, and production by Pawel Slabiak.

Related Preprint