“The limitations of radiology models stem from deeper problems with building medical AI. Training datasets come with strict inclusion criteria, where the diagnosis must be unambiguous (typically confirmed by a consensus of two to three experts or a pathology result) and without images that are shot at an odd angle, look too dark, or are blurry. This skews performance towards the easiest cases, which doctors are already best at diagnosing, and away from real-world images. In one 2022 study, an algorithm that was meant to spot pneumonia on chest X-rays faltered when the disease presented in subtle or mild forms, or when other lung conditions resembled pneumonia, such as pleural effusions, where fluid builds up in lungs, or in atelectasis (collapsed lung).”