The news: OpenAI’s o1 series can outperform attending physicians at some clinical diagnostic reasoning cases, according to a study recently published in the journal Science.
Digging into the details: Researchers at Harvard and Beth Israel Deaconess compared the AI model with two physicians on ER and other case diagnoses, then had two additional physicians evaluate the AI’s diagnoses against the doctors’ results.
The LLM was found to perform better than the clinicians at various ER decisions, such as identifying likely diagnoses and choosing next steps in patient management. For example, in one experiment, the AI correctly or very nearly correctly diagnosed 67% of cases at initial ER triage, outperforming two physicians, who achieved 55% and 50%, respectively. In the case of patients being admitted to the medical floor or ICU, the AI identified the exact or near-exact diagnosis 82% of the time—outperforming physicians, who reached 70% and 79%, respectively.
Why it matters: Numerous studies have shown that AI can match or even outperform doctors in specific diagnostic tasks. This study drew particular attention because, unlike most prior research, it used ER cases exactly as they appeared in electronic health records rather than cleaned-up data. The authors noted that pre-cleaning patient data for studies can inflate AI performance compared with the “messy” data encountered in real clinical workflows.
Implications for healthcare AI companies: It’s hard to draw broad conclusions from a study comparing AI to just two doctors. Yet story headlines covering the study as proof that AI can out-diagnose clinicians contribute to a narrative that undercuts physicians’ value in patient care and gives AI companies fodder to promote their technology. Notably, OpenAI did not do this in this case.
Ironically, one of the study’s authors even admitted that his research “is going to get used by companies that are heavily financed and are looking to skip some of these essential safe pieces of medicine.” Responsible marketing of healthcare AI should avoid overhyping benefits, clearly communicate the nuances and limitations of the specific studies underpinning the evidence, and prioritize building physician trust when engaging health systems and providers.
This content is part of EMARKETER’s subscription Briefings, where we pair daily updates with data and analysis from forecasts and research reports. Our Briefings prepare you to start your day informed, to provide critical insights in an important meeting, and to understand the context of what’s happening in your industry. Not a subscriber? Click here to get a demo of our full platform and coverage.
You've read 0 of 2 free articles this month.
685 Third Avenue21st FloorNew York, NY 100171-800-405-0844
1-800-405-0844[email protected]