Eye on Ai by Denise N. Fyffe, M.Ed

Eye on AI: OpenAI Model Did Better Than ER Doctors at Diagnosing Patients

OpenAI Model Did Better Than ER Doctors at Diagnosing Patients

Source: NPR

A patient shows up at the hospital with a pulmonary embolism — a blood clot that has traveled to the lungs. After initially improving, their symptoms start to worsen. The medical team suspects the medication isn’t working.

In steps artificial intelligence — with its own theory.

It has scanned the medical records and suspects a history of lupus, an autoimmune condition which can lead to heart inflammation, could explain what was really ailing the patient.

Turns out, the AI model is correct.

This type of scenario could become a reality in the-not-too-distant future, according to a study published Thursday in the journal Science.

Researchers based at Harvard Medical School and Beth Israel Deaconess Medical Center found that an AI reasoning model, developed by OpenAI, excelled at diagnosing patients and making decisions about managing their care. It matched and often outperformed doctors and the earlier AI model, GPT-4.

The researchers ran a series of experiments on the AI model to test its clinical acumen — including actual cases like the lupus patient who’d been previously treated at the emergency department at Beth Israel in Boston.

The team graded how well the AI model could provide an accurate diagnosis at three moments in time, from the triage stage in the ER, up to being admitted into the hospital.

Overall, AI outperformed two experienced physicians — and did so with only the electronic health records and the limited information that had been available to the physicians at the time.

“This is the big conclusion for me — it works with the messy real-world data of the emergency department, ” said Dr. Adam Rodman, a clinical researcher at Beth Israel and one of the study authors. “It works for making diagnoses in the real world.”

Other parts of the study focused on case reports published in the New England Journal of Medicine and clinical vignettes to suss out whether the AI model could meet well-established “benchmarks” and game out thorny diagnostic questions.

“The model outperformed our very large physician baseline,” said Raj Manrai, assistant professor of Biomedical Informatics at Harvard Medical School who was also part of the study.

The authors emphasize the AI relied on text alone, while in real life, clinicians need to attend to many other inputs like images, sounds and nonverbal cues when diagnosing and treating a patient.

Still, the work showcases just how far the technology has advanced in the last few years. Prior versions of large language models faltered when dealing with uncertainty, and in generating a list of possible conditions that could explain symptoms, what’s known as a differential diagnosis.

“This paper is a beautiful summary of just how much things have improved,” says Dr. David Reich, chief clinical officer for Mount Sinai Health System in New York, who was not involved in the work.

“You have something which is quite accurate, possibly ready for prime time,” he says. “Now the open question is how the heck do you introduce it into clinical workflows in ways that actually improve care?”

After all, arriving at some tricky, final diagnosis — which the AI model shines at — isn’t necessarily reflective of how things play out “in real clinical medicine,” says Reich, where the “outcomes are much more subtle and perhaps more diverse.”

And the emergency department is only a small portion of the patient’s total medical care. Rodman acknowledges it’s unlikely AI would have done such an “impressive” job had the team provided it with the records of someone who’d spent a month in the hospital.

None of those involved in the new study believe the findings support supplanting doctors with AI, “despite what some companies are likely to say and how they’re likely to use these results,” says Manrai.

“I think it does mean that we’re witnessing a really profound change in technology that will reshape medicine,” he adds.

But the results do make the case that AI models need to be tested in a rigorous fashion, ideally through forward-looking trials that can give more certainty about how the technology ultimately impacts clinical practice.

“It’s a very challenging process to design these trials,” says Reich, “but this study is a perfect call to action.”

Read more of this story at NPR.

*****

About curator: Denise N. Fyffe is a published author of over 100 books, for more than fifteen years, and enjoys gardening, and volunteering. She is a trainer, publisher, author, and writing mentor, helping others to achieve their dreams.

FEATURED BOOKS

My Life in LMS

Genesis of LMS

The Impact of LMS

Developing People in Learning Organizations

Career Development for National Growth

How to Keep Writing Guide

Write the Book Already!

The American Family

The Global Family

The Modern Family

The Blended Family

The Caribbean Family

The Expert Teacher‘s Guide

School Counselling in Jamaica

The Guidance Counsellors Handbook

Philosophy of Education & Work

Sophie’s Place

Understanding the Human Element

Empowering the 21st Century Worker

The Impact of Trade Unions in Jamaica

Thieves in the Workplace

The Psychology of Workplace Theft

What did you think about this? Please leave a reply.

This site uses Akismet to reduce spam. Learn how your comment data is processed.