By signing in or creating an account, you agree with Associated Broadcasting Company's Terms & Conditions and Privacy Policy.
New Delhi: A new artificial intelligence tool built by Microsoft is showing signs that it could one day assist in medical diagnoses that stump even experienced doctors. In a recent study, the company said its system outperformed human physicians on some of the toughest real-life diagnostic cases.
The AI, known as MAI-DxO (Microsoft AI Diagnostic Orchestrator), was tested using detailed case reports published in the New England Journal of Medicine. These are not textbook examples, but actual complicated patient cases that usually need multiple rounds of testing and discussion. Microsoft claims its tool got the diagnosis right in 85.5 percent of these cases. In comparison, a group of 21 practicing doctors from the US and UK managed to solve only 20 percent.
What makes MAI-DxO stand out is how it approaches each case. It doesn’t just give one final answer. Instead, it asks questions, considers test results, and revises its guesses — much like how a doctor might think through a tricky case. Microsoft built the tool to behave like a panel of doctors with different perspectives working together to find the most likely diagnosis.
“We’re taking a big step towards medical superintelligence,” wrote Mustafa Suleyman, CEO of Microsoft AI, in a LinkedIn post.
I found this bit especially interesting: the AI didn’t rely on just one model. It was tested with several top language models — OpenAI’s GPT, Google’s Gemini, Meta’s Llama, Anthropic’s Claude, and others. But the best results came from pairing MAI-DxO with OpenAI’s o3 model. That combination gave the highest success rate.
A major challenge in real-world medicine is cost. Tests aren’t free, and unnecessary ones can delay treatment or stress out patients. Microsoft said MAI-DxO can be told to stay within a certain budget, which helps it avoid overtesting.
Interestingly, the AI not only did better in accuracy, but also managed to keep costs lower than both doctors and individual AI models. That’s a pretty big deal in a field where reducing waste is as important as improving care.
Bay Gross, vice president of health at Microsoft AI, called the project “a proof-of-concept showing that large language model systems can master medicine’s most intricate diagnostic challenges by following the same step-by-step reasoning and debate process that expert physicians use every day.”
Before anyone thinks this will replace doctors tomorrow, there’s a catch. MAI-DxO is still in the research phase. It hasn’t been cleared by regulators or tested in everyday clinics yet. The cases used were unusually hard, and doctors didn’t have access to tools they’d normally rely on, no books, no peers, no online lookups.
Microsoft’s team has shared a preprint of the study and said they’re working with healthcare partners to study the tool further. There’s talk of turning the benchmark into a public dataset so others can test their own models in the same way.
From where I stand, this feels like the start of something new. AI won’t replace your local doctor anytime soon, but tools like MAI-DxO might one day help them make faster, safer, and smarter decisions, especially when the case is anything but simple.