Most clinical AI is trained to pass licensing exams, not to hold a consultation. That distinction sounds small. It is the whole game.
The exam-shaped model
Look at how medical models are usually built. They learn from single-turn question-and-answer data: a symptom goes in, a diagnosis comes out. The result is a model that scores well on board-style questions and falls apart the moment a real patient sits down.
A real doctor does not hear "chest pain" and fire back a heart attack. They consult. They ask when it started, what makes it worse, whether it radiates, how frightened the patient is. They rule out the cheap explanations before reaching for the expensive ones. The diagnosis is the last step of a conversation, not the first.
Designing for the conversation, not the answer
We build the data around how consultations actually flow. Every dialogue in meddies-consultant is generated under three clinical-interview frameworks:
- Calgary-Cambridge sets the arc of the conversation: opening the session, gathering information, building rapport, closing.
- OPQRST forces the AI doctor to drill into the symptom systematically: onset, provocation, quality, region, severity, timing.
- FIFE programs the patient's inner state: feelings, ideas, function, and expectations, so the synthetic patient behaves like a person rather than a clean data point.
The frameworks are not decoration. They are constraints on the generator. A model trained on this data learns the shape of a good interview, not just the shortest path to a label.
Why it matters for safety
A model that answers too fast is a model that skips the question that would have changed the answer. Teaching a system to ask first is teaching it to be careful. For an ambient scribe or a triage assistant working beside a clinician, that habit of asking is the difference between a useful colleague and a confident liability.
Train models to listen. The answers get better on their own.
