Evidence-Graded Decision Authorization for Safe Clinical AI: A Constrained Reasoning Framework

Clinical AI systems have achieved strong predictive performance; however, prediction accuracy is not sufficient for clinical safety. Retrieval-augmented generation (RAG) improves factual accuracy, and general-purpose LLM guardrails constrain surface-level output safety, but these mechanisms do not govern the inferential gap between available clinical evidence and permissible clinical claims. We propose Evidence-Graded Decision Authorization (EGDA), a framework that separates evidence extraction, sufficiency assessment, and claim-level authorization through domain-specific rules. In a controlled experiment using 60 breast cancer decision-snapshot cases (1,260 system outputs across three arms evaluated by LLM-as-Judge with expert calibration), EGDA reduced the unjustified inference rate to 8.0% (vs. 48.7% for unconstrained LLM and 47.7% for RAG; risk difference vs. unconstrained -40.7%, 95% CI -46.9 to -34.0, p < 0.001), raised the appropriate refusal rate to 95.0% (vs. 56.9% and 56.9%;

Read Original Article →

Source

https://www.medrxiv.org/content/10.64898/2026.05.19.26353565v1?rss=1