BackMR: Rethinking Chart Review with Multi-Agent AI

Case Study

The Problem: Chart Review as a Bottleneck

Medical professionals spend significant portions of their workday manually reviewing patient charts to extract relevant information. A typical pre-visit chart review might require a physician to:

Navigate to the patient's record in the EHR
Check the problem list and recent diagnoses
Review vital signs trends across multiple encounters
Examine recent lab results and flag abnormalities
Cross-reference current medications with conditions
Verify insurance coverage and prior authorizations
Scan clinical notes from specialists
Identify any care gaps or pending orders

This process often takes 10-15 minutes per patient and involves clicking through numerous screens, each presenting data in siloed, structured formats that don't align with the way clinicians naturally think about patient care. The cognitive load is high, the risk of missing important information is real, and the time could be better spent on direct patient interaction.

Traditional EHR systems excel at storing and organizing data but fall short at synthesizing information across domains to answer specific clinical questions. The ideal solution would allow natural language access to patient data, provide comprehensive yet focused answers, maintain auditability of information sources, adapt to different clinical contexts, and integrate seamlessly with existing FHIR-enabled systems.

The Solution: A Multi-Agent AI Architecture

BackMR addresses these challenges through a hierarchical multi-agent system that acts as an intelligent layer between clinicians and their EHR data. Instead of navigating multiple screens, healthcare professionals can ask questions in plain English: "Has the patient's blood pressure improved over the last six months?" or "What insurance coverage did the patient have at the time of their last ER visit?"

The system leverages two foundational capabilities of modern large language models:

Advanced Reasoning and Planning: Today's LLMs like GPT-4o can understand complex questions and break them down into step-by-step sub-tasks. This enables the system to form coherent chart review strategies without requiring pre-programmed logic for every possible question type.

Native Understanding of FHIR: Modern LLMs were trained on healthcare standards including the FHIR R4 API specification. This means they can generate valid FHIR queries on demand, retrieving exactly the right type of structured data—labs, medications, notes, procedures—without requiring pre-built templates or rigid code paths.

Together, these capabilities enable BackMR to combine natural language understanding, structured data retrieval, modular agent-based reasoning, and human-auditable decision logic.

Developed by Vicert as a research initiative, BackMR serves as an exploration of what these AI capabilities can achieve in healthcare settings. The project focuses on demonstrating technical feasibility and validating architectural approaches for applying multi-agent systems to clinical workflows, providing insights into how such systems can be designed and implemented.

How BackMR Works: The Architecture

BackMR's architecture is built on two key principles: agent specialization and iterative reasoning.

The Workflow

When a clinician submits a question, it flows through four main stages:

1. Planning: A Planner agent receives the natural language question and decomposes it into logical reasoning steps. For example, the question "Has the patient's blood pressure improved over the last six months?" becomes:

Step 1: Retrieve blood pressure observations from past 6 months
Step 2: Analyze trend direction (increasing/decreasing)
Step 3: Summarize results

2. Orchestration: A central Chart Analyzer receives the plan and the original question, then identifies which specialized domain agents are needed to fulfill each step. It delegates specific sub-questions to the appropriate agents.

3. Execution: Specialized agents execute their tasks. BackMR includes nine domain agents:

Clinical Agent: Conditions, procedures, clinical notes
Diagnostics Agent: Labs, imaging, pathology reports
Medications Agent: Prescriptions, administrations, immunizations
Administrative Agent: Encounters, scheduling, referrals
Financial Agent: Billing, claims, coverage
Social Determinants Agent: Socio-economic factors, environment
Patient Agent: Demographics and identifiers
Entities Agent: Providers, organizations, care teams
System Info Agent: Internal metrics and usage

Each agent uses specialized tools to query FHIR APIs, retrieve structured data, and return focused insights.

4. Reflection and Self-Correction: The synthesized answer passes through a Reflection module that evaluates its completeness and accuracy, scoring it from 0 to 1. If the score is below threshold (e.g., 0.6), the system provides feedback and returns to the Planner to refine the approach and try again.

This iterative process ensures that answers are not only fluent but also accurate, complete, and auditable.

BackMR vs. Alternative Approaches

Several methods exist for AI-powered chart review, each with distinct trade-offs:

Approach	Strengths	Limitations	When to Use
BackMR (Agent-based FHIR)	Auditable data paths, adaptable to complex queries, modular & extensible, supports iterative reasoning	More complex implementation, requires FHIR API access	For comprehensive clinical reasoning and complex queries requiring structured data access
Full-context queries	Conceptually simple, no data pre-processing	Most records exceed context windows, inefficient token usage, non-auditable	For very small records or limited scope reviews
Embedding + vector search	Works with unstructured notes, simple implementation	Weak on structured data, cannot execute parametrized queries, limited reasoning capabilities	For summarization and information retrieval from clinical notes
RAG pipelines	Good document retrieval, works with existing text	Same limitations as vector search, challenges with numerical data	For QA over semi-structured documents
Knowledge Graph-based RAG	Rich semantic context, good for relationship queries	Complex preprocessing, schema development challenges	For relationship-heavy queries and inferences across data types

BackMR's agent-based approach excels when questions require combining structured data retrieval (labs, medications, encounters) with clinical reasoning. Vector search and RAG approaches work well for document summarization but struggle with queries like "Find the most recent lab with a value above X" or "What was the patient's insurance at the time of encounter Y?"

Smart FHIR Tooling: The Technical Edge

A key innovation in BackMR is its "smart FHIR tooling" that combines deterministic querying with intelligent fallbacks. When an agent needs data, it first attempts a precise FHIR query using standard parameters (resource type, filters, sorting). The LLM generates these queries using its trained knowledge of FHIR standards.

If the query succeeds but returns no results—often due to inconsistent coding practices or missing fields—the system engages a fallback mechanism: it retrieves all resources of the requested type and invokes a separate LLM-based filter agent to interpret the filter semantically and return relevant entries.

For example, if a user asks "Has the patient ever had abnormal liver function tests?" and the primary FHIR query for liver panel codes returns nothing, the fallback retrieves all lab data and lets the LLM determine which entries constitute liver function tests and which values are abnormal. This ensures coverage without compromise—combining fast deterministic access with adaptive reasoning for complex searches.

Real-World Applications

BackMR integrates with FHIR-enabled EHR systems as a flexible chart review layer supporting diverse workflows:

Nurses and Care Managers: Quickly gather patient information for care coordination, answer patient questions during phone calls, identify barriers to care such as transportation or cost issues.

Physicians: Prepare for encounters without extensive pre-charting, perform deep dives into patient history for complex cases, assess clinical risk scores on demand.

Clinical Documentation Improvement (CDI) Teams: Validate documentation against structured and unstructured data, identify gaps between coded diagnoses and supporting evidence.

Patient Navigation Systems: Power intelligent chatbots and virtual assistants that can answer patient questions by querying their own records.

Administrative Staff: Verify insurance coverage for procedures, check prior authorization status, review billing and claims information.

Example Scenarios

Scenario 1: A care manager asks, "Which social factors may be impacting this patient's medication adherence?"

BackMR retrieves patient-reported SDOH data, examines insurance coverage and cost-sharing, reviews notes mentioning financial concerns or transportation barriers, and synthesizes a comprehensive answer highlighting specific factors like pharmacy access limitations, high copays, and housing instability.

Scenario 2: A physician asks, "Does the patient meet criteria for sepsis based on the most recent vitals and labs?"

BackMR retrieves the latest vital signs (temperature, heart rate, respiratory rate, blood pressure), pulls recent lab results (WBC, lactate), applies clinical criteria for sepsis diagnosis, and provides a structured answer indicating whether criteria are met with supporting values.

Scenario 3: An administrator asks, "Show me the summary of claims from past 12 months. Who covered these costs?"

BackMR queries Explanation of Benefits resources, aggregates claims by payer, summarizes total costs and coverage, and presents a breakdown of what was paid by insurance versus patient responsibility.

The Vision: AI-Native Healthcare Infrastructure

Most EHR systems rely on brittle, hardcoded business logic. BackMR explores an alternative vision:

Replace static logic with a dynamic, intelligent reasoning layer—FHIR in the back, generic UI on top, and LLM-powered agents in between.

With a system like BackMR, healthcare organizations can tailor chart review workflows in ways most suitable for their specific needs. A cardiologist might define standard questions that BackMR executes on every patient record to produce reports needed before appointments. An oncology practice might create question sets for tumor staging and treatment planning.

In essence, practitioners can program the business logic of chart review in plain English instead of waiting for custom EHR modules or third-party add-ons. This represents a fundamental shift toward more adaptable, intelligent healthcare IT infrastructure that can evolve with clinical practice rather than constraining it.

Extensibility and Future Growth

BackMR's modular architecture supports incremental expansion:

New agents can be added for specialized domains (Oncology, Radiology, Behavioral Health)
Agents can be refined into more specialized sub-agents as use cases grow
New tools can be integrated beyond FHIR (medical literature search, clinical calculators, guideline databases)
Custom workflows can be defined by clinical teams without software development

This extensibility makes BackMR a platform rather than a fixed application—capable of adapting to diverse clinical contexts while maintaining the auditability and transparency required for healthcare.

Conclusion

BackMR demonstrates how multi-agent AI architectures can transform clinical workflows by bridging the gap between natural language understanding and structured healthcare data. By combining intelligent planning, specialized domain agents, smart FHIR tooling, and self-correction mechanisms, the system delivers fast, accurate, and auditable answers to clinical questions—enhancing rather than replacing human clinical judgment.

As healthcare organizations seek to leverage AI effectively, approaches like BackMR offer a path forward: building on established standards (FHIR), maintaining transparency and auditability, and creating systems that adapt to clinical workflows rather than forcing workflows to adapt to rigid software.

View All Success Stories ->