Standardizing Clinical Performance Evaluation with Multimodal GenAI for Supply Medium

About the Client

Supply Medium is a technology-driven organization specializing in Artificial Intelligence, cloud computing, data engineering, and enterprise digital transformation. As demand for intelligent evaluation systems grew across training and assessment programs, the organization sought to leverage Generative AI to automate performance evaluations while maintaining consistency, transparency, and human oversight.

Background

Supply Medium manages large-scale training and assessment programs where participants are evaluated across both technical competencies and interpersonal skills. Evaluators traditionally reviewed recorded sessions, assessed participants against detailed scoring rubrics, and prepared individualized feedback reports.

As the number of assessments increased, manual evaluation became increasingly time-consuming and difficult to standardize. Reviewing lengthy recordings, applying complex evaluation criteria, and delivering timely feedback placed significant demands on evaluators while introducing variations in scoring consistency.

To improve scalability and maintain evaluation quality, Supply Medium required an AI-powered evaluation platform capable of analyzing video, audio, and textual data while supporting human decision-making.

Challenge

Several challenges limited the efficiency and consistency of the existing evaluation process.

Time-Intensive Manual Evaluation

Each assessment required extensive manual review, significantly limiting the number of sessions evaluators could process.

Lack of Standardization

Scoring varied between evaluators and different assessment formats, creating inconsistencies despite standardized evaluation rubrics.

Subjective Soft-Skill Assessment

Evaluating communication, confidence, collaboration, empathy, and participant engagement relied heavily on human interpretation, making consistency difficult to maintain.

The primary objectives were to reduce evaluator workload, improve scoring consistency, accelerate feedback delivery, and preserve expert oversight throughout the assessment process.


Solution

Supply Medium partnered with our team to develop a secure, cloud-native Multimodal Generative AI Evaluation Platform built on AWS. The solution combines video, audio, speech, sentiment, and language analysis with Retrieval-Augmented Generation (RAG) to produce consistent, explainable, and actionable performance evaluations.

The platform was deployed on Amazon EKS to support scalable containerized workloads, while Amazon RDS served as the central repository for application data, evaluation history, and the RAG knowledge base.

Phase 1: Data Integration & Processing

  • Securely ingested synchronized video and audio recordings into the cloud platform.
  • Applied speech recognition and sentiment analysis to evaluate communication quality, confidence, tone, and conversational context.
  • Processed video streams using multimodal AI models to analyze:
    • Body posture
    • Eye contact
    • Hand gestures
    • Non-verbal communication
    • Procedural adherence
  • Stored transcripts, extracted observations, and structured metadata within Amazon RDS.
  • Implemented a Retrieval-Augmented Generation (RAG) pipeline that continuously referenced evaluation rubrics, historical assessments, and organizational guidelines to enrich AI-generated analysis.

Phase 2: Intelligent Evaluation & Scoring

  • Combined outputs from video, audio, speech, and sentiment models using advanced Large Language Models to generate comprehensive evaluation reports.
  • Applied organizational scoring rubrics through AI system prompts to ensure standardized grading.
  • Leveraged the RAG framework to retrieve the latest evaluation guidelines, institutional updates, and historical feedback before generating results.

Each evaluation included:

  • Quantitative rubric-based scoring across multiple competency areas.
  • Detailed narrative feedback highlighting participant strengths.
  • Actionable recommendations for improvement.
  • Context-aware explanations supporting every assigned score.

Phase 3: Scalable Assessment Framework

The solution was designed to support multiple assessment formats, including:

  • Individual evaluations.
  • Team-based assessments.
  • Multi-participant sessions.
  • Multi-room distributed evaluations.

Additional capabilities included:

  • Individual performance tracking during collaborative sessions.
  • Timeline synchronization across multiple video and audio sources.
  • Automatic workload scaling through Amazon EKS during peak assessment periods.
  • Continuous AI improvement through evaluator feedback incorporated into the RAG knowledge base.

Outcome

The Multimodal Generative AI platform delivered substantial operational improvements for Supply Medium:

Metric Before After Improvement
Evaluation Time 45–60 minutes 10–15 minutes 67% faster
Annual Evaluation Hours High manual effort Significantly reduced 7,000+ hours saved
Evaluation Consistency Manual variation Highly standardized 37% higher consistency
Feedback Delivery Several days Under one hour 95% faster
Cost per Evaluation Manual processing AI-assisted workflow 87% cost reduction
Scoring Variability Higher inconsistency Minimal variation Near-perfect alignment

Lasting Impact

The AI-powered evaluation platform has become a standardized assessment framework across Supply Medium, enabling faster, fairer, and more consistent evaluations while preserving expert oversight.

By combining Multimodal AI, Large Language Models, Retrieval-Augmented Generation (RAG), and scalable AWS cloud infrastructure, Supply Medium established a future-ready evaluation system that improves operational efficiency, enhances participant feedback, strengthens decision-making, and supports continued organizational growth through intelligent automation.

Leave a Reply

Your email address will not be published. Required fields are marked *