About the Client
Supply Medium is a technology-driven organization specializing in Artificial Intelligence, cloud computing, data engineering, and enterprise digital transformation. As demand for intelligent evaluation systems grew across training and assessment programs, the organization sought to leverage Generative AI to automate performance evaluations while maintaining consistency, transparency, and human oversight.
Background
Supply Medium manages large-scale training and assessment programs where participants are evaluated across both technical competencies and interpersonal skills. Evaluators traditionally reviewed recorded sessions, assessed participants against detailed scoring rubrics, and prepared individualized feedback reports.
As the number of assessments increased, manual evaluation became increasingly time-consuming and difficult to standardize. Reviewing lengthy recordings, applying complex evaluation criteria, and delivering timely feedback placed significant demands on evaluators while introducing variations in scoring consistency.
To improve scalability and maintain evaluation quality, Supply Medium required an AI-powered evaluation platform capable of analyzing video, audio, and textual data while supporting human decision-making.
Challenge
Several challenges limited the efficiency and consistency of the existing evaluation process.
Time-Intensive Manual Evaluation
Each assessment required extensive manual review, significantly limiting the number of sessions evaluators could process.
Lack of Standardization
Scoring varied between evaluators and different assessment formats, creating inconsistencies despite standardized evaluation rubrics.
Subjective Soft-Skill Assessment
Evaluating communication, confidence, collaboration, empathy, and participant engagement relied heavily on human interpretation, making consistency difficult to maintain.
The primary objectives were to reduce evaluator workload, improve scoring consistency, accelerate feedback delivery, and preserve expert oversight throughout the assessment process.
Solution
Supply Medium partnered with our team to develop a secure, cloud-native Multimodal Generative AI Evaluation Platform built on AWS. The solution combines video, audio, speech, sentiment, and language analysis with Retrieval-Augmented Generation (RAG) to produce consistent, explainable, and actionable performance evaluations.
The platform was deployed on Amazon EKS to support scalable containerized workloads, while Amazon RDS served as the central repository for application data, evaluation history, and the RAG knowledge base.
Phase 1: Data Integration & Processing
- Securely ingested synchronized video and audio recordings into the cloud platform.
- Applied speech recognition and sentiment analysis to evaluate communication quality, confidence, tone, and conversational context.
- Processed video streams using multimodal AI models to analyze:
- Body posture
- Eye contact
- Hand gestures
- Non-verbal communication
- Procedural adherence
- Stored transcripts, extracted observations, and structured metadata within Amazon RDS.
- Implemented a Retrieval-Augmented Generation (RAG) pipeline that continuously referenced evaluation rubrics, historical assessments, and organizational guidelines to enrich AI-generated analysis.
Phase 2: Intelligent Evaluation & Scoring
- Combined outputs from video, audio, speech, and sentiment models using advanced Large Language Models to generate comprehensive evaluation reports.
- Applied organizational scoring rubrics through AI system prompts to ensure standardized grading.
- Leveraged the RAG framework to retrieve the latest evaluation guidelines, institutional updates, and historical feedback before generating results.
Each evaluation included:
- Quantitative rubric-based scoring across multiple competency areas.
- Detailed narrative feedback highlighting participant strengths.
- Actionable recommendations for improvement.
- Context-aware explanations supporting every assigned score.
Phase 3: Scalable Assessment Framework
The solution was designed to support multiple assessment formats, including:
- Individual evaluations.
- Team-based assessments.
- Multi-participant sessions.
- Multi-room distributed evaluations.
Additional capabilities included:
- Individual performance tracking during collaborative sessions.
- Timeline synchronization across multiple video and audio sources.
- Automatic workload scaling through Amazon EKS during peak assessment periods.
- Continuous AI improvement through evaluator feedback incorporated into the RAG knowledge base.
Outcome
The Multimodal Generative AI platform delivered substantial operational improvements for Supply Medium:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Evaluation Time | 45–60 minutes | 10–15 minutes | 67% faster |
| Annual Evaluation Hours | High manual effort | Significantly reduced | 7,000+ hours saved |
| Evaluation Consistency | Manual variation | Highly standardized | 37% higher consistency |
| Feedback Delivery | Several days | Under one hour | 95% faster |
| Cost per Evaluation | Manual processing | AI-assisted workflow | 87% cost reduction |
| Scoring Variability | Higher inconsistency | Minimal variation | Near-perfect alignment |
Lasting Impact
The AI-powered evaluation platform has become a standardized assessment framework across Supply Medium, enabling faster, fairer, and more consistent evaluations while preserving expert oversight.
By combining Multimodal AI, Large Language Models, Retrieval-Augmented Generation (RAG), and scalable AWS cloud infrastructure, Supply Medium established a future-ready evaluation system that improves operational efficiency, enhances participant feedback, strengthens decision-making, and supports continued organizational growth through intelligent automation.