Multimodal Generative AI Assistants for Real Time Pedagogical Feedback in Large Scale Computer Science Classrooms
Rayxona TillayevaResearcher, Namangan State Pedagogical Institute, Namangan, Uzbekistan. tillayevarayhona@gmail.com0000-0002-0018-5530
Kamola RaimovaResearcher, Tashkent International University (TIU), Tashkent, Uzbekistan. kamolaraimova7@gmail.com0009-0002-9807-4018
Gafur NamazovDepartment of Information Technology and Exact Sciences, Termez University of Economics and Service, Termez, Uzbekistan. gafur_namazov@tues.uz0009-0009-9738-1463
Farkhad UsmanovAssociate Professor, Andijan State University, Andijan, Uzbekistan. ufarhod001@gmail.com0000-0003-4652-9084
Nozima NasriddinovaSenior Teacher, Termez State University, Termez, Uzbekistan. nasriddinovanozima3@gmail.com0000-0001-5932-2264
Gulchekhra KhazratovaSenior Teacher, Department of Medical Informatics and Digital Technologies, Tashkent State Medical University, Tashkent, Uzbekistan. gulchekhrakhazratova@gmail.com0009-0007-3943-9589
Feruza BaymanovaResearcher, Jizzakh State Pedagogical University, Jizzakh, Uzbekistan. baymanovaferuza6@gmail.com0009-0003-0744-5758
This study aims to design and evaluate a Multimodal Generative AI Assistant capable of delivering real-time pedagogical feedback in large-scale computer science classrooms. The objective is to investigate whether integrating multimodal AI combining natural language processing, code analysis, speech interaction, and learning analytics can improve student learning outcomes, debugging efficiency, and engagement compared to traditional instructional support systems. The AI-assisted learning framework development and evaluation were based on Design Science Research (DSR) methodology. The system combines multimodal information of programming environments, natural language queries, and student interaction logs to produce contextual feedback based on generative AI models. A semester-long deployment was used, with 1,200 students, 600 in each control and AI-assisted group, using an A/B experimental design. Such quantitative measures as normalized learning gain, Time-to-Resolution (TTR), Hint Efficacy Score (HES), system latency, and multimodal synchronization accuracy were considered. The Technology Acceptance Model (TAM) and Subject Matter Expert (SME) audits were used to carry out a qualitative evaluation. The normalized learning gain of the AI-assisted group was found to be 0.73 versus 0.40 in the control group and statistically significant (p < 0.001), with a Cohen's d effect size of 0.91. The efficiency of debugging increased significantly, where Time-to-Resolution dropped on average by 134.5 minutes to 11.2 minutes (91.6% decrease). The system had a multimodal synchronization accuracy of 97.2% and a response latency of 1.65 seconds, and had a hallucination rate of 0.85%. Perceived Usefulness (6.6/7) and Perceived Ease of Use (6.3/7) were highly accepted. The results indicate that multimedia generative AI assistants can profoundly improve learning outcomes, decrease the debugging duration, and offer learning support during real-time instruction in large computer science classes at scale.