Key Frame Extraction Based on Real-Time Person Availability Using YOLO
Bharathi SAssistant Professor, Department of Electronics and Communication Engineering, Dr. Mahalingam College of Engineering and Technology bharathi_mani@yahoo.com0000-0001-9638-3779
Senthilarasi MAssistant Professor, Department of Electronics and Communication Engineering, Thiagarajar College of Engineering arasiece@gmail.com0000-0001-9053-8485
Hari KJunior Research Fellow, Department of Electronics and Communication Engineering, Dr. Mahalingam College of Engineering and Technology harikrishnasamy@gmail.com0009-0003-7750-7707
Keyframe extraction plays a crucial role in summarizing lengthy videos, particularly in the context of surveillance footage with a fixed field of view that records events over extended periods. The process of manually reviewing such videos can be time-consuming and challenging to extract essential information effectively. To address this issue, a study was conducted to evaluate four distinct methods for keyframe extraction, with the aim of determining the most suitable approach for creating a people-based dataset. The four methods assessed in the study were absolute difference, entropy, optical flow, and object detection-based video summarization using the YOLO (You Only Look Once) algorithm. Each method offers a unique approach to identify keyframes that encapsulate critical instances within the video footage. Among all the four evaluated methods, the object detection-based video summarization approach stood out as particularly promising. This method employed the YOLO algorithm, which utilizes advanced object detection techniques to identify and track people within the video frames. With this approach only fewest numbers of frames are extracted but still capturing all the relevant instances featuring people. The results of this study suggest that object detection-based video summarization using the YOLO algorithm is a highly effective method for keyframe extraction in surveillance videos. By significantly reducing the number of frames while preserving all relevant instances, this approach offers a time-efficient solution for reviewing and analyzing extensive video footage, ultimately facilitating the creation of a people-based dataset for further research and applications in various domains.