Video Stabilization by a Hybrid Structure of CNN: RNN for a Video Surveillance System
C.K. Siva RanjaniDepartment of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India. sivaranjani9017.sse@saveetha.com0000-0002-6782-179X
Dr.V. VallinayagamProfessor, Department of Mathematics, St. Joseph’s College of Engineering, Chennai, Tamil Nadu, India. vngam19@gmail.com0000-0001-6715-6227
Dr.P. Gururama SenthilvelProfessor, Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India. gurupandian.cse@gmail.com0000-0001-8666-1544
Dr.M. ShakilaDepartment of Computer Science Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India. shakilam.sse@saveetha.com0009-0001-4051-5768
Dr.B. AbiramiAssistant Professor, Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India. abivsb.sanrak@gmail.com0009-0001-9187-9238
Dr.J. NithishaAssociate Professor, Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, Tamil Nadu, India. nithisha.j@gmail.com0009-0008-1055-0686
Keywords: Video Stabilization, CNN, RNN, Video Surveillance, Video Processing.
Abstract
Video stabilization is important for enhancing the visual quality of both surveillance and consumer videos by minimizing undesired jitter and motion shake. Older methods are based upon pixel-space optimization, motion heuristics, or huge training sets, which tend to have non-convex optimization challenges and require an accurate optical flow estimate. The study proposes a hybrid CNN-RNN model, in which video stabilization is achieved by optimizing the CNN parameter space and refining the intra-frame hidden states via an RNN block. In contrast to traditional approaches that rely on learning, our model is directly trained on a specific input video, and it will overfit its parameters to obtain video-specific optimization. The CNN is a differentiable optimizer in a high-dimensional parameter space, and the RNN based on ConvLSTM makes use of inter-frame recurrence and intra-frame recurrence to enhance temporal consistency without adding extra architectural elements. As experimental results on DeepStab and NUS data show, our algorithm has a better PSNR, better SSIM, better ITF, and much faster execution time than the latest methods do. The suggested method is efficient, strong against parallax and dynamic images, and applicable in real time to the implementation of video surveillance systems.