APSIPA Transactions on Signal and Information Processing > Vol 12 > Issue 1

Optical Flow Regularization of Implicit Neural Representations for Video Frame Interpolation

Weihao Zhuang, Kobe University, Japan, zhuangweihao@stu.kobe-u.ac.jp , Tristan Hascoet, Kobe University, Japan, Xunquan Chen, Kobe University, Japan, Ryoichi Takashima, Kobe University, Japan, Tetsuya Takiguchi, Kobe University, Japan
 
Suggested Citation
Weihao Zhuang, Tristan Hascoet, Xunquan Chen, Ryoichi Takashima and Tetsuya Takiguchi (2023), "Optical Flow Regularization of Implicit Neural Representations for Video Frame Interpolation", APSIPA Transactions on Signal and Information Processing: Vol. 12: No. 1, e39. http://dx.doi.org/10.1561/116.00000218

Publication Date: 14 Sep 2023
© 2023 W. Zhuang, T. Hascoet, X. Chen, R. Takashima and T. Takiguchi
 
Subjects
 

Share

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1678 times

In this article:
Introduction 
Related Work 
Method 
Experiments 
Current Limitations and Future Work 
Conclusion 
References 

Abstract

Recent works have shown the ability of Implicit Neural Representations (INR) to carry meaningful representations of signal derivatives. In this work, we leverage this property to perform Video Frame Interpolation (VFI) by explicitly constraining the derivatives of the INR to satisfy the optical flow constraint equation. We achieve state-of-the-art VFI on Adobe-240FPS, X4K and UCF101 datasets using only a target video and its optical flow, without learning the interpolation operator from additional training data. We also found that constraining the INR derivatives not only enhances the interpolation of intermediate frames but also improves the ability of narrow networks to fit observed frames. By limiting the INR derivatives, we were able to improve the network’s efficiency in fitting observed frames, which could lead to more advanced video compression techniques and optimized INR representations. Our work highlights the potential of Implicit Neural Representations in video processing tasks and provides valuable insights into their utilization for signal derivatives.

DOI:10.1561/116.00000218