基于级联优化网络的视频合成方法

doi:10.3969/j.issn.2095-2198.2024.02.011

沈阳化工大学学报 ›› 2024, Vol. 38 ›› Issue (2): 161-166.doi: 10.3969/j.issn.2095-2198.2024.02.011

基于级联优化网络的视频合成方法

1.沈阳化工大学信息工程学院，辽宁沈阳 110142；2.中国科学院沈阳自动化研究所，辽宁沈阳 110016

出版日期:2024-04-30 发布日期:2025-01-02
通讯作者: 王国刚
作者简介:郝炯辉(1995—)，男，河北张家口人，硕士研究生在读，主要从事图像合成研究.
基金资助:
国家重点研发计划项目（2018YFB1700200）

Video Synthesis Method Based on Cascade Refinement Network

1. Shenyang University of Chemical Technology， Shenyang 110142， China； 2. Shenyang Institute of Automation, Chinese Academy of Sciences， Shenyang 110016， China

Online:2024-04-30 Published:2025-01-02

摘要/Abstract

摘要：

针对视频到视频的生成过程中视频生成质量较差，生成的物体属性无法在后续视频中得以延续，使仿真视频的视觉效果下降的问题，在图像到图像合成算法的基础上提出一种高分辨率的视频到视频的生成方法.在级联优化网络中增加残差块优化网络结构，从而提高生成视频帧的质量.为解决后续视频中生成物体属性不一致的问题，由两帧改进的级联优化网络预测图像计算光流，再由光流预测一帧图像，将这两个预测图像融合，得到仿真视频序列.与其他视频及图像生成方法在Cityscapes数据集上进行实验对比，结果表明所提算法可以得到更加真实的视频，并且生成的视频序列评价更高.

关键词: 深度学习')">

深度学习, 视频合成, 风格转换, 光流估算

Abstract:

A high-resolution video to video generation method is proposed based on the image to image synthesis algorithm to address the problem of poor video generation quality and inability to continue the generated object attributes in subsequent videos,resulting in a decrease in the visual effect of simulated videos.Adding residual blocks to the cascaded optimization network to optimize the network structure and improve the quality of generated video frames.In order to solve the problem that the attributes of the generated objects are inconsistent in subsequent videos,the optical flow is calculated by two improved cascaded optimization network prediction images,and then one image is predicted by optical flow.The two predicted images are fused to obtain the simulation video sequence.Compared with other video and image synthesis methods on cityscapes dataset,the results show that the proposed algorithm can get more realistic video,and the generated video sequences have higher evaluation.

Key words: deep learning')">

deep learning, video to video synthesis, image style transfer, optical flow estimation

郝炯辉1, 王国刚1, 汪滢1, 赵怀慈2.

基于级联优化网络的视频合成方法 [J]. 沈阳化工大学学报, 2024, 38(2): 161-166.

HAO Jionghui1, WANG Guogang1, WANG Ying1, ZHAO Huaici2.

Video Synthesis Method Based on Cascade Refinement Network [J]. Journal of Shenyang University of Chemical Technology, 2024, 38(2): 161-166.

参考文献

［1］HERTZMANN A,JACOBS C E,OLIVER N,et al.Image Analogies［C］//Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques.NewYork:Association for Computing Machinery,2001:327-340.

［2］GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Nets［C］//Proceedings of the 27th International Conference on Neural Information Processing Systems．Cambridge：MIT Press,2014:2672-2680.

［3］CHEN Q F,KOLTUN V.Photographic Image Synthesis with Cascaded Refinement Networks［C］//2017 IEEE International Conference on Computer Vision（ICCV）．Los Alamitos：IEEE Computer Society,2017:1520-1529.

［4］WANG T C,LIU M Y,ZHU J Y,et al.Video-to-Video Synthesis［C］//Advances in Neural Information Processing Systems31（NeurIPS 2018）．Red Hook：Curran Associates,Inc.,2018:1144-1156.

［5］WANG T C,LIU M Y,ZHU J Y,et al.High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos：IEEE Computer Society,2018:8798-8807.

［6］ILG E，MAYER N，SAIKIA T,et al.Flow Net 2.0:Evolution of Optical Flow Estimation with Deep Networks［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2017:1647-1655.

［7］HORN B K P,SCHUNCK B G.Determining Optical Flow［C］//Proceedings SPIE 0281,Techniques and Applications of Image Understanding.Bellingham：SPIE,1981:319-331.

［8］LUCAS B D,KANADE T.An Iterative Image Registration Technique with an Application to Stereo Vision［C］//Proceedings of the 7th International Joint Conference on Artificial Intelligence.San Francisco:Morgan Kaufmann,Publishers Inc.,1981:674-679.

［9］吉爱萍,陈未如,杨硕.采用局部更新策略快速消除鬼影的ViBe改进算法［J］.沈阳化工大学学报,2019,33(4):368-376,384.

［10］HE KM,ZHANG XY,REN SQ,et al.Deep Residual Learning for Image Recognition［C］//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2016:770-778.

［11］GATYS L A,ECKER A S,BETHGE M.Image Style Transfer Using Convolutional Neural Networks［C］//2016 IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society,2016:2414-2423.

［12］SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition［EB/OL］.(2014-09-04)［2020-11-25］.https://arxiv.org/abs/1409.1556.

［13］CORDTS M,OMRAN M,RAMOS S,et al.The Cityscapes Dataset for Semantic Urban Scene Understanding［C］//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2016:3213-3223.

［14］CARREIRA J,ZISSERMAN A.Quo Vadis,Action Recognition? A New Model and the Kinetics Dataset［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society,2017:4724-4733.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于级联优化网络的视频合成方法

Video Synthesis Method Based on Cascade Refinement Network

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0