基于改进级联细化网络的语义图像合成

doi:10.3969/j.issn.2095-2198.2024.02.010

沈阳化工大学学报 ›› 2024, Vol. 38 ›› Issue (2): 155-160.doi: 10.3969/j.issn.2095-2198.2024.02.010

基于改进级联细化网络的语义图像合成

1.沈阳化工大学信息工程学院，辽宁沈阳 110142;2.中国科学院沈阳自动化研究所，辽宁沈阳 110016

出版日期:2024-04-30 发布日期:2025-01-02
通讯作者: 王国刚
作者简介:王佳琦（1995—），男，河北石家庄人，硕士研究生在读，主要从事图像生成方面的研究.
基金资助:
国家重点研发计划项目（2018YFB1700200）

Semantic Image Synthesis Based on Improved Cascaded Refinement Network

1. Shenyang University of Chemical Technology, Shenyang 110142, China; 2. Shenyang Institute of Automation Chinese Academy of Sciences, Shenyang 110142, China

Online:2024-04-30 Published:2025-01-02

摘要/Abstract

摘要：

针对级联细化网络（cascaded refinement networks，CRNs）存在合成图像不完整、语义信息丢失、合成的图像颜色差异大的问题，提出一种改进级联细化网络的语义图像合成方法.在级联细化网络中用空间自适应归一化［spatially-adaptive(de)normalization，SPADE］代替层归一化，通过空间自适应的学习调节归一化层中的激活，从而使语义信息更加完整；引入平滑L1损失函数，减少输出图像和对比图像间的颜色差异；引入可学习的空间自适应归一化，增加网络参数的存储容量，能够学习更多的语义信息，使合成的图像质量得到提升.在Cityscapes数据集和GTA5数据集上的试验结果表明：该方法的平均交并比和像素准确性分别比CRNs的提升了31.4%和7.4%，弗雷歇初始距离比CRNs的降低了16.3%.

关键词: 图像合成')">

图像合成, 级联细化网络, 空间自适应归一化, 平滑L1损失

Abstract:

In order to solve the problems in cascaded refinement networks(CRNs),such as incomplete synthetic images,loss of semantic information and large color difference of synthesized images,an improved semantic image synthesis method based on improved cascaded refinement networks(CRNs)is proposed.In cascaded thinning network,spatially-adaptive(de)normalization wasused instead of layer normalization,and activation in normalization layer was adjusted by spatial adaptive learning to make semantic information more complete.Smooth L1 loss function was introduced to reduce thecolor difference between theoutput image and contrast image.In addition,learnable spatial adaptive normalization was introduced to increase the storage capacity of network parameters.More semantic information could be learned to improve the quality of the synthesized image.Experiments on Cityscapes and GTA5 datasets show that the mean intersection over Union and pixel accuracy are 31.4% and 17.4% higher than thatof CRNs,respectively,and the Fréchet Inception Distance is 16.3% lower than CRNs.

Key words: image synthesis')">

image synthesis, cascaded refinement networks, spatially-adaptive(de) normalization, smooth , L1 loss

王佳琦1, 2, 王国刚1, 汪滢1, 赵怀慈2.

基于改进级联细化网络的语义图像合成 [J]. 沈阳化工大学学报, 2024, 38(2): 155-160.

WANG Jiaqi1, 2, WANG Guogang1, WANG Ying1, ZHAO Huaici2.

Semantic Image Synthesis Based on Improved Cascaded Refinement Network [J]. Journal of Shenyang University of Chemical Technology, 2024, 38(2): 155-160.

参考文献

［1］孟智慧.城市公共自行车站点短时出租量预测方法研究［D］.沈阳:沈阳化工大学,2018:35-47.

［2］CHEN T,CHENG M M,TAN P,et al.Sketch2photo:Internet Image Montage［J］ACM Transactions on Graphics,2009,28(5):124.

［3］WANG K F,GOU C,DUAN Y J,et al.Generative Adversarial Networks:Introduction and Outlook［J］.IEEE/CAA Journal of Automatica Sinica,2017,4(4):588-598.

［4］KINGMA D P,WELLING M.Auto-Encoding Variational Bayes［EB/OL］.（2013-12-20）．https://arxiv.org/abs/1312.6114．

［5］GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Nets［C］//Proceeding of the 27th International Conference on Neural Information Processing Systems.Cambridge：MIT Press,2014:2672-2680.

［6］DENTON E,CHINTALA S,SZLAM A,et al.Deep Generative Image Models Using a Laplacian Pyramid of Adversarial Networks［C］//Proceedings of the 28th International Conference on Neural Information Processing Systems.Cambridge：MIT Press,2015:1486-1494.

［7］蔡雨婷,陈昭炯,叶东毅.基于双层级联GAN的草图到真实感图像的异质转换［J］.模式识别与人工智能,2018,31(10):877-886.

［8］RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolutional Networks for Biomedical Image Segmentation［C］//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015.Cham:Springer,2015:234-241.

［9］XIN D,BISWAL B B.Psychophysiological Interactions in a Visual Checkerboard Task:Reproducibility,Reliability,and the Effects of Deconvolution［J］.Frontiers in Neuroence,2017,11:573.

［10］CHEN Q F,KOLTUN V.Photographic Image Synthesis with Cascaded Refinement Networks［C］//Proceedings of the IEEE New York:2017 IEEE International Conference on Computer Vision(ICCV).Los Alamitos：IEEE Computer Society，2017:1520-1529.

［11］WANG X T,YU K,DONG C,et al.Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition．Los Alamitos：IEEE Computer Society,2018:606-615.

［12］PARK T,LIU M Y,WANG T C,et al.Semantic Image Synthesis with Spatially-Adaptive Normalization［C］//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos：IEEE Computer Society,2019:2332-2341.

［13］DOSOVITSKIY A,BROX T.Generating Images with Perceptual Similarity Metrics Based on Deep Networks［C］//Proceedings of the 30th International Conference on Neural Information Processing Systems.Red Hook：Curran Associates Inc.,2016:658-666.

［14］NGUYEN A,CLUNE J,BENGIO Y,et al.Plug & Play Generative Networks:Conditional Iterative Generation of Images in Latent Space［C］//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2017:3510-3520.

［15］HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition［C］//2016 IEEE Conferenceon Computer Visionand Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2016:770-778.

［16］CORDTS M,OMRAN M,RAMOS S,et al.The Cityscapes Dataset for Semantic Urban Scene Understanding［C］//2016 IEEE Conferenceon Computer Visionand Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2016:3213-3223.

［17］RICHTER S R,VINEET V,ROTH S,et al.laying for Data:Ground Truth from Computer Games［C］//Computer Vision-ECCV 2016.Cham:Springer,2016:102-118.

［18］ISOLA P,ZHU J Y,ZHOU T H,et al.Image-to-Image Translation with Conditional Adversarial Networks［C］//2017 IEEE Conferenceon Computer Visionand Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2017:5967-5976.

［19］ZHAO H S,SHI J P,QI X J,et al.Pyramid Scene Parsing Network［C］//2017 IEEE Conference on Computer Visionand Pattern Recognition(CVPR).Los Alamitos:IEEE Computer Society,2017:6230-6239.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于改进级联细化网络的语义图像合成

Semantic Image Synthesis Based on Improved Cascaded Refinement Network

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0