引用本文:库涛,熊艳彬,杨楠,等.基于全局交互的图像语义理解方法[J].控制与决策,2020,35(9):2103-2111
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览次   下载 本文二维码信息
码上扫一扫!
分享到: 微信 更多
基于全局交互的图像语义理解方法
库涛1,2, 熊艳彬1,2,3, 杨楠1,2,3, 林乐新1,2, 朱珠4
(1. 中国科学院沈阳自动化研究所,沈阳110016;2. 中国科学院机器人与智能制造创新研究院,沈阳110169;3. 中国科学院大学,北京100049;4. 辽宁大学信息学院,沈阳110000)
摘要:
针对图像语义生成过程中图像信息易模糊的问题,提出基于双向门控循环单元(GRU)和图像信息全局交互相结合的图像语义生成模型,通过图像和文本数据进行正则化处理和文本向量映射方法,实现模型驱动的图像语义生成.实验结果表明,所提出模型能较好地解决数据稀疏和偏态问题,采用GUR单元可以进一步降低模型参数规模,加快算法收敛速度,有效抑制模型过拟合,提高图像内容的丰富度、准确性和逻辑性.
关键词:  卷积神经网络  循环神经网络  图像语义理解  全局交互机制  数据正则化  门控循环单元
DOI:10.13195/j.kzyjc.2018.1699
分类号:TP273
基金项目:国家重点研发计划项目(2017YFB0306401);国家自然科学基金项目(61803367).
Image semantic understanding method based on global interaction
KU Tao1,2,XIONG Yan-bin1,2,3,YANG Nan1,2,3,LIN Yue-xin1,2,ZHU Zhu4
(1. Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;2. Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110169,China;3. University of Chinese Academy of Sciences, Beijing 100049,China;4. School of Information,Liaoning University,Shenyang 110000,China)
Abstract:
Aiming at the problem that image information is easily blurred during image semantic generation, an image semantic generation model based on the combination of gated recurrent unit(GRU) and global interaction of image information is proposed. Processing and word vector mapping methods achieve model-driven image semantic generation. The experimental results show that the model can better solve the problems of data sparseness and skewness. The use of GUR units further reduces the scale of low model parameters, speeds up the algorithm's convergence speed, effectively suppresses model overfitting, and improves the richness, accuracy and logicality of image content.
Key words:  convonlution neural network  recurrent neural network  image semantic understanding  global interaction mechanism  data regularization  gated recurrent unit

用微信扫一扫

用微信扫一扫