基于双分支特征融合的场景文本检测方法
作者:
作者单位:

安徽大学计算机科学与技术学院

作者简介:

通讯作者:

中图分类号:

TP391.4

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


A novel Scene Text Detection based on Dual-Path Feature Fusion
Author:
Affiliation:

(School of Computer Science and Technology, Anhui University,

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    现有的基于深度学习的自然场景文本检测方法一般都采用大型深度神经网络作为主干网络进行特征提取,虽然效果非常显著,但是整个检测模型十分庞大,检测效率很低.如果直接将主干网络换成轻量型网络,又不能够提取出足够的特征信息,直接导致检测效果大幅降低.为了降低文本检测模型的规模和更为高效的检测文本,本文提出基于双分支特征融合的场景文本检测方法,在采用相对轻量级的主干网络EfficientNet-b3的基础之上,使用双路分支进行特征融合进而进行场景文本的检测.一路分支使用特征金字塔网络,融合不同层级的特征;另一路分支使用空洞卷积空间金字塔池化结构,扩大感受野,然后融合两个分支的特征,在小幅增加计算量的同时获取更多的特征,弥补小型网络提取特征不足的问题。在三个公开的数据集上的实验结果显示,本文的方法在保持较高的检测水平的情况下,大幅度降低了模型的参数量,大幅度提升了检测速度.

    Abstract:

    The existing scene text detection methods based on deep learning generally use a deep neural network as the backbone network for feature extraction. Although it can achieve a striking detection effect, the entire detection model is very large which results in poor detection efficiency. If the large backbone network is replaced by a small backbone network directly, it will often fail to extract enough semantic features and can’t achieve an ideal detection result. To reduce the size of the scene text detection model and promote the detection efficiency, a Dual-Path Feature Fusion based Scene Text Detection (DPFF) is proposed in this paper. Based on a relatively lightweight basic network EfficientNet-b3, DPFF uses two branches for feature fusion to detect scene text. One branch uses a feature pyramid network to fuse the features with different levels. The other branch uses an Atrous Spatial Pyramid Pooling to enlarge receptive field and obtains the features of different scales. And then the features from the above two branches are fused to form more features only with a very small increasing computation, which makes up for the shortage of features caused by the small backbone network. The experimental results on three benchmark datasets show that the proposed method significantly reduces the number of the model parameters and greatly improves the detection efficiency while maintaining a high detection effect.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-01-01
  • 最后修改日期:2021-02-21
  • 录用日期:2020-04-03
  • 在线发布日期:
  • 出版日期: