基于强化学习的地铁站空调系统节能控制
作者:
作者单位:

1.北京建筑大学;2.北京兴创置地房地产开发有限公司

作者简介:

通讯作者:

中图分类号:

TP273

基金项目:

北京市属高校高水平创新团队建设计划项目(IDHT20190506), 北京市教委科技计划重点项目(KZ201810016019), 北京建筑大学市属高校基本科研业务费专项资金(X20068)


Energy Saving Control for the Subway Station Air Conditioning Systems Based on Reinforcement Learning
Author:
Affiliation:

1.Beijing University of Civil Engineering and Architecture;2.Beijing Xingchuang Land Real Estate Development Co., Ltd.

Fund Project:

the High Level Innovation Team Construction Project of Beijing Municipal Universities (No. IDHT20190506), the Key Science and Technology Plan Project of Beijing Municipal Education Commission of China (No. KZ201810016019), the Fundamental Research Funds for Beijing University of Civil Engineering and Architecture (No. X20068)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    地铁站空调系统能源消耗较大,传统控制方法无法兼顾舒适性和节能问题,控制效果不佳,且目前地铁站空调控制系统均是对风系统和水系统单独控制,无法保证整个系统的节能效果.为此本文提出了基于强化学习的空调系统节能控制策略.首先,本文采用神经网络建立了空调系统模型,作为离线训练智能体的模拟环境,以解决无模型强化学习方法在线训练收敛时间长的问题.然后,为了提升算法效率,同时针对地铁站空调系统多维连续动作空间的特点,本文提出了基于多步预测的深度确定性策略梯度算法,设计了智能体框架,将其用于与环境模型进行交互训练.此外,为了确定最佳的训练次数,本文还设置了智能体训练终止条件,进一步提升了算法效率.最后,本文基于武汉某地铁站的实测运行数据进行了仿真实验,结果表明,本文所提出的控制策略具有较好的温度跟踪性能,能够保证站台舒适性,且与目前实际系统相比,能源节省约17.908%.

    Abstract:

    The subway station air conditioning system consumes a lot of energy, and traditional control methods cannot take into account the comfort and energy saving issues together, resulting in poor control effect. Moreover, the current subway station air conditioning control systems control the air system and water system separately, which cannot guarantee the energy saving effect of the whole system. Therefore, this paper proposes an energy-saving control strategy for the system based on reinforcement learning. First, this paper uses a neural network to establish an air conditioning system model as a simulation environment for offline training of the agent to solve the problem of long convergence time of model-free reinforcement learning methods for online training. Then, in order to improve the efficiency of the algorithm and also to address the characteristics of the multidimensional continuous action space of the air conditioning systems, this paper proposes a deep deterministic policy gradient algorithm based on multi-step prediction and designs an agent framework that will be used to interact with the environment model for training. In addition, in order to determine the optimal number of training times, the agent training termination condition is also set in this paper, which further improves the algorithm efficiency. Finally, simulation experiments are conducted based on the measured operational data of a subway station in Wuhan, and the results show that the control strategy proposed in this paper has better temperature tracking performance and can ensure the comfort of the platform, and the energy saving is about 17.908% compared with the current actual system.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-05-04
  • 最后修改日期:2021-08-26
  • 录用日期:2021-08-27
  • 在线发布日期: 2021-09-17
  • 出版日期: