Department of Automation, Tsinghua University
The National Science Fund for Distinguished Young Scholars,
流水车间调度是应用背景最为广泛的调度问题, 其智能算法研究具有重要的学术意义和应用价值. 以最小化最大完工时间为目标, 提出求解流水车间调度的一种基于深度强化学习与迭代贪婪算法的框架. 首先, 设计一种新的编码网络对问题进行建模, 解决了传统模型受问题规模影响而难以扩展的缺陷, 并利用强化学习训练模型以获取优良输出结果; 其次, 提出一种带反馈机制的迭代贪婪算法, 以网络的输出结果为初始解, 协同利用多种局部操作提高搜索能力, 并根据性能反馈来调节各操作的使用, 进而获得最终的调度解. 仿真结果和统计对比表明所提出的深度强化学习与迭代贪婪融合的算法能够取得更好的性能.
As the scheduling problem with wide application backgrounds, the research of intelligent algorithms for flow-shop scheduling is of important academic significance and application value. With the criterion of minimizing the maximum completion time, a framework is proposed based on deep reinforcement learning and iterative greedy method for solving the permutation flow-shop scheduling. Firstly, a new encoding network is designed to model the problem to avoid the defect in generalizing the classic model affected by problem scale, and the reinforcement learning is used to train the model to yield good output result. Secondly, an iterative greedy algorithm with feedback mechanism is proposed by using the output result of the trained model as the initial solution. Multiple local search operators are conducted in a collaborative way and adjusted their utilizations according to the feedback of performances for obtaining the final schedule. Simulation results and statistical comparisons show that the proposed algorithm fusing deep reinforcement learning and iterative greedy method is able to achieve better performances.