基于深度强化学习的动态需求车辆路径规划

王焕江; 谢勇

doi:10.3969/j.issn.1007-7375.250223

基于深度强化学习的动态需求车辆路径规划

王焕江,
谢勇

Deep Reinforcement Learning for the Dynamic Vehicle Routing Problem with Dynamic Requests

摘要

摘要: 针对配送过程中客户需求随机到达的动态车辆路径问题，构建了以最小化总行驶距离为目标的动态需求车辆路径优化模型，并将其建模为马尔可夫决策过程进行求解；提出注意力机制引导的迭代编码深度强化学习方法(AGIE-DRL)，通过门控层与多头注意力改进编码器，增强动态状态特征的表达与聚合能力；随后，构建了面向配送场景态势感知的解码器，基于访问节点、时间信息及车辆剩余容量动态生成可行解，并采用PPO与Rollout结合的训练策略提升模型求解算法的收敛速度和训练稳定性。仿真结果表明，在15% ~ 75% 动态度场景下，该方法与Attention、混合PSO和ALNS相比，均能获得更短的平均行驶距离和更高的求解效率，并在跨动态度测试中保持较小性能偏差，这表明其具有良好的求解性能、泛化能力和环境适应性。

Abstract: For the dynamic vehicle routing problem in which customer requests arrive randomly during the delivery process, a dynamic- requests vehicle routing optimization model is established with the objective of minimizing the total travel distance, and the problem is formulated as a Markov decision process for solution. An attention-guided iterative encoding deep reinforcement learning method, denoted as AGIE-DRL, is proposed. The encoder is improved by introducing a gating layer and multi-head attention to enhance the representation and aggregation of dynamic state features. Furthermore, a situation-aware decoder for delivery scenarios is constructed to dynamically generate feasible solutions based on visited nodes, temporal information, and remaining vehicle capacity. In addition, a training strategy combining proximal policy optimization and rollout is adopted to improve the convergence speed and training stability of the solution algorithm. Simulation results show that, under scenarios with degrees of dynamism ranging from 15% to 75%, the proposed method achieves shorter average travel distances and higher computational efficiency than Attention, hybrid PSO, and ALNS, while maintaining small performance deviations in cross-dynamism tests, thereby demonstrating good solution performance, generalization ability, and adaptability to dynamic environments.

HTML全文

参考文献(20)

施引文献

资源附件(0)