基于改进DQN算法的无人仓多AGV路径规划

doi:10.3969/j.issn.1007-7375.220247

工业工程 ›› 2024, Vol. 27 ›› Issue (1): 36-44,53.doi: 10.3969/j.issn.1007-7375.220247

基于改进DQN算法的无人仓多AGV路径规划

谢勇¹, 郑绥君¹, 程念胜², 朱洪君¹

1. 华中科技大学人工智能与自动化学院，湖北武汉 430074;
2. 航天信息股份有限公司，北京 100195

收稿日期:2022-12-08 发布日期:2024-03-05
通讯作者: 郑绥君 (1998—)，女，湖南省人，硕士研究生，主要研究方向为多智能体的调度优化与路径规划。Email：zheng893724451@163.com E-mail:zheng893724451@163.com
作者简介:谢勇 (1974—)，男，湖北省人，副教授，主要研究方向为智慧物流、优化调度、智能制造
基金资助:
国家自然科学基金资助面上项目 (71771096)；国家自然科学基金创新群体资助项目 (71821001)

Multi-AGV Route Planning for Unmanned Warehouses Based on Improved DQN Algorithm

XIE Yong¹, ZHENG Suijun¹, CHENG Niansheng², ZHU Hongjun¹

1. School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China;
2. Aerospace Information Co., Ltd, Beijing 100195, China

Received:2022-12-08 Published:2024-03-05

摘要/Abstract

摘要： 针对无人仓中多AGV路径规划与冲突问题，以最小化总行程时间为目标，建立多AGV路径规划模型，提出一种基于动态决策的改进DQN算法。算法设计了基于单AGV静态路径规划的经验知识模型，指导AGV的学习探索方向，提前规避冲突与障碍物，加快算法收敛。同时提出基于总行程时间最短的冲突消解策略，从根本上解决多AGV路径冲突与死锁问题。最后，建立无人仓栅格地图进行仿真实验。结果表明，本文提出的模型和算法较其他DQN算法收敛速度提升13.3%，平均损失值降低26.3%。这说明该模型和算法有利于规避和化解无人仓多AGV路径规划冲突，减少多AGV总行程时间，对提高无人仓作业效率具有重要指导意义。

关键词: 多AGV, 路径规划, DQN算法, 经验知识, 冲突消解

Abstract: To solve the problem of multi-AGV route planning and conflicts in unmanned warehouses, with the objective of minimizing the total travel time, a multi-AGV route planning model is established, and an improved DQN algorithm based on dynamic decision-making is proposed. An empirical knowledge model based on static route planning of a single AGV is designed to guide the learning and exploration direction of AGVs. It avoids conflicts and obstacles for AGVs in advance, and accelerates the convergence of the proposed algorithm. Also, a conflict resolution strategy based on the shortest total travel time is proposed to fundamentally solve the problem of multi-AGV route conflicts and deadlocks. Finally, a grid map of an unmanned warehouse is established for simulation experiments. Results show that, compared with other DQN algorithms, the convergence speed of the proposed model and algorithm is increased by 13.3%, and the average loss value is reduced by 26.3%. This result indicates that the model and algorithm are conducive to avoiding and resolving the conflicts of multi-AGV route planning in unmanned warehouses, reducing the total travel time of multiple AGVs and having important guiding significance to improve the efficiency of unmanned warehouse operations.

Key words: multiple AGVs, route planning, DQN algorithm, empirical knowledge, conflict resolution

中图分类号:

F406.2
TP24

谢勇, 郑绥君, 程念胜, 朱洪君. 基于改进DQN算法的无人仓多AGV路径规划[J]. 工业工程, 2024, 27(1): 36-44,53.

XIE Yong, ZHENG Suijun, CHENG Niansheng, ZHU Hongjun. Multi-AGV Route Planning for Unmanned Warehouses Based on Improved DQN Algorithm[J]. Industrial Engineering Journal, 2024, 27(1): 36-44,53.

参考文献

[1] 余娜娜, 李铁克, 王柏琳, 等. 自动化分拣仓库中多AGV调度与路径规划算法[J]. 计算机集成制造系统, 2020, 26(1): 171-180
YU Nana, LI Tieke, WANG Bailin, et al. Multi-AGVs scheduling and path planning algorithm in automated sorting warehouse[J]. Computer Integrated Manufacturing Systems, 2020, 26(1): 171-180
[2] 王秀红, 刘雪豪, 王永成. 基于改进A*算法的仓储物流移动机器人任务调度和路径优化研究[J]. 工业工程, 2019, 22(6): 34-39
WANG Xiuhong, LIU Xuehao, WANG Yongcheng. A research on task scheduling and path planning of mobile robot in warehouse logistics based on improved A* algorithm[J]. Industrial Engineering Journal, 2019, 22(6): 34-39
[3] YANG L, FU L, LI P, et al. An effective dynamic path planning approach for mobile robots based on ant colony fusion dynamic windows[J]. Machines, 2022, 10(1): 50
[4] ZHONG X, TIAN J, HU H, et al. Hybrid path planning based on safe A* algorithm and adaptive window approach for mobile robot in large-scale dynamic environment[J]. Journal of Intelligent & Robotic Systems, 2020, 99(1): 65-77
[5] YANG Y, LI Juntao, PENG Lingling. Multi-robot path planning based on a deep reinforcement learning DQN algorithm[J]. CAAI Transactions on Intelligence Technology, 2020, 5(3): 177-183
[6] GUO S, ZHANG X, ZHENg Y, et al. An autonomous path planning model for unmanned ships based on deep reinforcement learning[J]. Sensors, 2020, 20(2): 426
[7] GAO P, LIU Z, WU Z, et al. A global path planning algorithm for robots using reinforcement learning[C/OL]//2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). Dali: IEEE, 2019: 1693-1698 (2019-12-01). https:// doi.org/10.1109/ROBIO49542.2019.8961753
[8] WATKINS C J C H, DAYAN P. Q-learning[J]. Machine Learning, 1992, 8(3): 279-292
[9] 周飞燕, 金林鹏, 董军. 卷积神经网络研究综述[J]. 计算机学报, 2017, 40(6): 1229-1251
ZHOU Feiyan, JIN Linpeng, DONG Jun. Review of convolutional neural network[J]. Chinses Journal of Computers, 2017, 40(6): 1229-1251
[10] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529-533
[11] JIANG L, HUANG H, DING Z. Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge[J]. IEEE/CAA Journal of Automatica Sinica, 2019, 7(4): 1179-1189
[12] 朱霸坤, 朱卫纲, 李伟, 等. 基于先验知识的多功能雷达智能干扰决策方法[J]. 系统工程与电子技术, 2022, 44(12): 3685-3695
ZHU Bakun, ZHU Weigang, LI Wei, et al. Multi-function radar intelligent jamming decision method based on prior knowledge[J]. Systems Engineering and Electronics, 2022, 44(12): 3685-3695
[13] 邹裕吉, 宋豫川, 王馨坤, 等. 自动导向小车与加工设备多目标集成调度的聚类遗传算法[J]. 中国机械工程, 2022, 33(1): 97-108
ZOU Yuji, SONG Yuchuan, WANG Xinkun, et al. Clustering genetic algorithm for multi-objective integrated scheduling of AGVs and machine[J]. China Mechanical Engineering, 2022, 33(1): 97-108
[14] MATIGNON L, LAURENT G J, LE FORT-PIAT N L. Hysteretic Q-learning: an algorithm for decentralized reinforcement learning in cooperative multi-agent teams[C]// 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems. San Diego: IEEE, 2007: 64-69.

基于改进DQN算法的无人仓多AGV路径规划

Multi-AGV Route Planning for Unmanned Warehouses Based on Improved DQN Algorithm

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 9

编辑推荐

Metrics

本文评价

[1]	刘冬宁, 向佳敏, 曾思敏, 叶自青. 复杂时空网络冲突消解群组角色指派研究[J]. 工业工程, 2022, 25(4): 143-150,172.
[2]	王玖河, 高辉, 刘欢. 基于遗传算法的共享助力车调度问题研究[J]. 工业工程, 2021, 24(1): 90-96.
[3]	王泽, 杨信丰, 刘兰芬. 考虑电量消耗的车辆调度优化研究[J]. 工业工程, 2020, 23(4): 140-147.
[4]	张守京, 张仪. 考虑余-废料资源回收的车间物料配送路径规划研究[J]. 工业工程, 2020, 23(2): 83-90.
[5]	王秀红, 刘雪豪, 王永成. 基于改进A^*算法的仓储物流移动机器人任务调度和路径优化研究[J]. 工业工程, 2019, 22(6): 34-39.
[6]	傅惠, 陈恺宇. 基于工作流网的应急资源配置与路径规划集成优化[J]. 工业工程, 2018, 21(5): 1-8.
[7]	何兆楚, 何元烈, 曾碧. RRT与人工势场法结合的机械臂避障规划[J]. 工业工程, 2017, 20(2): 56-63.
[8]	徐磊, 陈璐. 道路养护中的带随机时间变量的弧路径规划问题[J]. 工业工程, 2017, 20(1): 91-98,106.
[9]	王楠，李世其，王峻峰. 带时间窗的汽车总装线物料配送路径规划[J]. 工业工程, 2012, 15(2): 94-99.