基于多智能体强化学习的自动化电力仓库货位优化

Optimizing the Location of Automated Power Warehouse Based on Multi-agent Reinforcement Learning

摘要: 自动化仓库的货位优化是提高仓库效率的重要途经之一。本文针对电力仓库货位优化问题，采用基于多智能体强化学习的方法，提升优化效果。首先分析DDPG算法和MADDPG等算法的不足；然后在此基础上提出改进算法ECS-MADDPG及其模型。在该算法中，同时考虑当前时间点的即时奖励和未来奖励因素；最后利用电力物资的历史出入库数据，应用强化学习算法训练货位优化模型。研究表明，与MADDPG、DDPG等算法相比，ECS-MADDPG拥有较高的稳定性和回报值。

Abstract: The optimization of the cargo location of an automated warehouse is vital to improve warehouse efficiency. Aiming at the optimization of power warehouse cargo location, the method based on multi-agent reinforcement learning is adopted to improve the optimization. First, the deficiencies of DDPG algorithm and MADDPG algorithm are analyzed, and on this basis an improved algorithm ECS-MADDPG and its model proposed. In this algorithm, both the immediate reward at the current time point and the future reward factors are considered. Finally, using the historical incoming and outgoing data of electric power materials, the reinforcement learning algorithm is applied to train the cargo location optimization model. Experiments show that ECS-MADDPG has higher stability and rewards compared with algorithms such as MADDPG and DDPG.