Abstract:
In order to ensure the on-time delivery of machine tools in mixed-flow assembly shops, a scheduling optimization method based on improved deep multi-agent reinforcement learning is proposed, aiming to to address the low solution quality and slow training speed in minimizing production delays. A scheduling optimization model for mixed-flow assembly lines is constructed with the objective of minimizing delay time, where double deep
Q network (DDQN) agents with decentralized execution are applied to learn the relationship between production information and scheduling objectives. The framework adopts the strategies of centralized training and decentralized execution, utilizing parameter sharing to deal with the non-stationary problem in multi-agent reinforcement learning. On this basis, a recurrent neural network is used to manage variable-length state and action representations, enabling agents to handle problems of arbitrary scale. A global/local reward function is also introduced to solve the reward sparsity problem in the training process. The optimal parameter combinations are identified through ablation experiments. Numerical experimental results show that, compared with the standard benchmarks, the proposed algorithm improves the average total delay of workpieces by 24.1% to 32.3% compared to before the improvement, and the training speed increased by 8.3% in terms of the achievement of the objective.