Abstract:
To address the scheduling problem of wafer photolithography area, characterized by dynamic nature, real-time requirements, multiple constraints, and multiple objectives, a real-time scheduling method based on gated recurrent unit (GRU) reinforcement learning is proposed. This method incorporates GRU to learn the temporal information of historical scheduling decisions and states in the photolithography area, providing auxiliary decision-making information for the double deep reinforcement learning (DDRL) model. The input state space and output action set of the DDRL model are designed, and a multi-objective reward function is established with the objective of minimizing the maximum completion time of wafers and maximizing the on-time delivery rate, optimizing the scheduling output by intelligent agents. Additionally, constraint relaxation rules and scheduling methods are proposed combining equipment-specific constraints and mask constraints, to enhance the practicality of scheduling strategies. Through empirical evaluation using real-world cases from a wafer manufacturing enterprise, this method is compared with traditional double deep reinforcement learning and heuristic rule methods for photolithography area, demonstrating its superiority and verifying its effectiveness in solving this problem.