A Two-Stage GAIL-PPO Optimization Framework for High-Speed Vehicle Lane-changing Decision-Making

GUO Hongjiang; Zhu Zhengze; Gong Jiayuan; Yin Zheng; Lyu Chengzhi

doi:10.3969/j.issn.1007-7375.250150

GUO Hongjiang, Zhu Zhengze, Gong Jiayuan, Yin Zheng, Lyu Chengzhi. A Two-Stage GAIL-PPO Optimization Framework for High-Speed Vehicle Lane-changing Decision-Making[J]. Industrial Engineering Journal. DOI: 10.3969/j.issn.1007-7375.250150

Citation:

A Two-Stage GAIL-PPO Optimization Framework for High-Speed Vehicle Lane-changing Decision-Making

Graphical Abstract

Graphical Abstract

Abstract

Abstract

This paper addresses challenges in intelligent vehicle lane-changing decisions on highways, including the high dependency on data quality and limited generalization capability of imitation learning, as well as the low training efficiency and difficulty in balancing multiple objectives in reinforcement learning. It proposes a two-stage collaborative optimization framework based on generative adversarial imitation learning (GAIL) and proximal policy optimization (PPO). An adversarial mechanism based on Wasserstein distance combined with gradient penalties is introduced into the GAIL discriminator to enhance training stability. PPO is integrated into the generator update process of GAIL, leveraging an Actor-Critic architecture to strengthen the robustness of policy learning. PPO is employed for multi-objective reinforcement fine-tuning of the pre-trained policy, constructing a multi-objective reward function that balances traffic efficiency and safety constraints. This enables progression from expert imitation to policy optimization under complex scenario constraints. Experimental results on the highway-env simulation environment demonstrate that compared to the PPO baseline and DQN methods, the proposed approach achieves approximately 4% and 8% improvements in average travel speed, respectively, while effectively reducing unnecessary lane changes. Combined with longitudinal acceleration time series analysis and robustness testing, these results further validate the method's stability and generalization capabilities under varying driving durations, traffic flow densities, lane numbers, and vehicle dynamics constraints.

FullText(HTML)

References (16)

Cited By

Turn off MathJax

Article Contents

A Two-Stage GAIL-PPO Optimization Framework for High-Speed Vehicle Lane-changing Decision-Making

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content