Abstract:
Existing taxi scheduling models typically focus on the optimization of real-time cost while the potential impact of currently planned routes on future operating value is ignored, which is detrimental to continuous scheduling in autonomous driving environment. To this end, this paper proposes a route planning model focusing on long-term benefit, in which the estimated future operating value is incorporated into the real-time scheduling problem by reinforcement learning. Specifically, the model is solved by a neural network first to fit the state-value function for different temporal and spatial states of vehicles, after which a double neural network and the experience replay are used to accelerate the convergence of the algorithm. Through the simulation experiments on the road network of Shenzhen, it demonstrates that our model enables to accurately schedule the fleet in advance, serving more passengers and achieving greater operational profit. Additionally, the cost of fleet energy consumption can be reduced since the model can utilized the peak and off-peak characteristics of time-of-use electricity pricing and vehicle-to-grid (V2G) technology for charging and discharging. Compared to other scheduling models, the proposed model enables to increase the passenger response rate by 4% and the total profit by 25% in long-term operation, which also reduces energy consumption by 50% and passenger waiting time by 20%.