考虑未来运营收益的自动驾驶出租车充放电协同路径规划

doi:10.3969/j.issn.1007-7375.230095

摘要/Abstract

摘要： 现有的出租车调度模型通常只优化实时成本而忽视当前路径规划对未来运营收益的影响，这不利于自动驾驶环境下的连续调度。为此，本文提出一个专注于长期收益的路径规划模型，并利用强化学习将预估的未来运营收益整合到实时调度问题中。模型的具体求解方法是先利用神经网络来拟合车辆的不同时空状态的状态价值函数，再通过双神经网络和经验池的方式加快算法收敛。深圳路网仿真实验表明，所提出的调度模型能够预先精准地调度车队，服务更多乘客，获得更大的运营收益；并且模型能够利用分时电价的峰谷特征和电动汽车入网 (vehicle to grid, V2G) 技术进行充放电，从而降低车队的能耗成本。相较于其他调度模型，该模型在长期运营中实现乘客匹配服务率增加4%，总收益提高25%，能耗成本节省50%以及乘客等待时间降低20%。

关键词: 未来运营收益, 强化学习, 分时电价, 电动汽车入网技术, 状态价值函数

Abstract: Existing taxi scheduling models typically focus on the optimization of real-time cost while the potential impact of currently planned routes on future operating value is ignored, which is detrimental to continuous scheduling in autonomous driving environment. To this end, this paper proposes a route planning model focusing on long-term benefit, in which the estimated future operating value is incorporated into the real-time scheduling problem by reinforcement learning. Specifically, the model is solved by a neural network first to fit the state-value function for different temporal and spatial states of vehicles, after which a double neural network and the experience replay are used to accelerate the convergence of the algorithm. Through the simulation experiments on the road network of Shenzhen, it demonstrates that our model enables to accurately schedule the fleet in advance, serving more passengers and achieving greater operational profit. Additionally, the cost of fleet energy consumption can be reduced since the model can utilized the peak and off-peak characteristics of time-of-use electricity pricing and vehicle-to-grid (V2G) technology for charging and discharging. Compared to other scheduling models, the proposed model enables to increase the passenger response rate by 4% and the total profit by 25% in long-term operation, which also reduces energy consumption by 50% and passenger waiting time by 20%.

Key words: future operating value, reinforcement learning, time-of-use electricity pricing, vehicle-to-grid technology, state-value function

中图分类号:

曾伟良, 韩宇, 傅惠. 考虑未来运营收益的自动驾驶出租车充放电协同路径规划[J]. 工业工程, 2024, 27(4): 132-140,149.

ZENG Weiliang, HAN Yu, FU Hui. Charging and Discharging Coordinated Routing for Autonomous Electric Taxis Considering Future Operating Value[J]. Industrial Engineering Journal, 2024, 27(4): 132-140,149.

参考文献

[1] 傅惠, 伍乃骐, 胡刚. 城市交通系统管理与优化研究综述[J]. 工业工程, 2016, 19(1): 10-15.
FU Hui, WU Naiqi, HU Gang. An overview of management and optimization of urban transportation system[J]. Industrial Engineering Journal, 2016, 19(1): 10-15.
[2] 高盛, 卢健松. 数字孪生城市建设的实践探索及推进建议[J]. 建筑经济,2024,45(2): 5-12.
GAO Sheng, LU Jiansong. Practical exploration and suggestions for promoting the construction of digital twin cities[J]. Construction Economy, 2024,45 (2): 5-12.
[3] 董昕. 我国城市更新的现存问题与政策建议[J]. 建筑经济,2022,43(1): 27-31.
DONG Xin. Existing problems and policy suggestions of urban renewal in China [J]. Construction Economy, 2022,43 (1): 27-31.
[4] 刘小寒, 马晓磊, 刘钲可. 面向公共交通的电动自动驾驶模块车调度优化[J]. 中国公路学报, 2022, 35(3): 240-248.
LIU Xiaohan, MA Xiaolei, LIU Zhengke. Dispatch optimization of electric autonomous modular vehicles for public transport[J]. China Journal of Highway and Transport, 2022, 35(3): 240-248.
[5] LIAO Z, TAIEBAT M, XU M. Shared autonomous electric vehicle fleets with vehicle-to-grid capability: Economic viability and environmental co-benefits[J]. Applied Energy, 2021, 302: 117500.
[6] SHI J, GAO Y, WANG W, et al. Operating electric vehicle fleet for ride-hailing services with reinforcement learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(11): 4822-4834.
[7] 董海, 雷凤达. 基于萤火虫算法的动态车辆路径规划[J]. 工业工程, 2022, 25(6): 110-119.
DONG Hai, LEI Fengda. Dynamic vehicle routing problem with time windows and capacity constraints based on coordinate firefly algorithm[J]. Industrial Engineering Journal, 2022, 25(6): 110-119.
[8] LOWALEKAR M, VARAKANTHAM P, JAILLET P. ZAC: A zone path construction approach for effective real-time ridesharing[C]//Proceedings of the International Conference on Automated Planning and Scheduling Berkeley. California, Palo Alto: AAAI Press, 2019: 528-538.
[9] 黄晓辉, 张雄, 杨凯铭, 等. 基于联合Q值分解的强化学习网约车订单派送[J]. 计算机工程, 2022, 48(12): 296-303.
HUANG Xiaohui, ZHANG Xiong, YANG Kaiming, et al. Reinforcement learning online car-hailing order dispatch based on joint q-value decomposition[J]. Computer Engineering, 2022, 48(12): 296-303.
[10] ALONSO-MORA J, SAMARANAYAKE S, WALLAR A, et al. On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment[J]. Proceedings of the National Academy of Sciences of the United States of America, 2017, 114(3): 462-467.
[11] HULAGU S, CELIKOGLU H B. An electric vehicle routing problem with intermediate nodes for shuttle fleets[J]. IEEE Transactions On Intelligent Transportation Systems, 2022, 23(2): 1223-1235.
[12] JIAO Y, TANG X, QIN Z T, et al. Real-world ride-hailing vehicle repositioning using deep reinforcement learning[J]. Transportation Research Part C: Emerging Technologies, 2021, 130: 103289.
[13] WANG Y, TONG Y, LONG C, et al. Adaptive dynamic bipartite graph matching: a reinforcement learning approach[C]//2019 IEEE 35th International Conference on Data Engineering (ICDE). New York: IEEE, 2019: 1478-1489.
[14] WANG Z, QIN Z, TANG X, et al. Deep reinforcement learning with knowledge transfer for online rides order dispatching[C]//2018 IEEE International Conference on Data Mining (ICDM). New York: IEEE, 2018: 617-626.
[15] SHAH S, LOWALEKAR M, VARAKANTHAM P. Neural approximate dynamic programming for on-demand ride-pooling[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 507-515.
[16] JAMSHIDI H, CORREIA G, ESSEN T V, et al. Dynamic planning for simultaneous recharging and relocation of shared electric taxies: a sequential MILP approach[J]. Transportation Research Part C: Emerging Technologies, 2021, 125(PD): 102933.
[17] DENG J L, HU H, GONG S C, et al. Impacts of charging pricing schemes on cost-optimal logistics electric vehicle fleet operation[J]. Transportation Research Part D: Transport and Environment, 2022, 109: 103333.
[18] ALFAVERH F, DENAI M, SUN Y C. Optimal vehicle-to-grid control for supplementary frequency regulation using deep reinforcement learning[J]. Electric Power Systems Research, 2023, 214: 108949.
[19] IACOBUCCI R, BRUNO R, SCHMÖCKER J. An integrated optimisation-simulation framework for scalable smart charging and relocation of shared autonomous electric vehicles[J]. Energies, 2021, 14(12): 1-22.
[20] HANSEN N A, SU H, WANG X. Temporal difference learning for model predictive control[C]//Proceedings of the 39th International Conference on Machine Learning. Baltimore, Maryland. PMLR: 2022: 8387-8406.
[21] 深圳市发展和改革委员会. 深圳市发展和改革委员会关于进一步完善我市峰谷分时电价政策有关问题的通知[EB/OL]. (2021-12-28) [2023-05-11]. http://www.sz.gov.cn/cn/xxgk/zfxxgj/tzgg/content/post_9493597.html.
[22] CHEN T D, KOCKELMAN K M, HANNA J P. Operations of a shared, autonomous, electric vehicle fleet: Implications of vehicle & charging infrastructure decisions[J]. Transportation Research Part A: Policy and Practice, 2016, 94: 243-254.