基于自适应Stacking集成学习的干散货码头船舶作业时间预测

亓亮; 张伟; 赵伟丽; 薛松

doi:10.3969/j.issn.1007-7375.240277

基于自适应Stacking集成学习的干散货码头船舶作业时间预测

Prediction of Ship Operation Time in Dry Bulk Terminals Based on Adaptive Stacking Ensemble Learning

摘要

摘要: 船舶作业时间是制定泊位计划的重要依据，现有的关于船舶作业时间的研究多聚焦于集装箱码头，而干散货码头由于其货物的特殊性及作业流程的复杂性，鲜有学者深入研究。考虑到干散货码头不同泊位之间的相似性与差异性以及单一机器学习模型的局限性，本文首先提出了基于K-means的泊位聚类方法，将具有相似作业特性的泊位进行聚类；然后提出了结合模拟退火算法的自适应Stacking集成学习模型(SA-Stacking)，针对不同类别的泊位进行船舶作业时间预测。本文基于青岛港干散货码头真实历史作业数据进行实验，实验结果表明，SA-Stacking模型能够根据不同泊位类别的数据分布特征，自适应地选择最合适的基模型组合，相比单一的机器学习模型具有更好的预测效果和泛化能力。同时，基于泊位特性分类的预测方式能更好地捕捉不同泊位的作业特点，相比于不进行泊位聚类，平均绝对误差降低约2 h，均方根误差及平均绝对百分比误差均显著降低，决定系数提高，模型解释能力和预测精度得到增强。

Abstract: Ship operation time is a crucial factor in developing berth plans. While most existing researches focus on container terminals, few scholars conduct in-depth studies on dry bulk terminals due to the unique nature of their cargo and complex operational processes. Considering the similarities and differences among berths at dry bulk terminals and the limitations of a single machine learning model, this paper first proposes a berth clustering method based on K-means algorithm to cluster berths with similar operational characteristics. Then, an adaptive Stacking ensemble learning model combined with simulated annealing algorithm (SA-Stacking) is proposed for predicting ship operation time in various types of berths at dry bulk terminals. Real historical operation data from Dry Bulk Terminal of Qingdao Port is utilized in this paper. Experimental results demonstrate that the SA-Stacking model can adapt to select the most appropriate combination of base models according to the data distribution characteristics of different berth categories, yielding superior prediction accuracy and generalization ability compared to single machine learning models. Furthermore, the prediction approach based on berth characteristic classification can better capture the operational patterns of different berths. Compared to prediction without berth clustering, the mean absolute error is reduced by nearly 2 hours, and both the root mean square error and the mean absolute percentage error are significantly reduced. Simultaneously, the R-square is increased, enhancing both the interpretability and prediction accuracy of the model.

HTML全文

参考文献(17)

施引文献

资源附件(0)