SHORT-TERM BUS PASSENGER DEMAND FORECASTING USING MACHINE LEARNING: A CASE STUDY OF ROUTE №50 IN ASTANA
DOI:
https://doi.org/10.26577/jpcsit20253205Keywords:
Public transport demand forecasting, Bus passenger flow, CatBoostRegressor, Machine learning, Traffic congestion, Weather impact, Feature engineering, Urban transit operationsAbstract
This study introduced a machine learning-based approach for short-term forecasting of passenger traffic in the Astana city bus system, in particular, focusing on Route №50. Сonsidering how rapidly cities are developing and increasing transport problems, accurate forecasting of bus demand is an important step towards optimizing resource allocation and improving the quality of service and passenger satisfaction. The research combines data from several sources - information about passenger traffic from a transport company, 15-minute traffic figures and weather conditions, and offers a predictive model that was developed using CatBoostRegressor. The data was collected over one week in December 2024 and covered 9,819 passenger traffic records with a total of 22,111 boardings. According to the results of the study, the model showed high performance with RMSE values of 2,920 and 2,516 for directions from A to B and from B to A, respectively, accurately reflecting the structure of demand at different times and in different places. Also, an analysis of the importance of features showed that factors such as the location of the stop, time of day and traffic congestion are the most significant factors affecting bus demand. The results serve as the foundation for dynamic bus allocation and timetable optimization in difficult urban conditions characterized by harsh winter conditions and traffic congestion. This research addresses critical gaps in the literature by developing a resource-efficient forecasting solution adaptable to evolving urban environments with limited historical data sets. This study offers resource-efficient solutions for forecasting bus demand, adaptable to evolving urban environments with limited historical data sets like Astana, and addresses gaps in the literature.