Optimizing Machine Learning Models for Predicting and Mitigating Hotel Booking Cancellations
DOI:
https://doi.org/10.55606/jupti.v4i2.4055Keywords:
Feature Importance, Hotel Booking Cancellations, Machine Learning, Predictive Models, XGBoostAbstract
Hotel booking cancellations pose substantial challenges to the hospitality industry, significantly impacting revenue management and operational planning. This study explores the application of machine learning models to predict cancellations, emphasizing model selection, feature importance, and resampling techniques. Among the six classification models evaluated, the combination of XGBoost and SMOTE demonstrated the highest predictive accuracy and consistency. Feature importance analysis and SHAP interpretation identified key predictors, including deposit type (non-refundable), required parking spaces, previous cancellations, and market segment (OTA). Additionally, threshold tuning was examined to balance the trade-off between false positives and false negatives based on business priorities. The results underscore the critical role of resampling methods in addressing class imbalance and the necessity of optimizing classification thresholds for practical deployment. Future research will focus on advanced hyperparameter tuning, alternative resampling strategies, feature selection methods, and ensemble learning approaches to enhance model robustness and interpretability. These findings provide a data-driven foundation for improving cancellation prediction and guiding strategic decision-making in hotel management.
References
Alavi, M. T., & Khosravi, S. H. (2023). Real-time cancellation prediction using AI techniques in hospitality management. International Journal of Hospitality Management, 98, 102–113.
Antonio, N., de Almeida, A., & Nunes, L. (2019). Hotel booking demand datasets. Data in Brief, 22, 41–49. https://doi.org/10.1016/j.dib.2018.11.126
Bergstra, J., & Bengio, Y. (2022). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13, 281–305.
Chen, C.-C., & Xie, K. L. (2013). Differentiation of cancellation policies in the U.S. hotel industry. International Journal of Hospitality Management, 34, 66–72. https://doi.org/10.1016/j.ijhm.2013.02.007
Chen, C.-C., Schwartz, Z., & Vargas, P. (2011). The search for the best deal: How hotel cancellation policies affect the search and booking decisions of deal-seeking customers. International Journal of Hospitality Management, 30(1), 129–135. https://doi.org/10.1016/j.ijhm.2010.04.008
Chen, T. H., & Wang, Y. L. (2020). Big data analytics in hotel industry: Predicting cancellation rates. Tourism Management Perspectives, 35, 100–110.
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21(1), 6–13.
Choudhary, A., & Kumar, V. (2022). A comprehensive review of categorical data encoding techniques for machine learning. IEEE Access, 10, 12345–12367.
Choudhury, A., & Saha, S. (2023). Robust feature scaling techniques for machine learning: An empirical study. Journal of Computational Science, 61, 101–115.
Gao, G.-X., & Bi, J.-W. (2021). Hotel booking through online travel agency: Optimal Stackelberg strategies under customer-centric payment service. Annals of Tourism Research, 86, 103074. https://doi.org/10.1016/j.annals.2020.103074
González, M., & Palacios, M. (2020). Understanding cancellation behavior: The role of booking policies and customer loyalty. International Journal of Hospitality Management, 87, 102500.
Guido, S., & Müller, A. C. (2021). Introduction to machine learning with Python: A guide for data scientists. O’Reilly Media.
Haque, I., Ahmed, A., Rahman, M., & Singh, P. (2024). A comprehensive analysis of class imbalance handling techniques in machine learning. IEEE Access, 12, 1–20.
Huang, J., Li, Y., & Xie, M. (2023). Ensemble learning for hotel booking cancellation prediction: A comparative analysis of regularization techniques. International Journal of Hospitality Management, 108, 103329.
Kim, Y., Lee, J., Park, H., & Choi, S. (2023). Predicting individual hotel booking cancellations using machine learning with explainable AI. Decision Support Systems, 168, 113941.
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. Springer.
Kulkarni, S., Mahendran, H. K., & Lobo, L. (2022). Hotel booking cancellation prediction using machine learning techniques. International Journal of Hospitality Management, 102, 103157.
Kumar, P. S., & Rahman, M. A. (2019). Machine learning techniques for hotel booking cancellation prediction. Journal of Hospitality and Tourism Technology, 10(4), 567–580.
McKinney, W. (2022). Python for data analysis: Data wrangling with pandas, NumPy, and Jupyter. O’Reilly Media.
Morosan, C., & DeFranco, A. (2016). Co-creating value in hotels using mobile devices: Insights from consumer-generated feedback. Tourism Management, 57, 231–244. https://doi.org/10.1016/j.tourman.2016.06.012
Powers, D. M. W. (2020). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies, 2(1), 37–63.
Raschka, S., & Mirjalili, V. (2021). Python machine learning: Machine learning and deep learning with Python. Packt Publishing.
Rashid, M. F., Islam, M. S., & Hossain, M. K. (2020). An efficient approach for classifying imbalanced data using XGBoost with feature selection. Journal of Computer Science and Technology, 35(2), 212–227.
Smith, R. J., & Johnson, L. F. (2021). A deep learning approach for hotel booking cancellation prediction. Journal of Revenue and Pricing Management, 20(3), 215–230.
Tharwat, A. (2021). Classification assessment methods: A detailed tutorial. Applied Computing and Informatics, 17(1), 168–192.
VanderPlas, J. (2022). Python data science handbook: Essential tools for working with data. O’Reilly Media.
Wang, J., Zhang, J., & Yeh, S. S. (2018). Development and challenges of hotel revenue management. International Journal of Contemporary Hospitality Management, 30(1), 302–320. https://doi.org/10.1108/IJCHM-06-2017-0357
Wang, S., Zhang, X., Chen, Y., & Liu, H. (2022). Scalable decision tree learning with feature embedding. Proceedings of the 39th International Conference on Machine Learning (ICML).
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Jurnal Publikasi Teknik Informatika

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.