Limitations of Machine Learning in Early-Stage Lung Cancer Detection and a Comparative Analysis of Survival Models for Prognostic Prediction

Authors

  • Inam Ur Rahman Department of Statistics, PMAS-Arid Agriculture University, Rawalpindi, Pakistan Author
  • Nasir Ali Department of Statistics, PMAS-Arid Agriculture University, Rawalpindi, Pakistan Author https://orcid.org/0000-0002-0477-6549

DOI:

https://doi.org/10.54536/ajarai.v1i2.6687

Keywords:

Cox Proportional Hazards Model, Lung Cancer Prediction, Machine Learning, Prognostic Factors, Survival Analysis, Weibull Distribution

Abstract

Lung cancer's high mortality rate is primarily due to late-stage diagnosis. This study investigates the dual challenge of early detection using machine learning (ML) and prognostic prediction using survival analysis. We evaluated six ML classifiers (Logistic Regression, Random Forest, SVM, KNN, ANN, Decision Tree) and a Voting Ensemble on a clinical dataset (N=228) for their ability to detect lung cancer. All models exhibited critically low sensitivity (0–25%), failing to identify the majority of true positive cases, underscoring their current inadequacy for early detection due to severe class imbalance (26.67% prevalence) and likely insufficient feature discriminatively. Subsequently, we performed a comprehensive survival analysis. The Kaplan-Meier estimator revealed a median survival of 310 days (95% CI: 285–363). A multivariable Cox proportional hazards model identified female sex (adjusted Hazard Ratio [aHR] = 0.55, p < 0.001) and poorer ECOG performance status (aHR = 1.90, p < 0.001) as significant independent prognostic factors. Finally, a comparison of parametric survival models indicated that the Weibull distribution provided the best fit (Akaike Information Criterion [AIC] = 2296.55) for the data. Our findings highlight a significant gap between the potential and current reality of ML in early lung cancer diagnosis while validating established clinical prognostic factors and identifying an optimal statistical model for survival prediction.

Downloads

Download data is not yet available.

References

Aitchison, J., & Brown, J. A. C. (1957). The Log‐Normal Distribution. Cambridge University Press.

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), Article 6.

Alakwaa, W., Nassef, M., & Badr, A. (2017). Lung Cancer Detection and Classification with 3D Convolutional Neural Network (3D-CNN). International Journal of Advanced Computer Science and Applications, 8(8). https://doi.org/10.14569/IJACSA.2017.080853

Awah, L. F. (2025). Conformism in Cameroon politics: A strategy for survival in a repressive “democracy” 1961–1990. American Journal of Development Studies, 3(2), 27–35. https://doi.org/10.54536/ajds.v3i2.4318

Baid, U., Shah, N. A., & Talbar, S. (2020). Brain Tumor Segmentation with Cascaded Deep Convolutional Neural Network. In A. Crimi & S. Bakas (Eds.), Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (Vol. 11993, pp. 90–98). Springer International Publishing. https://doi.org/10.1007/978-3-030-46643-5_9

Candès, E., Lei, L., & Ren, Z. (2023). Conformalized survival analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(1), 24–45.

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785

Cox, D. R. (1972). Regression models and life‐tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187–202.

Cox, D. R., & Oakes, D. (1998). Analysis of Survival Data. Chapman & Hall.

Ibeakuzie, P. O., & Onyeagu, S. I. (2024). A Parametric Cox Proportional Hazard Model with Application. Earthline Journal of Mathematical Sciences.

Inam Ur Rahman*, Nasir Ali, Abid Hussain, & Mehvish Raja. (2025). A Comparative Study Among Parametric, Semiparametric and Non-parametric Techniques Using Survival Data. Review Journal of Social Psychology & Social Works, 3(1), 943–954. https://doi.org/10.71145/rjsp.v3i1.164

Lakhani, P., & Sundaram, B. (2017). Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks. Radiology, 284(2), 574–582. https://doi.org/10.1148/radiol.2017162326

Raphael, C. E., Mitchell, F., Kanaganayagam, G. S., Liew, A. C., Di Pietro, E., Vieira, M. S., Kanapeckaite, L., Newsome, S., Gregson, J., Owen, R., Hsu, L.-Y., Vassiliou, V., Cooper, R., MRCP, A. A., Ismail, T. F., Wong, B., Sun, K., Gatehouse, P., Firmin, D., … Prasad, S. K. (2021). Cardiovascular magnetic resonance predictors of heart failure in hypertrophic cardiomyopathy: The role of myocardial replacement fibrosis and the microcirculation. Journal of Cardiovascular Magnetic Resonance, 23(1), 26. https://doi.org/10.1186/s12968-021-00720-9

Polwaththa, K. P. G. D. M., Amarasinghe, S. T. C., Amarasinghe, A. A. Y. D., & Amarasinghe, A. A. Y. (2024). Exploring artificial intelligence and machine learning in precision agriculture: A pathway to improved efficiency and economic outcomes in crop production. American Journal of Agricultural Science, Engineering, and Technology, 8(3), 50–59. https://doi.org/10.54536/ajaset.v8i3.3843

Samawi, H., Yu, L., & Yin, J. (2023). On Cox proportional hazards model performance under different sampling schemes. PLOS ONE, 18(4), e0278700. https://doi.org/10.1371/journal.pone.0278700

Saptasagar, A., Badgujar, R., Misal, A., & Raskar, O. (2023). A Detailed Literature Survey and In-depth Analysis of Existing Methods for the Detection of Lung cancer. Asian Journal of Convergence in Technology, 9(2), 70–74. https://doi.org/10.33130/ajct.2023v09i02.012

Scheaffer, R. L., Mendenhall, W., Ott, L., & Gerow, K. (1990). Elementary Survey Sampling (4th ed.). PWS-Kent.

Downloads

Published

2026-07-01

How to Cite

Rahman, I. U. ., & Ali, N. . (2026). Limitations of Machine Learning in Early-Stage Lung Cancer Detection and a Comparative Analysis of Survival Models for Prognostic Prediction. American Journal of Applied Research and AI , 1(2), 1-11. https://doi.org/10.54536/ajarai.v1i2.6687