Maximizing Predictive Regression and Dimensionality Reduction Techniques: Evidence from Monte Carlo’s Simulation Study
DOI:
https://doi.org/10.54536/ajase.v4i1.5938Keywords:
Elastic Net, High-Dimensional Data, Lasso, Multicollinearity, Regularization, SCAD, SPCRAbstract
This study proposes a novel two-step sparse learning framework that combines Sparse Principal Component Regression (SPCR) with regularization methods, Lasso, Elastic Net, Ridge, and Smoothly Clipped Absolute Deviation (SCAD), to improve prediction and interpretability in high-dimensional settings. Simulation experiments were conducted under varying sample sizes, dimensionality levels, sparsity conditions, and predictor correlations to evaluate the performance of the hybrid estimators in comparison to traditional penalization approaches. Results show that SPCR-Lasso and SPCR-Enet consistently deliver superior accuracy and stability in high-dimensional, multicollinear contexts, with SPCR-Enet performing particularly well in extreme dimensionality. SPCR-SCAD demonstrated advantages in sparse, low-correlation scenarios, while Ridge regression contributed modest improvements. These findings underscore that estimator performance is strongly data-dependent and highlight the value of SPCR hybridization for mitigating multicollinearity while enhancing interpretability. The study offers practical guidance for applied researchers in fields such as genomics, finance, and climate science, and contributes methodologically by demonstrating the robustness of SPCR-based regularization in handling complex high-dimensional data structures.
Downloads
References
Ali, H., Shahzad, M., Sarfraz, S., Sewell, K. B., Alqalyoobi, S., & Mohan, B. P. (2023). Application and impact of Lasso regression in gastroenterology: a systematic review. Indian Journal of Gastroenterology, 42(6), 780-790.
Chatterjee, I., & Baumgärtner, L. (2024). Unveiling Functional Biomarkers in Schizophrenia: Insights from Region of Interest Analysis Using Machine Learning. Journal of Integrative Neuroscience, 23(9).
Chen, J., Yang, S., Wang, Z., & Mao, H. (2021). Efficient sparse representation for learning with high-dimensional data. IEEE Transactions on Neural Networks and Learning Systems, 34(8), 4208-4222.
Cleophas, T. J., & Zwinderman, A. H. (2024). Application of Regularized Regressions to Identify Novel Predictors in Clinical Research. Springer Nature.
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. JASA.
Gupta, V., Chen, Y., & Wan, M. (2024). Predictability of weakly turbulent systems from spatially sparse observations using data assimilation and machine learning. arXiv preprint arXiv:2407.10088.
Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics.
Jolliffe, I. T. (2002). Principal component analysis. Springer.
Kawano, S. (2018). Sparse Principal Component Regression (one-stage SPCR literature).
Kitano, T., & Noma, H. (2024). Ridge, lasso, and elastic-net estimations of the modified Poisson and least-squares regressions for binary outcome data. arXiv preprint arXiv:2408.13474.
Manzhos, S., & Ihara, M. (2022). Advanced machine learning methods for learning from sparse data in high-dimensional spaces: A perspective on uses in the upstream of development of novel energy technologies. Physchem, 2(2), 72-95.
Meinshausen, N. (2007). Relaxed Lasso and stability selection literature.
Nwosu, A., Aimufua, G. I. O., Ajayi, B. A., & Olalere, M. (2024). The Impact of Regularization on Linear Regression Based Model. Journal of Artificial Intelligence and Computer Science, 1(1).
Priya, A. K., Gnanasekaran, L., Rajendran, S., Qin, J., & Vasseghian, Y. (2022). Occurrences and removal of pharmaceutical and personal care products from aquatic systems using advanced treatment-A review. Environmental Research, 204, 112298.
Song, J., Xu, L., & Wang, X. (2024, July). A Regularization Method for Enhancing the Robustness of Regression Networks. In 2024 43rd Chinese Control Conference (CCC) (pp. 8524-8529). IEEE.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. JRSS-B.
Witten, D. M., Tibshirani, R., & Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics.
Zhang, X., Sun, Q., & Kong, D. (2024). Supervised Principal Component Regression for Functional Responses with High Dimensional Predictors. Journal of Computational and Graphical Statistics, 33(1), 242-249.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the Elastic Net. JRSS-B.
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Oluwafemi Clement Onifade, Samuel Olayemi Olanrewaju, Emmanuel Segun Oguntade

This work is licensed under a Creative Commons Attribution 4.0 International License.

