Maximizing Predictive Regression and Dimensionality Reduction Techniques: Evidence from Monte Carlo’s Simulation Study

Oluwafemi Clement Onifade; Samuel Olayemi Olanrewaju; Emmanuel Segun Oguntade

doi:10.54536/ajase.v4i1.5938

Authors

Oluwafemi Clement Onifade Department of Statistics, Faculty of Science, University of Abuja, Abuja, Nigeria
Samuel Olayemi Olanrewaju Department of Statistics, Faculty of Science, University of Abuja, Abuja, Nigeria
Emmanuel Segun Oguntade Department of Statistics, Faculty of Science, University of Abuja, Abuja, Nigeria

DOI:

https://doi.org/10.54536/ajase.v4i1.5938

Keywords:

Elastic Net, High-Dimensional Data, Lasso, Multicollinearity, Regularization, SCAD, SPCR

Abstract

This study proposes a novel two-step sparse learning framework that combines Sparse Principal Component Regression (SPCR) with regularization methods, Lasso, Elastic Net, Ridge, and Smoothly Clipped Absolute Deviation (SCAD), to improve prediction and interpretability in high-dimensional settings. Simulation experiments were conducted under varying sample sizes, dimensionality levels, sparsity conditions, and predictor correlations to evaluate the performance of the hybrid estimators in comparison to traditional penalization approaches. Results show that SPCR-Lasso and SPCR-Enet consistently deliver superior accuracy and stability in high-dimensional, multicollinear contexts, with SPCR-Enet performing particularly well in extreme dimensionality. SPCR-SCAD demonstrated advantages in sparse, low-correlation scenarios, while Ridge regression contributed modest improvements. These findings underscore that estimator performance is strongly data-dependent and highlight the value of SPCR hybridization for mitigating multicollinearity while enhancing interpretability. The study offers practical guidance for applied researchers in fields such as genomics, finance, and climate science, and contributes methodologically by demonstrating the robustness of SPCR-based regularization in handling complex high-dimensional data structures.

Downloads

Download data is not yet available.

References

Ali, H., Shahzad, M., Sarfraz, S., Sewell, K. B., Alqalyoobi, S., & Mohan, B. P. (2023). Application and impact of Lasso regression in gastroenterology: a systematic review. Indian Journal of Gastroenterology, 42(6), 780-790.

Chatterjee, I., & Baumgärtner, L. (2024). Unveiling Functional Biomarkers in Schizophrenia: Insights from Region of Interest Analysis Using Machine Learning. Journal of Integrative Neuroscience, 23(9).

Chen, J., Yang, S., Wang, Z., & Mao, H. (2021). Efficient sparse representation for learning with high-dimensional data. IEEE Transactions on Neural Networks and Learning Systems, 34(8), 4208-4222.

Cleophas, T. J., & Zwinderman, A. H. (2024). Application of Regularized Regressions to Identify Novel Predictors in Clinical Research. Springer Nature.

Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. JASA.

Gupta, V., Chen, Y., & Wan, M. (2024). Predictability of weakly turbulent systems from spatially sparse observations using data assimilation and machine learning. arXiv preprint arXiv:2407.10088.

Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics.

Jolliffe, I. T. (2002). Principal component analysis. Springer.

Kawano, S. (2018). Sparse Principal Component Regression (one-stage SPCR literature).

Kitano, T., & Noma, H. (2024). Ridge, lasso, and elastic-net estimations of the modified Poisson and least-squares regressions for binary outcome data. arXiv preprint arXiv:2408.13474.

Manzhos, S., & Ihara, M. (2022). Advanced machine learning methods for learning from sparse data in high-dimensional spaces: A perspective on uses in the upstream of development of novel energy technologies. Physchem, 2(2), 72-95.

Meinshausen, N. (2007). Relaxed Lasso and stability selection literature.

Nwosu, A., Aimufua, G. I. O., Ajayi, B. A., & Olalere, M. (2024). The Impact of Regularization on Linear Regression Based Model. Journal of Artificial Intelligence and Computer Science, 1(1).

Priya, A. K., Gnanasekaran, L., Rajendran, S., Qin, J., & Vasseghian, Y. (2022). Occurrences and removal of pharmaceutical and personal care products from aquatic systems using advanced treatment-A review. Environmental Research, 204, 112298.

Song, J., Xu, L., & Wang, X. (2024, July). A Regularization Method for Enhancing the Robustness of Regression Networks. In 2024 43rd Chinese Control Conference (CCC) (pp. 8524-8529). IEEE.

Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. JRSS-B.

Witten, D. M., Tibshirani, R., & Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics.

Zhang, X., Sun, Q., & Kong, D. (2024). Supervised Principal Component Regression for Functional Responses with High Dimensional Predictors. Journal of Computational and Graphical Statistics, 33(1), 242-249.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the Elastic Net. JRSS-B.

Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics.

Engineering & Technology

Agricultural Science

Environment & Climate

Business & Economics

Arts & Social Science

Multidisciplinary

Medical Science & Others

Maximizing Predictive Regression and Dimensionality Reduction Techniques: Evidence from Monte Carlo’s Simulation Study

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

manuscript-template

Information

indexing

Facebook

Current Issue