Maximizing Predictive Regression and Dimensionality Reduction Techniques: Evidence from Monte Carlo’s Simulation Study

Authors

  • Oluwafemi Clement Onifade Department of Statistics, Faculty of Science, University of Abuja, Abuja, Nigeria
  • Samuel Olayemi Olanrewaju Department of Statistics, Faculty of Science, University of Abuja, Abuja, Nigeria
  • Emmanuel Segun Oguntade Department of Statistics, Faculty of Science, University of Abuja, Abuja, Nigeria

DOI:

https://doi.org/10.54536/ajase.v4i1.5938

Keywords:

Elastic Net, High-Dimensional Data, Lasso, Multicollinearity, Regularization, SCAD, SPCR

Abstract

This study proposes a novel two-step sparse learning framework that combines Sparse Principal Component Regression (SPCR) with regularization methods, Lasso, Elastic Net, Ridge, and Smoothly Clipped Absolute Deviation (SCAD), to improve prediction and interpretability in high-dimensional settings. Simulation experiments were conducted under varying sample sizes, dimensionality levels, sparsity conditions, and predictor correlations to evaluate the performance of the hybrid estimators in comparison to traditional penalization approaches. Results show that SPCR-Lasso and SPCR-Enet consistently deliver superior accuracy and stability in high-dimensional, multicollinear contexts, with SPCR-Enet performing particularly well in extreme dimensionality. SPCR-SCAD demonstrated advantages in sparse, low-correlation scenarios, while Ridge regression contributed modest improvements. These findings underscore that estimator performance is strongly data-dependent and highlight the value of SPCR hybridization for mitigating multicollinearity while enhancing interpretability. The study offers practical guidance for applied researchers in fields such as genomics, finance, and climate science, and contributes methodologically by demonstrating the robustness of SPCR-based regularization in handling complex high-dimensional data structures.

Downloads

Download data is not yet available.

References

Ali, H., Shahzad, M., Sarfraz, S., Sewell, K. B., Alqalyoobi, S., & Mohan, B. P. (2023). Application and impact of Lasso regression in gastroenterology: a systematic review. Indian Journal of Gastroenterology, 42(6), 780-790.

Chatterjee, I., & Baumgärtner, L. (2024). Unveiling Functional Biomarkers in Schizophrenia: Insights from Region of Interest Analysis Using Machine Learning. Journal of Integrative Neuroscience, 23(9).

Chen, J., Yang, S., Wang, Z., & Mao, H. (2021). Efficient sparse representation for learning with high-dimensional data. IEEE Transactions on Neural Networks and Learning Systems, 34(8), 4208-4222.

Cleophas, T. J., & Zwinderman, A. H. (2024). Application of Regularized Regressions to Identify Novel Predictors in Clinical Research. Springer Nature.

Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. JASA.

Gupta, V., Chen, Y., & Wan, M. (2024). Predictability of weakly turbulent systems from spatially sparse observations using data assimilation and machine learning. arXiv preprint arXiv:2407.10088.

Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics.

Jolliffe, I. T. (2002). Principal component analysis. Springer.

Kawano, S. (2018). Sparse Principal Component Regression (one-stage SPCR literature).

Kitano, T., & Noma, H. (2024). Ridge, lasso, and elastic-net estimations of the modified Poisson and least-squares regressions for binary outcome data. arXiv preprint arXiv:2408.13474.

Manzhos, S., & Ihara, M. (2022). Advanced machine learning methods for learning from sparse data in high-dimensional spaces: A perspective on uses in the upstream of development of novel energy technologies. Physchem, 2(2), 72-95.

Meinshausen, N. (2007). Relaxed Lasso and stability selection literature.

Nwosu, A., Aimufua, G. I. O., Ajayi, B. A., & Olalere, M. (2024). The Impact of Regularization on Linear Regression Based Model. Journal of Artificial Intelligence and Computer Science, 1(1).

Priya, A. K., Gnanasekaran, L., Rajendran, S., Qin, J., & Vasseghian, Y. (2022). Occurrences and removal of pharmaceutical and personal care products from aquatic systems using advanced treatment-A review. Environmental Research, 204, 112298.

Song, J., Xu, L., & Wang, X. (2024, July). A Regularization Method for Enhancing the Robustness of Regression Networks. In 2024 43rd Chinese Control Conference (CCC) (pp. 8524-8529). IEEE.

Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. JRSS-B.

Witten, D. M., Tibshirani, R., & Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics.

Zhang, X., Sun, Q., & Kong, D. (2024). Supervised Principal Component Regression for Functional Responses with High Dimensional Predictors. Journal of Computational and Graphical Statistics, 33(1), 242-249.

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the Elastic Net. JRSS-B.

Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics.

Downloads

Published

2025-10-18

How to Cite

Onifade, O. C., Olanrewaju, S. O., & Oguntade, E. S. (2025). Maximizing Predictive Regression and Dimensionality Reduction Techniques: Evidence from Monte Carlo’s Simulation Study. American Journal of Applied Statistics and Economics, 4(1), 127–140. https://doi.org/10.54536/ajase.v4i1.5938