Sparse Dynamic Factor Modeling of Some Selected Climatic Variables
DOI:
https://doi.org/10.54536/ajase.v4i1.5500Keywords:
Annual Average Surface Air Temperature, Annual Precipitation, Maximum Number of Consecutive Wet Days, Number of Days with Heat Index > 35°CAbstract
The classical forecasting models struggle to handle missing data, a common issue in climate data, due to irregular reporting intervals or sensor failures. Incomplete datasets can lead to biased or unreliable forecasts, further complicating efforts to predict climatic variables accurately. This study aims to examine the performance of the Sparse dynamic factor models on climate data. Its performance is compared with classical models, such as ARIMA, PCA, Two-Stage DFM, EM-Based DFM, Sparse DFM, Lasso, and Group Lasso. The study integrates a traditional statistical approach with penalized likelihood optimization, ensuring the inclusion of sparse, interpretable models. The dataset employed in this study was extracted from the Nigerian Meteorological Agency (NiMet) and the National Bureau of Statistics (NBS) statistical bulletin 2023. The data includes Annual Average Mean Surface Air Temperature, Annual Precipitation, Number of Days with Heat Index > 35°C, and Maximum Number of Consecutive Wet Days. The findings of the study revealed that group Lasso consistently yielded the lowest MSE across key variables, Air Temperature (MSE = 0.3854), Precipitation (921.27), Heat Days (296.85), and Wet Days (748.90) outperforming all benchmark models. Results also showed that, ARIMA, PCA, and Two-Stage DFM recorded substantially higher errors, highlighting their inability to capture intricate, nonlinear dependencies present in climate processes.
Downloads
References
Adams, S. O., & Bamanga, M. A. (2020). Modelling and forecasting seasonal behavior of rainfall in Abuja, Nigeria: A SARIMA approach. American Journal of Mathematics and Statistics, 10(1), 10–19. https://doi.org/10.5923/j.ajms.20201001.02
Adams, S. O., Mustapha, B., & Alumbugu, A. I. (2019). Seasonal autoregressive integrated moving average (SARIMA) model for the analysis of frequency of monthly rainfall in Osun State, Nigeria. Physical Science International Journal, 22(4), 1–14. https://doi.org/10.9734/psij/2019/v22i430139
Ashiq, M. W., Zhao, C., Ni, J., & Akhtar, M. (2010). GIS-based high-resolution spatial interpolation of precipitation in mountain-plain areas of upper Pakistan for regional climate change impact studies. Theoretical and Applied Climatology, 99, 239–253. https://doi.org/10.1007/s00704-009-0140-y
Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.
Cristóbal, J., Ninyerola, M., & Pons, X. (2008). Modeling air temperature through a combination of remote sensing and GIS data. Journal of Geophysical Research: Atmospheres, 113. https://doi.org/10.1029/2007JD009318
De Livera, A. M., Hyndman, R. J., & Snyder, R. D. (2011). Forecasting time series with complex seasonality using exponential smoothing models. Journal of the American Statistical Association, 106.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22.
Diebold, F. X., & Mariano, R. (2002). Comparing predictive accuracy of time series models for weather variables. Journal of Business & Economic Statistics.
Fan, J., & Tang, R. (2013). Tuning parameter selection in high dimensional penalized likelihood. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(3), 531–552.
Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2000). The generalized dynamic factor model: Identification and estimation. Review of Economics and Statistics, 82, 540–554.
Forni, M., Hallin, M., Lippi, M., & Reichlin, L. (2001). Coincident and leading indicators for the euro area. The Economic Journal, 111, C62–C85.
Gultepe, I., et al. (2019). Aviation applications of climate forecasts: A review. Aviation Meteorology, 201.
Han, Z., Liu, Y., Zhao, J., & Wang, W. (2012). Real-time prediction for converter gas tank levels based on multi-output least square support vector regressor. Control Engineering Practice, 20, 1400–1409.
Harvey, A. C., & Peters, S. (1990). Estimation and inference in time series models for climate data. Journal of Climate Dynamics, 40.
Huang, N. E., Shen, Z., Long, S. R., Wu, M. C., Shih, H. H., Zheng, Q., Yen, N., Tung, C. C., & Liu, H. H. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences, 454, 903–995. https://doi.org/10.1098/rspa.1998.0193
Huang, Y. (2021). Hilbert-Huang transform for climate time series forecasting. Journal of Atmospheric Research, 98.
Ismail, A., & Oke, I. A. (2012). Trend analysis of precipitation in Birnin Kebbi, Nigeria. International Research Journal of Agricultural Science and Soil Science, 2(7), 286–297.
Jokubaitis, S., Celov, D., & Leipus, R. (2021). Sparse structures with LASSO through principal components: Forecasting GDP components in the short-run. International Journal of Forecasting, 37(2), 759–776. https://doi.org/10.1016/j.ijforecast.2020.09.005
Jungbacker, B., & Koopman, S. J. (2015). Likelihood-based dynamic factor analysis for measurement and forecasting. The Econometrics Journal, 18(1), C1–C21.
Kajtar, J. B., Santoso, A., Collins, M., Taschetto, A. S., England, M. H., & Frankcombe, L. M. (2021). CMIP5 intermodel relationships in the baseline Southern Ocean climate system and with future projections. Earth’s Future, 9(6), 1–21. https://doi.org/10.1029/2020EF001873
Khesali, E., & Mobasheri, M. R. (2023). Near surface air temperature estimation through parametrization of MODIS products. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 10. https://doi.org/10.5194/isprs-annals-X-4-W1-2022-405-2023
Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. Proceedings of the 5th International Conference on Learning Representations.
Knight, K., & Fu, W. (2000). Asymptotics for LASSO-type estimators. The Annals of Statistics, 28(5), 1356–1378.
Liu, Z. (2018). Vector autoregressive models for climate time series forecasting. Journal of Atmospheric Science.
Lucas, J., Carvalho, C., Wang, Q., Bild, A., Nevins, J., & West, M. (2006). Sparse statistical modelling in gene expression genomics. In K. A. Do, P. M., & M. Vannucci (Eds.), Bayesian inference for gene expression and proteomics (pp. 723–732). Cambridge University Press.
Magnano, B. (2008). Generation of stochastic weather sequences for climate impact assessments. Journal of Climate, 25.
Refsgaard, J. C., Arnbjerg-Nielsen, K., & Drews, M. (2013). The role of uncertainty in climate change adaptation strategies: A Danish water management example. Mitigation and Adaptation Strategies for Global Change, 18, 337–359. https://doi.org/10.1007/s11027-012-9366-6
Modarres, R., & Ouarda, T. (2013). Heteroscedasticity in time series modeling of climatic variables. Journal of Hydrology.
Modarres, R., & Ouarda, T. (2014). Modeling heteroscedasticity in precipitation data using GARCH models. Journal of Hydrology.
Mustapha, D. A., Adams, S. O., & Adehi, M. U. (2025). Dynamic time series models for forecasting climatic variables. Journal of Statistics, 15(1), 1–13. https://doi.org/10.5923/j.statistics.20251501.01
Niclos, R., Valiente, J. A., Barbera, M. J., & Caselles, V. (2014). Land surface air temperature retrieval from EOS-MODIS images. IEEE Geoscience and Remote Sensing Letters, 11. https://doi.org/10.1109/LGRS.2013.2293540
Nieto, H., Sandholt, I., Aguado, I., Chuvieco, E., & Stisen, S. (2011). Air temperature estimation with MSG-SEVIRI data: Calibration and validation of the TVX algorithm for the Iberian Peninsula. Remote Sensing of Environment, 115. https://doi.org/10.1016/j.rse.2010.08.010
Niles, M. T., Brown, M., & Dynes, R. (2016). Farmer’s intended and actual adoption of climate change mitigation and adaptation strategies. Climatic Change, 135, 277–295. https://doi.org/10.1007/s10584-015-1558-0
Ninyerola, M., Pons, X., & Roure, J. (2007). Monthly precipitation mapping of the Iberian Peninsula using spatial interpolation tools implemented in a geographic information system. Theoretical and Applied Climatology, 89, 195–209. https://doi.org/10.1007/s00704-006-0264-2
Okhakhu, P. A. (2010). The significance of climatic elements in planning the urban environment of Benin City, Nigeria (Unpublished doctoral dissertation). Ambrose Alli University.
Oyediran, O. (1977). The climates of West Africa. Heinemann Education Books.
Park, T. (2023). Real-time prediction for climate using multivariate time series. Journal of Applied Earth Science.
Prihodko, L., & Goward, S. N. (1997). Estimation of air temperature from remotely sensed surface observations. Remote Sensing of Environment, 60. https://doi.org/10.1016/S0034-4257(96)00216-7
Qaisrani, Z. N., Nuthammachot, N., & Techato, K. (2021). Drought monitoring based on standardized precipitation index and standardized precipitation evapotranspiration index in the arid zone of Balochistan province, Pakistan. Arabian Journal of Geosciences. https://doi.org/10.1007/s12517-020-06302-w
Ragatoa, D. S., Ogunjobi, K. O., Okhimamhe, A. A., Francis, S. D., & Adet, L. (2018). A trend analysis of temperature in selected stations in Nigeria using three different approaches. Open Access Library Journal, 5(2), 1–17. https://doi.org/10.4236/oalib.1104371
Slater, L. J., Arnal, L., & Boucher, M.-A. (2023). Hybrid forecasting: Blending climate predictions with AI models. Hydrology and Earth System Sciences, 27, 1865–1889.
Stock, J. H., & Watson, M. W. (2002). Macroeconomic forecasting using diffusion indexes. Journal of Business & Economic Statistics, 20, 147–162.
Van der Wiel, K., & Bintanja, R. (2021). Contribution of climatic changes in mean and variability to monthly temperature and precipitation extremes. Communications Earth & Environment, 2(1), 1–21. https://doi.org/10.1038/s43247-020-00077-4
Vaswani, A. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
West, M. (2003). Bayesian factor regression models in the “large p, small n” paradigm. Bayesian Statistics, 7, 723–732.
Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Daramola Azeez Mustapha, Samuel Olorunfemi Adams, Mary Unekwu Adehi

This work is licensed under a Creative Commons Attribution 4.0 International License.

