In this study, the SPI values for the 6-, 9-, and 12-month timescale were calculated using mothly total rainfall data measured at the Cape Town Internatinal Airport meteorological station in the Western Cape Region of South Africa (See
Figure 6).
Figure 9 illustrates the time series of SPI computed for timescales 6-months (SPI6), 9-months (SPI9), and 12-months (SPI12), respectively. In general, all SPIs (SPI6, SPI9, and SPI12) reported several episodes of moderate to high to excessive droughts in the study area. There is a broad peak of drought period that begins late 2009 and persist to 2012. SPI12 reported a persistant drought period that started in 2014, that led to day zero conditions in terms of water supply over Cape Town [
43].
The complexity and fluctuation of the original SPI series in the model fitting and subsequent convergence of the model can be affected and thus limit the prediction accuracy. In response to the challenges, the nonlinear and nonstationary SPI series is preprocessed with the powerful method for conducting the decomposition of the series, namely CEEMDAN algorithm [
44]. The decomposition of the time series is a fundamental component of the hybrid model proposed in this research for time series forecasting. As a result, CEEMDAN deconstructed the training dataset, yielding five IMF components and one trend time series. For an example of the decomposition of the time series, CEEMDAN was applied to the SPI6 time series, and the results are shown in
Figure 10. While the CEEMDAN is able to present all IMFs, it is important to mention that IMF6 (which is assoiated with the Trend) seem to indicate that a downwards trend of the SPI6 time series, which could indicate that the domination of dry spells in recent over the study area.
This study compared the prediction performance of models listed in
Table 4 before and after time series decomposition to determine whether the effort of decomposition improves the practicality of the model's prediction performance.
Figure 11 presents a comparison of the prediction results of different models, along with the original time series for SPI6, SPI9, and SPI12, as well as a Taylor diagram. Overall, all models appear to closely mimic the original SPI time series across all time scales (see
Figure 11). The Taylor diagram seem to indicate an improvement in prediction accuracy after applying CEEMDAN signal decomposition method, with the CEEMDAN-ARIMA-LSTM model outperforming other models in terms of prediction accuracy across all time scales.
Table 4 evaluates the comparison of prediction performance values of different models using RMSE,
and DS. As the time scale increases, the RMSE values of the models decrease, while the DS values generally increase (see
Table 4). This indicates that the prediction accuracy of the models gradually improves with increasing time scale, peaking at the 12-month time scale. For instance, the LSTM model implemented at SPI6 had an RMSE of 0.234 and a
of 0.897, while at SPI12, it had an RMSE of 0.058 and a
of 0.984. In
Table 4, it can be observed that at all-time scales, the RMSE values of the CEEMDAN-ARIMA and CEEMDAN-LSTM models were lower than those of the ARIMA and LSTM models, respectively, while the
values were higher than those of the single models. This indicates higher prediction accuracy of the combined model, making it more suitable for predicting multiscale SPI. At each monthly time scale, the prediction accuracy of the ARIMA-LSTM combined model was significantly higher than that of the single model, with slightly higher accuracy for SPI6, SPI9, and SPI12. For example, in SPI6, the model had an RMSE of 0.186 and a
of 0.931, while in SPI9, it had an RMSE of 0.077 and a
of 0.983. In SPI12, the RMSE was 0.057 and the
was 0.985. It is evident that the prediction performance after the CEEMDAN decomposition is superior to that of the undecomposed models, suggesting that the SPI time series is better predicted after decomposition. Among these models, the CEEMDAN-ARIMA-LSTM model achieves the highest prediction accuracy with RMSE values ranging from 0.120 to 0.042, DS values ranging from 0.915 to 0.950, and values ranging from 0.970 to 0.995, significantly outperforming other models.