3.1. Experience Gained from Forecasting Previous Major El Niño Events as a Guide for Forecasting the 2023-2024 El Niño Using the ‘Natural Time Analysis”
Our first objective is to study the evolution of the current 2023-2024 El Niño based on the conclusions drawn from the forecast analysis proposed by Varotsos et al. [
19] regarding the 2015–2016 El Niño. In this context, we apply the method developed in [
19,
20] on the SOI dataset for the period from January 1876 to July 2023. All SOI values are considered as small, medium, or large SOI events.
Thus, we create a new time series
where
(
) is the
i-th event (minimum value) of the original SOI dataset and
is the total number of SOI events, over the entire period (January 1876 to July 2023). We then use the technique of the “natural time analysis” (NTA), matching each event
with the quantity
denoting the order of the occurrence
against the total number of events within a window of
k events, i.e.,
=
j /
k, with
. Thus, we introduce a new sequence of pairs (
,
) where
> 0, using the order of events as a measure of time instead of the conventional clock time (
t) [
22,
23,
24,
25,
26].
However, the quantity
,
, could be considered as a probability, since
and
= 1 [
22,
27], so we try to calculate the entropy of SOI events in the natural time domain as follows:
The entropy
is calculated for a sliding window of
k-length, each time by 1 month, running the entire SOI time series of the
N-events. Then, we again use Eq. (2) to calculate the entropy, but, this time, considering the time reversal in each window, i.e.
=
, with
, (see [
19,
20,
23]). According to [
23] the obtained entropy (
) is different from
and the quantity Δ
indicates the time symmetry breaking.
Positive values of Δ
correspond to a decreasing time series in natural time and when Δ
exceeds a certain threshold, extremely small SOI events occur, revealing El Niño (see details in [
24,
25,
26,
27]).
Furthermore, Varotsos et al. [
19,
20] suggested that the most useful window size for the above-described analysis is
k = 20 events (months). Thus, we herewith calculate Δ
for the past 20 months and this window is sliding, each time by 1 month, and ran the entire SOI time series of
N-events.
We also use the well-known Receiver Operating Characteristics (ROC, see [
28]) method to estimate the most appropriate threshold (for Δ
) that could be used as a forecasting tool for small SOI events. More specifically, when Δ
is equal to or exceeds a certain threshold (Δ
) during the
i-th month of the total period under study, an event low
(less than or equal to a specific value
T) is predicted for the next month. If this is verified, then we have a “true positive prediction”. Conversely in case Δ
< Δ
and
, then we have a “true negative prediction”, while all other combinations lead to errors (see details in [
19].
Following the classification of the past El Niño events given by the BOM (
http://www.bom. gov.au/climate/enso/enlist/) we now consider a
value of -14 (see
Figure 1 in [
19]). We herewith present in
Figure 2 the true positive rate (TPR, i.e., the number of true positive predictions in all cases with
≤
T = −14) versus the False Positive rate (FPR, i.e., the number of false positive predictions for all cases with
> −14) for various Δ
-values.
It is worth noting that the best possible prediction method yields a point in the upper left corner or coordinate (0,1) of the ROC space, representing 100% sensitivity (no false negatives) and 100% specificity (no false positives). Points along the diagonal line y = x indicate a random guess (which for a finite number of trials ranges around this diagonal (see, e.g.,[
29]) while points above the diagonal indicate good prediction results (better than random) and points below the diagonal indicate poor results (worse than random).
The obtained ROC curve (for the studied period January 1876 to July 2023) best fits a function of the form
+
. So, we find that the point
of the
-curve with the maximum distance
d(
x) from the diagonal line
y =
x (i.e.,
d(
x) =
) is associated with Δ
= 0.0035. This value is exactly equal to the Δ
-value suggested by Varotsos et al [
19] (for their studied period January 1878-October 2015). We note that the slope of the tangent line through the point
of the
-curve (i.e., the derivative
) is unity (
Figure 2).
In
Figure 3, we now plot the time progression of monthly SOI events as well as the entropy change Δ
in natural time under time reversal, during the period January 2010 - July 2023. In order to further determine whether 2015–2016 El Niño could be classified as “very strong” or even more “one of the strongest on record”, we again use the classification of past El Niño events provided by the BOM (
http://www.bom.gov.au/climate/history/enso/).
In
Figure 3 the coloured areas represent the mean minimum negative SOI value together with the 1σ standard deviation bands for the two cases of “weak, weak to moderate, moderate, moderate to strong” (green band) and “strong, very strong” (yellow band) El Niño events. We notice in
Figure 3 that the monthly SOI events, for the period 2015–2016, remain in the green zone and on the yellow border.
Furthermore, looking at
Figure 1 in Varotsos et al. [
19], it appears that the SOI time series during the period 2015–2017 presents a less pronounced downward trend compared to the corresponding ones during El Niño events of previous periods such as 1982–1983 and 1997–1998 (see
Figure 4).
These observations lead to the conclusion that the forecasting analysis proposed by Varotsos et al. [
19], is fully validated, and the 2015-2016 El Niño was rather characterized as a “moderate to strong” event rather than “one of the strongest on record”.
3.3. Forecasting El Niño Events Using the “Modified Natural Time Analysis” Applied to ONI
We now investigate the distribution of ONI values, labelled by
x, over the period from January 1950 to April 2023, using the Modified Natural Time Analysis (M-NTA). By applying the non-parametric Kolmogorov Smirnov (KS) test, we test the hypothesis
Ho: the sample values follow a normal distribution against
H1: the sample values do not follow the normal distribution (
Figure 6a). The calculated KS-statistic
Dn and the corresponding
p-values are 0.043 and 0.076, respectively. From 0.076 > 0.05 we conclude that the dataset fits the normal distribution well, at the 95% confidence level.
However, this is inconsistent with the quantile-quantile (Q-Q) plot shown in
Figure 6b, which is a graphical method for assessing whether two data sets come from populations with a common distribution. The vertical axis depicts the sample quantiles, while the horizontal axis represents the theoretical quantiles. According to
Figure 6b, the ONI values seem to diverge from the normal distribution, and the inconsistency is mainly attributed to the extreme values of the data set. To clarify this point, we apply a newly developed nowcasting method, suggested by Varotsos et al. [
30,
31,
32]. This method is a modified NTA (M-NTA) and its steps are described below:
We plot the logarithm of the cumulative number (
CN) of the ONI observations equal to or above a certain x-value versus the x magnitude (
Figure 7). For high ONI values, regression analysis shows a statistically significant linear fit between log
CN and
x. The best linear fit is achieved for the range (−0.11, 1.93):
The F-test (t-test) suggests that the calculated values of R2 = 0.99 (a0 = −0.69 and a1 = 2.67) are statistically significant (at 95% confidence level) thus giving indications that high ONI values may follow a semi-logarithmic distribution resembling the Gutenberg-Richter (GR) law (Varotsos et al. 2020b, 2004, 2021).
To fit ONI values above rollover (i.e.,
x ≥ 1.93), we use an upper-truncated GR model developed by Burroughs and Tebbens (2002):
where the values
a0,
a1 are derived from Eq. (4) and
xmax = 2.72 is chosen to obtain the most accurate approximation.
Next, we use the NTA in order to study the exceptional events in a time series. As we said above "natural time" ignores the clock-time that an event occurs and instead provides an index that is the order of occurrence of that event divided by the total number of events in the time-series [
24,
26,
33,
34]. In this sense, we detect the high ONI values (with
x ≥
x2) in the studied time series, and every time a high ONI value occurs, we calculate the cumulative numbers
CN1,
CN2 of the ONI values with magnitude
x ≥
x1,
x ≥
x2 respectively (where
x1<
x2) that occur after this high ONI value until the end of the time series. The
x1 value is chosen as the average of the ONI dataset (i.e.−0.003), while the value
x2 = 0.826 is set and corresponded to the mean increased by the standard deviation of the dataset.
The above-mentioned technique allows us to precisely test the accuracy of the GR fit by examining whether two values with constant difference x2 − x1 have a constant ratio:
as predicted by the GR relation:
where
a0 is estimated from Eq. (4) and C
N1, C
N2 is the cumulative number of ONI values with magnitude
x ≥
x1,
x ≥
x2 respectively.
Indeed, we plot the pairs
in
Figure 8a, and an almost perfect linear fit
(with
A =
and
R² = 0.97), thus confirming the accuracy of the GR- fit.
The NTA is also used to forecast the occurrence rate of future extreme ONI events, based on the estimated mean occurrence rate of the lowest (and most frequent) ONI values.
To this end, we first plot
CN1 versus the clock time
t (during January 1950 – April 2023) where a nearly perfect linear fit
on the pairs
emerges (
Figure 8b). The constant
is estimated by the average of the ratio
and its confidence interval is:
where
n is the count of the pairs
and
is the critical value of the standard normal distribution at
a significance level.
The extracted
R2 = 0.98 indicates high statistical significance at the 95% confidence level. Then, utilizing all the above, we attempt to predict the occurrence rate
and consequently the average time interval between two successive ONI values with
x ≥
x0 for some selected high magnitudes i.e.,
, using the following formula which derives from Eq. (6):
where
b is seen in
Figure 8b (
Table 1).
It is worth noting that, in case of extremely high ONI values (with x ≥ 1.93) we use the truncated GR scaling given in Eq. (5) (i.e., ).
The last step of our survey is to repeat the NTA, this time, applied to a new time series, which is generated by the initial ONI time series having reversed its time evolution. The analysis gives similar results (i.e.,
with
R² = 0.99 and
b = 0.50 ± 0.04 with
R2 = 0.995), and the estimated nowcasted mean inter-event time for the selected ONI values are presented in
Table 1 (with grey colour).
According to Thompson [
13] the phenomenon “El Niño” has arrived causing major changes to the weather all over the world. This year or next is going to be the hottest year on record, as happened in 2016. She also suggests that the 2016 El Niño was comparable in strength to the one in 1998 (see also [
16]. The ONI values from October 1997 to January 1998 (i.e., 2.33, 2.4, 2.39, 2.24, respectively), as well as the ONI value from October 2015 to January 2016 (i.e., 2.42, 2.57, 2.64, 2.48, respectively), are seen to be the maxima of the total time-series (see
Figure 9).
Using the M-NTA the estimation of the mean inter-event time for the ONI value occurred in November 1997 (i.e.,
x0 = 2.4) is between 17.4 years and 26.1 years, a forecasting that sufficiently describes the empirical inter-event time from 1997 to 2016 (see
Table 1). Furthermore, in December 1982 (i.e., 15 years before December 1997) the ONI value is
x0 = 2.23 which corresponds to the estimated inter-event time (9.8, 14.7) years, revealing a good agreement between the nowcasted and the empirical time. Finally, using
Table 1 we can predict that the mean time for the re-appearance of the ONI values observed in November 2015 and December 2015, i.e., 2.57 and 2.64 is expected to be (42.8, 64.3) years and (85.1, 127.6) years respectively.
Regarding the prediction of the new El Nino event, this is linked to the extreme values of many previous years. For example, in November 1997 we had an extreme ONI-value = 2.40 which according to the model is expected to reappear in (17.4, 26.1) years, possibly reaching 2023. On the other hand, from October 2015 to December 2015 extreme ONI values i.e., 2.42, 2.57, 2.64 arose which according to the model may be repeated in [18.9, 28.3] years, (42.8, 64.3) years and (85.1, 127.6) years, respectively. Therefore, extreme value is difficult to reappear in 2023 (originating from 2015-2016 with ONI-value ≥ 2.42) but can appear originating from 1997 (with ONI-value ≈ 2.40).
However, it is very likely that for the winter of 2023-2024, a large ONI-value associated with 2015-2016 will appear since in September 2015 and February 2016 we had a large ONI-value = 2.16 and ONI-value = 2.14 respectively, which according with the model they will repeat in (8.0, 12.0) years and (7.6, 11.4) years, possibly reaching 2023-2024.
3.4. Forecasting El Niño Events Using the “Modified Natural Time Analysis” Applied to SOI
To confirm the above-mentioned results, we apply the same M-NTA method on the SOI dataset during the period January 1950-April 2023, focusing on the low SOI-values that appear to dominate during the three strong ENSO events (1982–1983, 1997–1998, 2015-2016).
Thus, we plot the logarithm of the cumulative number (cN) of the SOI observations equal to or below a certain x-value against x. For low SOI values, the regression analysis gives a statistically significant linear fit between logcN and x (i.e., logcN = a3 + a2x) with optimal features in the range of values (-∞, -3.1] where the calculated values of R2 = 0.99 (a2 = 0.07 and a3 = 2.86) are statistically significant (at 95% confidence level).
Then, we detect the low SOI values (with x ≤ x2) in the studied time series, and every time a low SOI value occurs, we calculate the cumulative numbers cN1, cN2 of the SOI values with magnitude x ≤ x1, x ≤ x2 respectively (where x1<x2) that occur after this low SOI value until the end of the time series. The x1 value is chosen as the threshold of the above-mentioned value range (-∞, -3.1], while the value x2 = -8.2 is the 20% percentile of the entire dataset.
Plotting the pairs (cN1, cN2), a perfect linear fit (with A = and R² = 0.99) confirms the accuracy of the GR- fit. We then plot cN1 against clock time t (during January 1950 – April 2023) and a statistically significant (at 95% confidence level) linear fit (with b = 0.36 ± 0.04 and R2 = 0.98) is revealed.
Finally, using the above results, we forecast the occurrence rate
and consequently the average time interval between two successive SOI values with
x ≤
x0. In this regard, we select the low SOI values with -35.7 ≤
x0 ≤ -19.0, i.e., values like those observed during the three strong ENSO events (
Table 2).
Using
Table 2, we see that the average time for the SOI values observed in October 2015 and January 2016 (i.e., -21.30 and -21.7) to reappear is (4.5, 6.8) years and (4.9, 7.2) years respectively, while the SOI magnitudes in November 1982 and March 1998 (i.e., -30 and -26.1) correspond to (20.6, 30.6) years and (10.4, 15.6) years, respectively.
However, all these extremely low values seem to not affect the current ENSO event (2023-2024). On the other hand, the January 1983 SOI value (i.e., -31.4) corresponds to (26.2, 39.1) years and it could be related to the 2015-2016 ENSO, while the February 1983 SOI value (i.e., -35.7) corresponds to (55.4, 82.5) years and is expected to affect the years after 2038. As for the SOI value of May 2023 (i.e., -15.26), it may be related to the minimum SOI of 2019 (i.e., -14.6).