1. Motivation
Ocean variability arises from the external forcing (for example, solar radiation) and internal processes associated with the nonlinearity and multi-scale integrations within the system ([
1]). Such variability caused by internal processes is called internal variability, which is intrinsic, unprovoked, and chaotic, and cannot be linked directly to external forcing or the details of the initial state ([
2]).To study internal, unprovoked variability in observational data of the atmosphere and ocean weather is difficult; there are baroclinic disturbances in the atmosphere, which are a manifestation of such variability, but that they are really forming spontaneously became clear only when atmospheric models became available in the 1970 for extended simulations allowing the analysis of unprovoked variability; the challenge of how to determine the "noise level" and how to perform statistical tests was introduced by [
3,
4]. Thus, the recognition of internal variability dates some 50 years back. A nice illustration of the formation of such unprovoked variability is provided by Fischer’s experiment [
5] running an aqua-planet experiment initiated at a state of rest, i.e., with zero wind everywhere (see
Figure 1). During the first, say, 10 days, the atmosphere remains at a state of rest, but then the general circulation begins to emerge, in particular the baroclinic storms at Northern and Southern Hemisphere mid-latitudes.
Such a recognition for limited area models of the atmospheric dynamics took much longer, because of a tacit understanding that any such possible variability would quickly be wiped out by the continuously updated lateral boundary conditions. However, when eventually ensembles of simulations in the regional model were integrated [
6,
7], it became quickly clear that relevant intra-ensemble variability would form [
8]. The tendency to build internal variability depends on the size of the integration area: in small areas the noise level becomes very small (e.g. [
9]); in large areas substantial noise forms, see e.g. [
10].
In the ocean, the recognition of internal variability had to wait much longer, until the availability of satellites, which showed this ubiquity of meso-scale eddies and other meso-scale features (
Figure 2). Since the early 2000s studies evaluating ensembles of ocean simulations began to be published, and with it the recognition of internal variability formed in the ocean, and not just inherited from the atmosphere: Leading examples were Jochum and Murtugudde [
11,
12], Arbic [
13] or Sèrazin [
14]; a first systematic analysis of the formation and of characteristics of internally generated variability in the global ocean was done by Penduff et al. [
15].
The question of internal variability in marginal seas was taken up even later, even if the observation of divergent trajectories of tracers was long known (
Figure 3). Pioneers were Büchmann and Söderkvist [
16], Waldman et al. [
17], Tang et al. [
18,
19], Lin et al. [
20,
21] and Benincasa et al. [
22] using the ensemble framework (see
Section 3). Tang demonstrated for the South China Sea that increasing the horizontal spatial resolution leads to the formation of more intrinsic variability and that the external forcing is dominant for large scales, while most of the small scale is internally generated. Lin et al.[
21], on the other hand found the tendency to generate internal variability to be dependent on the presence of tides and of a seasonal thermocline - and demonstrated the significance of baroclinic instability in forming noise [
23], consistent with the results of Waldman et al. [
24]. More on this in
Section 3.1.
The concept that climate is not static but rather undergoes variations across all time scales has been recognized for a considerable period, as evidenced by works such as those of Brūckner [
26] in 1890 and Mitchell [
27] in 1976 (cf.
Figure 4). The debate surrounding whether these variations stem from anthropogenic sources, particularly deforestation, or are intrinsic to Earth’s internal processes (referred to as "cosmic" processes), has been ongoing since at least the mid-19th century, if not earlier, extending beyond academic circles to encompass political discourse.
In 1976, Klaus Hasselmann [
1] introduced an alternative mechanism, proposing the integration of short-term small-scale weather fluctuations (such as atmospheric storms and oceanic eddies) in systems with significant "memory." This proposition, articulated in his "Stochastic Climate model," among other achievements, ultimately earned him the Nobel Prize in Physics in 2021.
The challenge to "detect" and characterize such "unprovoked variability" and to "attribute" plausible causes and processes for its emergence has persisted to some extent. In the realm of global atmospheric dynamics, detection became straightforward once global general circulation models of the atmosphere could be integrated over extended periods or in ensembles of simulations. Similarly, attribution was relatively uncomplicated, given the recognition of baroclinic instability since the time of Vilhelm Bjerknes [
28], .
In contrast, the detection and attribution process for the global ocean occurred much later, coinciding with the availability of eddy-permitting models and high-resolution satellite imagery of the ocean surface. Similar delays were observed in the case of regional atmospheric and oceanic systems, where detection efforts were initially limited. However, pioneers in ocean noise research like Penduff [
15] shed light on macro-scale internal variability. Waldman [
17] and Lin [
23] determined a dominant role of baroclinicity in amplifying small-scale variations to large-scale variability. Nonetheless, other processes may also contribute to the formation of macro-scale noise.
The emergence of macro-scale noise carries three significant implications. The motivation of this article is to illustrate these challenges in
Section 4:
Firstly, this noise is expected to arise in systems influenced by short-term weather variations, which lack strong damping but possess a robust memory. Such systems are foremost all atmospheric and oceanic hydrodynamical systems with short-term variations related to eddies, internal tides, fronts, and other phenomena. A very different case of such systems encompasses regional morphodynamics, as highlighted by a CNN report
1 following a US submarine incident in the South China Sea, which emphasized the ongoing, albeit gradual, changes in the environment and seafloor. This underscores the necessity for continuous bottom contour mapping in the region. Additionally, ecosystem dynamics may also be affected by such noise.
Another aspect pertains to the realm of numerical experimentation, where alterations in factors such as parametrizations, boundary conditions, and atmospheric composition are introduced in simulations. In such experiments, appropriately designed ensembles are crucial for estimating the extent of inherent variability, determining whether changes between ensembles can be attributed solely to internal variability, or if external factors play a role (an issue akin to detection) (e.g., [
29,
30]).
The third one is the conventional “detection and attribution” challenge [
31] if observed variations may be understood in the framework of internal variability, or if an external factor needs to be determined for explaining the observed change, which brings us back to the initial observations of the need to separate forced and unforced climate variations mentioned in the beginning.
2. The Stochastic Climate Model
The "Stochastic Climate Model" was originally introduced by Klaus Hasselmann [
1] in [
1], and its applicability was demonstrated by [
32,
33,
34]. Originally, it was quite a challenge to understand the concept, but as people got more used to it, it became really simple. When considering stationary solutions of the dynamics
x of a not too non-linear system
it may be approximated by a linear stochastic system
or
with a "memory-term"
and a stochastic "forcing" component
. This formulation is based on the assumption that the slow part of
y, named
x, is exposed to the action of the fast components of
y which may be summarized as white noise
.
More complex approximations than (
2) are of course possible, but (
3) named "Ornstein-Uhlenbeck" or "autogressive process of 1st order" (AR-1) have turned out quite successful. For further details, refer to Hasselmann [
1] or the textbook [
35].
The key parameter in (
3) is the memory term
, which controls the time within which a random excursion, excited by
, is decaying. A useful approximation is the lag-1 auto-correlation of
x. A larger
goes with a slower decay of the auto-correlation function and with more variance at slow frequencies of the auto-spectrum [
35]. For two values of
, namely 0.3 and 0.9, the auto-correlation function as well as the auto-spectra are displayed in
Figure 5.
The main message of the Stochastic Climate model is, first, that short-term variability matters for the overall variability, which turns out to be a mixture of more or less sharp spectral peaks related to specific periodic forcing, such as the daily or annual cycles, and the smooth spectrum of random forced variations (as depicted in Mitchell’s sketch, cf.
Figure 4), and, second, given the same noise-seeding, a system with a larger memory generates more energetic long-term variability.
4. Seeding Noise
Ensembles of extended simulations using regional atmospheric and oceanic models were initially conducted by varying the initialization time. This allowed the models to have different lengths of spin-up periods before the period of interest. During the spin-up period, which lasted a few months, the same oceanic forcing (sea surface temperature and lateral atmospheric forcing) or atmospheric forcing (vertical fluxes at the surface and lateral oceanic forcing) was applied. Consequently, the entire simulation period, including both the spin-up and experimental phases, remained consistent and identical across the different ensemble members during overlapping times[
8,
19].
By coincidence, identical simulations using the same limited area model code, initial conditions, and forcing conditions but performed on different computer platforms became available, first in atmospheric sciences and later in oceanography. The intensity and timing of deviations across different ensemble members were statistically identical [
38,
40]. In atmospheric models, the periods of divergence were shorter and appeared to be influenced by the strength of the flow at the lateral boundaries, whereas in ocean models, it was very likely affected by the seasonal strength of the thermocline.
For each day, when all members of both ensembles, those generated by using different times of initialization and by using different computing platforms, are available, daily mean values across each ensemble, as well as the standard deviations of the deviations from these means, were calculated.
Figure 9 shows these time series for the atmospheric case of a Northern European limited area model (top) and the oceanic case of a limited area model of the Bo Hai and Yellow Sea (bottom).
Minuscule changes, whether in the initial state or caused by different compilers, can trigger significant and persistent deviations. The way these small changes are introduced is irrelevant. This phenomenon, where unavoidable small disturbances grow to substantial spatial scales and intensities, aligns with the Stochastic Climate model theory. Such sensitivity to minor disturbances is not necessarily indicative of chaotic processes and can even occur in linear systems. A memory term
, which may vary over time, is required. These variations can be due to factors like the strength of the constraining boundary values [
40] or seasonal variations in baroclinic stability [
23].
5. Challenges Arising from the Presence of Noise
In the "motivation"-
Section 1, some implications of the presence of hydrodynamic noise have already been touched. Here, we deepen this discussion further.
5.1. Stochastic Climate Model: Not just atmosphere and ocean
A main conclusion in
Section 3 was that the Stochastic Climate model-concept (
Section 2) applies not only to the classical oceanographic and atmospheric cases documented in the literature, but also to the hydrodynamics of marginal seas: whenever a system is not too nonlinear with a limited damping then significant internal variably should emerge. Thus observation led to the hypothesis that such variability should occur in other areas as well, such as in morphodynamics.
To do so, a simple model of the morphodynamics of a small bay, with a narrow entry to the open sea is considered [
41]. The ocean influences the bay through a mono-frequency tide. Initially, the morphology is uniform, but in the course of time, a variety of channels forms. Without any noise seeding, the distribution of the channels is symmetric, as no Coriolis force is active in this setup.
Hydrodynamical noise is excited by adding small disturbances to the tidal variation for the first 20 tidal cycles, i.e. during the first 10 days. The figure shows how these small initial disturbances diverge within 240 years. While the general pattern is the same with mostly two major channels through the entry of the bay, the details are all different. Small initial disturbances lead to an ensemble with significant variations of all channels, not only but also the large ones. Thus, we find the hypothesis that the development of internal variability also in the morphodynamic is consistent with the Stochastic Climate model, namely that short-term random disturbances are leading to red noise slow variations.
Figure 11.
Morphology in a small bay after 240 years of integration, initialized at rest and exposed to slightly disturbed tides during the fist 10 days [
41]) a grid point.
Figure 11.
Morphology in a small bay after 240 years of integration, initialized at rest and exposed to slightly disturbed tides during the fist 10 days [
41]) a grid point.
5.2. Assessing the Outcome of Numerical Experiments
An aspect relates to numerical experimentation, where changes in factors such as parameterizations, boundary conditions, and atmospheric composition are introduced in simulations [
3]. In these experiments, appropriately designed ensembles are crucial for estimating the extent of inherent variability. This helps to determine whether changes between ensembles can be attributed solely to internal variability or if external factors play a role.
If “something” is expected to produce a certain effect, whether in the real world represented by observational data or in numerical simulations, it is simplistic to assume that the difference between the state when the "something" is active and the state when it is not active would accurately represent this effect, or the "signal.”
Recognizing the presence of variations unprovoked by the “something” using the technique of statistical hypothesis testing with the null hypothesis H0: <“something” has no effect> is appropriate. If this null-hypothesis is rejected with a sufficiently small risk, then a valid conclusion is that an external factor is active. In the case of a numerical experiment with an active and a passive “something,” the external factor is indeed the “something.” However, in the case of observational data, there may be a variety of factors changing the observational record.
The determination if a “something” is active is called “detection” – the presence of a “signal is detected”, but it is not necessarily clear which factor is causing this signal. If a variety of candidates for the change exists, then the process of determining the most plausible mix of factors is determined – a process called “attribution”.
The term “(statistical) significance” in science refers to a statistical assessment of the probability that a given event is drawn from a certain population (even if "significance" is also used as in ordinary English, also in this article). If this probability is small, then the null-hypothesis “event is consistent with standard conditions” is rejected, and the alternative “event is evidence for the presence of non-standard conditions” is accepted. If it is “not small”, then the null-hypothesis is “not rejected” and the alternative hypothesis is also not accepted. The choice of “small” is subjective; in our field, it is usually 5%.
Obviously, the "local" tests need to be done at grid points but could be done in any other representation of the fields, say EOFs, and then only with components of interest, such as large scales. Hasselmann [
31,
42] suggested different strategies (see also [
43]).
A problem arises when such a test is done at many grid points. This is well-known among global atmospheric modellers, but possibly less so among regional modellers. If the null hypothesis is valid at all of these grid points, then one would expect, on average, at 5% of them the null hypothesis to be rejected falsely. The number of false rejections is itself a random variable, and the percentage of rejections may be much larger, say 20%, or so. If, however, the rate is larger than, say 20%, then it is unlikely that all rejections are false, but some are valid. For further details on this aspect, refer to the textbook [
35]
An example of such testing is that of a numerical experiment on the formation of the seasonal cold water pool off Qingdao [
30]. Here, the effect of tides as well as of variable winds is examined - and three ensembles were built with both, tides and realistic wind variability, one without tides, and one with zero wind. Those points, where the null hypothesis is rejected are marked by crosses, which are 51% and 90% of all vgrid pæoints, indicating global significance (i.e., not the effect of doing multiple local tests). Obviously, the null hypothesis that the tides and wind forcing have no effect is untenable.
Figure 12.
Sensitivity of the formation of the spring cold water circulation in the western Yellow Sea to the presence of tides (left) and the presence of seasonal variations of wind stress (right). Three ensembles of simulations with full forcing (tides and wind), with disregarding tidal forcing (top) and with disregarding the annual march of wind stress. Those points, at which a local t-test indicates the presence of a signal, are marked with a cross. The diagram is taken from [
30]
Figure 12.
Sensitivity of the formation of the spring cold water circulation in the western Yellow Sea to the presence of tides (left) and the presence of seasonal variations of wind stress (right). Three ensembles of simulations with full forcing (tides and wind), with disregarding tidal forcing (top) and with disregarding the annual march of wind stress. Those points, at which a local t-test indicates the presence of a signal, are marked with a cross. The diagram is taken from [
30]
For regional oceanographers, it seems that it is uncommon to examine their numerical sensitivity experiments.
5.3. Detection and Attribution
Another challenge is the "detection and attribution" problem in climate change studies, which involves determining whether observed variations can be explained by internal variability alone or if an external factor is needed to explain the observed change. This underscores the importance of distinguishing between forced and unforced climate variations, as mentioned earlier.
In this case, it is just one case, which needs to be evaluated, and the estimation of the noise must be done separately by considering the output of model "control" runs without changing atmospheric conditions. After detecting that the thus estimated internal variability alone cannot explain the ongoing change (“detection”; see above), then the attribution step compares the ongoing change with suggestions of climate models as a response to given changes. Then that combination of drivers, which explains the ongoing change best, is chosen as the plausible cause of the change. Thus, attribution is a plausibility argument.[
31,
44].
This procedure is nowadays a standard approach in the toolbox of climate change scientists. The analysis of the effect of baroclinic instability on the level of internal variability (
Section 3.1), may be considered as an example [
17,
23]
5.4. Multiple Equillibria
Another issue, which may be significantly affected by the presence of hydrodynamical noise is tipping points. Low-dimensional systems such as those proposed by [
45] and [
46] point to the possibility that the earth system may have two, or more, stable states, and which state the system is eventually ending depends on the initial state.
However, when noise is added, for instance by describing cloudiness as partly random, then the system varies considerably, and travels between the two preferred states, as displayed in
Figure 13. While this system questions the concept of irreversibility of tipping points, it illustrates the need to consider the presence of noise, when examining such irreversibility.