2.1. Mass Conservation
From the Global Carbon Project [
8] we get the emission and concentration data. A global mass balance representation of the yearly atmospheric CO
2 flow is created, with
Ci as the CO2 concentration of the atmosphere at the end of year i,
Ei as the global emissions of human origin during year i,
Li as the global land use net emissions during year i,
Ni the global natural net emissions during year i,
Si other special causes of emissions such as El Nino, vulcanos, etc.
-
A
i as the global net absorption of CO
2 during year i into the oceans and biosphere (
)
Without explicit external information,
cannot be discriminated from
or
. Therefore we set
to 0, and include all inferred special causes in the unknown
in this investigation. With
the equation becomes
This is not a model, but the formulation of the necessary mass conservation as seen from the atmosphere—like a bank account with cumulative yearly deposits and withdrawals, not directly dealing with the daily ups and downs nor with the exact nature of the earning and spending processes. As a matter of fact, equation (
2) must be fulfilled at all time scales.
For consistency all quantities have to be converted to the same unit (1 ppm = 2.124 GtC, 1 GtC = 3.664 Gt CO
2 [
8]). Here all calculations here are done with the unit “ppm”. All CO
2 related data, emissions, land use change, and CO
2 concentration growth are from the Global Carbon Budget 2021, covering the years 1850-2020. We have to be aware that the land use change data are subject to considerable uncertainty, with an error range of
ppm.
Consistency reasons will justify to assume an estimate for land use change emission that is 0.2 ppm lower than the published mean value. This is well within the error range and therefore justifiable.
Figure 1 shows that the total emissions (
+
, blue) exceed the yearly CO
2 concentration growth (
, green) increasingly, indicating that we have an increasing effective absorption (
, red ) with growing CO
2 concentration.
One immediate result of displaying these raw data is that before 1900 the anthropogenic emissions were considerable smaller than other variations such as land use change. There are roughly 4 historical phases:
The phase before 1900, where explicit emissions are smaller than implicit ones by land use change, however there is a small but increasing CO2 concentration growth,
the phase between 1900 and 1950 with growing emissions but approximately constant CO2 concentration growth and slightly increasing land use change,
the phase from 1950 to 2010 with growing emissions and growing concentration growth.
Recent publications indicate that emissions have remained approximately constant since 2010 [
9] and are expected to remain approximately constant for the forseeable future [
10] (
Figure 2, Stated Policies Scenario). The challenge is to estimate reasonable projections of CO
2 concentration based on these emission assumptions.
2.2. Exploratory Analysis
As a first exploratory analysis of these data, the scatter plot in
Figure 2 relates the effective CO
2 absorption to the CO
2 concentration.
Qualitatively we see a long term linear dependence of the effective absorption on the atmospheric CO2 concentration with significant short term deviations, where the effective zero-absorption line is crossed at appr. 280 ppm. This is considered to be the pre-industrial equilibrium CO2 concentration where natural yearly emissions are balanced by the yearly absorptions. The average yearly absorption is appr. 2.5% ()of the CO2 concentration exceeding 280 ppm.
While the correlation coefficient of 0.97 is very high, there are clear non-random deviations from an ideal linear behaviour. With the large uncertainty of the land use change, and contingent effects such as vulcano eruptions, influences like ENSO, etc., it is not surprising that there are systematic deviations from a perfect line [
6].
Regarding predictions of future CO
2 concentration, the key question is, whether the deviations are averaging out or whether there is a systematic saturation trend, limiting the absorption of CO
2. Some climate researchers claim that the absorption will decline [
11], but there are other papers providing evidence of increasing absorption [
12].
We can see from these data, that a reliable estimate of the historical equilibrium concentration requires the whole data range. An estimation based on all data above 310 ppm (year 1950) would result in a smaller equilibrium value of 245 ppm.
Starting from the mass conservation equation above, on the basis of the preceding considerations, we state two hypotheses, from which the actual model is derived:
2.3. Hypothesis 1: The Absorption Is Proportional to Previous CO2 Concentration
The physical justification for this assumption is the fact that the partial pressure of CO
2, which is relevant for absorption processes, increases proportional to concentration. It is also known that C3 plants, representing the majority of all plants, have a linear absorption property. The absorption property of C4 plants is nearly flat, but also linear in the range of 280-560 ppm, resulting in a linear behaviour when averaging over all plants. Ari Halperin analysed the different processes of gas transport into the ocean, with the conclusion that all relevant processes can be linearized [
5], equation 16.
Assuming that there are different absorption constants for oceanic (
) and biospheric absorption (
), under the linearity assumtion they can be added to a single constant a:
Both ocean as well as biospheric processes may consist of multiple sub-processes. E.g. the photosynthesis of C3 plants has a much larger proportionality constant to that of C4 plants in the relevant CO2 concentration range of 280..560 ppm. As long as linearity holds, the net absorption constant is reflected by the sum of all elementary absorption constants.
This is a radical simplification of the box-diffusion model [
13] referred to in the article “ Predicting Future Atmospheric Carbon Dioxide Levels“ [
1]. Instead of assuming separate boxes for the mixed layer and for the biosphere, we assume a one-dimensional diffusion process between atmosphere and ocean resp. biosphere with a single diffusion constant, making no explicit assumptions about the properties of the mixed layer nor the mechanism of the absorption in the biosphere. As stated above, this is justified when all relevant absorption processes are approximately linear w.r.t. CO
2 concentration. The advantage of this model is that we do not have to make any speculative assumptions about potentially many model parameters, some of which are quite arbitrary (e.g. thickness of mixed layer), but restrict the whole model to a single absorption parameter, which can be estimated from measurable data.
The authors of the Bern model claim, that the “net primary production of the land biosphere and the surface ocean carbon uptake depend on atmospheric CO
2 and surface temperature in a nonlinear way” [
4] by assuming a superposition of 4 linear processes. This statement contradicts our assumption of a single linear process. We will show in the appendix that the IRF’s of the Bern model [
4], Equation (
21)), can be mapped into the form of our model with a time varying absorption parameter a. Statistical tests will tell if there is a need to actually introduce this time dependence or whether it is more appropriate to assume constant relative absorption over time. Any deviations from the validity of a linear model will show up in the residual error. This gives our model a method of intrinsic validation, and the model can be extended when required.
2.3.1. Temperature Dependence of the Absorption Parameter
We have to consider the possibility that the absorption parameter a depends on temperature. Investigations of ice-core data clearly indicate a temperature dependency of CO
2 concentration. The open question is whether there is a measurable dependency during the time scale and the temperature scale of our investigation. We will hypothetically test the temperature dependence of relative absorption
a with a linear dependency on sea surface temperature anomalies T, with the HadSST2 temperature
[
14]. Linear dependence on Temperature anomaly is assumed like in the Bern model [
4]:
2.3.2. CO2 Concentration as a Hypothetical Proxy for Temperature
When we make predictions with hypothetical future CO
2 emissions, we don’t know future temperatures. Without diving into the problematic discussion about the degree, how strong the influence of CO
2 concentration is on temperature, we assume the “worst case” of full predictability of temperature effects by CO
2 concentration (
Figure 3).
Without making any assumptions about C
i−1→T
i causality, the estimated functional dependence of the temperature proxy from the regression with CO
2 concentration is
We are aware that this is a very incomplete model, because it ignores the obvious significant, trend reversing deviations between 1900 and 1975, and it also ignores the dominant contribution of cloud albedo reduction to global warming [
15], whereby 80% of recent warming is caused by albedo reduction and only 20% by increase of CO
2 concentration. The proxy is still suitable as a tool for estimating an upper bound of temperature dependence on CO
2 concentration. Based on actually measured data, it avoids speculations and discussions about hypothetical feedback factors of CO
2 sensitivity.
2.3.3. Corollary: Carbon Sinks Are Not Expected to Be Saturated in the Near
Future
This is related to and is a consequence of the linearity assumption. Much of the debate about carbon sinks is concerned with numerous details about possible saturation of various “boxes” in the models. There are strong reasons to consider the atmosphere and the mixed layer, i.e. the top 75 m of the ocean as a single “box”, which exchanges gases with the deep ocean and the biosphere [
5]. Due to the fact that the deep ocean contains more than 50 times the CO
2 of the atmosphere, or 4000 times the yearly global emissions, this means, that the whole atmospheric content is just about 2% of the ocean content. Therefore we do not expect this huge “box” to be saturated any time soon.
Having said that, we do not rely on our own assumptions, but there are 4 indications supporting the assumption that the uptaking reservoirs are not saturated:
We can test the past 70 years for linearity. If there was any sign of saturation, this would have shown up as a deviation from the linearity assumption. We will see that in the residual deviations from the model: if the relative absorption decreases with time, the real CO2 content at the end would be larger than estimated by the linear model.
The global carbon budget [
8] clearly shows an increasing trend in both the ocean sink as well as the (biosphere) land sink.
A recent article revised the estimates of the ocean-atmosphere CO
2 flux [
12], making it consistent with the increasing ocean sink found in the global carbon budget.
We can make a rough estimation of the expected ocean uptake. The ocean has a total carbon inventory of 38000 GtC ≈140000 Gt CO2. If we assume the realistic scenario of constant future emissions at today’s level (37 Gt CO2 per year) and we assume that they are all absorbed by the ocean, by 2100 that would be appr. 3000 Gt CO2, just about 2% of the current inventory.
Whatever the subjective opinion is regarding the future absorption, the measured data provide us with the current trend and its potential changes in the near future.
2.4. Hypothesis 2: Natural Emissions and Absorptions Are Balanced
This implies that without anthropogenic emissions
=
=
, resulting in a constant equilibrium concentration
. Equations (
1)–(
3) imply that global natural emissions are constant:
This relation makes a falsifiable statement about the magnitude of those natural emissions, which are not compensated by absorptions within the time unit of measurement, which is a year in this investigation. The statement of assumed constant equilibrium concentration requires further clarification. We know e.g. from ice-core investigations that historical CO
2 concentration is not constant, and most likely depends on temperature. A linear dependence on temperature can be mapped onto a linear dependence of relative absorption
a on temperature, which is covered by equation (
5).
As we know, there are causes for systematic changes in the natural emissions, e.g. vulcano eruptions, ocean cycles, or changes of land use. We will see from the residual deviations of the measured data, how significant these influences are, and if the model needs to be adapted. For the time being, we initially assume no changes of natural emissions within the investigated time range 1850-2020. As with the previous assumption, the residual error of the model will lead to possible further fine-tuning of the model. Three possible deviations are possible:
a systematic “trend” in the natural emissions. This would either increase or decrease the estimated absorption factor and the equilibrium concentration, resp. the constant model of natural emissions,
short term zero centered variations within a year. These variations do not show up in our model due to the one year sampling interval,
long term variations of more than a year are not averaged out. They a are visible in the residual error of the predicted CO2 content.
2.5. The Modelling Equations
From equation (
2), (
3), and (
7) we get the final model equation for an assumed constant absorption parameter a:
Equation (
8) emphasises the fact that the effective absorption depends on the difference betwees the actual and the equilibrium CO
2 concentration
, implicitely including the natural carbon cycle, described by the equlibrium concentration
. Initially we estimate
as the constant natural emissions simultaneously with
a in this regression equation, for estimating temperature dependent models we assume a fixed known value for the equilibrium CO
2 concentration
and estimate the absorption function
a.
Starting with the available data, from emissions
, land use change
, CO
2 growth, the absorptions are modelled according to equation (
8) for the time interval 1850-2014,
For
, the emission change per year due to land use, the uncertainty is considerable [
8]. Therefore we have a certain amount of freedom to adapt its value in order to satisfy other given constraints.
The absorption constant a and the natural equilibrium concentration
in a given time interval are obtained through estimation with the least-squares method, where the dependent variable is the left hand side of equation (
8), the independent variable is
, by means of the Python module OLS (statsmodel-OLS-0.13.5). The result table of the Python linear regression model displays for each regression variable its estimated value (Coef.), its estimated standard error, the t-statistics for the variable being different from 0, the probability for the variable to be 0 (strictly speaking the error probability when accepting the hypothesis that the variable is different from 0), and in the last 2 colums the error bounds, i.e. the 95% interval of the variable’s value:
|
Coef. |
Std.Err. |
t |
|
[0.025 |
0.975] |
|
-6.8952 |
0.2640 |
-26.1142 |
0.0000 |
-7.4166 |
-6.3738 |
a |
0.0247 |
0.0008 |
29.3485 |
0.0000 |
0.0230 |
0.0264 |
This results in ppm, with error bounds [277, 282] ppm, and a half life time of an emission pulse years, with error bounds [26,30] years.
is very close to and its error bounds contain the widely accepted pre-industrial equilibrium CO
2 concentration of 280 ppm. As it can be seen from model equation (
8), we can substitute global variations of
and
When using the center value of the land use change error band, i.e. on average 0.55 ppm , the calculated would have been ppm, which we consider to be too small to be compatible with the accepted value of 280 ppm. Therefore in the face of the large uncertainty of land use change estimates we prefer to assume slightly lower average land use change caused emissions (average 0.35 ppm) over an inconsistent equilibrium concentration. This substitution, however, only changes the equilibrium concentration.
Modelling the raw, noisy differential absorption data with this – simplest possible – model shows a fairly good approximation over the whole time span. The residue
shows variations but no systematic trend over the time span 1850-2014 (
Figure 4).
The blue differential effective absorption data are approximated by the orange model curve, the green curve shows the residual deviations. The standard deviation of the residuum is 0.2 ppm, the same order of magnitude as the uncertainty of emission and concentration measurements, in particular LUC.
We notice that before 1900 the absorptions are smaller than this error, which means that for analysing absorption values the time before 1900 is not meaningful. We also observe that after 1950, when the quality of measurements dramatically increased, there is much more variability in the differential measurements of CO2 concentration. This justifies doubts about the data quality before 1950, and justifies a separate analysis of the much more reliable data after 1950.
Next we investigate the possible dependence of the absorption parameter
a on sea surface temperature from the data set HadSST2 [
14]. Using equation (
8) with temperature dependent variable a from Equation (
5) leads to a more complex 3 parameter estimation problem:
This cannot be easily solved in closed form when
is variable. After showing that the data are consistent with
ppm in the case for constant absorption a, we assume
to be constant as an apriori condition and simplify the model equation by fixing
to this value and only estimate the absorption parameters
and
from the data. Implicitely this means that we make use of the assumption that the equilibrium concentration is constant. The temperature dependent 2 parameter absorption estimation problem has the estimation result:
The temperature dependent parameter
is statistically significant, and we get a slightly negative trend of the absorption parameter with the increasing sea surface temperature since 1900.
The negative temperature dependence tells us that before 1900 the absorption has no identifiable trend and that between 1900 and 2000 absorption appears to have decreased. The relative absorption at 1900 of 3% corresponds to the half life time of 23 year for an emission pulse. But in 2014 the relative absorption is still larger than 2.3% of the CO2 concentration, which corresponds to a half life time of 30 years for an emission pulse. The fact that the relative absorption changes with time in this model variant, prohibits the use of a time-invariant convolution kernel for computing the CO2 concentration from emissions. Before we draw conclusions from this result, we need to validate the estimation.
2.5.1. Model Validation
In order to validate the model, we are in the comfortable situation, that there is a long time series, so that we can perform an ex-post prognosis by restricting the training data of the model to the year 2000 and make predictions of the CO2 concentration of the years 2001-2020, which are available for comparison.
In the validation process we compare all 3 discussed model variants:
assumed constant relative absorption
assumed temperature dependent relative absorption
assumed relative absorption dependent on temperature modelled by CO2 concentration.
2.5.2. Estimation with Limited Data Range and Model Validation
The estimation results based on historical data may depend to a certain extend on the selected time interval, in particular, when we let the value of the equilibrium concentration of CO
2 to be determined from the data. This explains why previous authors arrive at so different results for the equilibrium concentration [
5,
6]. There are several reasons for constraining the data range:
As stated above, there is no large variability of both CO2 emissions and CO2 concentration before the year 1900. Moreover the measurements at that time were not really reliable. Therefore the signal-to-noise ratio is so large, that for the determination of concentration changes as a function of CO2 emissions it is better to dispense with these data.
We want to evaluate the predictive quality of the data model. Therefore we limit the training data to 1999 and compare the predicted CO2 concentration of the years 2000 to 2020 with the actual measurements.
We further argue, that also the data of the first part of the 20th century are not really reliable, indicated e.g. by the nearly constant yearly change of CO2 concentration despite growing emissions, as well as the extreme uncertainty of land use change data. We will therefore make an evaluation with training data from 1950 to 1999 and build the model based on these data.
We allow the investigated data interval to have its own equilibrium concentration, according to the available data. The equilibrium concentration is determined by the initial model with constant relative absorption variable a.
2.5.3. Estimation Based on Data from 1950 to 2000
The much better CO
2 concentration measurements after 1950 in conjunction with the fact, that the overwhelming bulk of anthropogenic emissions has been released after 1950 justify to investigate the second half of the 20th century seperately.
This results implies an equlibrilium CO2 concentration of = 242 ppm with the error bounds [232,251]. The half life time of an emission pulse is 44 years with the 95% error bounds [39,48].
The temperature dependent estimation according to equation (
11) with fixed
ppm leads to:
The temperature dependent part of the absorption is clearly not significantly different from 0. When using the CO
2 temperature proxy from Equation (
6), we get:
Using the CO
2 proxy for temperature, there is also no significant temperature dependence. Therefore we are forced to take the constant relative absorption as the best possible absorption model of the 50 years from 1950 to 2000 (
Figure 5).
The diagram in
Figure 5 confirms that there is no deviation from the constant relative absorption when taking into account the hypothetical temperature dependence.
2.5.4. Validation Based on Data from 1950 to 2000
Based on the model parameters from the 1950-2000 data, recursive evaluation of equation (
8) with future emission and land use data allows the prediction of future CO
2 concentrations
from
.
Figure 6 shows an excellent prediction in the center of the 95% gray error bar. There are small apparently periodical variations between the predictions and the actual data. Roy Spencer [
6] has explained these systematic deviations of up to 1 ppm with the Multivariate ENSO Index (
https://psl.noaa.gov/enso/mei.old/mei.html), further improving the already excellent fit. For projections of future emission scenarios, these small deviations, which are symmetric w.r.t. 0, do not play a significant role. Roy Spencer also identifies vulcanic activities, e.g. the Pinatubo eruption, but also these small deviations do not change the functional dependency between anthropogenic emissions and CO
2 concentration in a significant way.