Persistence
Barro (1979) [
2] was the first to claim that there are no underlying economic forces that would cause public debt/GDP ratio to converge to some steady-state target value. In other words, in Barro’s (1979) [
2] tax smoothing model, the U.S. public debt behaves like a random walk after World War I. The public debt/GDP ratio exhibits unpredictable movements governed only by transitory government spending (mostly during wars) and countercyclical output shocks (mostly during recessions). There is also no effect of both unanticipated and expected (anticipated) inflation on the public debt/GDP ratio. The stated results do not change regardless of whether one measures public debt at nominal (par) or market values.
Hamilton and Flavin (1986) [
9] refute Barro’s (1979) [
2] conclusion that the U.S. public debt/GDP ratio exhibits random walk-type behavior, although for a much shorter period spanning from 1960 to 1984. By applying the standard Dickey-Fuller unit root test of Dickey and Fuller (1981) [
10], Hamilton and Flavin (1986) [
9] reject the unit root non-stationarity hypothesis for the U.S. public debt/GDP ratio at 10% significance level
Kremers (1988) [
11], however, shows that the non-stationarity of the U.S. public debt/GDP ratio cannot be rejected in the post-World War II data. Contrary to Hamilton and Flavin (1986) [
9], Kremers (1988) [
11] implements an augmented Dickey-Fuller unit root test to appropriately model the autocorrelation present in the residual values of the U.S. public debt-to-GDP ratio, and consequently overturns the results of Hamilton and Flavin (1986) [
9] by not being able to reject the non-stationarity hypothesis at any critical level of up to 90%. In addition, Kremers (1989) [
12] further shows that even for the combined inter- and post-war period one cannot firmly reject the non-stationarity hypothesis in the case of the U.S. public debt/GDP ratio.
Wilcox (1989) [
13] argues that the U.S. public indebtedness measure that Hamilton and Flavin (1986) [
9] use is inappropriate since it rests on the undiscounted public debt. Contrary to Hamilton and Flavin (1986) [
9], Wilcox (1989) [
13] calculates the discounted value of the U.S. public debt where the present value of the public debt at a particular point in time is calculated using stochastic real interest rates. Wilcox (1989) [
13] uses the discounted value of the U.S. government debt to define a public debt sustainability criterion which states that overall fiscal policy is sustainable if the projected discounted value of the public debt/GDP ratio approaches zero, i.e., if the expected present value of the sum of future primary surpluses equals the current market value of the U.S. public debt. As in Hamilton and Flavin (1986) [
9], Wilcox (1989) [
13] operationalizes his sustainability criterion by comparing the current market value of the public debt with the sum of expected discounted primary surpluses and denotes the difference between the two as
where
is the market value of public debt and
is the stochastic interest rate measured in real terms. Wilcox (1989) [
13] further argues that the behavior of
is influenced by the behavior of
-if
is non-stationary, then
is stochastic, and if
is stationary, then
is constant. The conclusion of Wilcox (1989) [
13] is that for the period after 1974, the discounted market value of the U.S. public debt is non-stationary.
Given the inconclusive evidence of previous unit root studies in assessing the sustainability of the U.S. public debt, Bohn (1998, 2007) [
14,
15] criticizes a unit root-type regressions on two grounds.
First, Bohn (1998) [
14] argues that unit root test regression suffers from an omitted variable bias since they do not account for the cyclical output changes and transitory government spending. By aiming to explain the variations in the primary fiscal balance as a function of movements in the public debt, output gap and transitory government spending, Bohn (1998) [
14] proposes a fiscal reaction function (FRF) regression approach to test for the mean-reversion in the stochastic process for the U.S. public debt. Using the data for the United States between 1916 and 1995, Bohn (1998) [
14] concludes that the U.S. public debt/GDP ratio behaves as highly persistent, but overall mean-reverting, stationary stochastic process. Regardless of how interest rates and growth rates compare, a positive response of the primary fiscal balance to public debt movements represents a
sufficient condition for public debt sustainability, since any upward movement in public debt/GDP ratio would be reversed with positive primary fiscal balance response.
Second, the sustainability notion of Wilcox (1989) [
13],
, is always satisfied, since the exponential growth in the denominator
asymptotically dominates the polynomial growth in the numerator
irrespective of the order of integration for
-see Proposition 1 on page 1840 of Bohn (2007) [
15] for a detailed proof.
Contrary to Bohn (1998) [
14], who estimates a single equation ordinary least squares (OLS) FRF, Cochrane (2020, 2022) [
16,
17] estimates a vector autoregressive (VAR) model with public debt and primary fiscal surplus and finds a 0.98 value for the first lag debt coefficient. In other words, Cochrane (2020, 2022) [
16,
17] reaffirms the findings of Bohn (1998) [
14] that public debt/GDP ratio is stationary, but highly persistent, near-unit root stochastic process.
On the other hand, Campbell et al. (2023) [
18] argue that the U.S. public debt/GDP ratio after the World War II must be non-stationary since it has little ability to predict its own dynamics, as well as future fiscal developments in taxes and spending. Campbell et al. (2023) [
18] instead propose a stationary government surplus/debt ratio as a useful predictor of future fiscal outcomes.
Finally, although Jiang et al. (2024) [
19] find that the U.S. public debt/GDP ratio is persistent, close to a unit root, stochastic process, the authors exclude the possibility that there is an actual unit root in the autoregressive representation for the public debt/GDP ratio on several grounds.
First, a non-stationary public debt/GDP would breach any upper bound given an arbitrarily long forecast horizon.
Second, a unit root stochastic process would also imply an ever-increasing variance of the public debt/GDP ratio with the passage of time.
Third, large increases in the public debt/GDP ratio in the U.S. fiscal history were usually followed by
i) discretionary fiscal adjustments;
ii) high inflation;
iii) financial repression in the form of interest rate caps on government borrowing; or
iv) corrections in market prices of government bonds. In sum, Jiang et al. (2024) [
19] conclude that the U.S. public debt/GDP ratio exhibits highly persistent, near-unit root, behavior, but more importantly, the authors contribute such autocorrelation profile to the 2007 structural break due to the Global Financial Crisis (GFC). However, as Jiang et al. (2024) [
19] acknowledge themselves, the timing of the structural break is exogenous to the dynamics of public debt/GDP ratio.
The reader should note at this point that even if the timing of the 2007 structural break had been endogenous, i.e., explained by the underlying forces that govern the dynamics of public debt/GDP ratio, Carrasco (2002) [
20] warns that endogenous structural change tests have no power if the data are generated by a non-linear threshold-type model. Putted differently, the non-linear threshold-type tests for parameter stability have greater power in comparison to tests that deal with structural change in parameters. Consequently, Carrasco (2002) [
20] advises that testing the null hypothesis of linearity against a threshold alternative is the most robust approach in detecting parameter instability in macroeconomic and financial time series.
The recommendations of Carrasco (2002) [
20] regarding the use of non-linear threshold-type models in economics are crucial from the standpoint of this paper, even more so given the results by Gonzáles and Gonzalo (1997) [
7] and Lanne and Saikkonen (2002) [
21] who caution about the observational equivalence between the actual unit root stochastic processes and respective non-linear alternatives, especially in small samples. The question is, however, which non-linear threshold alternative is the most suitable one in describing the dynamics of highly persistent near-unit root stochastic processes such as the one governing the dynamics of the U.S. public debt/GDP ratio after the Bretton Woods collapse.
Non-Linearities
One of the first contributions that model the non-linearities in the dynamics of the U.S. public debt/GDP ratio is Sarno (2001) [
22]. In particular, Sarno (2001) [
22] estimates the ESTAR model of the following form
in which
represents the first difference operator,
is the ergodic and globally stationary public debt/GDP ratio,
and
are regime-dependent level shifts, the residuals are
, while
is the delay parameter. The transition function between the two regimes takes the form
where
measures the speed of transition between the two regimes and
denotes the threshold public debt/GDP ratio. The sum of autoregressive coefficients,
, determines the order of autoregression (
), while
and
represent respective regime-dependent autoregressive slope coefficients. Although it is admissible for
, the global stationarity condition for the described ESTAR model of Sarno (2001) [
22] demands that
and
.
Sarno (2001) [
22] estimates the Equation (1) on a sample spanning from 1916 to 1995 to discover that the U.S. public debt/GDP ratio behaves as a nonlinear mean-reverting ESTAR stochastic process. There are, however, several potential problems with the underlying ESTAR econometric estimates from Sarno (2001) [
22].
First, since the Equation (1) of Sarno (2001) [
22] from above is parameterized and estimated in first differences and not levels of public debt/GDP ratio, the estimates from (1) might be prone to an omitted variable bias. The Equation (1), in essence, represents a non-linear reaction function of
on
in which the response of
to
is regime specific and determined by the estimated values of
,
and
, as well as on the shape of the transition function
which, in the case of Sarno (2001) [
22], is an exponential transition function. Since
is approximately equal to the overall fiscal balance corrected for the potential stock-flow discrepancies, the ESTAR Equation (1) is essentially a non-linear FRF of the overall fiscal balance to regime specific
values. To the extent that
approximates the dynamics of the U.S. primary fiscal balance, the Equation (1), similarly to the unit root test regressions, also does not incorporate the transitory government spending and cyclical output shocks on its right-hand side. More importantly, Bohn (1998) [
14] explicitly states that
is a function of both lagged public debt/GDP and non-debt components, most notably the output gap and transitory government spending. In particular, the Equation (4) from Bohn (1998) [
14] reads as follows
in which
for the real interest rate
and the real growth rate
, and where
represents lagged output gap and transitory government spending, under the realistic assumption that both variables are strictly bounded stochastic processes. In Table 2 (page 956), Bohn (1998) [
14] provides estimates of the Equation (2) from above. In addition, when evaluating a nonlinear response of primary fiscal balance to changes in public debt/GDP ratio in Table 3 (page 958), Bohn (1998) [
14] explicitly controls for the variations in output gap and transitory government spending. Similar to Bohn (1998) [
14], Mendoza and Ostry (2008) [
23] and Mauro et al. (2015) [
24] also quantify the size of an omitted variable bias for the primary balance FRF coefficient in a wider international and historical context.
Second, a claim of Sarno (2001) [
22] on page 120 that “there is growing evidence that governments respond more to primary deficits (surpluses) when public debt is particularly high (low)” is a valid empirical fact in the case of the U.S. for the sample period from 1916 to 1995 which both Bohn (1998) [
14] and Sarno (2001) [
22] use in their respective studies. However, there is a statistically significant structural shift in the primary balance FRF coefficient after the GFC, as D’Erasmo et al. (2015) [
25] document in the case of the U.S. for the period 1791-2014. Using the extended sample period that ends in 2014, D’Erasmo et al. (2015) [
25] manage to overturn most of the results originally reported by Bohn (1998) [
14]. In particular, due to an unprecedented public debt build up after the 2008 GFC, D’Erasmo et al. (2015) [
25] quantify a much lower primary balance FRF coefficient to public debt upward movements. This finding of D’Erasmo et al. (2015) [
25] contradicts the statement of Sarno (2001) [
22] from page 121 “…that governments react more strongly to primary deficits when the deviation of the debt/GDP ratio from equilibrium is large in absolute size suggests that the larger the deviation from the long-run equilibrium of the debt/GDP ratio, the stronger will be the tendency to move back to equilibrium.”
The reader should also note that the emphasized assertions of Sarno (2001) [
22], and the original estimates by Bohn (1998) [
14], are inconsistent with the sovereign borrower rational expectations equilibrium model of Ghosh et al. (2013) [
26] in which the fiscal behaviour of the sovereign borrower tracks a reduced form FRF with the fiscal fatigue characteristics. The FRF with fiscal fatigue characteristics of Ghosh et al. (2013) [
26] implies a cubic relationship between primary fiscal balance and public debt such that at low levels of debt there is no, or even negative, relationship between the primary balance and public debt. With the increase of public debt, the response of the primary balance increases also, but the magnitude of the response eventually weakens and finally decreases at very high levels of debt. In sum, it is unlikely that governments can respond more aggressively to increased primary deficits when public debt/GDP ratio is particularly high, if only because the primary surplus/GDP ratio cannot exceed 100% while interest payments and public debt as % of GDP can.
Third, some novel econometric findings of Heinen et al. (2012) [
27] and Buncic (2019) [
28] are in contrast with the claims of Sarno (2001) [
22] about the desirable properties of the exponential transition function, most notably the properties about its boundedness between 0 and 1 and symmetrically inverse-bell shaped transition function around 0. On page 120, below the Equation (1), Sarno (2001) [
22] claims that “
these properties are attractive in the present context because they allow symmetric adjustment of d(t) for deviations above and below the equilibrium level.” However, Heinen et al. (2012) [
27] argue that one cannot distinguish the transition function in relation to extreme parameter combinations which is especially the case for very small or very large values of the error term variance, or when certain model parameters tend to their limiting values. The consequence of this identification problem are strongly biased estimators in the case of ESTAR model specification.
Similar to Heinen et al. (2012) [
27], Buncic (2019) [
28] emphasizes an additional identification problem in the case of the ESTAR model which implies observational equivalence between the exponential transition function and the quadratic transition function in cases when the speed of transition parameter
takes on relatively small values. On the other hand, for relatively large values of the speed of transition parameter
, there is an observational equivalence between the exponential transition function and indicator outlier fitting function. In other words, the exponential transition function acts like a dummy variable which removes the influence of outlier observations. As the simplest possible alternative to the ESTAR model specification, Buncic (2019) [
28] recommends the use of (SE)TAR-type threshold models.
Fourth, as Sarno (2001) [
22] notes in the footnote number 3 on page 120 of his article, an alternative smooth transition function to the exponential one of the ESTAR process is the logistic transition function of the LSTAR model specification. Sarno (2001) [
22] opts for an exponential transition function on statistical grounds and further argues that the LSTAR model
“seems relatively less appropriate for modeling the dynamics of the public debt/GDP ratio”, since it implies asymmetric behaviour of public debt/GDP with respect to endogenously estimated threshold. Cochrane (2022) [
17], however, claims (page 31) that
“the s-shaped surplus/GDP process is a crucial lesson” for the post-World War II fiscal dynamics. In other words, today’s deficits are followed by future surpluses, since the surplus/GDP follows an
s-shaped process in a two-variable VAR setting with public debt/GDP and surplus/GDP ratios. But even if the statements of Cochrane (2022) [
17] about the
s-shaped surplus/GDP process are correct, which Campbell et al. (2023) [
18] and Jiang et al. (2024) [
19] question on the basis of highly persistent near-unit root process for public debt/GDP, the problem with the LSTAR model specification, as Ekner and Nejstgaard (2013) [
29] claim on page 17, is that
“a large and imprecise estimate of the speed of transition parameter implies that the LSTAR model is effectively a TAR model.”
Both the recommendations of Buncic (2019) [
28] and Ekner and Nejstgaard (2013) [
29] show that the (SE)TAR process has more desirable statistical properties in comparison to ESTAR and LSTAR processes, respectively. Gnegne and Jawadi (2013) [
30] estimate a two-regime SETAR process for the public debt/GDP ratio in the case of the United States between 1970 and 2009. However, similarly to Sarno (2001) [
22], Gnegne and Jawadi (2013) [
30] model the non-linear behaviour in the
changes, not levels, of the public debt/GDP ratio which effectively implies investigating asymmetries in the stock-flow adjusted overall fiscal balance. The choice of Gnegne and Jawadi (2013) [
30] to focus on changes, instead on levels, of the public debt/GDP ratio is governed by the potentially inappropriate choice of respective unit root tests. In particular, Gnegne and Jawadi (2013) [
30] assert (Table 1, page 158) that
“According to Table 1, the great majority of unit root tests indicate that public debt/GDP ratio in the case of US is an I(1) stochastic processes. To check the robustness of our findings for the presence of structural breaks, we further apply a ZA unit root test, but the main conclusion about I(1) behaviour remains unchanged.”
Gnegne and Jawadi (2013) [
30], hence, use the Zivot-Andrews (ZA) unit root test with single endogenous structural break to strengthen their findings about the
I(1) nature of the stochastic process for the U.S. public debt/GDP ratio between 1970-2009. Chortareas et al. (2008) [
31], however, caution that the results of unit root tests with structural breaks often do not agree with the results of unit root tests that posit a non-linear stationarity under the alternative hypothesis. In other words, since unit root tests with structural breaks essentially capture the different time series characteristics of the stochastic process in question, one should use them only as complementary tests to the non-linear unit root tests, as Chortareas et al. (2008) [
31] recommend.
Since the choice of a particular alternative hypothesis in unit-root tests affects their ability to reject the null hypothesis, one particular testing strategy for attaining the desirable power of unit root testing procedures would be to use an
F-test of Enders and Granger (1998) [
32] for the null hypothesis of a unit root against an alternative of a stationary two-regime SETAR process. The reader should note, however, that the Monte Carlo simulations of Enders (1998) [
33] report that the
F-test of Enders and Granger (1998) [
32] has lower power than the traditional Dickey-Fuller unit root test of Dickey and Fuller (1981) [
10] that ignores the threshold break under the alternative. The problem with the Dickey-Fuller unit root test, on the other hand, is that it has very low power in the case of highly persistent near-unit root AR(1) processes, which is precisely the case for the U.S. public debt/GDP ratio. Since both the
F-test of Enders and Granger (1998) [
32] and the Dickey-Fuller test of Dickey and Fuller (1981) [
10] have low power in the case of the U.S. public debt/GDP ratio, one potential solution is to use the efficient unit root tests of Elliott et al. (1996) [
34], since Bec et al. (2022) [
35] find that these unit root tests have higher power than traditional unit root tests, single threshold-type unit root tests of Enders and Granger (1998) [
32] and two threshold-type unit root tests of Kapetanios and Shin (2006) [
36] in the case when AR(1) coefficient is larger than 0.95.
Although Gnegne and Jawadi (2013) [
30] do not report the results of efficient unit root tests from Elliott et al. (1996) [
34], they present, in line with the recommendations from Bohn (2007) [
15], the results of the stationarity KPSS test of Kwiatkowski et al. (1992) [
37]. In particular, Bohn (2007) [
15] asserts that testing the null hypothesis of stationarity against the alternative of a unit root can be of economic interest, since one can, after concluding that the null hypothesis of stationarity cannot be rejected, proceed to test for potential non-linearities in the stochastic process for public debt. However, Gnegne and Jawadi (2013) [
30] present only the results of the stationarity testing with the intercept term without trend, even though the Figure 2 on page 156 of their paper clearly depicts the upward trending behaviour in the U.S. public debt/GDP ratio between 1970 and 2009. The realized value of the KPSS test statistics of 1.44 from Table 1 (page 158) of Gnegne and Jawadi (2013) [
30] rejects the null hypothesis of stationarity at the 5% significance level, but the results have to be interpreted with caution since the choice of an intercept term as the only deterministic component can influence the power of the stationarity test of Kwiatkowski et al. (1992) [
37].
Before presenting the methodological econometric framework in the next section of the paper, it would be useful to summarize main points regarding the time series properties of the U.S. public debt/GDP ratio after the Bretton Woods collapse.
First, the U.S. public debt/GDP ratio can be best characterized as the near-unit root stochastic process with a first lag autocorrelation coefficient higher than 0.95.
Second, in determining the order of integration of the U.S. public debt/GDP ratio, one should place emphasis on efficient unit root tests from Elliott et al. (1996) [
34] and stationarity test from Kwiatkowski et al. (1992) [
37] using both the intercept and linear time trend as deterministic components in testing regressions.
Third, to model threshold non-linearities in the dynamics of the U.S. public debt/GDP ratio one should opt for the SETAR model specification instead of ESTAR or LSTAR model specifications.
Fourth, the SETAR model should be estimated in levels, not first differences, of the U.S. public debt/GDP ratio since
i) the first differenced public debt/GDP approximates the overall fiscal balance corrected for the stock-flow adjustments and consequently has an alternative economic interpretation with respect to the public debt/GDP ratio measured in levels; and
ii) bond investors, credit rating agencies, policy makers and international financial institutions are primarily interested in monitoring and forecasting public debt/GDP ratio in levels, not first differences.