1. Introduction
Artificial intelligence (AI) involves the creation of machines that are able to think and act like humans. An important goal is to ensure that these are safe for humans and are consistent with human values (Gabriel & Ghazavi, 2021). As a result, the AI Alignment problem (Chaturvedi, 2023) is a central area of research in the science of AI. Getting machines to think like human brains requires modelling machine learning in line with the way human brains make decisions in the real world (Sejnowski, 2018). The highly formalized human decision making models of economics based on utility functions lend well to this purpose. In the science of AI, the counterparts to economic utility functions are known by various terms, including value functions, objective functions, loss functions, reward functions and preference orderings (Naudé, 2023).
Explicitly grounding AI in economic theory has been shown to yield improved results (Jenkins et al., 2021). Many AI scientists strive to make their AI systems fully rational by utilizing rational choice theory from economics (Green, 2022), specifically in the form of expected utility theory (Naudé, 2023). But there is much evidence that human decision makers are not 100% rational and calculating in practice (Dillon, 1998). It has also been well documented that expected utility is not effective as an applied model of individual choice (Tversky, 1975). Comprehensive reviews of experimental research have demonstrated violations in each of the axioms of expected utility (Yaqub et al., 2009). Behavioral economics (Dhami, 2016) emerged as a powerful approach to decision making heavily grounded in psychology, incorporating the evidence that humans are restricted to bounded rationality in practice (Simon, 1990). The reality of decision making reflects many challenges—including computation, information and biases. Behavioral economics has been most robustly structured within the framework of cumulative prospect theory (Tversky & Kahneman, 1992) and this has been a big leap forward in modeling. Today, much of AI modelling falls back on some form of bounded rationality (Şimşek, 2020).
These are significant enhancements in theoretical foundations, yet when it comes to real world application in decision making under uncertainty, economics and AI both face an environment with multiple individuals, human agent collectives (Naudé, 2023) and decisions that span multiple, sequential, causally-linked time periods where information is costly (Naudé, 2023). AI is a time-based process, with causality from percepts to outcomes (Russell & Norvig, 2021). Given this context, the theoretical frameworks of EUT and BE typically have to be put aside in applied economics in favour of dynamic stochastic programming (Bellman, 1957) based on a Markov decision process (Howard, 1960). In similar fashion, much of applied AI is focused on reinforcement learning (Kaelbling et al., 1996), which is also sequential, based on a Markov decision process (Agrawal, 2019). The sequential nature of applied decision making has also led to economists and AI scientists each modelling intelligent agents with Bayesian probability theory to inform their beliefs (priors), and utility theory to inform their preferences (Naudé, 2023).
This complex environment renders expected utility and behavioral economics unable to serve as core models in real-world applications, primarily due to their reliance on single-period lotteries with intertemporal choice augmentations such as exponential discounting (Samuelson, 1937) and quasi-hyperbolic discounting (O’Donoghue & Rabin, 1999), These approaches don’t explicitly define the causal relationship that uncertain benefits follow upfront certain and uncertain costs over time on significant decisions (Horton, 2019). In contrast, the structure of causal economics (Horton, 2019) does align to this real-world environment, as this paper will explore. Causal economics provides a consistent theoretical underpinning to applied modelling in economics and AI that utilize dynamic stochastic programming and AI machine learning algorithms (Naudé, 2023).
The theoretical foundations of causal economics align strongly with the current surge of interest in causal inference throughout econometrics, statistics and computer science, represented in research on natural experiments (Imbens & Angrist, 1994) based on the potential outcomes approach (Rubin, 2005) of the Rubin causal model (Imbens & Rubin, 2010) and the recent development of causal machine learning (Huber, 2023) with its expanding set of literature (Arti et al., 2020). This invigorated and cross-disciplinary interest in causality presents an opportunity for causal economic theory (CE), causal machine learning (CML) and the natural experiments approach to be used together as a research agenda in causal economic machine learning (CEML).
2. Theory
2.1. Causal Economics Theory
There have been many significant advances in decision theory across academic disciplines, but more work needs to be done to provide a consistent, unified foundational framework, at the theoretical and empirical levels (Mishra, 2014). Causal economics attempts to address this gap by providing a robust framework well suited to testing a wide range of algorithms that address issues in AI research. The theory is grounded in behavioral economics, psychology, biology and neuroscience (Horton, 2019). In the following section, it will be demonstrated that expected utility (Mongin, 1997) and cumulative prospect theory (Tversky & Kahneman, 1992) also represent constrained mathematical reductions of causal economics that limit applications. The richness of causal economics is possible due to the definition of outcomes. Outcomes in causal economics must always contain both cost and benefit, where each of these contain certain (deliberate) and uncertain elements. Causation must also run in at least one direction of the relationship (Horton, 2019). The approach to preferences used in causal economics is built upon the structural model of cumulative prospect theory (Fennema & Wakker, 1997). With this foundation in place, preferences are expanded to include cost broadly (incorporating certain and uncertain components), including risk in the form of the uncertain component. Psychological trade-off-constraints are then defined as the primary umbrella constraint for optimization, which includes budget constraints (Horton, 2019). The following summary briefly introduces the equations of the causal economic framework for those not familiar with it (Horton, 2019), beginning with key definitions:
i = an outcome where P incrementally results from Q and/or Q incrementally results from P (Q is required for P and/or visa versa)
t = time period
+,- = positive overall outcome and negative overall outcome respectively
(+) , (-) = personal total benefit and personal total cost outcome values respectively
w = probability weighting function
v = value function
π = capacity function of individual events
Ψ = rank dependent cumulative weighting of overall weighted event values
Ω = rank dependent cumulative probability weighting
Pi = perceived magnitude of positive element in outcome i
Qi = perceived magnitude of negative element in outcome i
pi = perceived probability of positive element in outcome i
qi = perceived probability of negative element in outcome i
PCi = acceptable relative magnitude of positive element in outcome i
QCi = acceptable relative magnitude of negative element in outcome i
pCi = acceptable relative probability of positive element in outcome i
qCi = acceptable relative probability of negative element in outcome i
ΨC = acceptable rank dependent cumulative weighting of overall weighted event values
ZU = perceived magnitude of outcome i (used when positive and negative outcomes aren’t specified)
zU = perceived probability of outcome i (used when positive and negative outcomes aren’t specified)
Optimization in causal economics involves maximization of the value function,
v, shown in equation (1);
This maximization is subject to a pair of personal
psychological trade-off constraints, defined by (2) and (3);
Optimization in causal economics differs
fundamentally from the approach utilized in traditional economics, mainly
through the addition of these two psychological trade-off constraints. Decision
makers optimize by making trade-off decisions in light of their personal,
internal psychological constraint of acceptable cost relative to causally
coupled benefit (Horton, 2019). External constraints such as the decision
maker’s financial budget impact utility indirectly through the personal
interpretation of the decision maker, based on that agent's personal
psychological constraint, which is a matter of personal choice based on their
individual experiences (Vahabi, 2001).
In causal economic optimization, the value
function, v, and the psychological trade-off constraint, c, play
separate but complementary roles. The value function captures a decision
maker’s personal judgements regarding the particular set of prospects they
believe they may face when making a decision and the specific costs and
benefits they associate as realistically possible with each event. It
incorporates preferences over both risk and time, but also represents perceived
realities that the decision maker feels they must take as given (Horton, 2019).
This is contrasted with the psychological trade-off constraint, c, that
directly conveys a decision-maker's preferences across costs (uncertain and
uncertain) as well as risk (uncertain). It conveys the marginal amount of
benefit required for a decision maker to take on one additional unit of cost,
or similarly the marginal amount of cost they will bear in order to obtain an
additional unit of benefit.
The resulting optimization dynamic between v
and c is therefore a process in which anticipated prospects are
evaluated, via v, and compared to acceptable scenarios, via c, subject
to L as a limit, at least in the short-term (Horton, 2019).
2.2. A Real World Example
A powerful way to illustrate how causal economics is able to effectively model the complex decisions agents face in the real-world, where traditional economic models cannot, is through a common example that most people can relate to—the pursuit of weight loss. A goal to lose weight requires the decision maker to weigh known costs and risks in advance of anticipated resulting future benefits (Horton, 2019). It is certainly more complex than going to the gym and eating a healthy diet for a day and then immediately losing a few pounds. Losing weight actually requires motivating goals (Brink & Ferguson, 1998) as well as deliberate and sustained commitment to incurring challenging personal costs in the form of dietary sacrifices and physical exertion that both extend over months (Horton, 2019). Neither the effort required nor the results obtained are primarily random lotteries, as would be utilized in traditional economic optimization (Horton, 2019).
Causal economics is able to effectively model this weight loss scenario because its framework builds in a multi-period structure, containing both certain (deliberate) and uncertain cost and benefit, that also requires costs to be causally matched to resulting benefits, both contemporaneously and in future tine periods. This causal coupling means that costs (some certain and some uncertain) are explicitly and directly tied to the expected resulting benefits that occur over the entire decision time-horizon (Horton, 2019). Costs are generally greater in earlier periods with relatively higher certainty as contrasted with potential benefits that can occur in later periods with relatively less certainty (Horton, 2019). In the weight loss example, costs that can be controlled to a large degree include exercise and diet. The associated causal gains involve weight loss, improved health and positive self-esteem.
The complexity of modelling an individual’s decision to achieve a weight loss goal weight doesn’t end with this matching of costs and benefits over the decision time-frame. Decision makers are each individuals with their own unique subjective preferences and subjective probability assessments, as captured by subjective expected utility theory (Savage, 1954). For example, an individual may pursue an exercise program in order to obtain their own perceptions of personal benefit, such as improved health and attractiveness (Brink & Ferguson, 1998), with full knowledge that exercising and eating well will allow them to achieve their goal with certainty, but only if they commit to the effort over time. There is uncertainty concerning the actual outcome quantities of individual parameters, but the decision maker’s ability to deliberately decide and control the overall outcomes of their decisions (versus a lottery) is a large gap in expected utility and behavioral economics models (Horton, 2019). In the real world decision makers do not face a series of random lotteries. They impact and are impacted by outcomes (Horton, 2019). Many popular expressions in the English language capture this observed reality, such as ‘getting out what you put in’, ‘no pain, no gain’ and ‘bearing the fruit of one's labour’ (Horton, 2019).
2.3. Reducing Causal Economics to Expected Utility Theory
Expected utility (Schoemaker, 1982) can be directly derived from the model of causal economics with a number of restrictions. The first element of constraining causal economics to expected utility is restricting outcomes to single-value, net positive or negative lottery outcomes. This is accomplished by applying the following conditions to equations (1), (2) and (3):
w+(+), w+(-), w-(+), w-(-) = 0
v+(+), v+(-), v-(+), v-(-) = 0
pA, qA, PA, QA = 0
pU, qU, PU, QU = 0
w, v, Z, z ≠ 0
Ψ, ΨC = 1
Given that outcomes in causal economics typically span a significant number of time periods, the second element of reducing causal economics to expected utility is the requirement that;
Based further on the assumption of linear independence as required for expected utility (Fishburn, 2013), the expected utility value function is defined by the familiar equation (4);
In expected utility theory the value function is optimized in the context of standard consumer choice theory, based on indifference curves and budget constraints (Isaac, 1998). In order to reduce causal economic optimization to expected utility optimization, the internal psychological trade-off constraint function (2) must also be limited to a budget constraint and the personal total cost threshold constraint (3) must be allowed to take on an infinite value.
Time preferences are often added to the risk preferences of expected utility through the delta model (O'Donoghue & Rabin, 1999), which incorporates exponential discounting (Samuelson, 1937). Exponential discounting is time-separable, also known as time-consistent or dynamically consistent, which implies that a decision maker’s trade-offs between utility today and delayed utility in the future, are independent of when that delay occurs. In the delta model, each subsequent period utility value is multiplied by δ
t as shown in equation (5);
This approach to intertemporal time discounting applies the delta discount factor to the overall utility function, which is a reduction from the framework of causal economics, where time discounting can be captured within any or all of the weighting functions, w+(+), w+(-), w-(+), w-(-), v+(+), v+(-), v-(+), v-(-) with a functional form that fits the data appropriately (Horton, 2019).
2.4. Reducing Causal Economics to Behavioral Economics
Causal economics can also be mathematically reduced to cumulative prospect theory, which is the most robust formalized framework within behavioral economics. The first step in doing so is to restrict outcomes to single-value, net positive or negative lottery outcomes as was done with expected utility. This is accomplished by applying the following conditions to equations (1), (2) and (3);
w+(+), w+(-), w-(+), w-(-) = 0
v+(+), v+(-), v-(+), v-(-) = 0
pA, qA, PA, QA = 0
pU, qU, PU, QU = 0
w, v, Z, z ≠ 0
Ψ, ΨC = 1
Outcomes in causal economics typically span a significant number of time periods, which results in the second requirement for reducing causal economics to cumulative prospect theory;
In cumulative prospect theory, Z and z are relative to status quo (instead of zero), Ω is a rank dependent cumulative probability weighting, and sign comonotonic trade-off consistency (Wakker & Tversky, 1993) holds. Under all of these conditions, the causal economic value function (1) reduces to the utility value function for cumulative prospect theory represented by equation (6);
In cumulative prospect theory, the value function is optimized in the context of standard consumer choice theory, based on indifference curves and budget constraints (Isaac, 1998). As a result, in order to reduce causal economic optimization to cumulative prospect theory optimization, the internal psychological trade-off constraint function (2) must also be limited to a budget constraint and the personal total cost threshold constraint (3) must be allowed to take on an infinite value.
This reduction to cumulative prospect theory addresses the risk preference element of behavioral economics. However, intertemporal choice is an important component of behavioral economics (Dhami, 2016) and is typically incorporated through quasi-hyperbolic discounting via the beta-delta model (O'Donoghue & Rabin, 1999). Quasi-hyperbolic discounting captures time inconsistent preferences, such as present bias and changing preferences as time elapses. These are important additions, as extensive research demonstrates that decision makers generally do not have time consistent preferences (Thaler, 1981). They tend to be more patient for long-term gains and relatively impatient for short-term gains, which means that they often plan to do something in the future, but subsequently change their mind. These characteristics align closely with the concept of causal coupling, especially as illustrated in the real world example in section 2.2. Causal economics weighting functions often employ quasi-hyperbolic discounting (Horton, 2019).
In the beta-delta model, each subsequent period utility value is multiplied by β and δ
t as shown in equation (7).
This approach to intertemporal time discounting applies the beta and delta discount factors overall to the utility function, which is a reduction from the framework of causal economics, where time discounting can be captured within any or all of the weighting functions, w+(+), w+(-), w-(+), w-(-), v+(+), v+(-), v-(+), v-(-) with a functional form that fits the data appropriately (Horton, 2019).
2.5. Causal Economics and Artificial Intelligence
2.5.1. The Principle of Causal Coupling
2.5.1.1. Microeconomics
The previous sections have mapped out causal economics and demonstrated that it can be used as a single unified decision model that captures the breadth of scenarios applicable in the real world; those tackled by AI. This provides increased clarity for AI modelling at the individual level. AI is a broad discipline that attacks many problems with a range of techniques (Pannu, 2015). However, the richness of causal economics creates complexity and there is a need to be aware of the risk of combinatorial explosion slowing down computation and impacting solvability (Russell & Norvig, 2021). Causal economics is most relevant to AI problems in the areas of planning, decision making and learning. These areas deal with intelligent reasoning in the face of uncertainty and incomplete information over time. As a result they incorporate probabilities, feedback and learning, utilizing various tools from probability and economics (Russel & Norvig, 2021). Research in adaptive economics has mapped out the importance of this feedback and learning in a complex and evolving environment (Day, 1975). The fundamental anchor in causal economics that aligns it closely with this context is the concept of causal coupling (CC). Causal coupling builds in cost and benefits that reflect subjectivity, causality, and uncertainty and information availability over time. Causal coupling is formally defined in equation (8);
Equation (8) requires that each outcome must contain both benefits and costs (contemporaneous and subsequent causation). It reflects the notion that decision makers in general have more immediate control over and face more immediate impact on their costs in terms of upfront effort than they do over the benefits that typically follow with some time delay. Equations (1), (2) and (8) directly build causal coupling into the definitions of percepts, outcomes, preferences and constraints. This mathematical grounding provides a robust foundation for AI algorithm development and application in individual agent decision making.
2.5.1.2. Macroeconomics
Whereas causal coupling connects cost and benefit for individual decision makers as a causal relationship over time, it can also be applied at a macro policy level in pursuit of societal optimization (Horton, 2019). This possibility centers on a bold conclusion of causal economics—that economic social coordination failures (outcomes that are not Pareto optimal) within societies are at their root the result of persistent causal decoupling of benefits and costs within populations of involved and/or impacted agents, and that effective solutions are the result of ensuring that causal coupling is achieved (Horton, 2019). In essence, social coordination failure is a generalization of the well-entrenched concept of market failure (Horton, 2019).
The pursuit of self-interest through the power of the free market, captured by Adam Smith’s invisible hand metaphor has been demonstrated in practice (Baumol, 2002) but there are significant market failures in society that reduce welfare (Stiglitz, 1991). The application of causal coupling at a macro level attempts to close this gap through policies that couple cost and benefit for individuals, building in the notion of freedom with accountability (Horton, 2019). A loose metaphor for causal coupling at the macro level is perhaps the generalizing of Adam Smith’s invisible hand into invisible hands. One hand represents the freedom to pursue self-interest for personal benefit, while the other represents the obligation to contribute a prorated share to societal costs as determined through democracy. This perspective allows a fresh opportunity for AI researchers to apply causal coupling as a macro policy condition when testing for policy optimality in populations.
Market failure occurs when the voluntary actions of agents fail to take into account broader externalities to the current transaction (Ledyard, 1989). Social coordination failure typically results when an agent is able to obtain personal enrichment without bearing the causally preceding costs, essentially gaining benefit through means other than a voluntary interaction/exchange, and/or avoiding their prorated share of the cost of public programs (Horton, 2019). It is essentially a causal decoupling of cost and benefit over the intermediate to long-term across individuals and society at large. In causal economic theory market failure is not due to exogenous externalities, but is instead the result of endogenous causal decoupling between cost and benefit across individuals and the population (Horton, 2019).
In practice, the largest source of causal decoupling in societies is when intermediaries receive/distribute compensation/benefits that is not tied to performance (Horton, 2019). Social policies are Pareto optimal when they couple benefits and costs for agents and do not allow intermediaries to accrue extra benefit from and/or offload extra cost to others that have no choice in the matter, taking into account realistic preferences, usage and risk (Horton, 2019). When involuntary allocations are applied through taxation to fund democratically determined public goods and services and risk sharing programs, optimal allocations of benefit and cost are achieved by prorating costs across individuals either though a flat tax or user fee, with relief for those facing significant challenges to paying a proportionate share, subsidized by taxpayers with higher ability to pay (Horton, 2019). Causal economics asserts the optimality of taxation based on voter-approved specific usage with committed metrics (use-based taxes), as opposed to source-based taxation that is automatically collected on activities associated with economic value creation, such as income and consumption (Horton, 2019) and placed into general government coffers. This doesn’t necessitate higher or lower taxes on individuals or overall. It just places the focus on a causally coupled, efficient means of raising funds for directed purposes desired by a democratic majority of voters.
2.6. The Neuroscience Underpinning Causal Economics
The concepts and structural setup of causal economics are closely aligned to findings in neuroscience (Kurniawan et al., 2010). Alongside psychology, neuroscience adds a fundamental perspective to our understanding of human decision making. This convergence in research provides a strong grounding for causal economics and its use as a foundation in AI. Studies in neuroscience have established the integration of reward and pain during choice (Talmi et al., 2009). It has been demonstrated that the cost required in action is a vital determinant of choice behavior (Kennerley et al., 2009). Multiple human and animal experiments have shown how the effort one must exert impacts choice and how the brain calculates and integrates effort into action value (Croxson et al., 2009; Floresco & Ghods-Sharifi, 2007; Kennerley et al., 2009; Rudebeck et al., 2008). Total effort is a combination of upfront and ongoing efforts (Horton, 2019) and can be expended as either mental work (Botvinick et al., 2009), or physical work (Kurniawan et al., 2010). Additional research in neuroscience reinforces the discounting of prospects where outcomes contain potential pain or loss (Seymour et al., 2007; Talmi et al., 2009) and are separated over time (Kable & Glimcher, 2008; McClure et al., 2007; Pine et al., 2009). Overall, these observations on effort, costs and outcome value over time from research in neuroscience ground the upfront and ongoing cost component of causal coupling at the centre of causal economics.
2.7. The Psychology Underpinning Causal Economics
The science of large and complex choices that take upfront and continued effort over time is comprehensively covered in the field of psychology (Newell et al., 2022). The causal economics framework essentially consolidates and provides a common formalized structure to many proven concepts in psychology (Horton, 2019). From a theoretical perspective, the structural design of causal economics relies heavily on many principles of psychology (Horton, 2019), as does that of behavioral economics (Camerer, 1999). This is an important foundation to a robust framework that can be most useful when used in AI. However, this alignment to psychology doesn’t end with theory. In applied practice psychologists actively work to assist in the management of a decision maker’s perceived costs and benefits to obtain desired outcomes. These therapies can focus on thoughts and behaviors to various degrees, through methods such as cognitive behavioral therapy, behavioral therapy or behavioral choice theory (Hollon, 2013).
2.8. Decision Making in Artificial Intelligence
AI seeks to address a broad range of challenges, spanning reasoning, planning and learning and perception, with applications that span areas such as software, robotics and the Internet of Things (Ertel, 2018). AI researchers have devised many tools to tackle these problems, and since AI agents generally operate with incomplete or uncertain information in the real world, for the most part methodologies from probability theory and economics are leveraged (Russell & Norvig, 2021). Modelling of decision making agents in AI typically parallels the approach of microeconomic decision making theory, based on bounded rational utility maximization in the face of constraints and perceptions of the environment (Naudé, 2023). Causal economics and its use of causal coupling provides the best structural fit as an underlying model of choice in the context of AI research.
AI takes the bold step of attempting to automate the decision process for efficiency and results that would parallel a human decision maker. Automated planning (Cimatti et al., 2008) involves agents pursuing a specific goal, in the face of complexity, where solutions need to be discovered and optimized in multidimensional space (Russell & Norvig, 2021). Decision makers choose an action with some known costs and make probabilistic guesses about future additional costs and benefits. In multi agent situations, the decision maker’s environment and even preferences can be dynamically unknown, requiring ongoing iterative revision to the model and associated strategies and policies (Russell & Norvig, 2021). Agents are often forced to reassess their situation once results are experienced (Russell & Norvig, 2021). As the field of artificial intelligence is fundamentally concerned with solving applied problems, not just theory, a range of computational techniques are deployed, including dynamic programming, reinforcement learning and combinatorial optimization (Russell & Norvig, 2021). Game theory is utilized in situations where a researcher is modelling rational behavior across multiple interacting agents (Russell & Norvig, 2021).
Like applied economics, AI research typically utilizes dynamic stochastic programming (Bellman, 1957), based on a Markov decision process (Garcia et al., 2013). In this approach, a transition model is defined that describes the probability that a specific action will change the resulting state in a certain way as well as a reward function that defines the utility benefit value of each state and the cost of each action. Within this context, a policy is calculated or learned and associates a decision with each possible state of the world (Russell & Norvig, 2021).
There has been increasing momentum among AI researchers for the use of subjective probability, in order to model utility functions based on actual consequences and the decision taken, a priori probability distributions and other probability distributions impacting the decision (Suppes, 1969). Currently, AI researchers often model intelligent agents using Bayesian probability theory to inform beliefs (priors), and utility theory to inform preferences (Naudé, 2023). A broad range of practical problems are addressed with Bayesian tools, such as Bayesian networks (Domingos, 2015). Particular algorithms are often applied to particular classes of problems, such as the Bayesian inference algorithm for reasoning, the expectation-maximization algorithm for learning (Russell & Norvig, 2021), decision networks for planning using decision networks (Russell & Norvig, 2021), and dynamic Bayesian networks for perception (Poole et al., 1998).
It is clear that the environment in which AI research is conducted is one of complex decisions involving causally-linked cost and benefit trade-offs over extended periods of time, in the face of uncertainty, and across many agents. This environment is naturally modelled with causal economics and causal coupling, and does not correspond highly to the underlying frameworks of expected utility or behavioral economics (Horton, 2019).
2.9. Machine Learning
The previous section illustrated the underlying context in which AI agents must make decisions; one of subjective, causally-linked cost benefit trade-offs over time, in the face of uncertainty, with multiple interacting individuals and feedback from actions taken. But stopping at this point would forego the opportunity to incorporate learning insights that reinforce effective strategies while simultaneously avoiding ineffective ones (Barto et al., 1989). In real world decision making situations, humans iteratively discover information, without a priori knowledge of eventual results. They regularly experience outcomes that are demonstrated to be suboptimal posteriori, however these situations provide learning opportunities, and as a result decision makers are able to modify future decisions in pursuit of improved future results (Chialvo & Bak, 1999). The fundamental driver of the study of machine learning is to understand and apply programs that are able to improve their own performance on a particular task automatically (Russell & Norvig, 2021). The notion of computers being able to learn from their own decisions and outcomes has been fundamental to AI from the outset (Turing, 1950). Today, machine learning has become central to the resurgence of interest that is occurring in AI (Jayabharathi & Ilango, 2021). Fields such as healthcare are making very strong use of machine learning and seeing positive results (Bera et. al., 2023). The study of machine learning can fall into one of two major categories—unsupervised and supervised. In the unsupervised machine learning approach, algorithms analyse data in order to find patterns and make predictions from that analysis automatically without any human guidance. In contrast, the supervised machine learning approach involves a human, with the role of labelling data inputs and categorizing it (Russell & Norvig, 2021).
Reinforcement learning is one of the most utilized forms of machine learning in practice and the approach aligns very closely with the principle of causal coupling. The approach of reinforcement learning applies rewards to the decision making agent for all actions that the model defines as good, and applies penalties to all actions that the model defines as bad (Luger & Stubblefield, 1990). Researchers have gone further, deriving the additional concept of inverse reinforcement learning. In such models decision making agents are able to seek out new information and then apply this experience-based knowledge to improve their preference functions. Inverse reinforcement learning does not utilize an explicit specification of a reward function because the reward function is actually inferred based on the observed behavior of the decision maker (Ng & Russell, 2000). The field of machine learning continues to evolve, bringing with it powerful new approaches that can improve the predictive power of models. An example is the inclusion of the concept of transfer learning, whereby knowledge obtained from a particular problem can be intelligently applied to a new problem to improve outcomes (Russel & Norvig, 2021). Building upon the previous discussion regarding the importance of grounding theory in multiple related disciplines, the area of deep learning is another core branch of machine learning that deserves attention. It leverages artificial neural networks that are inspired by biological systems with the objective of closely resembling the biological processes of human decision making (LeCun et. al., 2015).
The time-based, information-constrained and uncertain process of decision, action, feedback and learning captured in causal economics aligns directly with the context of machine learning. However, one component of causal economics that is missing in traditional machine learning methodology is causation, which is at the heart of causal machine learning and causal economics. Causal inference is addressed in the following section as a result.
2.10. Causal Inference
The previous section has sought to demonstrate that applied economics can certainly benefit from the use of machine learning in practice and that machine learning can benefit from explicit use of causal economics in its modelling. But the opportunity of these disciplines working in tandem for improved results has become greater than ever with the surge in research interest in two areas fundamentally focused on causality; natural experiments in econometrics and causal machine learning in computer science (Huber, 2023; Arti et al., 2020). Causality is at the very heart of causal economic theory through the concept of causal coupling (Horton, 2019), so it is particularly exciting that causality is seeing a surge of interest in the field of economics. This momentum has had an impact, as it is now considered a best practice in empirical work to identify all assumptions necessary to estimate causal effects (Lechner, 2023). Causal inference is vital in modern scientific research, because without it researchers only have correlations to rely on. In today’s highly connect world, data almost always contains variables that impact each other, producing observations that are spurious and appear to have causal relationships that do not exist. It is clear that reliance on correlation cannot be relied upon for intervention decisions in areas such as healthcare (Sanchez et al., 2022).
2.10.1. Natural Experiments
A recent revolution in empirical research has brought the importance of causality front and center. The current prominence of causality in empirical work is highlighted clearly in the econometrics and statistics literature through the work of 2021 Nobel laureates David Card, Joshua Angrist and Guido Imbens on natural experiments (Hull et al., 2022) and independently in the biostatics literature through Baker and Lindeman (Baker & Lindeman, 1994). These insights build upon the potential outcomes framework (VanderWeele, 2016) first proposed by Jerzy Neyman and the subsequent Neyman–Rubin causal model (Imbens, 2015) devised by Donald Rubin. This invigorated and cross-disciplinary interest in causality presents an opportunity for causal economic theory, causal machine learning and the natural experiments approach to be used together in future research.
Natural experiments are so powerful because they allow researchers to extract causal relationships from the real world, which is much more challenging to work with than data from a controlled clinical trial environment. In the latter, exposure to test and control groups can be controlled to ensure randomization. In natural experiments, individuals and/or groups are exposed to both the test and control conditions which are observed from nature rather than controlled by the experimenter. In nature there is often a confounding variable, one that is related to both the input variable and the output variable, which can distort the causal relationship of interest and even give rise to an apparent causal relationship that is actually spurious. Therefore, natural experiments attempt to approximate a random assignment. If a researcher suspects that changes in variable A cause changes in variable B, they compare the level of B across systems that vary in their level of A. So instead of manipulating A directly, the researcher analyzes the variations in the level of A observed in nature, not in a lab.
It is much more difficult to interpret cause and effect in natural experiments because individuals have themselves chosen whether they participate in the program/policy of interest. Even in this challenging context, robust conclusions can be drawn from natural experiments in which individuals cannot be forced or forbidden to participate in a policy/program (Angrist & Imbens, 1994), such that changes in outcomes may be plausibly attributed to exposure to the intervention (DiNardo, 2008). Since most randomised experiments in practice do not allow complete control over who actually participates in a policy intervention, these findings are broadly relevant and widely adopted by researchers in practice. Natural experiments can shed the most insight when there exists a clearly defined exposure to a policy/program intervention for a well-defined subpopulation and the absence of such exposure in a similar subpopulation. Though extremely powerful, the methods described can only provide an estimate of the effect of a policy intervention on the people who actually changed their behavior as a result of the natural experiment.
The Rubin causal model (RCM) measures the causal effect of a policy by comparing the two alternative potential outcomes that an individual could experience. Since only one can actually be observed as an outcome, one is always missing, an issue known as the fundamental problem of causal inference (Holland, 1986). The fundamental problem of causal inference means that causal effects cannot be directly measured at an individual level. A researcher must instead turn to population-level estimates of causal effects drawn from randomized experiments (Rubin, 1974). To the extent that an experiment can be randomized, individuals can be identified within one of two groups and then a difference in the means of the policy variable can be observed to ascertain the causal effect (Angrist & Imbens, 1995), known as the average treatment effect (ATE).
In natural experiments randomization is not always possible, as individuals may select particular outcomes based on other factors. In these cases, techniques such as propensity score matching (Kane et al., 2020) are utilized in an attempt to reduce treatment assignment bias, and mimic randomization. This is accomplished by creating a sample of individuals that received the treatment which is comparable on all observed covariates to a sample of units that did not receive the treatment. More precise estimates can be achieved by focusing only on individuals that are compliant (Hull, 2022), which leads to use of a measure such as the local average treatment effect (LATE), also known as known as the complier average causal effect (CACE) or the conditional average treatment effect (CATE). The LATE is typically calculated as either the ratio of the estimated intent-to-treat effect relative to estimated proportion of compliers, or through an instrumental variable estimator. However, this increase in precision comes with the trade-off of ignoring the effect of non-compliance to a policy that is typical in real world environments.
An alternative model of causal inference to the RCM is a structural causal model (SCM) (Pearl, 2010), which introduces a highly graphical approach which can be very helpful in modelling assumptions or identifying whether an intervention is even possible or not. The potential outcomes literature is more focused on quantifying the impact of policy/program interventions. In addition, there is ongoing work to potentially unify these methods, such as single world intervention graphs (Sanchez et al., 2002). Continuing in the spirit of unification, there is further opportunity at present to explore whether predictive power in experiments can be increased by building the causal definitions of outcomes, preferences and constraints of causal economic theory into research models across broad areas such as computer science, economics, healthcare and others.
2.10.2. Causal Machine Learning (CML)
Causal machine learning (Huber, 2023) is currently an emerging area of computer science with an expanding literature (Arti & Kusumawardani, 2020). The increased focus on causality in empirical work spans many disciplines, such as computer science, econometrics, epidemiology, marketing, and statistics, as well as business (Lechner, 2023). Some of the most significant work underlying this area is the Neyman–Rubin causal model (Sekhon, 2007) based on the potential outcome approach of statistics (Imbens & Rubin, 2015) discussed in the previous section. Causal machine learning builds upon this underlying work and others in causal inference (Pearl, 2018) and applies machine learning methods (Lechner, 2023) to answer well identified causal questions using large and informative data (Athey, 2017). The continual pursuit of better treatment/policy outcomes by practitioners provides reason to believe that causal inference and machine learning (Cui et al., 2020) will come to play a dominant role in empirical analysis over time.
Pioneering economists are driving the increased use of causal machine learning in econometrics (Athey & Imbens, 2019) and its application to economic policy evaluation (Lechner, 2023). Prominent causal machine learning methods have been developed by economists directly, an example being causal forests as proposed by Susan Athey and Stefan Wager (Athey & Wager, 2019). The causal forest approach identifies neighbourhoods in the covariate space through recursive partitioning. Each causal tree within the causal forest learns a low-dimensional representation of the heterogeneity of the treatment effect. The causal forest is an average of a large number of individual causal trees, where each tree differs based on subsampling (Athey & Imbens, 2019). Studies find a value-added contribution from the use of causal machine learning in welfare economics applications (Strittmatter, 2023). With the continued expansion of the big data trend there will most likely be further increased interest in causal machine learning to deal with data sets that have a large numbers of covariates relative to the number of observations, posing a challenge to non-causal techniques.
Causal machine learning has experienced increasing popularity in healthcare (Sanchez et al., 2022) in tandem with a continued focus on real world evidence (RWE) and improvement of intervention outcomes (Crown, 2019). Obtaining valid statistical inference when using machine learning in causal research is currently a critical topic in public health and beyond (Naimi & Whitcomb, 2023). Medical researchers and practitioners are making some of the most powerful advances in the use of causal machine learning, asserting that machine learning promises to revolutionize clinical decision making and diagnosis (Richens et al., 2020). For example, a 2020 study conducted by Richens et al. showed that a causal, counterfactual algorithm delivered predictions that ranked within the top 25% of doctors, achieving expert clinical accuracy, compared to a standard associative algorithm that only placed in the 48% of doctors. These findings demonstrate that causal reasoning is a vital missing ingredient when applying machine learning to medical diagnosis situations (Richens et al., 2020). Recent research on the past expansion of health insurance programs using causal machine learning has provided tangible and robust insights into potential future programme expansions (Kreif et al., 2021).
A range of powerful meta-algorithms have been developed for causal machine learning, including the S-learner, the T-learner, the X-learner, the R-learner and the doubly robust (DR) learner. Each of these uses slightly different ways of estimating the average output and defining the conditional average treatment effect (Künzel, 2019). For example, in epidemiology, expert researchers suggest priority use of doubly robust estimators, the super learner and sample splitting to reduce bias and improve inference (Balzer & Westling, 2023). Work is ongoing to further categorize and apply causal machine learning (Kaddour et al., 2022).
The increased interest in causal machine learning is driven by the desire to use knowledge of a causal relationship to better predict outcomes of policies and treatments and to support confident and informed decision making in important situations where reliance on spurious correlations could be dangerous or costly. Causal machine learning is where the theoretical and statistical approaches discussed previously see power in action in a world of big data. It is where the principles of causal economics connect directly with artificial intelligence, where theoretical and applied come together, unified in theory and practice.
2.10.3. Causal Economic Machine Learning (CEML)
Causal economic machine learning (CEML) represents the intersection of causal machine learning and causal economics. It essentially involves the replacement of traditional economic decision modelling with causal economic decision modelling in the practice of machine learning, particularly with respect to reinforcement learning and deep learning contexts. This primarily means observing outcomes that contain both certain and uncertain cost upfront that cause future uncertain benefits and costs when modelling individual decision makers. It can also mean application of causal coupling at the macroeconomic policy level to gauge the effectiveness of policies that do a better or worse job of coupling cost and benefit across individuals.
There is currently a tremendous amount of potential for a research agenda in CEML, as a causal machine learning framework that employs the model of causal economics—in particular studies that employ big data and the increasingly powerful software in use today.