1. Introduction
Statistical mechanics (SM), in the formulation developed by E.T. Jaynes [
1,
2], is founded on an entropy optimization principle. Specifically, the Boltzmann entropy is maximized under the constraint of a fixed average energy
:
The Lagrange multiplier equation defining the optimization problem is:
where
and
are Lagrange multipliers enforcing the normalization and average energy constraints. Solving this optimization problem yields the Gibbs measure:
where
is the partition function.
For comparison, quantum mechanics (QM) is not formulated as the solution to an optimization problem, but rather consists of a collection of axioms [
3,
4]:
- QM Axiom 1 of 5
State Space: Every physical system is associated with a complex Hilbert space, and its state is represented by a ray (an equivalence class of vectors differing by a non-zero scalar multiple) in this space.
- QM Axiom 2 of 5
Observables: Physical observables correspond to Hermitian (self-adjoint) operators acting on the Hilbert space.
- QM Axiom 3 of 5
Dynamics: The time evolution of a quantum system is governed by the Schrödinger equation, where the Hamiltonian operator represents the system’s total energy.
- QM Axiom 4 of 5
Measurement: Measuring an observable projects the system into an eigenstate of the corresponding operator, yielding one of its eigenvalues as the measurement result.
- QM Axiom 5 of 5
Probability Interpretation: The probability of obtaining a specific measurement outcome is given by the squared magnitude of the projection of the state vector onto the relevant eigenstate (Born rule).
Physical theories have traditionally been constructed in two distinct ways. Some, like quantum mechanics, are defined through a set of mathematical axioms that are first postulated and then verified against experiments. Others, like statistical mechanics, emerge as solutions to optimization problems with experimentally-verified constraints.
We propose to generalize the optimization methodology of E.T. Jaynes to encompass all of physics, aiming to derive a unified theory from a single optimization problem.
To that end, we introduce the following constraint:
Axiom 1 (Natural Constraint).
where are matrices, and is their average.
This constraint, as it replaces the scalar with the matrix , extends E.T. Jaynes’ optimization method to encompass non-commutative observables and symmetry group generators required for fundamental physics.
We then construct an optimization problem:
Definition 1 (Physics).
Physics is the solution to:
where λ and τ are Lagrange multipliers enforcing the normalization and natural constraints, respectively.
This single definition constitutes our complete proposal for reformulating fundamental physics—no additional principles will be introduced. By replacing the Boltzmann entropy with the relative Shannon entropy, the optimization problem extends beyond thermodynamic variables to encompass any type of experiment. This generalization occurs because relative entropy captures the essence of any experiment: the relationship between a final measurement state and its initial preparation state.
Two key constraints shape our framework. The normalization constraint ensures we are working with a proper predictive theory, while the natural constraint spawns the domain of applicability of the theory. Together, they capture the complete evolution—from initial to final states—that defines any experiment. The crucial insight is that because our formulation maintains complete generality in the structure of experiments while optimizing over all possible predictive theories, the resulting solution holds true, by construction, for all realizable experiments within the domain.
The solution provides a complete set of axioms that automatically satisfy the requirements of a physical theory valid over the domain spawned by the constraint: mathematical rigour, internal consistency, optimal predictive power, and automatic applicability to all realizable experiments in its domain. This approach reduces our reliance on postulating axioms through trial and error, and simplifies the foundations of physics. Specifically, when we employ the natural constraint –the most permissive constraint for this problem–, the solution points toward a unified physics where statistical mechanics, quantum mechanics, general relativity, and the Standard Model gauge symmetries emerge naturally. Importantly, this emergence occurs without additional assumptions and without generating unwanted artifacts like extra dimensions or unobserved gauge symmetries.
Theorem 1.
The general solution of the optimization problem is:
Proof. We solve the maximization problem by setting the derivative of the Lagrangian with respect to
to zero:
Normalizing the probabilities using
, we find:
Substituting back, we obtain:
Finally, using the identity
for square matrices
, we get:
□
where .
This solution encapsulates three distinct special cases:
-
Statistical Mechanics:
To recover statistical mechanics from Equation (
10), we consider the case where the matrices
are
, i.e., scalars. Specifically, we set:
and take
to be a uniform distribution. Then, Equation (
10) reduces to the Gibbs distribution:
where
corresponds to
in traditional statistical mechanics. This demonstrates that our solution generalizes SM, as it recovers it when
are scalars.
-
Quantum Mechanics:
By choosing
to generate the U(1) group, we derive the axioms of quantum mechanics from entropy maximization. Specifically, we set:
where
are energy levels. In the results section, we will detail how this choice leads to a probability measure that includes a unitarily invariant ensemble and the Born rule, satisfying all five axioms of QM.
-
Fundamental Physics:
Extending our approach, we choose
to be
matrices representing the generators of the Spin
c(3,1) group. Specifically, we consider multivectors of the form
, where
is a bivector and
is a pseudoscalar of the 3+1D geometric algebra
. The matrix representation of
is:
where
, and
b correspond to the generators of the Spin
c(3,1) group, which includes both Lorentz transformations and U(1) phase rotations. This choice leads to a relativistic quantum probability measure:
where
emerges as a parameter generating boosts, rotations, and phase transformations.
In the results section, we show that the associated Dirac current is automatically invariant under the gauge symmetries of the Standard Model, specifically SU(3), SU(2) and U(1). We then show that it further suggests a quantum theory of gravity naturally incorporating the U(1) and SO(3,1) symmetries.
-
Dimensional Obstructions:
Axiom 1 yields valid probability measures only in specific geometric cases. Beyond the instances of statistical mechanics and quantum mechanics, Axiom 1 yields a consistent solution only in 3+1 dimensions. In other dimensional configurations, various obstructions arises violating the axioms of probability theory. The following table summarizes the geometric cases and their obstructions:
where
means the geometric algebra of
dimensions.
We will first investigate the unobstructed cases in
Section 2.1,
Section 2.2 and
Section 2.3 and then demonstrate the obstructions in
Section 2.4. These obstructions are desirable because they automatically limit the theory to 3+1D, thus providing a built-in mechanism for the observed dimensionality of our universe.
2. Results
2.1. Quantum Mechanics
In statistical mechanics (SM), the central observation is that energy measurements of a thermally equilibrated system tend to cluster around a fixed average value (Equation
1). In contrast, quantum mechanics (QM) is characterized by the presence of interference effects in measurement outcomes. To capture these features within an entropy maximization framework, we introduce the following special case of Axiom 1:
Definition 2 (U(1) Generating Constraint
. We reduce the generality of Axiom 1 to the generator of the U(1) group. Specifically, we replace
where are scalar values (e.g., energy levels), are the probabilities of outcomes, and the matrices generate the U(1) group.
The general solution of the optimization problem reduces as follows
Though initially unfamiliar, this form effectively establishes a comprehensive formulation of quantum mechanics, as we will demonstrate.
To align our results with conventional quantum mechanical notation, we translate the matrices to complex numbers. Specifically, we consider that:
Then, we note the following equivalence with the complex norm:
Finally, substituting
analogously to
, and applying the complex-norm representation to both the numerator and to the denominator, consolidates the Born rule, normalization, and initial prepration into :
The wavefunction emerges by decomposing the complex norm into a complex number and its conjugate. It is then visualized as a vector within a complex n-dimensional Hilbert space. The partition function acts as the inner product. This relationship is articulated as follows:
where
We clarify that represents the probability associated with the initial preparation of the wavefunction, where .
We also note that Z is invariant under unitary transformations.
Let us now investigate how the axioms of quantum mechanics are recovered from this result:
The entropy maximization procedure inherently normalizes the vectors with . This normalization links to a unit vector in Hilbert space. Furthermore, as physical states associate to the probability measure, and the probability is defined up to a phase, we conclude that physical states map to Rays within Hilbert space. This demonstrates QM Axiom 1 of 5.
-
In
Z, an observable must satisfy:
Since , then any self-adjoint operator satisfying the condition will equate the above equation, simply because . This demonstrates QM Axiom 2 of 5.
-
Upon transforming Equation (
43) out of its eigenbasis through unitary operations, we find that the energy,
, typically transforms in the manner of a Hamiltonian operator:
The system’s dynamics emerge from differentiating the solution with respect to the Lagrange multiplier. This is manifested as:
which is the Schrödinger equation. This demonstrates QM Axiom 3 of 5.
-
From Equation (
43) it follows that the possible microstates
of the system correspond to specific eigenvalues of
. An observation can thus be conceptualized as sampling from
, with the measured state being the occupied microstate
i. Consequently, when a measurement occurs, the system invariably emerges in one of these microstates, which directly corresponds to an eigenstate of
. Measured in the eigenbasis, the probability measure is:
In scenarios where the probability measure
is expressed in a basis other than its eigenbasis, the probability
of obtaining the eigenvalue
is given as a projection on a eigenstate:
Here, signifies the squared magnitude of the amplitude of the state when projected onto the eigenstate . As this argument hold for any observables, this demonstrates QM Axiom 4 of 5.
Finally, since the probability measure (Equation
41) replicates the Born rule, QM Axiom 5 of 5 is also demonstrated.
Revisiting quantum mechanics with this perspective offers a coherent and unified narrative. Specifically, the U(1) generating constraint is sufficient to entail the foundations of quantum mechanics (Axiom 1, 2, 3, 4 and 5) through the principle of entropy maximization. QM Axioms 1, 2, 3, 4, and 5 are shown to be the solution to an optimization problem.
2.2. RQM in 2D
In this section, we investigate a model, isomorphic to quantum mechanics, that lives in 2D which provides a valuable starting point before addressing the more complex 3+1D case. In RQM 2D, the fundamental Lagrange Multiplier Equation is:
where
and
are the Lagrange multipliers, and where
is the
matrix representation of the multivectors of
.
In general a multivector
of
, where
a is a scalar,
is a vector and
a pseudo-scalar, is represented as follows:
This holds for any matrix and any multivectors of .
The basis elements are defined as:
To investigate this case in more detail, we introduce the multivector conjugate, also known as the Clifford conjugate, which generalizes the concept of complex conjugation to multivectors.
Definition 3 (Multivector Conjugate).
Let be a multi-vector of the geometric algebra over the reals in two dimensions . The multivector conjugate is defined as:
The determinant of the matrix representation of a multivector can be expressed as a multivector self-product:
Theorem 2 (The Determinant in Multivector Self-Product Form).
Proof. Let
, and let
be its matrix representation
. Then:
□
This theorem establishes a connection between the determinant and a multivector self-product form. By expressing the determinant in a self-product form, we can later restrict our attention to the even subalgebra of GA(2), where this self-product form becomes positive-definite, thus yielding a proper inner product. This construction will be essential for developing quantum mechanical structures such as probability measures and observables within the geometric algebra framework.
Building upon the concept of the multivector conjugate, we introduce the multivector conjugate transpose, which serves as an extension of the Hermitian conjugate to the domain of multivectors.
Definition 4 (Multivector Conjugate Transpose).
Let :
The multivector conjugate transpose of is defined as first taking the transpose and then the element-wise multivector conjugate:
Definition 5 (Bilinear Form).
Let and be two vectors valued in . We introduce the following bilinear form:
Theorem 3 (Inner Product). Restricted to the even sub-algebra of , the bilinear form is an inner product.
Proof.
This is isomorphic to the inner product of a complex Hilbert space, with the identification
. □
Let us now solve the optimization problem for the even multivectors of , whose inner product is positive-definite.
We take
then
reduces as follows:
The Lagrange multiplier equation can be solved as follows:
The partition function
, serving as a normalization constant, is determined as follows:
Consequently, the optimal probability measure that connects an initial preparation
to a final measurement
, in 2D is:
Definition 6 (Spin(2)-valued Wavefunction)
.
where representing the square root of the probability and representing a rotor in 2D.
The partition function of the probability measure can be expressed using the bilinear form applied to the Spin(2)-valued Wavefunction:
Theorem 4 (Partition Function).
Definition 7 (Spin(2)-valued Evolution Operator)
.
Theorem 5. The partition function is invariant with respect to the Spin(2)-valued evolution operator.
Proof. We note that:
then, since
, the relation
is satisfied. □
We note that the even sub-algebra of , being closed under addition and multiplication and constituting an inner product through its bilinear form, allows for the construction of a Hilbert space. In this context, the Hilbert space is Spin(2)-valued. The primary distinction between a wavefunction in a complex Hilbert space and one in a Spin(2)-valued Hilbert space lies in the subject matter of the theory. Specifically, in the latter, the construction governs the change in orientation experienced by an observer (versus change in time), which in turn dictates the measurement basis used in the experiment, consistently with the rotational symmetry and freedom of the system.
The dynamics of observer orientation transformations are described by a variant of the Schrödinger equation, which is derived by taking the derivative of the wavefunction with respect to the Lagrange multiplier, :
Definition 8 (Spin(2)-valued Schrödinger Equation)
.
where θ represents a global one-parameter evolution parameter akin to time, which is able to transform the wavefunction under the Spin(2), locally across the states of the Hilbert space. This is an extremely general equation that captures all transformations that can be done consistently with the symmetries of the wavefunction for the Spin(2) group.
Definition 9 (David Hestenes’ Formulation).
In 3+1D, the David Hestenes’ formulation [5] of the wavefunction is , where is a Lorentz boost or rotation and where is a phase. In 2D, as the algebra only admits a bivector, his formulation would reduce to , which is the form we have recovered.
The definition of the Dirac current applicable to our wavefunction follows the formulation of David Hestenes:
Definition 10 (Dirac Current).
Given the basis and , the Dirac current for the 2D theory is defined as:
where and are a SO(2) rotated basis vectors.
2.2.1. 1+1D Obstruction
As stated in the introduction, of the dimensional cases, only 2D and 3+1D are free of obstructions. For instance, the 1+1D theory results in a split-complex quantum theory due to the bilinear form , which yields negative probabilities: for certain wavefunction states, in contrast to the non-negative probabilities obtained in the Euclidean 2D case. This is why we had to use 2D instead of 1+1D in this two-dimensional introduction. In the following section, we will investigate the 3+1D case, then we will show why all other dimensional cases are obstructed.
2.3. RQM in 3+1D
Extending the framework to relativistic quantum mechanics begins by considering measurements relative to (3,1) symmetries. This allows for transformations that include boosts, rotations, and phases, enabling relativistic consistency within the same entropy-based framework.
Here, we will develop the framework for the continuum case
which can be obtained by solving the following Lagrangian equation:
where
is a counter-term ensuring the integral remains invariant under coordinate transformations,
is a "twisted-phase" rapidity, and
Here,
, and
b correspond to the generators of the Spin
c(3,1) group, which includes both Lorentz transformations and U(1) phase rotations.
The solution (proof in Annex
Appendix B) is obtained using the same step-by-step process as the 2D case, and yields a probability density:
2.3.1. Probability Measure
As we did in the 2D case, our goal here will be to express the partition function as a self-product of elements of the vector space. As such, we begin by defining a general multivector in the geometric algebra .
Definition 11 (Multivector).
Let be a multivector of . Its general form is:
where are the basis vectors in the real Majorana representation.
A more compact notation for is
where a is a scalar, a vector, a bivector, is pseudo-vector and a pseudo-scalar.
This general multivector can be represented by a real matrix using the real Majorana representation:
Definition 12 (Matrix Representation of
).
To manipulate and analyze multivectors in , we introduce several important operations, such as the multivector conjugate, the 3,4 blade conjugate, and the multivector self-product.
Definition 13 (Multivector Conjugate (in 4D))
.
Definition 14 (3,4 Blade Conjugate).
The 3,4 blade conjugate of is
Lundholm [
6] proposes a number the multivector norms, and shows that they are the
unique forms which carries the properties of the determinants such as
to the domain of multivectors:
Definition 15.
The self-products associated with low-dimensional geometric algebras are:
We can now express the determinant of the matrix representation of a multivector via the self-product . This choice is not arbitrary, but the unique choice which allows us to represent the determinant of the matrix representation of a multivector within :
Theorem 6 (Determinant as a Multivector Self-Product).
Proof. Please find a computer assisted proof of this equality in Annex C. □
As can be seen from this theorem, the relationship between determinants and multivector products becomes more sophisticated in 3+1D. Unlike the 2D case where the determinant could be expressed using a product of two terms, in GA(3,1) the determinant requires two products involving four copies of the multivector. This is reflected in the structure , which cannot be reduced to a simpler self-product of two terms. However, when we restrict to invertible even-multivectors, this double-product becomes positive-definite as it did in GA(2), making it suitable as a probability measure despite having four terms.
The solution to the optimisation problem is a subset of the even sub-algebra of GA(3,1), which includes the scalar a, the bivector , and the pseudoscalar . Specifically, since solving the optimization problem involves an exponentiation step that guarantees invertibility, the wavefunction is a vector whose elements are valued in the invertible even-multivectors of GA(3,1):
Definition 16 (
-valued Wavefunction).
Theorem 7 (Positive-Definite Probability). The double-product is positive-definite on ψ.
Proof.
Since
is in
, then it is positive-definite, yielding a value of zero only for the zero-element and positive otherwise. □
The set of all transformations which maps to another valid wavefunction, and leaves the partition function invariant, is a local gauge valued in (3,1):
Definition 17 (
Evolution Operator).
In turn, this leads to a variant of the Schrödinger equation obtained by taking the derivative of the wavefunction with respect to the Lagrange multiplier :
Definition 18 (
-valued Schrödinger equation).
In this case represents a one-parameter evolution parameter akin to time, which is able to transform the measurement basis of the wavefunction under action of the group.
Theorem 8 (Local
(3,1) invariance).
Let be a general element of (3,1). Then, the equality:
is always satisfied.
2.3.2. RQM
Definition 19 (David Hestenes’ Wavefunction).
The -valued wavefunction we have recovered is formulated identically to David Hestenes’ [5] formulation of the wavefunction within GA(3,1).
where , and . Here, is a probability density, is a rotor and is a complex phase.
Before we continue the RQM investigation, let us note that the double-product contains two copies of a bilinear form
:
In the present and upcoming section, we will investigate the properties of each product individually, leaving the properties specific to the double-product for the section on quantum gravity.
Taking a single copy, the Dirac current is obtained directly from the gamma matrices, as follows:
Definition 20 (Dirac Current).
The definition of the Dirac current is the same as Hestenes’:
where is a SO(3,1) rotated basis vector.
2.3.3. Standard Model Gauge Symmetries
Based on some results of Hestenes and Lasenby [
7,
8], we will now demonstrate that the double-product is automatically invariant under transformations corresponding to the
,
, and
symmetries which play fundamental roles in the Standard Model of particle physics. These symmetries constitute the set of all transformations that leave the Dirac current invariant, i.e.,
with
T valued in the even subalgebra of
.
Theorem 9 (U(1) Invariance)
. Let be a general element of U(1). Then, the equality
is satisfied, yielding a U(1) symmetry for each copied bilinear form.
Proof. Equation
119 is invariant if this expression is satisfied:
This is always satisfied simply because □
Theorem 10 (SU(2) Invariance)
. Let be a general element of Spin(3,1). Then, the equality:
is satisfied for if (which generates SU(2)), yielding a SU(2) symmetry for each copied bilinear form.
Proof. Equation
121 is invariant if this expression is satisfied [
7]:
We now note that moving the left-most term to the right of the gamma matrix yields:
Therefore, the product
reduces to
if and only if
, leaving
:
Finally, we note that generates . □
Theorem 11 (SU(3))
. The equality
identifies the bivectors that parametrizes a subspace of the wavefunction that admis SU(3) symmetry.
Proof. First, we note the following action:
which we can rewrite as follows:
The first three terms anticommute with
, while the last three commute with
:
This can be written as:
where
and
.
Thus, for
, we require: 1)
and 2)
. The first requirement expands as follows:
which, as the norm of three complex numbers, is the defining conditions for the
symmetry group.
We can show that
is invertible by showing that the norm is never 0:
We note that for
, it is necessary but not sufficient that
, which is a dot product. However, the second condition for SU(3),
is a cross-product. Since it cannot be the case that both dot and cross products of two vectors be zero,
. Therefore
is invertible. Since
is invertible, there exists an exponential form where
. As such,
can be understood as a wavefunction state, which means that the wavefunction contains a structure that evolves according to SU(3) under adjoint action on
. Consequently, there exists a representation in which the Gell-Mann matrices that generate the SU(3) group would act on the three complex numbers comprising
:
□
In conventional QM, the Born rule naturally leads to a U(1)-valued gauge theory due to the following symmetry:
However, the and symmetries do not emerge from the probability measure in the same straightforward manner and are typically introduced by hand, justified by experimental observations. This raises the question: why these specific symmetries and not others? In contrast, within our framework, all three symmetry groups—, , and —as well as the local gauge symmetry, follow naturally from the invariance of the probability measure, in the same way that the symmetry follows from the Born rule. This suggests a deeper underlying principle governing the symmetries in fundamental physics.
Can we go further than SU(3)—for instance, is SU(4) also supported? Let us now investigate what happens if we demand invariance under the full even multivector:
The constraints that need to be satisfied for this equality are:
These conditions look like they enforce an SU(4)-invariant structure:
However, unlike the SU(3) case, here the remaining constraints are not sufficient to guarantee that
is invertible. As such, they do not describe a valid wavefunction state. Thus, the sequence of permissible local gauges does not include SU(4)—it ends at SU(3).
2.3.4. Gravity via the Double-Product
We recall the definition of the metric tensor in terms of basis vectors of geometric algebra, as follows:
Then, we note that the double-product acts on a pair of basis element
and
, as follows:
where
and
are SO(3,1) rotated basis vectors, and where
is a probability measure, then dividing by
, we get
.
As one can swap and and obtain the same metric tensor, the double-product guarantees that is symmetric.
Furthermore, since
, we get:
which allows us to conclude that
and
are self-adjoint within the double-product, entailing the interpretation of
as an observable.
This yields a definition of the metric tensor as follows:
Dynamics: In the double-product, the metric tensor emerges as a product of Dirac currents. This formulation suggests that the metric tensor encodes the probabilistic structure of the theory in the form of a symmetric rank-2 tensor, analogous to how the Dirac current encodes the probabilistic structure of a special relativistic quantum theory in the form of a 4-vector.
Let us now investigate the dynamics. We recall that the evolution operator (Definition 17) is:
Acting on the wavefunction, the effect of this operator cascades down to the basis vectors via the double-product in the form of a local
(3,1) gauge:
which realizes an
invariant transformation of the metric tensor via action of the exponential of a bivector, and a U(1) invariant transformation via action of the exponential of a pseudo-scalar:
In summary, this investigation has identified a scenario in which the metric tensor is measured using basis vectors. The evolution operator, governed by a (3,1)-valued Schrödinger equation, dynamically realizes SO(3,1) transformations on the metric tensor. Furthermore, the amplitudes associated with possible metric tensors are derived from a double-product acting on the basis vectors. This formulation simultaneously preserves the SO(3,1) symmetry, essential for describing spacetime structure, and the unitary symmetry, fundamental to quantum mechanics. It describes all changes of basis transformations that an observer in 3+1D spacetime can perform prior to measuring (in the quantum sense) a basis system in spacetime, and attributes a probability to the outcome (the outcome being the metric tensor).
Equivalence to frame fields: Adding a SO(3,1) symmetry to a geometric algebra basis vector and via adjoint action of the wavefunction’s rotors (i.e., ) causes these basis elements to acquire the same degrees of freedom as a frame field and , which includes the basis vector and a SO(3,1) orientation for that vector.
2.3.5. Local Gauges
The double-product probability measure automatically admits the following invariant gauges:
SO(3,1) (Equation
142), via adjoint action of Spin(3,1) elements of the basis elements within the double-product, leaving the metric tensor invariant:
SU(2)xU(1) (Theorem 9 and 10), via adjoint action of
(3,1) identifying the SU(2)xU(1) as the stabilizing subgroup of SO(3,1) acting on
:
SU(3) (Theorem 11), as the group that acts on a substructure of
that remains invariant to the adjoint action of
stabilizing
:
Gauging the SO(3,1) group yields to a gravitational theory [
9], whereas gauging the SU(2)xU(1) and SU(3) yields the electroweak and strong forces, respectively.
2.3.6. Gravity, Quantum Gravity and Future Work
One possible path of exploration would be to use our double-product structure, the wavefunction and the basis elements of GA to construct the Einstein-Hilbert action:
and its quantum counterpart:
However, this approach may not be entirely satisfying, because it does not derive the Einstein-Hilbert action from first principles—it merely re-expresses it using elements from our framework.
Ricci Scalar: To remedy this situation, let us consider the commutator. We have already seen that the anti-commutator double-product leads to
:
but what about the commutator? To define it, we will replace the basis vectors
and
with an observable of the dynamics (i.e., involving
and
). The result is a measurement of the dynamics of the system:
which if we contract with
and
, yields
R, the Ricci scalar. To obtain this result we have assumed that
is an equiprobable measure not dependent on
x, thus it is a constant. As we will see in Equation (
168), this assumption is acceptable for the Einstein-Hilbert action.
Lagrangian: Now that we have the Ricci scalar, we must incorporate it in an integral over spacetime so that it becomes a Lagrangian. To achieve this, we aim to derive the very concept of the Lagrangian entirely from entropy maximization. Consider the following Lagrange multiplier equation:
This equation is similar to Equation (
87), the Lagrange multiplier equation of the 3+1D case, except that we have made explicit the integration variables, and we considered the possibility of curvature via
. Solving this optimization problem gives the following partition function:
The evolution term
is normally utilized to produce a Schrödinger-like equation acting on the wavefunction, but in the integral it offers no contribution because the determinant reduces it to unity, simplifying to:
Since we normalized over spacetime, not just space, we obtain a statistical theory of events in spacetime. In this construction, a measurement collapses the wavefunction to a point in spacetime at with probability , thus signalling the end of the experiment via a measurement event.
We now introduce an observable for this probability measure: Our choice is
(i.e., the Lagrangian). Specifically, we use our previous definition of a commutator over the dynamical observables
and
(equation
159):
where
is a dimensional constant. The equation reduces to:
Finally, we consider the initial preparation
to attribute equal probabilities to all states. As such, it becomes a constant that no longer depends on
x. This assumption was also utilized in Equation (
159). This prevents
from dynamically contributing to the equation of motion. As it is a constant, we may absorb it along with
into the correct dimensional factor
. Consequently, the Einstein-Hilbert action is obtained:
which yields the EFE, derived from first principles as the natural dynamical theory in curved 3+1D spacetime.
A more detailed analysis of these equations and their quantization, and the resulting interpretation tying the Lagrangian to a concluding measurement on an experiment conducted from to , will be done in a future paper.
2.4. Dimensional Obstructions
In this section, we explore the dimensional obstructions that arise when attempting to resolve the entropy maximization problem for other dimensional configurations. We found that all geometric configurations except the previously explored cases are obstructed. By obstructed, we mean that the solution to the entropy maximization problem,
, does not satisfy all axioms of probability theory.
Let us now demonstrate the obstructions mentioned above.
Theorem 12 (Non-real probabilities). The determinant of the matrix representation of the geometric algebras in this category is either complex-valued or quaternion-valued, making them unsuitable as a probability.
Proof. These geometric algebras are classified as follows:
The determinant of these objects is valued in or in , where are the complex numbers, and where are the quaternions. □
Theorem 13 (Negative probabilities). The even sub-algebra, which associates to the RQM part of the theory, of these dimensional configurations allows for negative probabilities, making them unsuitable.
Proof. This category contains three dimensional configurations:
-
:
Let
, then:
which is valued in
.
-
:
Let
, then:
which is valued in
.
-
:
-
Let
, where
, then:
We note that
, therefore:
which is valued in
.
In all of these cases the probability can be negative. □
Conjecture 1 (No observables (6D)). The multivector representation of the norm in 6D cannot satisfy any observables.
Argument (Argument). In six dimensions and above, the self-product patterns found in Definition 15 collapse. The research by Acus et al. [
10] in 6D geometric algebra concludes that the determinant, so far defined through a self-products of the multivector, fails to extend into 6D. The crux of the difficulty is evident in the reduced case of a 6D multivector containing only scalar and grade-4 elements:
This equation is not a multivector self-product but a linear sum of two multivector self-products [
10].
The full expression is given in the form of a system of 4 equations, which is too long to list in its entirety. A small characteristic part is shown:
From Equation (
204), it is possible to see that no observable
can satisfy this equation because the linear combination does not allow one to factor it out of the equation.
Any equality of the above type between
and
is frustrated by the factors
and
, forcing
as the only satisfying observable. Since the obstruction occurs within grade-4, which is part of the even sub-algebra it is questionable that a satisfactory theory (with non-trivial observables) be constructible in 6D, using our method. □
This conjecture proposes that the multivector representation of the determinant in 6D does not allow for the construction of non-trivial observables, which is a crucial requirement for a relevant quantum formalism. The linear combination of multivector self-products in the 6D expression prevents the factorization of observables, limiting their role to the identity operator.
Conjecture 2 No observables (above 6D)). The norms beyond 6D are progressively more complex than the 6D case, which is already obstructed.
These theorems and conjectures provide additional insights into the unique role of the unobstructed 3+1D signature in our proposal.
It is also interesting that our proposal is able to rule out even if in relativity, the signature of the metric versus does not influence the physics. However, in geometric algebra, represents 1 space dimension and 3 time dimensions. Therefore, it is not the signature itself that is ruled out but rather the specific arrangement of 3 time and 1 space dimensions, as this configuration yields quaternion-valued "probabilities" (i.e., and ).
3. Discussion
When asked to define what a physical theory is, an informal answer may be that it is a predictive framework of measurements that applies to all possible experiments realizable within a domain, with nature as a whole being the most general domain. While physicists have expressed these theories through sets of axioms, we propose a more direct approach - mathematically realizing this fundamental definition itself. This definition is realized as an optimization problem (Definition 1), which can then be solved. The solution to this optimization problem yields precisely those axioms that realize the physical theory over said domain. Succinctly, physics is the solution to:
The relative Shannon entropy represents the basic structure of any experiment, quantifying the informational difference between its initial preparation and its final measurement.
The natural constraint is chosen to be the most general structure that admits a solution to this optimization problem. This generality follows from key mathematical requirements. The constraint must involve quantities that form an algebra, as the solution requires taking exponentials:
which involves addition, powers, and scalar multiplication of X. The use of the trace operation necessitates that X must be represented by square
matrices. Thus Axiom 1 involves
matrices:
The trace operation is utilized because the constraint must be converted back to a scalar for use in the Lagrange multiplier equation; while any function that maps an algebra to a scalar would achieve that, picking the trace recovers quantum mechanics in the case.
These mathematical requirements demonstrate that the natural constraint, as formulated in Axiom 1, represents the most general structure for this optimization problem. More precisely, Axiom 1, admitting the minimal mathematical structure required to solve an arbitrary entropy maximization problem, can be understood as the most general extension of the statistical mechanics average energy constraint which contains QM (as induced by the trace) as a specific solution.
Thus, having established both the mathematical structure and its generality, we can understand how this minimal ontology operates. Since our formulation keeps the structure of experiments completely general, our optimization considers all possible predictive theories for that structure, and the constraint is the most general constraint possible for that structure, the resulting optimal physical theory applies, by construction, to all realizable experiments within its domain.
This ontology is both operational, being grounded in the basic structure of experiments rather than abstract entities, and constructive, showing how physical laws emerge from optimization over all possible predictive theories subject to the natural constraint. Physics is encapsulated not as a pre-defined collection of fundamental axioms but as the optimal solution to a well-defined optimization problem over all experiments realizable within the domain. This represents a significant philosophical shift from traditional physical ontologies where laws are typically taken as primitive.
The next step in our derivation is to represent the determinant of the matrices through a self-product of multivectors involving various conjugate structures. By examining the various dimensional configurations of geometric algebras, we find that GA(3,1), representing real matrices, admits a sub-algebra whose determinant is positive-definite for its invertible members. All other dimensional configurations fail to admit such a positive-definite structure, with three exceptions: GA(0) yielding statistical mechanics, GA(0,1) and a sub-algebra of GA(2,0) yielding quantum mechanics.
The solution reveals that the 3+1D case harbours a new type of probability amplitude structure analogous to complex amplitudes, one that exhibits the characteristic elements of a quantum mechanical theory. Instead of complex-valued amplitudes, we have amplitudes valued in the invertible subset of the even sub-algebra of GA(3,1). This probability amplitude is identical to David Hestenes’ wavefunction, but comes with an extended Born rule represented by the determinant, and rather than a complex Hilbert space, it lives in a "double-product structure". This double-product structure suggests an automatic incorporation of a quantum theory of gravity via local SO(3,1) and U(1) gauges and local gauge theories with the SU(3) group, SU(2) and U(1) groups.
Interpretation: This framework presents a novel interpretation of quantum mechanics where quantum states and their evolution emerge purely from optimizing the relative entropy between initial and final experimental states, under the natural constraint. Unlike traditional interpretations that begin by postulating entities like the wavefunction or multiple worlds, this approach starts only with measurement outcomes and derives quantum mechanics as the optimal predictive framework applicable to all experiments realizable within this domain.
In this interpretation, the wavefunction isn’t a fundamental physical entity but rather emerges as a mathematical tool - it’s the solution to an optimization problem. The Born rule and even unitary evolution emerges naturally.
This interpretation aligns quantum mechanics more closely with statistical mechanics, where probability distributions aren’t physical entities but rather optimal descriptions of our knowledge given certain constraints. Just as the Gibbs distribution emerges from maximizing entropy subject to energy constraints, the quantum formalism emerges from maximizing relative entropy subject to the natural constraint.
This provides a more economical interpretation that avoids additional ontological commitments beyond what’s directly supported by measurement, while still recovering the full predictive power of quantum mechanics.
Measurements:
In this interpretation, measurements are the primary empirical foundation—they are what is actually observed, not derived. The quantum formalism is inferred from these measurement outcomes through entropy maximization. This reverses the usual interpretational direction: rather than starting with a wavefunction that "collapses" during measurement (creating the measurement problem), we start with measurements and derive the quantum formalism as the optimal way to predict future measurements from past ones, subject to contraints. The wavefunction and its evolution are mathematical tools derived from entropy optimization over possible measurement outcomes, rather than physical entities that somehow "collapse" during measurement.
4. Conclusions
This work presents a simple reformulation of fundamental physics inspired by E.T. Jaynes’ formulation of statistical mechanics. An optimization problem is defined that considers the space of all possible experiments, the space of all predictive theories, while being constrained by the space of all possible measurements. This mathematically realizes the definition of a physical theory as a solvable optimization problem. Its resolution then automatically selects the relevant physics - encompassing quantum mechanics, general relativity, and the Standard Model gauge symmetries - as the optimal solution.
The power of this reformulation lies in its explanatory reach: it suggests that these particular theories are in fact the optimal predictive frameworks for our universe, and further provides a mechanism for why spacetime exhibits 3+1 dimensionality. By framing physics as the solution to a single entropy optimization problem, constrained only by the measurements nature allows, this work presents a significant philosophical shift. Physical laws are no longer taken as primitive, but rather arise from the interplay between the natural constraint and the drive to maximize predictive capacity over all realizable experiments.
Author Contributions
For research articles with several authors, a short paragraph specifying their individual contributions must be provided. The following statements should be used “Conceptualization, X.X. and Y.Y.; methodology, X.X.; software, X.X.; validation, X.X., Y.Y. and Z.Z.; formal analysis, X.X.; investigation, X.X.; resources, X.X.; data curation, X.X.; writing—original draft preparation, X.X.; writing—review and editing, X.X.; visualization, X.X.; supervision, X.X.; project administration, X.X.; funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.”, please turn to the
CRediT taxonomy for the term explanation. Authorship must be limited to those who have contributed substantially to the work reported.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Data Availability Statement
No datasets were generated or analyzed during the current study.
Use of Artificial Intelligence
During the preparation of this manuscript, we utilized a Large Language Model (LLM), for assistance with spelling and grammar corrections, as well as for minor improvements to the text to enhance clarity and readability. This AI tool did not contribute to the conceptual development of the work, data analysis, interpretation of results, or the decision-making process in the research. Its use was limited to language editing and minor textual enhancements to ensure the manuscript met the required linguistic standards.
Conflicts of Interest
The author declares that he has no competing financial or non-financial interests that are directly or indirectly related to the work submitted for publication.
Appendix A
Here, we solve the Lagrange multiplier equation of SM.
We solve the maximization problem as follows:
The partition function, is obtained as follows:
Finally, the probability measure is:
Appendix B. RQM in 3+1D
The solution is obtained using the same step-by-step process as the 2D case, and yields:
Proof. The Lagrange multiplier equation can be solved as follows:
The partition function
, serving as a normalization constant, is determined as follows:
□
Appendix C. SageMath program showing ⌊u ‡ u⌋ 3,4 u ‡ u=detM u
from sage.algebras.clifford_algebra import CliffordAlgebra
from sage.quadratic_forms.quadratic_form import QuadraticForm
from sage.symbolic.ring import SR
from sage.matrix.constructor import~Matrix
# Define the quadratic form for GA(3,1) over the Symbolic Ring
Q = QuadraticForm(SR, 4, [-1, 0, 0, 0, 1, 0, 0, 1, 0, 1])
# Initialize the GA(3,1) algebra over the Symbolic Ring
algebra = CliffordAlgebra(Q)
# Define the basis vectors
e0, e1, e2, e3 = algebra.gens()
# Define the scalar variables for each basis element
a = var(’a’)
t, x, y, z = var(’t x y z’)
f01, f02, f03, f12, f23, f13 = var(’f01 f02 f03 f12 f23 f13’)
v, w, q, p = var(’v w q p’)
b = var(’b’)
# Create a general multivector
udegree0=a
udegree1=t*e0+x*e1+y*e2+z*e3
udegree2=f01*e0*e1+f02*e0*e2+f03*e0*e3+f12*e1*e2+f13*e1*e3+f23*e2*e3
udegree3=v*e0*e1*e2+w*e0*e1*e3+q*e0*e2*e3+p*e1*e2*e3
udegree4=b*e0*e1*e2*e3
u=udegree0+udegree1+udegree2+udegree3+udegree4
u2 = u.clifford_conjugate()*u
u2degree0 = sum(x for x in u2.terms() if x.degree() == 0)
u2degree1 = sum(x for x in u2.terms() if x.degree() == 1)
u2degree2 = sum(x for x in u2.terms() if x.degree() == 2)
u2degree3 = sum(x for x in u2.terms() if x.degree() == 3)
u2degree4 = sum(x for x in u2.terms() if x.degree() == 4)
u2conj34 = u2degree0+u2degree1+u2degree2-u2degree3-u2degree4
I = Matrix(SR, [[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
#MAJORANA MATRICES
y0 = Matrix(SR, [[0, 0, 0, 1],
[0, 0, -1, 0],
[0, 1, 0, 0],
[-1, 0, 0, 0]])
y1 = Matrix(SR, [[0, -1, 0, 0],
[-1, 0, 0, 0],
[0, 0, 0, -1],
[0, 0, -1, 0]])
y2 = Matrix(SR, [[0, 0, 0, 1],
[0, 0, -1, 0],
[0, -1, 0, 0],
[1, 0, 0, 0]])
y3 = Matrix(SR, [[-1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, -1, 0],
[0, 0, 0, 1]])
mdegree0 = a
mdegree1 = t*y0+x*y1+y*y2+z*y3
mdegree2 = f01*y0*y1+f02*y0*y2+f03*y0*y3+f12*y1*y2+f13*y1*y3+f23*y2*y3
mdegree3 = v*y0*y1*y2+w*y0*y1*y3+q*y0*y2*y3+p*y1*y2*y3
mdegree4 = b*y0*y1*y2*y3
m=mdegree0+mdegree1+mdegree2+mdegree3+mdegree4
print(u2conj34*u2 == m.det())
The program outputs
showing, by computer assisted symbolic manipulations, that the determinant of the real Majorana representation of a multivector u is equal to the double-product: .
References
- Jaynes, E.T. Information theory and statistical mechanics. Physical review 1957, 106, 620.
- Jaynes, E.T. Information theory and statistical mechanics. II. Physical review 1957, 108, 171.
- Dirac, P.A.M. The principles of quantum mechanics; Number 27, Oxford university press, 1981.
- Von Neumann, J. Mathematical foundations of quantum mechanics: New edition; Vol. 53, Princeton university press, 2018.
- Hestenes, D. Spacetime physics with geometric algebra (Page 6). American Journal of Physics 2003, 71, 691–714.
- Lundholm, D. Geometric (Clifford) algebra and its applications. arXiv preprint math/0605280 2006.
- Hestenes, D. Space-time structure of weak and electromagnetic interactions. Foundations of Physics 1982, 12, 153–168.
- Lasenby, A. Some recent results for SU(3) and Octonions within the Geometric Algebra approach to the fundamental forces of nature. arXiv preprint arXiv:2202.06733 2022.
- Kibble, T.W. Lorentz invariance and the gravitational field. Journal of mathematical physics 1961, 2, 212–221.
- Acus, A.; Dargys, A. Inverse of multivector: Beyond p+ q= 5 threshold. arXiv preprint arXiv:1712.05204 2017.
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).