1. Introduction
The canonical formalism of quantum mechanics (QM) is based on five principal axioms[
1,
2]:
- (QM)
State Space: Each physical system corresponds to a complex Hilbert space, with the system’s state represented by a ray in this space.
- (QM)
Observables: Physical observables correspond to Hermitian operators within the Hilbert space.
- (QM)
Dynamics: The time evolution of a quantum system is dictated by the Schrödinger equation, where the Hamiltonian operator signifies the system’s total energy.
- (QM)
Measurement: The act of measuring an observable results in the system’s transition to an eigenstate of the associated operator, with the measurement value being one of the eigenvalues.
- (QM)
Probability Interpretation: The likelihood of a specific measurement outcome is determined by the squared magnitude of the state vector’s projection onto the relevant eigenstate.
Contrastingly, statistical mechanics (SM), the other statistical pillar of physics, derives its probability measures through entropy maximization, constrained by the following expression:
- (SM)
Average Energy Constraint: The average of energy measurements of a system at thermodynamic equilibrium converge to a specific value (
):
To maximize entropy while satisfying this constraint, the theory uses a Lagrange multiplier approach.
Definition 1 (Fundamental Lagrange Multiplier Equation of SM).
where λ and β are the Lagrange multipliers.
Theorem 1 (Gibbs Measure).
The solution to the Lagrange multiplier equation of SM, is the well-known Gibbs measure.
Proof. This is an well-known result by E. T. Jaynes [
3,
4]. As a convenience, we replicate the proof in Annex a. □
As evident from E. T. Jaynes’ methodological innovation, SM relies on a single constraint related to the nature of the measurements under consideration, which allows the formulation of an optimization problem sufficient to derive the relevant probability measure. This is an exceptionally parsimonious formulation of a physical theory.
We propose a generalization of E. T. Jaynes’ approach to the realms of Quantum Mechanics (QM), Relativistic Quantum Mechanics (RQM), and Quantum Gravity (QG). For each of these three domains, we will introduce a single constraint related to measurements, formulate a corresponding entropy maximization problem, and present a main theorem that fully encapsulates the theory within each realm. This formulation reduces fundamental physics to its simplest and most parsimonious expression, deriving the core theories as optimal solutions to a well-defined entropy maximization problem.
1.1. Quantum Mechanics
To reformulate QM as the solution to an entropy maximization problem, we propose the following constraint:
- QM
Vanishing Complex-Phase: Quantum measurements admit a vanishing complex phase. The constraint is:
which associates to the follow equation:
Definition 2 (Fundamental Lagrange Multiplier Equation of QM).
where λ and τ are the Lagrange multipliers.
The
relative Shannon entropy[
5,
6] is utilized because we are solving for the least biased theory that connects an initial preparation
to its final measurement
.
Theorem 2.
The least biased theory that connects an initial preparation to its final measurement , under the constraint of the vanishing complex-phase, is:
where we have defined (analogous to in SM).
The proof of this theorem will be presented in the results section. We will show that this solution entails the five axioms of QM, which are now promoted to theorems, establishing it as our most parsimonious formulation of QM to date.
1.2. Relativistic Quantum Mechanics
Before we can discuss RQM, we first need to introduce some notation. Let , where a is a scalar, is a vector, is a bivector, is a pseudo-vector and is a pseudo-scalar, be a multivector of the geometric algebra , and let be its matrix representation. Then, the fundamental constraint of RQM is:
- RQM
Vanishing Relativistic Phase: Our formulation of RQM is based around a vanishing phase spanning the
group. The constraint is:
where
is the matrix representation of the multivector
of
, using the real Majorana representation of the gamma matrices.
The Lagrange multiplier equation is as follows:
Definition 3 (Fundamental Lagrange Multiplier Equation of RQM).
where λ and ζ are the Lagrange multipliers.
Theorem 3.
The least biased theory that connects an initial preparation to its final measurement , under the constraint of the vanishing relativistic phase, is:
In the results section, we aim to demonstrate that this solution represents a quantum mechanical theory of inertial reference frames, where is a one parameter generator of boosts, rotations and phase transformations. This theory allows for measurements, superpositions, and interference between inertial reference frames, providing the arena in which relativistic quantum mechanics (RQM) operates. While incorporating David Hestenes’ results regarding the geometric algebra formulation of RQM, the Dirac current, and the Dirac equation, our approach completes his formulation by introducing missing elements which allow the promotion of the spacetime interval to an observable constructing the metric tensor. The formulation thus lays the foundation for the forthcoming development of quantum gravity through the introduction of quantum frame fields and metric measurements.
1.3. Quantum Gravity
Our formulation of QG is based on a quantum theory of accelerated reference frames, which via the equivalence principle, is locally equivalent to gravity. To formulate the maximization problem whose resolution automatically yields the theory, we utilize the same vanishing phase constraint as the RQM case, but we modify the normalization constraint as follows:
Definition 4 (Fundamental Lagrange Multiplier Equation of QG).
where λ and ζ are the Lagrange multipliers.
Theorem 4.
The least biased theory which connects an initial preparation to its final measurement , under the constraint of the vanishing linear phase, is:
In the results section, we aim to demonstrate that the solution entails a quantum theory of accelerated reference frames. This theory defines the arena in which QG operates. In the solution, is not a probability measure but, as revealed by dimensional analysis, an area measure. The size of this area serves as the dilation constraint, remains invariant with respect to all transformations, and its entropy is associated with the entropy of the mixture of all enclosed quantum systems. The Lagrange multiplier serves as the generator of a -valued flow which preserves the area-size associated with the measure. The spacetime interval is an observable, enabling the construction of the metric tensor, here valid for metrics of any curvature. Finally, we derive the quantized Einstein field equations, and argue that the theory is finite.
1.4. Dimensional Obstructions
We end the result section with a number of theorems showing that the formalism, except for the scalar case of SM and the case of QM, is found to be consistent only with 3+1-dimensional spacetime, encountering various obstructions in all other dimensional configurations, and we discuss the implications.
2. Results
2.1. Quantum Mechanics
In statistical mechanics, the founding observation is that energy measurements of a thermally equilibrated system tend towards an average value. Comparatively, in QM, the founding observation involves the interplay between the systematic elimination of complex phases in measurement outcomes and the presence of interference effects in repeated measurement outcomes. To represent this observation, we introduce the
Vanishing Complex-Phase Anti-Constraint:
where
are scalar-valued functions of
. The usage of the matrix generates a
phase, and the trace causes it to vanish under specific circumstances (which will correspond to measurements).
At first glance, this expression may seem to reduce to a tautology equating zero with zero, suggesting it imposes no restriction on energy measurements. However, this appearance is deceptive. Unlike a conventional constraint that limits the solution space, this expression serves as a formal device to expand it, allowing for the incorporation of complex phases into the probability measure. The expression’s role in broadening, rather than restricting, the solution space leads to its designation as an ”anti-constraint.”
In general, usage of anti-constraints expand classical probability measures into larger domains, such as quantum probabilities.
Its significance will become evident upon the completion of the optimization problem. For the moment, this expression can be conceptualized as the correct expression that, when incorporated as an anti-constraint within an entropy-maximization problem, resolves into the axioms of quantum mechanics.
Our next procedural step involves solving the corresponding Lagrange multiplier equation, mirroring the methodology employed in statistical mechanics by E. T. Jaynes. We utilize the relative Shannon entropy because we wish to solve for the least biased measure that connects an initial preparation
to its final measurement
. For that, we deploy the following Lagrange multiplier equation:
Where and are the Lagrange multipliers.
We solve the maximization problem as follows:
The partition function, is obtained as follows:
Finally, the least biased theory that connects an initial preparation
to its final measurement
, under the constraint of the vanishing complex phase, is:
Though initially unfamiliar, this form effectively establishes a comprehensive formulation of quantum mechanics, as we will demonstrate.
Upon examination, we find that phase elimination is manifestly evident in the probability measure: since the trace evaluates to zero, the probability measure simplifies to classical probabilities, aligning precisely with the Born rule’s exclusion of complex phases:
However, the significance of this phase elimination extends beyond this mere simplicity. As we will soon see, the partition function Z gains unitary invariance, allowing for the emergence of interference patterns and other quantum characteristics under appropriate basis changes.
We will begin by aligning our results with the conventional quantum mechanical notation. As such, we transform the representation of complex numbers from
to
. For instance, the exponential of a complex matrix is:
Then, we associate the exponential trace to the complex norm using
:
Finally, substituting
analogously to
, and applying the complex-norm representation to both the numerator and to the denominator, consolidates the Born rule, normalization, and initial prepration into :
We are now in a position to explore the solution space.
The wavefunction is delineated by decomposing the complex norm into a complex number and its conjugate. It is then visualized as a vector within a complex n-dimensional Hilbert space. The partition function acts as the inner product. This relationship is articulated as follows:
where
We clarify that represents the probability associated with the initial preparation of the wavefunction, where .
We also note that Z is invariant under unitary transformations.
Let us now investigate how the axioms of quantum mechanics are recovered from this result:
The entropy maximization procedure inherently normalizes the vectors with . This normalization links to a unit vector in Hilbert space. Furthermore, as the POP formulation of QM associates physical states with its probability measure, and the probability is defined up to a phase, we conclude that physical states map to Rays within Hilbert space. This demonstrates a.
-
In
Z, an observable must satisfy:
Since , then any self-adjoint operator satisfying the condition will equate the above equation, simply because . This demonstrates b.
-
Upon transforming Equation
31 out of its eigenbasis through unitary operations, we find that the energy,
, typically transforms in the manner of a Hamiltonian operator:
The system’s dynamics emerge from differentiating the solution with respect to the Lagrange multiplier. This is manifested as:
Which is the Schrödinger equation. This demonstrates c.
-
From Equation
31 it follows that the possible microstates
of the system correspond to specific eigenvalues of
. An observation can thus be conceptualized as sampling from
, with the measured state being the occupied microstate
q of
. Consequently, when a measurement occurs, the system invariably emerges in one of these microstates, which directly corresponds to an eigenstate of
. Measured in the eigenbasis, the probability measure is:
In scenarios where the probability measure
is expressed in a basis other than its eigenbasis, the probability
of obtaining the eigenvalue
is given as a projection on a eigenstate:
Here, signifies the squared magnitude of the amplitude of the state when projected onto the eigenstate . As this argument hold for any observables, this demonstrates d.
Finally, since the probability measure (Equation
29) replicates the Born rule, e is also demonstrated.
Revisiting quantum mechanics with this perspective offers a coherent and unified narrative. Specifically, the vanishing complex phase constraint (Equation
12) is sufficient to entail the foundations of quantum mechanics (Axiom 1, 2, 3, 4 and 5) through the principle of entropy maximization. Equation
12 becomes the formulation’s new singular foundation, and Axioms 1, 2, 3, 4, and 5 are now theorems.
2.2. RQM in 2D
In this section, we investigate RQM in 2D. Although all dimensional configurations except 3+1D contain obstructions, which will be discussed later in this section, the 2D case provides a valuable starting point before addressing the more complex 3+1D case. In RQM 2D, the fundamental Lagrange Multiplier Equation is:
where
and
are the Lagrange multipliers, and where
is the matrix representation of a multivector
of
, where
a is a scalar,
is a vector and
is a bivector:
where the basis elements are defined as:
If we take
then
reduces as follows:
The Lagrange multiplier equation can be solved as follows:
The partition function
, serving as a normalization constant, is determined as follows:
Consequently, the least biased theory that connects an initial preparation
to a final measurement
, under the constraint of the vanishing relativistic phase in 2D is:
where
.
In 2D, the Lagrange multiplier
correspond to an angle of rotation, and in 1+1D it would correspond to the rapidity
:
The 2D solution may appear equivalent to the QM case because they are related by an isomorphism and under the replacement . However, an isomorphism is not an equality, and in Spin(2) we gain extra structures related to a relativistic description, which are not available in the QM case.
To investigate the solution in more detail, we introduce the multivector conjugate, also known as the Clifford conjugate, which generalizes the concept of complex conjugation to multivectors.
(a.k.a Clifford conjugate)).
Definition 5 (Multivector conjugate Let be a multi-vector of the geometric algebra over the reals in two dimensions . The multivector conjugate is defined as:
The determinant of the matrix representation of a multivector can be expressed as a self-product:
Theorem 5 (Determinant as a Multivector Self-Product).
Proof. Let
, and let
be its matrix representation
. Then:
□
Building upon the concept of the multivector conjugate, we introduce the multivector conjugate transpose, which serves as an extension of the Hermitian conjugate to the domain of multivectors.
Definition 6 (Multivector Conjugate Transpose).
Let :
The multivector conjugate transpose of is defined as first taking the transpose and then the element-wise multivector conjugate:
Definition 7 (Bilinear Form).
Let and be two vectors valued in . We introduce the following bilinear form:
Theorem 6 (Inner Product). Restricted to the even sub-algebra of , the bilinear form is an inner product.
Proof.
This is isomorphic to the inner product of a complex Hilbert space, with the identification . □
(2)-valued Wavefunction).
Definition 8 (Spin
where representing the square root of the probability and representing a rotor in 2D (or boost in 1+1D).
The partition function of the probability measure can be expressed using the bilinear form applied to the Spin(2)-valued Wavefunction:
Theorem 7 (Partition Function).
Thus, the Spin(2)-valued wavefunction is a linear object whose inner product reduces to the partition function.
(2)-valued Evolution Operator).
Definition 9 (Spin
Theorem 8. The partition function is invariant with respect to the Spin(2)-valued evolution operator.
Proof.
where
, because
is traceless. □
We note that since the even sub-algebra of is closed under addition and multiplication, and the bilinear form constitutes an inner product, it follows that it can be employed to construct a Hilbert space, in this case a Spin(2)-valued Hilbert space. The primary difference between a wavefunction living in a complex Hilbert space and one living in a Spin(2) Hilbert space relates to the subject matter of the theory. In the present case, the subject matter is a quantum theory of inertial reference frames in 2D.
The dynamics of reference frame transformations follow from the Schrödinger equation, which is obtained by taking the derivative of the wavefunction with respect to the Lagrange multiplier . Each element of the wavefunction represents an inertial reference frame, whose transformation is generated by the angle (for instance, the change of angle experienced by an inertial observer).
(2)-valued Schrödinger Equation).
Definition 10 (Spin
The Spin(2)-valued Schrödinger Equation can be parametrized in space
In this case represents a global one-parameter evolution parameter akin to time, which is able to transform the wavefunction under the Spin(2), locally across 2D space. This is an extremely general equation that captures all transformations that can be done consistently with the evolution group of the wavefunction.
Definition 11 (Reference Frame Measurement). The e, regarding the measurement postulates, is derived as a theorem in the RQM case as well (for the same reason as it is in the QM case). This allows us to measure the wavefunction ψ into one of its states q according to probability . Here the post-measurement state q corresponds to picking a specific inertial reference frame q from .
We note that, as a linear system, linear combinations of the wavefunction (such as ) will also be solutions. This can introduce interference patterns between inertial reference frames:
Theorem 9 (Reference Frame Superpositions and Interference).
Proof. Let
, and
, then:
Then the probability can be computed as follows:
Since Spin(2)≅U(1), then Spin(2)-valued interference is isomorphism to complex interference. □
Definition 12 (David Hestenes’ Formulation).
In 3+1D, the David Hestenes’ formulation [7] of the wavefunction is , where is a Lorentz boost or rotation and where is a phase. In 2D, as the algebra only admits a bivector, his formulation would reduce to , which is identical to what we recovered.
The definition of the Dirac current applicable to our wavefunction follows the formulation of David Hestenes:
Definition 13 (Dirac Current).
Given the basis and , the Dirac current is defined as:
where and are a Spin(2) rotated frame field.
2.2.1. Obstructions
We identify two obstructions:
In 1+1D: The 1+1D theory results in a split-complex quantum theory due to the bilinear form , which yields negative probabilities: for certain wavefunction states, in contrast to the non-negative probabilities obtained in the Euclidean 2D case. (This is why we had to use 2D instead of 1+1D in this two-dimensional introduction...)
In 1+1D and in 2D: The basis vectors (
and
in 2D, and
and
in 1+1D) are not self-adjoint. Although used in the context defining the Dirac current, their non-self-adjointness prevents the construction of the spacetime interval (or in 2D, the Euclidean distance) as a quantum observable. The benefits of having the basis vectors self-adjoint will become obvious in the 3+1D case, where we will be able to construct the metric tensor from spacetime interval measurements. Specifically, in 2D:
because
.
In the following section, we will explore the obstruction-free 3+1D case.
2.3. RQM in 3+1D
In this section, we extend the concepts and techniques developed for multivector amplitudes in 2D to the more physically relevant case of 3+1D dimensions. The Lagrange multiplier equation is as follows:
The solution (proof in Annex a) is obtained using the same step-by-step process as the 2D case, and yields:
where
is a "twisted-phase" rapidity. (If the invariance group was Spin(3,1) instead of Spin
c(3,1), obtainable by posing
, then it would simply be the rapidity).
Our initial goal will be to express the partition function as a self-product of elements of the vector space. As such, we begin by defining a general multivector in the geometric algebra .
Definition 14 (Multivector).
Let be a multivector of . Its general form is:
where are the basis vectors in the real Majorana representation.
A more compact notation for is
where a is a scalar, a vector, a bivector, is pseudo-vector and a pseudo-scalar.
This general multivector can be represented by a real matrix using the real Majorana representation:
Definition 15 (Matrix Representation
of
).
To manipulate and analyze multivectors in , we introduce several important operations, such as the multivector conjugate, the 3,4 blade conjugate, and the multivector self-product.
Definition 16 (Multivector Conjugate(in 4D)).
Definition 17 (3,4 Blade Conjugate).
The 3,4 blade conjugate of is
The results of Lundholm[
8], demonstrates that the multivector norms in the following definition, are the
unique forms which carries the properties of the determinants such as
to the domain of multivectors:
Definition 18.
The self-products associated with low-dimensional geometric algebras are:
We can now express the determinant of the matrix representation of a multivector via the self-product . This choice is not arbitrary, but the unique choice with allows us to represent the determinant of the matrix representation of a multivector within :
Theorem 10 (Determinant as a Multivector Self-Product).
Proof. Please find a computer assisted symbolic proof of this equality in Annex a. □
Definition 19 (
-valued Vector).
These constructions allow us to express the measure in terms of the multivector self-product.
Theorem 11 (Partition Function).
Theorem 12 (Non-negative inner product). The multilinear form, applied to the even sub-algebra of is awlays non-negative.
Proof. Let
. Then,
We note 1)
and 2)
We note that the terms are now complex numbers, which we rewrite as
and
Which is always non-negative. □
We now define the -valued wavefunction, which is valued in the even sub-algebra of :
Definition 21 (
-valued Wavefunction).
where is a rotor, is a phase, and .
The evolution operator, leaving the partition function invariant, becomes:
Definition 22 (
Evolution Operator).
In turn, this leads to a Schrödinger equation obtained by taking the derivative of the wavefunction with respect to the Lagrange multiplier :
Definition 23 (
-valued Schrödinger equation).
The
-valued Schrödinger Equation can be parametrized in spacetime
In this case represents a global one-parameter evolution parameter akin to time, which is able to transform the wavefunction under the , locally across spacetime. This is an extremely general equation that captures all transformations that can be done consistently with the evolution group of the wavefunction.
Definition 24 (David Hestenes’ Formulation).
Our -valued wavefunction is identical to David Hestenes’[7] formulation of the wavefunction within GA(3,1). Both contain a rotor , a phase and the probability term .
Definition 25 (Dirac Current).
The definition employed in the 2D case (same as Hestenes’) applies here as well:
We will now demonstrate that the multilinear form is invariant with respect to the , , and gauge symmetries, which play a fundamental role in the standard model of particle physics. Using the basis to enforce the invariance means that we are interested in a transformation that preserves a charge density in time, rather than that of a charge current in space ().
Theorem 14
implies , which generates .
Proof.
We can now identify that the condition to preserve the equality reduces to this expression:
We further note that moving the left most term to the right yields:
Therefore, the product reduces to if and only if , leaving :
Finally, we note that generates . □
Proof. From the above relation, we identify that the following expression must remain invariant:
. Now, let
. Then:
The first three terms anticommute with
, while the last three commute with
:
This can be written as:
where
and
.
Thus, for
, we require: 1)
and 2)
. The second requirement means that
and
must commute (and thus be isomorphic to three complex numbers), and the first implies:
which are the defining conditions for the
symmetry group. □
We have now demonstrated that the solution to the entropy maximization problem offers a powerful framework that naturally incorporates gauge symmetries, retains invariance with respect to the group, includes the Dirac current and equation, and introduces the notion of the metric tensor via spacetime interval measurements. The specificity of these gauges is attributable to the set of all time-invariant gauges supported by the multilinear form in , and cannot be different.
2.4. Quantum Gravity in 3+1D
In the previous section, we developed a quantum theory of inertial reference frames valued in Spinc(3,1), in which RQM lives. Our goal in this section is to extend the methodology to accelerated frame fields, in which General Relativity (GR) lives. We understand accelerated reference frames, via the principle of equivalence, as locally equivalent to gravity. To formulate the theory we will exploit the double-copy feature of the multilinear form, which will allow us to formulate the spacetime interval as an observable.
2.4.1. Initial Investigation
In addition to linearity, the multilinear form supports a double-copy wavefunction. Specifically, we note that in the multilinear form, the term will be multiplied four times, leading to as the probability. This is acceptable because the multiplication of two probabilities yields a probability. In fact, the multilinear form supports a "double-copy" wavefunction:
Definition 26 (Double-Copy).
Let ψ and φ be two Spinc(3,1)-valued wavefunctions. Then, the double copy
yields a transition amplitude that satisfies the probability measure.
This double-copy feature will be crucial to formulate the spacetime interval as an observable.
Let us now explore in more details how the adjoint action of the wavefunction acts on a basis element.
From this, we note that the wavefunction contains all the multivectorial components required to map a vector such as to any other vector , allowing for rotation/boosts and dilations of the vector, but leaving the origin unchanged.
Comparatively, we previously defined the Dirac current as . The difference here, is that we absorbed into a dilation of . In general, the wavefunction is able to transform the frame field arbitrarily.
As a measurement must yield a real value, then the construction of the metric tensor requires the multiplication of two basis elements. This will require separate joint action on and , during a singular measurement. This is where the double-copy feature becomes crucial:
Theorem 16 (Metric Measurement).
The metric measurement is the expectation value of the and vectors, applied to a double-copy -valued wavefunction, with ψ, φ, ϕ and ξ:
One needs four wavefunctions, one for each basis elements.
We can write this more compactly as follows:
Proof. Without loss of generality, let us do
. Let
and
:
As one can swap with and obtain the same metric tensor, the multilinear form guarantees that is symmetric. Finally, since , then and are self-adjoint within the multilinear form, entailing the interpretation of as an observable. □
In general, we can formulate the spacetime interval as an observable:
Definition 27 (Spacetime Interval Measurement).
The spacetime interval measurement is the expectation value of the and vectors, with wavefunctions ψ, φ, ϕ and ξ:
2.4.2. The Lagrange Multiplier Equation
Following this initial heuristic investigation, we now define the problem formally via a Lagrange multiplier equation. First, we raise an interpretational observation regarding the scalar term of . In the previous sections on QM and RQM, this term was associated with the square root of the probability . However, as we noted in Theorem 16, it now associates with a dilation factor. Specifically, the frame field absorbs this term within its curvilinear transformation. Furthermore, a probability varies between , but dilations can vary between in case of an orientable manifold (or if non-orientable). So if not a probability, then what is ?
The breakthrough in understanding the precise role of came from dimensional analysis. Specifically, to construct the entries of the metric tensor from the wavefunction, the scalar terms ends up being multiplied four times (twice per gamma matrix). The 4-volume density of the metric, given by the square root of the metric determinant , thus scales as . Significantly, is the square root of the 4-volume , indicating that the measure grows with the area associated with the metric it defines.
Consequently, the solution will not be a probability measure; rather, it will be a measure of entropy-bearing areas. The sum over these areas will be used to normalize the wavefunction. As such, a sum over areas will replace the typical normalization constraint of a probability measure. As the measure is non-negative, then all areas will have the same orientation, and the metric tensor will always describe an orientable manifold such as a world manifold.
In line of this identification, the Lagrange multiplier equation is as follows:
Definition 28 (The Fundamental Lagrange Multiplier Equation of QG).
where is the measure, is the initial preparation, is the total area, maps q to a matrix, and where ζ is the Lagrange multiplier. We note that the equation is the same as the RQM case, with the exception that we normalize the measure to instead of 1.
The solution to this optimization problem is obtained as follows:
Theorem 17.
The least biased theory which connects an initial preparation to its final measurement , under the constraint of the vanishing relativistic phase, is:
Proof.
The partition function
, serving as a normalization constant, is determined as follows:
□
2.4.3. Entropy Area Law
Theorem 18 (Entropy Area Law). The Shannon entropy leads to a thermodynamic law relating the entropy to the area.
Proof.
Since
, then
This mid result is not surprising, because the evolution operator preserves the probability. Continuing...
Since
, then
Since
, then
Since
is the definition of the average, it yields
. Furthermore,
is the definition
. Then:
□
2.4.4. Interpretation of the Entropy
To help us understand what the distribution
means, let us investigate how it behaves in spacetime. As such, we will parametrize
in
:
The term is a coefficient that turns the entries of the matrix into densities, allowing for integration.
- 4D:
Normalizing the wavefunction by integrating over the 4-volume constituted the base normalization constraint. On any orientable pseudo-Riemannian manifold (which includes the cases relevant for general relativity), there exists a naturally defined volume form, where this sum is observer independent.
- 3D
In comparison, integrating the wavefunction over a 3D region by fixing one coordinate (such as time), will also produce an area representing the total entropy within this region. But as the volume of a 3D region can vary based on the observer, this area, and the entropy it associates to, is observer dependent.
- 2D
-
Integrating the wavefunction over a 2D surface, fixing two coordinates also yields an area to represent the entropy of the surface. In GR, some 2D regions, such as null surfaces, are observer invariant; consequently, the entropy associated with these areas is also observer invariant.
As an example, an area scaling factor with value of
leads to an area law (Equation
179) with the same form as the Bekenstein-Hawking entropy[
11].
where the additional logarithmic term is there to satisfy the third law of thermodynamics.
- 1D
Integrating the wavefunction over a 1D curve, fixing three coordinates also associates an area to an entropy. The proper length of all curves in GR is invariant, and therefore these curves also bear an observer-invariant entropy associated with the sum of the area elements defined by the metric at each point of the curve.
Finally, we note that if we do take to be normalized (i.e. ), then satisfies all axioms of probability theory and can be interpreted as a quantum system.
In light of these observations, the most natural interpretation we can identify the total area expressed by is that represents the capacity of the spacetime region to host independent quantum systems (i.e. pure states), each described by a normalized probability measure , and with the total area corresponding to total number of such systems. As such, each pure state within the mixture would stretch the area by some amount according to its entropy contribution within the mixture.
2.4.5. Observables
We recall that in a complex Hilbert space an observable is given as: .
Here, we investigate the general self-adjoint relation for the multilinear form.
Theorem 19
where the elements of and ξ are valued in .
This relation implies the eigenvalues of are real-valued and that its eigenvectors are orthogonal, allowing for proper treatment of observables in 3+1D.
Proof. Let us show the theorem for a two-state system. The observable
is represented by a
matrix:
where
,
,
, and
are multivectors, encapsulating the components of the observable.
Let us calculate each part of the equality:
For the equality to be realized, it must be the case that the elements of
commute with with the elements of
and
, because we must move them between the elements of the self-products; for instance the observable elements in 3) and 4) must be move to the left by 2 places to realize the equality. The relations are then:
which reduces to
implying simply that
and that the elements of
are valued in the reals (so that it commutes with all grades of a multivector). The eigenvalues of a symmetric matrix are real-valued, and its eigenvectors are orthogonal, allowing the consistent description of observables within the theory. □
A general observable for a two-state system would therefore be expressed as follows:
for a three-state system, as follows:
and so on.
We can notice that such matrices spawns the set of all possible inner product for a n-dimensional quantum system (i.e. O defines an inner product as ). Thus observables in our theory associates to the set of all possible inner products on the vector space. The multilinear form ensures that all possible observables are real eigenvalues and have orthogonal eigenvectors.
Finally, since we utilize a multilinear form (and not just a bilinear form), we repeat that we also have access to another kind of observables, relying on the double copy structure, and already mentioned for the spacetime interval as an observable in Theorem 27.
Proof.
This permits the measurement of arbitrary geometric objects constructed from multivectors. The plus/minus signs follow from the double copy which eliminates . □
In their eigenbasis, metric observables are expressed as follows:
where
are multivector valued, and where
. For instance, and metric measurement involves these observables:
which meets the requirement
.
In general, all observables A and B whose eigenvalues are vector-valued, will yield the value of the inner product between the eigenvalues of A and of B, within the metric measurement equation:
2.4.6. Quantized Einstein Field Equations
To study the EFE within the present framework, we must express the Einstein tensor in terms of the metric observable (Definition 16), yielding , instead of the classical metric tensor .
Definition 29 (Quantum EFE).
The quantum version of the Einstein Field Equation becomes:
With this in hand, we can now demonstrate that the quantized Einstein tensor is, in this framework, finite.
Theorem 21. , a metric observable, is finite for all possible ψ.
Proof. The proof is in two parts.
-
First, we show that the elements of the metric tensor are real-valued. As such, they contain no singularities.
This is because the metric is engendered by the joint action of the wavefunction on the basis vectors (), which is valued in . Any element of , applied to the gamma basis to yield the metric tensor, will yield a real number. Thus, metric tensors that contain, say, a term in , yielding a singularity at , cannot be constructed from the wavefunction, as we would need to pick an element from that contains ∞ at , and no such element exists in .
Second, the finiteness of implies the finiteness of if is twice differentiable. Since , as a metric tensor, is smooth, it is at least twice differentiable.
□
While we concede that this proof does not automatically provide the most efficient algorithm for perturbatively calculating graviton amplitudes, it nonetheless constitutes a valid proof of the claim. That is, is finite for all possible .
Furthermore, since all observables (including world observables and metric observables) are finite and the theory contains the Einstein tensor as an observables, it follows that the theory is a finite theory of quantum gravity.
2.4.7. A Geometric Twist on Einstein’s Dice
Einstein famously remarked, "God does not play dice." It appears that Einstein may have been right: God plays with disks, not dice.
The entropy in 4D spacetime is associated with oriented area elements, or "disks." This arises from the fact that the determinant of the metric tensor, as produced by the general linear wavefunction, in 4D contains 16 products of , yielding . The square root of the determinant of the metric tensor, which gives the 4-volume density, scales as . The square root of this 4-volume density scaling, , corresponds to the scaling of an area element and matches the factor found in the multilinear form. Thus, entropy-bearing oriented disks are the geometric objects that solves the problem of maximizing the entropy of all possible measurements in 4D spacetime.
But the game changes in different dimensions. In 2D space, God trades disks for sticks. The determinant of the metric tensor in 2D contains 4 products of , yielding . The square root of this expression, , corresponds to the scaling of a line element, matching the factor in the theory’s bilinear form in 2D. Therefore, in 2D space, entropy-bearing oriented line elements, or "sticks," solves the problem of maximizing the entropy of all possible geometric measurements.
Moving up to 6D space, God finally picks up the dice. The determinant of the metric tensor in 6D contains 24 products of , yielding . The 6D hyper-volume scaling is given by the square root of this expression, . The square root of this 6D hyper-volume scaling, , corresponds to the scaling of a 3D volume element, matching the factor in the determinant of a 6x6 matrix in the theory. Thus, in 6D space, entropy-bearing oriented 3D volume elements, or "dice," are the geometric objects that solves the problem of maximizing the entropy of all possible geometric measurements.
In summary, while Einstein was right that God does not play dice in 4D spacetime, the multivector-valued quantum mechanics theory suggests that the divine game varies across dimensions. God flips the sticks in 2D, spins the disks in 4D, and finally rolls the dice in 6D.
2.5. Dimensional Obstructions
In this section, we explore the dimensional obstructions that arise when attempting to extend the multivector amplitude formalism to other dimensional configurations. We found that all dimensional configurations except those explored in this paper (e.g.
,
and
) are obstructed:
Let us now demonstrate the obstructions mentioned above.
Theorem 22 (Not isomorphic to a real matrix algebra). The determinant of the matrix representation of the geometric algebras in this category is either complex-valued or quaternion-valued, making them unsuitable as a probability.
Proof. These geometric algebras are classified as follows:
The determinant of these objects, when such a thing exists, is valued in or in , where are the complex numbers, and where are the quaternions. □
Theorem 23 (Negative Probabilities in the RQM). The even sub-algebra, which associates to the RQM part of the theory, of these dimensional configurations allows for negative probabilities, making them unsuitable as a RQM.
Proof. We note three cases:
-
:
Let
, then:
which is valued in
.
-
:
Let
, then:
which is valued in
.
-
:
-
Let
, where
, then:
We note that
, therefore:
which is valued in
.
In all of these cases the RQM probability can be negative. □
We repeat the following self-products[
8] (Definition 18), which will help us demonstrate the next theorem:
Theorem 24 (No Metric Measurements). This obstruction applies to . Multilinear forms of at least four self-products are required for the theory to be observationally complete with respect to the geometry.
Proof. A metric measurement requires a multilinear form of 4 self products because the metric tensor is defined using 2 self-products of the gamma matrices:
Each pair of wavefunction products fixes one basis elements. Thus, two pairs of wavefunction products are required to fix the geometry from the wavefunction. As multilinear forms of four self-products begin to appear in 3D, then the cannot produce a metric measurement as a quantum observable, thus its geometry is not observationally complete. □
[No multilinear form as a self-product (in 6D)] The multivector representation of the norm in 6D cannot satisfy any observables.
Proof (Argument). In six dimensions and above, the self-product patterns found in Definition 18 collapse. The research by Acus et al.[
12] in 6D geometric algebra demonstrates that the determinant, so far defined through a self-products of the multivector, fails to extend into 6D. The crux of the difficulty is evident in the reduced case of a 6D multivector containing only scalar and grade-4 elements:
This equation is not a multivector self-product but a linear sum of two multivector self-products[
12].
The full expression is given in the form of a system of 4 equations, which is too long to list in its entirety. A small characteristic part is shown:
From Equation
262, it is possible to see that no observable
can satisfy this equation because the linear combination does not allow one to factor it out of the equation.
Any equality of the above type between and is frustrated by the factors and , forcing as the only satisfying observable. Since the obstruction occurs within grade-4, which is part of the even sub-algebra it is questionable that a satisfactory quantum theory (with observables) be constructible in 6D. □
This conjecture proposes that the multivector representation of the determinant in 6D does not allow for the construction of non-trivial observables, which is a crucial requirement for a consistent quantum formalism. The linear combination of multivector self-products in the 6D expression prevents the factorization of observables, limiting their role to the identity operator.
[No multilinear form as a self-product (above 6D)] The norms beyond 6D are progressively more complex than the 6D case, which is already obstructed.
These theorems and conjectures provide additional insights into the unique role of the unobstructed 3+1D signature in our proposal.
It is also interesting that our proposal is able to rule out even if in relativity, the signature of the metric versus does not influence the physics. However, in geometric algebra, represents 1 space dimension and 3 time dimensions. Therefore, it is not the signature itself that is ruled out but rather the specific arrangement of 3 time and 1 space dimensions, as this configuration yields quaternion-valued "probabilities" (i.e. and ).
Consequently, 3+1D is the only dimensional configuration (other than the "non-geometric" configurations of and ) in which a ’least biased’ solution to the problem of maximizing the Shannon entropy of quantum measurements relative to an initial preparation, exists. This is an extremely strong claim regarding the possible spacetime configurations of the universe, and our ability (or inability) to construct an objective theory to explain it.
3. Discussion
3.1. Maximizing The Relative Shannon Entropy
The principle of maximum entropy[
3] states that the probability measure that best represents the current state of knowledge about a system is the one with the largest entropy, constrained by prior data.
In QM, an experiment begins with an initial preparation, followed by some transformations, and concludes with a final measurement of the system, yielding the result of the experiment. Consistent with the maximum entropy principle, our aim is to derive the ’least biased’ theory that connects the initial preparation to its final measurement , thereby formulating the theory as a solution to a maximization problem, rather than merely by axiomatic stipulation.
Using this methodology, fundamental physics can be formulated as the general solution to a maximization problem involving the Shannon entropy of all possible measurements of an arbitrary system relative to its initial preparation, under the constraint of a vanishing phase. As such, the structure of the inferred theory is determined by the nature and generality of the employed constraint. In this paper, we have investigated these four entropy maximization problems:
Despite the differences in constraints, all four theories hereso formulated share a common logical genesis, adhere to the same principle of maximum entropy, and qualify as the least biased theory for their given constraint.
3.2. The Multilinear Form
David Hestenes’ work on the representation of the relativistic wavefunction within was instrumental in the development of this research. His results served as a milestone, confirming the validity of our approach at various stages. Hestenes’ wavefunction, , contains the same geometric structures as the wavefunction in our theory.
However, it is noteworthy that Hestenes’ work does not include a fully satisfactory probability measure. He proposes multiplying the wavefunction with its reverse:
The result does contains , but it also includes a phase factor . As such, it is not a proper probability measure.
Subsequently, Hestenes proposes sandwiching the
basis to obtain the Dirac current:
This approach eliminates the phase contribution because . Likewise, the Dirac current is not a proper probability measure (nor is it designed to be) as it contains a basis .
To construct an adapted Born rule that directly yields the probability when applied to the wavefunction, one might be tempted to apply the conjugate to
in addition to the reverse:
In this case one indeeds maps to , however, this approach disrupts the definition of the Dirac current: .
Finally, a last proposal is to define it as
However, such a definition is not the solution to an entropy maximization problem, and therefore does not represent the least biased probability measure for the situation. It erases some of the features required to fully describe the system.
To correctly incorporate all the necessary features, including both the Dirac current and a probability measure yielding the probability density, the multilinear form must be employed. Transitioning from bilinear forms to multilinear forms involving four self-products of represents a significant conceptual leap. The strength of the entropy maximization problem lies in its ability to automatically reveal the appropriate form to use. Specifically:
3.3. The Double-Copy Gauge Theory
In recent years, a remarkable connection between gauge theories and gravity has been discovered, known as the "double-copy" relationship. This relationship, first proposed by Bern, Carrasco, and Johansson (BCJ) [
13], states that the scattering amplitudes of certain gravity theories can be expressed as a "double-copy" of the scattering amplitudes of gauge theories, such as Yang-Mills theory.
The BCJ double-copy is based on the observation that the scattering amplitudes of gauge theories can be written in a form where the kinematic numerators obey the same algebraic relations as the color factors. This is known as the "color-kinematics duality." By replacing the color factors with another copy of the kinematic numerators, one obtains the scattering amplitudes of a related gravity theory.
Our multilinear form is able to engender its own version of a double-copy of gauge theories. It would be interesting to establish if this relates to the BCJ double copy, or if it is a different double-copy effect.
Theorem 25 (Double-Copy Gauge Theory).
Let and be two -valued wavefunction, and let and be two bivectors of . Then:
implies two copies of a gauge theory, satisfying the invariance of the multilinear form.
Proof. The relation
remains invariant if
which according to Theorem 15, each copy yields a realization of the
gauge; in the present case, yielding two distinct copies. Any perturbative expansion of the metric operator will be formulated in terms of these wavefunction double-copies which can be related to gauge theory. We are not sure if this can be connected to the BCJ double-copy conjecture, but we think it may be an interesting avenue for future research. □
4. Conclusion
In conclusion, this paper presents a novel approach to physical theory construction by solving a maximization problem on the Shannon entropy of all possible measurements of a system relative to its initial preparation, under the constraint of a vanishing phase. By appropriately selecting the group of the vanishing phase, the solution resolves to quantum mechanics, relativistic quantum mechanics, or a theory of quantum gravity. Our findings reveal the exceptional ability of this approach to generate a mathematically well-behaved theory that generalizes quantum probabilities through the introduction of vanishing phases. The resulting measure is invariant under a wide range of geometric transformations, including those generated by the gauge groups of the Standard Model, those associated to general relativity, and leads to the metric tensor as a quantum mechanical observable, without the need for additional assumptions beyond the vanishing phase. This finding aligns with the observed dimensionality and gauge symmetries of the universe and suggests a possible explanation for its specificity.
This research represents a significant step in reconciling quantum mechanics with general relativity, challenging and expanding conventional methodologies in theoretical physics, and potentially paving the way for new insights in the field. By reducing fundamental physics to its simplest and most parsimonious expression, deriving the core theories as optimal solutions to a well-defined entropy maximization problem, we offer a unified framework that integrates statistical mechanics, quantum mechanics, relativistic quantum mechanics, and quantum gravity, while also accounting for the dimensionality of spacetime and the gauge symmetries of particle physics.