Preprint
Article

This version is not peer-reviewed.

Introducing the Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type: Mathematical Methodology and Illustrative Application to Nuclear Engineering

A peer-reviewed article of this preprint also exists.

Submitted:

24 January 2025

Posted:

05 February 2025

You are already at the latest version

Abstract
This work presents the general mathematical frameworks of the “First and Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra Type” designated as the 1st-FASAM-NIE-V and the 2nd-FASAM-NIE-V methodologies, respectively. Using a single large-scale (adjoint) computation, the 1st-FASAM-NIE-V enables the most efficient computation of the exact expressions of all first-order sensitivities of the decoder response to the feature functions and also with respect to the optimal values of the NIE-net’s parameters/weights after the respective NIE-Volterra-net was optimized to represent the underlying physical system. The 2nd-FASAM-NIE-V requires as many large-scale computations as there are first-order sensitivities of the decoder response with respect to the feature functions. Subsequently, the second-order sensitivities of the decoder response with respect to the primary model parameters are obtained trivially by applying the “chain-rule of differentiation” to the second-order sensitivities with respect to the feature functions. The application of the 1st-FASAM-NIE-V and the 2nd-FASAM-NIE-V methodologies is illustrated by using a well-known model for neutron slowing down in a homogeneous hydrogenous medium, which yields tractable closed-form exact explicit expressions for all quantities of interest, including the various adjoint sensitivity functions and first- and second-order sensitivities of the decoder response with respect to all feature functions and also primary model parameters.
Keywords: 
;  ;  ;  

1. Introduction

It is well known that Neural Ordinary Differential Equations (NODEs) have enabled the use of deep learning for modeling discretely sampled dynamical systems. NODEs [1,2,3] provide a flexible trade-off between efficiency, memory costs and accuracy while bridging traditional numerical modeling with modern deep learning, as demonstrated by various applications, including time-series, dynamics and control [1,2,3,4,5,6,7,8,9]. However, since each time-step is determined locally in time, NODEs are limited to describing systems that are instantaneous. On the other hand, integral equations (IE) model global “long-distance” spatiotemporal relations, and IE solvers often possess stability properties that are superior to solvers for ordinary and/or partial differential equations. Therefore, differential equations are occasionally recast in integral-equation forms that can be solved more efficiently using IE solvers, as exemplified by the applications described in [10,11,12].
Due to their non-local behavior, IE solvers are suitable for modeling complex dynamics, learning the operator underlying the system under consideration by using data sampled from the respective system. As discussed in [13], the operator learning problem is formulated on finite grids, using finite-difference methods that approximate the domain of the functions under investigation; the learning is performed by using an IE solver which samples the domain of integration continuously. As shown in [14], Neural Integral Equations (NIEs) and the Attentional Neural Integral Equations (ANIEs) can be used to generate dynamics and infer the spatiotemporal relations that initially generated the data, thus enabling the continuous learning of non-local dynamics with arbitrary time resolution. The ANIE interprets the self-attention mechanism as the Nystrom method for approximating integrals [15], which enables efficient integration over higher dimensions, as discussed in [10,11,12,13,14,15] and references therein.
Neural nets are trained by minimizing a “loss functional” chosen by the user to represent the discrepancy between the output produced by the neural net’s decoder and some user-chosen “reference solution.” However, the physical system modeled by a neural net inevitably comprises imperfectly known parameters that stem from measurements and/or computations and are therefore afflicted by uncertainties that stem from the respective experiments and/or computations. Hence, even if the neural net reproduces perfectly a given state of a physical system, the neural net’s “optimized weights” are subject to the uncertainties inherent in the parameters that characterize the underlying physical system, and these uncertainties inevitably propagate to the decoder’s output response. It is hence important to quantify the impact of parameters/weights uncertainties on the uncertainties induced in the decoder’s output response. This impact is quantified by the sensitivities of the decoder’s response with respect to the optimized weights/parameters comprised within the neural net.
Neural nets comprise not only scalar-valued weights/parameters but also functions (e.g., correlations) of such scalar model parameters, which can be conveniently called “features of primary model parameters”. Cacuci [16] has developed the “nth-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-FASAM-N)”, which enables the most efficient computation of the exact expressions of arbitrarily high-order sensitivities of model responses with respect to the model’s “features”. In turn, the sensitivities of the responses with respect to the primary model parameters are determined, analytically and trivially, by applying the “chain rule” to the expressions obtained for the response sensitivities with respect to the model’s “features.” The nth-FASAM-N [16] has been applied to develop general first- and second-order sensitivity analysis methodologies for NODEs [17] and for Neural Integral Equations of Fredholm-type [18], which enable the computation, with unsurpassed efficacy, of the exact expressions of first and second-order sensitivities of decoder responses with respect to the underlying neural net’s optimized weights.
This work continues the application of the nth-FASAM-N [16] methodology to develop the “First- and Second-Order Methodologies for Neural Integral Equations of Volterra Type” (acronyms “1st-FASAM-NIE-V” and, respectively, “2nd-FASAM-NIE-V”). The 1st-FASAM-NIE-V methodology, which is presented in Section 2, enables the most efficient computation of exact expressions of all of the first-order sensitivities of NIE decoder responses with respect to all of the optimal values of the NIE-net’s parameters/weights, after the respective NIE-Volterra-net was optimized to represent the underlying physical system. The efficiency of the 1st-FASAM-NIE-V is illustrated in Section 3 by applying it to perform a comprehensive first-order sensitivity analysis of the well-known model [19,20,21] of neutron slowing down in a homogeneous medium containing fissionable material.
The general mathematical framework of the 2nd-FASAM-NIE-Volterra methodology, which is presented in Section 4, enables the most efficient computation of the exact expressions of the second-order sensitivities of NIE decoder responses with respect to all of the optimal values of the NIE-net’s parameters/weights. The efficiency of the 2nd-FASAM-NIE-V is illustrated in Section 5 by applying it to perform a comprehensive second-order sensitivity analysis of the neutron slowing down model [19,20,21] considered in Section 3. Section 6 concludes this work by presenting a discussion that highlights the unparalleled efficiency of the 2nd-FASAM-NIE-V methodology for performing sensitivity analysis of Volterra-type neural integral equations.

2. First-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (1st-FASAM-NIE-V)

Following [14], a network of nonlinear “Neural Integral Equations of Volterra-type (NIE-Volterra)” can be represented by the system of coupled equations shown below:
h i t = g i F θ ; t + φ i F θ ; t t 0 t ψ i h τ ; F θ ; τ d τ ;     t 0 t t f ;     1 = 1 , ... T H .
The quantities appearing in Eq. (1) are defined as follows:
(i)
The real-valued scalar quantities t , t 0 t t f , and τ , t 0 τ t f , are time-like independent variables which parameterize the dynamics of the hidden/latent neuron units. Customarily, the variable t is called the “global time” while the variable τ is called the “local time.” The initial time-value is denoted as t 0 while the stopping time-value is denoted as t f .
(ii)
The components of the vector θ θ 1 , ... , θ T W represent scalar learnable adjustable weights, where T W denotes the total number of adjustable weights in all of the latent neural nets. The components of the column-vector θ θ 1 , ... , θ T W are considered to be “primary parameters” while the components of the vector-valued function F θ F 1 θ , ... , F T F θ represent the ”feature” functions of the respective weights. The quantity T F denotes the “total number of feature/functions of the primary model parameters” comprised in the NIE-Volterra. In general, F θ is a nonlinear function of θ . The total number of feature functions must necessarily be smaller than the total number of primary parameters (weights), i.e., T F < T W . In the extreme case when there are no feature functions, it follows that F i θ θ i , for all i = 1 , ... , T W T F . In this work, all vectors are considered to be column vectors, and the dagger “ ” symbol will be used to denote “transposition.”. The symbol “ ” will be used to denote “is defined as” or, equivalently, “is by definition equal to.”
(iii)
The T H -dimensional vector-valued function h t h 1 t , ... , h T H t represents the hidden/latent neural networks. The quantity T H denotes the total number components of h t . At the initial time-value t 0 , the functions h i t 0 take on the known values h i t 0 = g i F θ ; t 0 .
(iv)
The functions g i F θ ; t = h i t 0 , i = 1 , ... , T H , model the initial state (“encoder”) of the network. The functions φ i F θ ; t and ψ i h τ ; F θ ; τ , i = 1 , ... , T H , depend nonlinearly on h t and F θ , respectively, and model the dynamics of the latent neurons.
The “training” of the NIE-Volterra net is accomplished by using the “adjoint” or other methods to minimize the user-chosen “loss functional” intended to represent the discrepancy between the output produced by the NIE-decoder and a “reference solution” chosen by the user. After the training is completed, the primary parameters (“weights”) θ θ 1 , ... , θ T W will have been assigned “optimal” values which are obtained as a result of having minimized the chosen loss functional. These optimal values for the primary parameters (“weights”) will be denoted using a superscript “zero,” as follows: θ 0 θ 1 0 , ... , θ T W 0 . Using these optimal/nominal parameter values to solve the NIE-system will yield the optimal/nominal solution h 0 t h 1 0 t , ... , h T H 0 t , which will satisfy the following form of Eq. (1):
h i 0 t = g i 0 F θ 0 ; t + φ i 0 F θ 0 ; t t 0 t ψ i 0 h 0 τ ; F θ 0 ; τ d τ ; t 0 t t f ; 1 = 1 , ... T H .
After the NIE-net is optimized to reproduce the underlying physical system as closely as possible, the subsequent responses of interest are no longer “loss functions” but become specific functionals of NIE’s “decoder” response/output. Such a decoder-response, which will be denoted as R h ; F θ , can be generally represented a scalar-valued functional of h t and F θ , defined as follows:
R h ; F θ = t 0 t f D h t ; F θ ; t d t
The function D h t ; F θ ; t models the decoder and may contain distributions (e.g., Dirac-delta and/or Heaviside functionals, etc.), if the decoder-response is to be evaluated at some particular point in time or over a subinterval within the interval t 0 , t f .
The optimal value of the decoder-response, denoted as R h 0 ; F θ 0 , is represented by evaluating Eq. (3) at the optimal/nominal parameter values θ 0 θ 1 0 , ... , θ T W 0 and optimal/nominal solution h 0 t , as follows:
R h 0 ; F θ 0 = t 0 t f D h 0 t ; F θ 0 ; t d t
The true values θ of the primary parameters (“weights”) that characterize the physical system modeled by the NIE-V net are afflicted by uncertainties inherent to the experimental and/or computational methodologies employed to model the original physical system. Therefore, the true values θ of the primary parameters (“weights”) will differ from the known nominal values θ 0 (which are obtained after training the NIE-net to represent the model of the physical system) by variations denoted as δ θ θ θ 0 . The variations δ θ θ θ 0 will induce corresponding variations δ F F θ F 0 , F 0 F θ 0 , in the feature functions, which in turn will induce variations v 1 t v 1 1 t , , v T H 1 t , v i 1 t h i t h 1 0 t , i = 1 , ... , T H , around the nominal/optimal functions h 0 t . Subsequently, the variations δ F and v 1 t ; x will induce variations in the NIE decoder’s response.
The 1st-FASAM-IDE-V methodology for computing the first-order sensitivities of the decoder’s response with respect to the NIE’s weights will be established by applying the same principles as those underlying the 1st-FASAM-N [16] methodology. These first-order sensitivities are embodied in the first-order G-variation δ R h 0 ; F 0 ; v 1 ; δ F of the response R h ; F θ , for variations v 1 t and δ F around the nominal values h 0 t and F 0 , which is by definition obtained as follows:
δ R h 0 ; F 0 ; v 1 ; δ F = d d ε t 0 t f D h 0 t + ε v 1 t ; F 0 + ε δ F ; t d t ε = 0        = δ R h 0 ; F 0 ; δ F d i r + δ R h 0 ; F 0 ; v 1 i n d .
In Eq. (5), the “direct-effect term” δ R h 0 ; F 0 ; δ F d i r arises directly from variations δ F (which in turn stem from parameter variations δ θ ) and is defined as follows:
δ R h 0 ; F 0 ; δ F d i r t 0 t f D h t ; F θ ; t F δ F h 0 ; F 0 d t j = 1 T F t 0 t f D h t ; F θ ; t F j δ F j h 0 ; F 0 d t ,
while the “indirect-effect term” δ R h 0 ; F 0 ; v 1 i n d arises through the variations v 1 t in the hidden state functions h t , and is defined as follows:
δ R h 0 ; F 0 ; v 1 i n d t 0 t f D h t ; F θ ; t h v 1 t h 0 ; F 0 d t j = 1 T H t 0 t f D h t ; F θ ; t h j δ h j h 0 ; F 0 d t .
The first-order relationship between the variations v 1 t and δ F is obtained from the first-order G-variation of Eq. (1) for i = 1 , ... T H , as follows:
d d ε h i 0 t + ε v i 1 t ε = 0 = d d ε g i F 0 + ε δ F ; t ε = 0 + d d ε φ i F 0 + ε δ F ; t ε = 0 × d d ε t 0 t ψ i h 1 0 τ + ε v 1 1 τ , ... , h T H 0 τ + ε v T H 1 τ ; F 0 + ε δ F ; τ d τ ε = 0 ; t 0 t t f .
Performing the operations indicated in Eq. (8) yields the following NIE-V net, which will be called the “1st-Level Variational Sensitivity System” (1st-LVSS), for the components v i 1 t , i = 1 , ... T H , of the “1st-level variational function” v 1 t :
v i 1 t = φ i F ; t j = 1 T H t 0 t ψ i h τ ; F ; τ h j τ v j 1 τ d τ h 0 , F 0 + k = 1 T F q i k F ; t δ F k h 0 ; F 0
where:
q i k F ; t g i F ; t F k + φ i F ; t F k t 0 t ψ i h τ ; F ; τ d τ + φ i F ; t t 0 t ψ i h τ ; F ; τ F k d τ .
As indicated in Eq. (9), the 1st-LVSS is to be computed at the nominal/optimal values for the respective model parameters. It is important to note that the 1st-LVSS is linear in the variational function v 1 t , although it generally remains nonlinear in h t .
The 1st-LVSS would need to be solved anew to obtain the function v 1 t that would correspond to each variation δ F j , j = 1 , ... , T F ; this procedure would become prohibitively expensive computationally if T F is a large number. The need for repeatedly solving the 1st-LVSS can be avoided by recasting the indirect-effect term δ R h 0 ; F 0 ; v 1 i n d in terms of an expression that does not involve the function v 1 t . This goal can be achieved by expressing δ R h 0 ; F 0 ; v 1 i n d in terms of another function, which will be called the “1st-level adjoint function,” and which will be the solution of the “1st-Level Adjoint Sensitivity System (1st-LASS)” to be constructed next.
The 1st-LASS will be constructed in a Hilbert space, denoted as H 1 Ω t , where Ω t t 0 , t f , comprising elements of the same form as v 1 t H 1 Ω t . The inner product of two elements χ ( 1 ) t χ 1 1 t , , χ T H 1 t H 1 Ω t and η ( 1 ) t η 1 1 t , , η T H 1 t H 1 Ω t will be denoted as χ ( 1 ) t , η ( 1 ) t 1 and is defined as follows:
χ ( 1 ) t , η ( 1 ) t 1 t 0 t f χ ( 1 ) t η ( 1 ) t d t = j = 1 T H t 0 t f χ i 1 t η i 1 t d t
The inner product χ ( 1 ) t , η ( 1 ) t 1 is required to hold in a neighborhood of the nominal values h 0 ; F 0 .
The next step is to form the inner product of Eq. (9) with a vector a 1 t a 1 1 t , , a T H 1 t H 1 Ω t , where the superscript “(1)” indicates “1st-level”, to obtain the following relationship:
i = 1 T H t 0 t f a i 1 t v i 1 t d t i = 1 T H t 0 t f a i 1 t φ i F ; t d t j = 1 T H t 0 t ψ i h τ ; F ; τ h j τ v j 1 τ d τ h 0 , F 0 = i = 1 T H k = 1 T F t 0 t f a i 1 t q i k F ; t d t δ F k .
The second term on the left-side of Eq. (12) is transformed using “integration by parts” as follows:
i = 1 T H t 0 t f a i 1 t φ i F ; t d t j = 1 T H t 0 t ψ i h τ ; F ; τ h j τ v j 1 τ d τ = i = 1 T H j = 1 T H t 0 t ψ i h τ ; F ; τ h j τ v j 1 τ d τ t 0 t a i 1 τ φ i F ; τ d τ t = t f i = 1 T H j = 1 T H t 0 t ψ i h τ ; F ; τ h j τ v j 1 τ d τ t 0 t a i 1 τ φ i F ; τ d τ t = t 0 i = 1 T H j = 1 T H t 0 t f ψ i h t ; F ; t h j t v j 1 t d t t 0 t a i 1 τ φ i F ; τ d τ = i = 1 T H j = 1 T H t 0 t f v j 1 t ψ i h t ; F ; t h j t d t t t f a i 1 τ φ i F ; τ d τ .
Replacing the result obtained in Eq. (13) into the left-side of Eq. (12) yields the following relation for the left-side of Eq. (12):
i = 1 T H t 0 t f a i 1 t v i 1 t d t i = 1 T H t 0 t f a i 1 t φ i F ; t d t j = 1 T H t 0 t ψ i h τ ; F ; τ h j τ v j 1 τ d τ h 0 , F 0 = i = 1 T H t 0 t f v i 1 t a i 1 t k = 1 T H ψ k h t ; F ; t h i t t t f a k 1 τ φ k F ; τ d τ h 0 , F 0 d t .
The term on the right-side of Eq. (14) is now required to represent the “indirect-effect” term defined in Eq. (7), which is achieved by requiring that the components of the function a ( 1 ) t satisfy the following system of equations for    i = 1 , ... , T H :
a i 1 t k = 1 T H ψ k h t ; F ; t h i t t t f a k 1 τ φ k F ; τ d τ = D h t ; F θ ; t h i .
The Volterra-like neural system obtained in Eq. (15) will be called the “1st-Level Adjoint Sensitivity System” and its solution, a ( 1 ) t , will be called the “1st-level adjoint sensitivity function.” The 1st-LASS is to be solved using the nominal/optimal values for the parameters and for the function h t but this fact has not been explicitly indicated in order to simplify the notation. The 1st-LASS is linear in a ( 1 ) t but is, in general, nonlinear in h t ; x . Notably, the 1st-LASS is independent of any parameter variations and needs to be solved once only to determine the 1st-level adjoint sensitivity function a ( 1 ) t . The 1st-LASS is a “final-value problem” since the computation of the adjoint function a ( 1 ) t will commence at t = t f , with the known values a i 1 t f = D h t ; F θ ; t / h i t t = t f .
It follows from Eqs. (12)‒(15) that the indirect-effect term defined in Eq. (7) can be expressed in terms of the 1st-level adjoint sensitivity function a ( 1 ) t as follows:
δ R h 0 ; F 0 ; a ( 1 ) i n d = k = 1 T F i = 1 T H t 0 t f a i 1 t q i k F ; t d t h 0 ; F 0 δ F k .
Using the results obtained in Eqs. (16) and (6) in Eq. (5) yields the following expression for the G-variation δ R h 0 ; F 0 ; v 1 ; δ F , which is seen to be linear in δ F :
δ R h 0 ; F 0 ; v 1 ; δ F = j = 1 T F t 0 t f D h t ; F θ ; t F j δ F j h 0 ; F 0 d t + i = 1 T H j = 1 T F t 0 t f a i 1 t q i j F ; t d t δ F j h 0 ; F 0 j = 1 T F R F j h 0 ; F 0 δ F j .
Identifying in Eq. (17) the expressions that multiply the variations δ F j , yields the following expressions for the sensitivities R / F j of the response R h ; F θ with respect to the components F j θ of the feature function F θ , for j = 1 , ... , T F :
R h ; F θ F j = t 0 t f D h t ; F θ ; t F j d t + i = 1 T H t 0 t f a i 1 t g i F ; t F j d t + i = 1 T H t 0 t f a i 1 t φ i F ; t F j t 0 t ψ i h τ ; F ; τ d τ + φ i F ; t t 0 t ψ i h τ ; F ; τ F j d τ d t .
The expression on the right-side of Eq. (18) is to be evaluated at the nominal/optimal values for the respective model parameters, but this fact has not been indicated explicitly in order to simplify the notation.
The sensitivities with respect to the primary model parameters can be obtained by using the result obtained in Eq. (18) together with the “chain rule” of differentiating compound functions, as follows:
R θ j = i = 1 T F R F i F i θ j ,      j = 1 , ... , T W .
The sensitivities R / F j are obtained from Eq. (18) while the derivatives F i / θ j are obtained analytically, exactly, from the known expressions of the feature functions F i θ .

Particular Case: The First-Order Comprehensive Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (1st-CASAM-NIE-V)

When no feature functions can be constructed from the model parameters/weights, the feature functions become identical to the parameters, i.e., F i θ θ i for all i = 1 , ... , T F T W . In this case, the expression obtained in Eq. (18) yields directly the first-order sensitivities R / θ j of the decoder response with respect to the model weights/parameters, for all j = 1 , ... , T W , taking on the following specific form:
R h ; F θ θ j = t 0 t f D h t ; F θ ; t θ j d t + i = 1 T H t 0 t f a i 1 t g i F ; t θ j d t + i = 1 T H t 0 t f a i 1 t φ i F ; t θ j t 0 t ψ i h τ ; F ; τ d τ + φ i F ; t t 0 t ψ i h τ ; F ; τ θ j d τ d t .
Since the 1st-LASS is independent of any parameter variations, the 1st-level adjoint sensitivity function a 1 t a 1 1 t , , a T H 1 t which appears in Eq. (20) remains the solution of the 1st-LASS defined by Eq. (15). In this case, however, all of the sensitivities R / θ j , for all j = 1 , ... , T W would be obtained by computing integrals using quadrature formulas. Thus, when there are no feature functions of parameters, the 1st-FASAM-NIE-V reduces to the “First-Order Comprehensive Adjoint Sensitivity Analysis Methodology [16] applied to Neural Integral Equations of Volterra-Type” (1st-CASAM-NIE-V). On the other hand, when features of parameters can be constructed, only T F   T F < T W numerical computations of integrals using quadrature formulas are required, using Eq. (18) to obtain the sensitivities R / F j , j = 1 , ... , T F . Subsequently, the sensitivities with respect to the model’s weights/parameters are obtained analytically using the chain-rule provided in Eq. (19).

3. Illustrative Application of the 1st-CASAM-NIE-V and 1st-FASAM-NIE-V Methodologies to Neutron Slowing Down in an Infinite Homogeneous Hydrogenous Medium

The illustrative model considered in this Section is a Volterra-type integral equation that describes the energy distribution of neutrons in a homogeneous hydrogenous medium (such as a water-moderated/cooled reactor system) containing 238U (among other materials), which is a heavy element that strongly absorbs neutrons. The distribution of collided neutrons in such a medium is described [19,20,21] by the following linear integral equation of Volterra-type, customarily called the “neutron slowing down equation” for the neutron collision density denoted as C E :
C E = Σ s E s S Σ t E s E s + E E s C e Σ s e Σ t e d e e .
The various quantities that appear in Eq. (21) are defined as follows:
(i)
The quantity S denotes the rate at which the source neutrons, considered to be monoenergetic, are emitted at the “source energy” E s . Neutron upscattering is considered to be negligeable; therefore, E s is the highest energy in the medium.
(ii)
The quantity E , 0 < E l E E s , denotes the instantaneous energy of the collided neutrons; E l denotes the lowest neutron energy in the model.
(iii)
The quantity Σ s E denotes the medium’s macroscopic scattering cross section, which is defined as follows:
Σ s E i = 1 M w i N i σ s i E ,
where M denotes the number of materials in the medium, w i denotes the relative weighting of the ith-material in the medium, N i denotes the number density of the ith-material, while σ s i E denotes the energy-dependent scattering microscopic cross section of the ith-material.
(iv)
The quantity Σ t E denotes the medium’s macroscopic scattering cross section, which is defined as follows:
Σ t E i = 1 M w i N i σ t i E ,
where σ t i E σ s i E denotes the energy-dependent total microscopic cross section of the ith-material. The quantities w i , N i , σ s i E , σ t i E are subject to uncertainties since they are determined from experimentally obtained data.
Notably, the Volterra-type Eq. (21) is a “final-value problem” since the computation is started at the highest-energy value, C E s = Σ s E s S / Σ t E s E s , and progresses towards the lowest energy value E l . Customarily, the solution of Eq. (21) is written in the following form:
C E = Σ s E s S Σ t E s E e x p E E s Σ a e Σ t e d e e ,
where Σ a E Σ t E Σ s E denotes the medium’s macroscopic absorption cross section. The expression provided in Eq. (24) is amenable to computations of the loss of neutrons due to absorbing materials, particularly in the so-called “resonance” energy region.
A typical “decoder response” for the NIE-Volterra network modeled by Eq. (21) is the energy-averaged collision density, denoted below as R C E , which would be measured by a detector having an interaction cross-section Σ d . Mathematically, this detector-response can be expressed as follows:
R C E E l E s Σ d E C E d E ;     Σ d E N d σ d E ,
where N d and σ d E denote, respectively, the detector material’s atomic number density and the microscopic cross section describing the interaction (e.g., absorption) of neutrons with the detector’s material; N d and σ d E can be considered as the “weights” that characterizes the neural net’s “decoder.”
Since the energy-dependence of the cross sections does not play a significant role in the sensitivity analysis of the NIE-Volterra modeled by Eq. (21), the respective microscopic cross-sections will henceforth be considered to be energy-independent for the purpose of illustrating the application of the 1st-FASAM-NIE-V, in order to simplify the ensuing derivations. For energy-independent cross sections, Eqs. (21) and (25) take on the following forms, respectively:
C E = F θ S E s + E E s C e d e e ,
R C E = Σ d θ E l E s C E d E .
In Eqs. (26) and (27), the source strength S is an imprecisely-known “weight” that characterizes the neural net’s “encoder.” Furthermore, the (column) vector of parameters denoted as θ θ 1 , ... , θ T W comprises as components the “imprecisely known primary model parameters” (or “weights”, as customarily called when referring to neural nets) and is defined as follows:
θ θ 1 , ... , θ T W w 1 , ... , w M ; N 1 , ... , N M ; σ t 1 , ... , σ t M ; σ s 1 , ... , σ s M ; N d , σ d ;
where  ​​​​ T W 4 × M + 2 denotes the “total number of imprecisely-known weights/parameters.” These primary model parameters/weights are not known exactly but are affected by uncertainties since they stem from experimental procedures, which determine the nominal/mean/optimal values and the second-order moments of their unknown joint distributions; their third- and higher-order moments are rarely known. It is convenient to denote the nominal values of these primary model parameters/weights by using the superscript “zero” as follows:
θ 0 θ 1 0 , ... , θ T W 0 w 1 0 , ... , w M 0 ; N 1 0 , ... , N M 0 ; σ t 1 , 0 , ... , σ t M , 0 ; σ s 1 , 0 , ... , σ s M , 0 ; N d 0 , σ d 0 ;
The “feature function of primary parameters,” F θ , is defined as follows:
F θ Σ s θ / Σ t θ 1.
The closed-form solution of Eq. (26) has the following expression in terms of the feature function F θ :
C E = S E s F θ E s E F θ .
The closed-form expression of the decoder response can be readily obtained by replacing the result obtained in Eq. (31) into Eq. (27) and performing the integration over the energy-variable to obtain:
R C E = S Σ d θ F θ 1 F θ 1 E s E l F θ 1 .
The expression obtained in Eq. (32) reveals that the imprecisely known quantities that affect the decoder-response R C E are as follows:
(i)
the source strength S ;
(ii)
the detector interaction macroscopic cross section Σ d θ , which can be considered to be a “feature function” of the model parameters θ ;
(iii)
the feature function F θ Σ s θ / Σ t θ .

3.1. Application of 1st-CASAM-NIE-V to Directly Compute the First-Order Sensitivities of the Decoder Response with Respect to the Primary Model Parameters

The first-order sensitivities of the decoder response R C E with respect to the model parameters is obtained by applying the definition of the G-differential to Eq. (26), for arbitrary parameter variations δ θ δ θ 1 , ... , δ θ T W θ θ 0 θ 1 θ 1 0 , ... , θ T W θ T W 0 around the parameters’ nominal values. These parameter variations will induce variations δ C E C E C 0 E in the neutron collision density, around the nominal value C 0 E of the neutron collision density. The variations δ θ and δ C E will induce variations δ R C 0 E ; θ 0 ; δ C E ; δ θ in the decoder’s response.
The first-order Gateaux (G-)variation δ R C 0 E ; θ 0 ; δ C E ; δ θ is obtained, by definition, from Eq. (27) as follows:
δ R C 0 ; θ 0 ; δ C ; δ θ d d ε Σ d θ 0 + ε δ θ E l E s C 0 E + ε δ C E d E ε = 0 = δ R C 0 ; θ 0 ; δ θ d i r + δ R C 0 ; θ 0 ; δ C i n d ,
where the “direct effect” term δ R C 0 ; θ 0 ; δ θ d i r arises directly from parameter variations δ θ and is defined as follows:
δ R C 0 ; θ 0 ; δ θ d i r E l E s C 0 E d E i = 1 T W Σ d θ θ i θ = θ 0 δ θ i = δ N d σ d + N d δ σ d θ = θ 0 E l E s C 0 E d E ,
while the indirect effect term arises from the variations δ C E and is defined as follows:
δ R C 0 ; θ 0 ; δ C i n d Σ d θ 0 E l E s δ C E d E .
As indicated in Eqs. (34) and (35), both the direct-effect and the indirect-effect term are to be evaluated at the nominal parameter values.
The first-order relation between the variation δ C E and the parameter variations δ θ i is obtained by evaluating the G-variation of Eq. (26) for variations δ θ around the nominal parameter values θ 0 which yields, by definition, the following NIE-Volterra equation for δ C E :
δ C E d d ε F θ 0 + ε δ θ S 0 + ε δ S E s + F θ 0 + ε δ θ E E s C 0 e + ε δ C e d e e ε = 0 = F θ 0 E E s δ C e d e e + Q E ,
where:
Q E F θ 0 δ S E s + S 0 E s + E E s C 0 e d e e i = 1 T W F θ θ i δ θ i θ = θ 0 = F θ 0 δ S E s + S 0 E s E s E F θ i = 1 T W F θ θ i δ θ i θ = θ 0 .
The second equality in Eq. (37) has been obtained by using Eqs. (26) and (31) to eliminate the integral term involving C 0 e .
The particular form of the first-order derivative F θ / θ i , which appears in Eq. (37), is obtained by using the definition of F θ provided in Eq. (30), which yields the following expression:
F θ θ i = 1 Σ t θ Σ s θ θ i Σ s θ Σ t 2 θ Σ t θ θ i ,
In view of the definition provided in Eq. (22), the derivatives Σ s θ / θ i have the following particular expressions:
f o r    i = 1 , ... , M :      θ i w i ;         Σ s θ θ i = N i σ s i ;
f o r    i = M + 1 , ... , 2 M : ​      θ i N i ;         Σ s θ θ i = w i σ s i ;
f o r     i = 3 M + 1 , ... , 4 M : ​       θ i σ s i ;         Σ s θ θ i = w i N i .
In view of the definition provided in Eq. (23), the derivatives Σ t θ / θ i have the following particular expressions:
f o r    i = 1 , ... , M :      θ i w i ;         Σ t θ θ i = N i σ t i ;
f o r    i = M + 1 , ... , 2 M : ​      θ i N i ;         Σ t θ θ i = w i σ t i ;
f o r     i = 2 M + 1 , ... , 3 M : ​      θ i σ t i ;          Σ t θ θ i = w i N i .
The NIE-Volterra net represented by Eq. (36) will be called the “1st-Level Variational Sensitivity System (1st-LVSS)” and its solution, δ C E , will be called the “1st-level variational sensitivity function.” It is evident that Eq. (36) would need to be solved T W + 1 -times in order to obtain the variation δ C E for the source variation δ S and for every parameter variation δ θ i This need for repeatedly solving Eq. (36) can be circumvented by applying the principles of the 1st-CASAM-NIE-V, generally outlined in Section 2, to eliminate the appearance of the variation δ C E in the indirect-effect term defined in Eq. (35) while expressing this indirect-effect term as a functional of a first-level adjoint function that does not depend on any parameter variation, as follows:
  • Consider that the function δ C E H 1 Ω E belongs to a Hilbert space denoted as H 1 Ω E , which is defined on the domain Ω E E l , E s . The inner product in H 1 Ω E of two functions u E H 1 Ω E and v E H 1 Ω E will be denoted as u E , v E 1 and is defined as follows:
    u E , v E 1 E l E 0 u E v E d E .
  • Form the inner product of Eq. (36) with a vector a 1 E H 1 E , where the superscript “(1)” indicates “1st-Level”, to obtain the following relationship:
    a 1 E , δ C E F θ 0 E E s δ C e d e e 1 = a 1 E , Q E 1 .
  • Transform the left-side of Eq. (46) as follows:
    a 1 E , δ C E F θ 0 E E s δ C e d e e 1 = E l E s a 1 E δ C E d E F θ 0 E l E s a 1 E d E E E s δ C e d e e = E l E s δ C E a 1 E d E F θ 0 E l E s δ C E d E E E l E a 1 e d e = E l E s δ C E a 1 E F θ 0 E E l E a 1 e d e d E .
  • Require the last term in Eq. (47) to represent the indirect-effect term defined in Eq. (35) which yields the following “1st-Level Adjoint Sensitivity System (1st-LASS)” for the first-level adjoint sensitivity function a 1 E :
    a 1 E F θ 0 E E l E a 1 e d e = Σ d θ 0 .
    The 1st-LASS represented by Eq. (50) is a linear NIE-Volterra net, which is independent of any parameter variation and needs to be solved just once to obtain the first-level adjoint sensitivity function a 1 E . Notably, the 1st-LASS is an “initial-value problem,” in that the computation of a 1 E commences at the lowest-energy value, where a 1 E l = Σ d θ 0 , and progresses towards the highest-energy value, E s . For further reference, the closed-form solution of Eq. (50) can be obtained by differentiating this equation with respect to E and subsequently integrating the resulting first-order linear differential equation, to obtain the following exact expression:
    a 1 E = Σ d θ 1 F θ 1 F θ E E l F θ 1 .
    The expression on the right-side of Eq. (51) is to be evaluated at the nominal parameter values θ 0 , but the superscript “zero” has been omitted for notational simplicity.
  • Using Eqs. (46), (47) and (50) yields the following expression for the indirect-effect term defined in Eq. (35):
    δ R C ; θ ; δ C i n d = E l E s a 1 E Q E d E = F θ δ S E s E l E s a 1 E d E + S E s i = 1 T W F θ θ i δ θ i E l E s a 1 E d E E s E F θ .
    The expression on the right-side of Eq. (52) is to be evaluated at the nominal parameter values θ 0 , but the superscript “zero” has been omitted for notational simplicity.
  • Adding the expression obtained in Eq. (52) to the expression of the direct-effect term represented by Eq. (34) yields the following expression for the first-order G-variation δ R C 0 ; θ 0 ; δ C ; δ θ :
    δ R C 0 ; θ 0 ; δ C ; δ θ = δ N d σ d + N d δ σ d E l E s C E d E + F θ δ S E s E l E s a 1 E d E + S E s i = 1 T W F θ θ i δ θ i E l E s a 1 E d E E s E F θ .
  • It follows from Eq. (53) that the first-order sensitivities of the decoder response with respect to the (encoder’s) source strength and the optimal weights/parameters have the following expressions:
    R S = F θ E s E l E s a 1 E d E ;
    R N d = σ d E l E s C E d E ;
    R σ d = N d E l E s C E d E ;
    R θ i = S E s F θ θ i E l E s a 1 E E s E F θ d E ;     i = 1 , ... , T W 2.
Inserting into Eqs. (54)‒(57) the closed-form expression for the neutron collision density obtained in Eq. (31) yields the following closed-form explicit expressions for the first-order sensitivities of the decoder response with respect to the (encoder’s) source strength and the optimal weights/parameters:
R S = Σ d θ F θ 1 F θ 1 E s E l F θ 1 ;
R N d = σ d S F θ 1 F θ 1 E s E l F θ 1 ;
R σ d = N d S F θ 1 F θ 1 E s E l F θ 1
R θ i = F θ θ i S Σ d θ 1 F θ 1 1 F θ 1 E s E l F θ 1 F θ E s E l F θ 1 ln E s E l ;     i = 1 , ... , T W .
The correctness of the expressions obtained in Eqs. (58)‒(61) can be readily verified by differentiating the expressions of the decoder’s response obtained in Eq. (32).
In practice, only the exact mathematical expression of the 1st-LASS, namely Eq. (50), and the exact mathematical expression of the first-order sensitivities obtained in Eqs. (54)‒(57) are available. The solution of the 1st-LASS, which is a linear NIE-Volterra net for the first-level adjoint sensitivity function a 1 E , would need to be obtained numerically, in practice. The numerical solution for a 1 E would be used to determine the first-order sensitivities stemming from the “indirect-effect” term by using quadrature formulas to evaluate the integrals obtained in Eqs. (54) and (57). It is very important to note that a single “large-scale” computation, for determining numerically the adjoint function a 1 E by solving the 1st-LASS (a NIE-Volterra type equation), would be needed for evaluating all of the first-order sensitivities. The numerical computations using quadrature formulas for evaluating the integrals in Eqs. (54) and (57) are considered to be “small-scale” computations.
As has been already observed in the brief remarks following Eq. (37), the computation of the first-order sensitivities of the decoder response with respect to the encoder source strength S and model weights/parameters could also have been computed by numerically solving repeatedly the NIE-Volterra net (1st-LVSS) represented by Eq. (36). This procedure would be very expensive computationally, since it would require T W + 1 large scale computations to solve the 1st-LVSS defined by Eq. (36) in order to obtain the variation δ C E for every parameter variation δ θ i and the source variation δ S . In addition, the same amount of “quadrature” computations would need to be performed using Eq. (35) as would be needed for evaluating the first-order sensitivities using Eqs. (54) and (57).

3.2. Efficient Indirect Computation Using the 1st-FASAM-NIE-V of the First-Order Sensitivities of the Decoder Response with Respect to Pimary Model Parameters

When feature functions of model parameters such as Σ d θ and F θ can be identified, as is the case with the NIE-Volterra net and decoder response represented by Eqs. (26) and (27), respectively, it is considerably more efficient to determine the first-order sensitivities of the decoder response with respect to the feature functions and subsequently derive analytically the sensitivities with respect to the primary model parameters by using the “chain rule of differentiation,” as will be shown in this Section. Thus, considering arbitrary variations δ Σ d θ Σ d θ Σ d θ 0 and δ F θ F θ F θ 0 around the nominal values Σ d θ 0 and, respectively, F θ 0 , the first-order G-variation of the decoder response has the following expression:
δ R C 0 ; θ 0 ; δ C ; δ Σ d d d ε Σ d θ 0 + ε δ Σ d θ 0 E l E s C 0 E + ε δ C E d E ε = 0 = δ Σ d E l E s C E d E θ = θ 0 + δ R C 0 ; θ 0 ; δ C i n d ,
where the expression of the indirect effect term is defined in Eq. (35). The first-order relation between the variation δ C E and the variations δ Σ d θ and δ F θ is obtained, by definition, from Eq. (26) as follows:
δ C E d d ε F θ 0 + ε δ F θ S 0 + ε δ S E s + F θ 0 + ε δ F θ         × E E s C 0 e + ε δ C e d e e ε = 0 = F θ 0 E E s δ C e d e e + Q E ,
where:
Q E F θ 0 δ S E s + S 0 E s E s E F θ δ F θ θ = θ 0 .
Comparing Eq. (63) to Eq. (36) indicates that the only difference between these equations is the expression of the term Q E , which is expressed in terms of δ F θ in Eq. (64). Consequently, the first-level adjoint sensitivity function that corresponds to the variational function δ C E is determined by following the same procedure as outlined in Eqs. (46)‒(50), ultimately obtaining the same 1st-LASS as was obtained in Eq. (50), having as solution the same expression for a 1 E as was obtained in Eq. (51). It further follows that the expression of the indirect-effect term will have the following expression:
δ R C ; θ ; δ C i n d = δ S F θ E s E l E s a 1 E d E + δ F θ S E s E l E s a 1 E d E E s E F θ .
It follows from Eqs. (62) and (65) that the first-order G-variation δ R C 0 ; θ 0 ; δ C ; δ Σ d has the following expression:
δ R C 0 ; θ 0 ; δ C ; δ Σ d = δ Σ d E l E s C E d E θ = θ 0 + δ S F θ E s E l E s a 1 E d E + δ F θ S E s E l E s a 1 E d E E s E F θ .
As indicated by the expression obtained in Eq. (66), the first-order sensitivities of the decoder response with respect to the feature functions and the encoder’s source strength are as follows:
R F θ = S E s E l E s a 1 E E s E F θ d E ;
R Σ d θ = E l E s C E d E ;
R S = F θ E s E l E s a 1 E d E .
The closed-form expressions of the above sensitivities are readily determined by using in Eqs. (67)‒(69) the expressions obtained in Eqs. (51) and (24), and by performing the respective integrations obtain:
R F θ = S Σ d θ 1 F θ 1 1 F θ 1 E s E l F θ 1 F θ E s E l F θ 1 ln E s E l ;
R Σ d θ = S F θ 1 F θ 1 E s E l F θ 1 ;
R S = Σ d θ F θ 1 F θ 1 E s E l F θ 1 .
The first-order sensitivities with respect to the primary parameters are obtained analytically from Eqs. (67) and (68), respectively, by using the following “chain rule” of differentiation:
R N d = R Σ d θ Σ d θ N d = σ d R Σ d θ ;
R σ d = R Σ d θ Σ d θ σ d = N d R Σ d θ ;
R θ i = R F θ F θ θ i ;     i = 1 , ... , 4 M .
The specific expressions of the first-order sensitivities R / θ i , i = 1 , ... , 4 M , are obtained by using Eq. (75) in conjunction with Eq. (69) and Eqs. (38)‒(44).

3.3. Discussion: Direct Versus Indirect Computation of the First-Order Sensitivities of Decoder Response with Respect to the Primary Model Parameters:

The principles of the 1st-CASAM-NIE-V were applied in Section 3.1 to determine the first-order sensitivities of the decoder response directly with respect to the model’s primary parameters/weights. It has been shown that this procedure requires a single “large-scale” computation to solve a NIE-Volterra equation in order to determine the (single) 1st-level adjoint sensitivity function a 1 E , which is subsequently used in 4 M + 1 integrals that are computed using quadrature formulas. The two additional first-order sensitivities with respect to the components of Σ d θ require a single quadrature involving the forward function C E .
The principles of the 1st-FASAM-NIE-V were applied in Subsection 3.2 to determine the first-order sensitivities of the decoder response with respect to the feature functions. This path required just two (as opposed to 4 M + 1 ) numerical evaluations of (two) integrals using quadrature formulas involving the 1st-level adjoint sensitivity function a 1 E . The sensitivities of the decoder response with respect to the primary parameters/weights were subsequently determined analytically, using the “chain rule of differentiation” of the explicitly-known expression of the feature function F θ . Evaluating the two additional first-order sensitivities with respect to the components of Σ d θ require a single quadrature involving the forward function C E , as in Section 3.1. Evidently, the indirect path presented in Section 3.2 is computationally more efficient, since it requires substantially fewer numerical quadratures than the path presented in Section 3.1. The superiority of the indirect path, via “feature functions,” over the direct computation of sensitivities with respect to the model parameters will be considerably more evident for the computation of second-order sensitivities, as will be shown in the forthcoming Section 4 and Section 5, below.
Of course, when no feature functions can be identified, the 1st-FASAM-NIE-V methodology becomes identical to the 1st-CASAM-NIE-V methodology.

4. The Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra-Type (2nd-FASAM-NIE-V)

The second-order sensitivities of the response R h ; F θ defined in Eq. (3) will be computed by conceptually using their basic definitions as being the “first-order sensitivities of the first-order sensitivities.” Thus, the second-order sensitivities stemming from the first-order sensitivities R h ; F θ / F j are obtained from the first-order G-differential of Eq. (18), for j = 1 , ... , T F , as follows:
δ R F j d d ε t 0 t f D h 0 t + ε v 1 t ; F θ 0 + ε δ F ; t F j d t ε = 0 + i = 1 T H d d ε t 0 t f a i 1 , 0 t + ε δ a i 1 t g i F 0 + ε δ F ; t F j d t ε = 0 + i = 1 T H t 0 t f a i 1 , 0 t + ε δ a i 1 t φ i F 0 + ε δ F ; t F j × t 0 t ψ i h 0 τ + ε v 1 τ ; F 0 + ε δ F ; τ d τ ε = 0 + i = 1 T H t 0 t f a i 1 , 0 t + ε δ a i 1 t φ i F 0 + ε δ F ; t d t × t 0 t ψ i h 0 τ + ε v 1 τ ; F 0 + ε δ F ; τ F j d τ ε = 0 δ R / F j d i r + δ R / F j i n d ;      j = 1 , ... , T F .
In Eq. (76), the expression of the direct-effect term δ R / F j d i r is obtained after performing the operations with respect to the scalar ε and comprises the variations δ F (stemming from variations in the model parameters), being defined as follows:
δ R F j d i r k = 1 T F t 0 t f 2 D h t ; F θ ; t F k F j δ F k d t + i = 1 T H t 0 t f a i 1 t k = 1 T F 2 g i h ; F F k F j δ F k d t + i = 1 T H t 0 t f a i 1 t k = 1 T F 2 φ i F ; t F k F j δ F k t 0 t ψ i h τ ; F ; τ d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t F j t 0 t k = 1 T F ψ i h τ ; F ; τ F k δ F k d τ + i = 1 T H t 0 t f a i 1 t k = 1 T F φ i F ; t F k δ F k d t t 0 t ψ i h τ ; F ; τ F j d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t d t k = 1 T F 2 ψ i h τ ; F ; τ F k F j δ F k ;      j = 1 , ... , T F .
The expression on the right-side of Eq. (77) is to be evaluated at the nominal/optimal values for the respective model parameters, but this fact has not been indicated explicitly in order to simplify the notation.
The expression of the indirect-effect term δ R / F j i n d defined in Eq. (76) is obtained after performing the operations with respect to the scalar ε and comprises the variations v 1 t and δ a 1 t δ a 1 1 t , , δ a T H 1 t , as follows:
δ R F j i n d k = 1 T H t 0 t f 2 D h t ; F θ ; t h k t F j v k t d t + i = 1 T H t 0 t f δ a i 1 t g i F ; t F j d t + i = 1 T H t 0 t f δ a i 1 t φ i F ; t F j t 0 t ψ i h τ ; F ; τ d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t F j t 0 t k = 1 T H ψ i h τ ; F ; τ h k τ v k τ d τ + i = 1 T H t 0 t f δ a i 1 t φ i F ; t d t t 0 t ψ i h τ ; F ; τ F j d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t d t t 0 t k = 1 T H 2 ψ i h τ ; F ; τ h k τ F j v k τ d τ ;     j = 1 , ... , T F .
The expressions in Eq. (78) are to be evaluated at the nominal values of the respective functions and parameters, but the respective indication (i.e., the superscript “zero”) has been omitted in order to simplify the notation.
The direct-effect term δ R / F j d i r can be evaluated at this time for all variations δ F , but the indirect-effect term δ R / F j i n d can be evaluated only after having determined the variations v 1 t and δ a ( 1 ) t . The variation v 1 t is the solution of the 1st-LVSS defined by Eq. (9). On the other hand, the variational function δ a ( 1 ) t is the solution of the system of equations obtained by G-differentiating the 1st-LASS. By definition, the G-differential of Eq. (15) is obtained as follows, for i = 1 , ... , T H :
d d ε a i 1 , 0 t + ε δ a i 1 t ε = 0 d d ε k = 1 T H ψ k h 0 t + ε v 1 t ; F 0 + ε δ F ; t h i t × t t f a k 1 , 0 τ + ε δ a k 1 τ φ k F 0 + ε δ F ; τ d τ ε = 0 = d d ε D h 0 t + ε v 1 t ; F 0 + ε δ F ; t h i t ε = 0 .
Performing the operations indicated in Eq. (79) and rearranging the various terms yields the following relations, for i = 1 , ... , T H :
δ a i 1 t k = 1 T H ψ k h t ; F ; t h i t t t f δ a k 1 τ φ k F ; τ d τ m = 1 T H 2 D h t ; F ; t h m t h i t v m 1 t k = 1 T H m = 1 T H 2 ψ k h t ; F ; t h m t h i t v m 1 t t t f a k 1 τ φ k F ; τ d τ = n = 1 T F S i n F ; t δ F n ,
where:
n = 1 T F S i n F ; t δ F n k = 1 T H n = 1 T F 2 ψ k h t ; F ; t F n h i t δ F n t t f a k 1 τ φ k F ; τ d τ + k = 1 T H ψ k h t ; F ; t h i t n = 1 T F t t f a k 1 τ φ k F ; τ F n δ F n d τ + n = 1 T F 2 D h t ; F ; t F n h i t δ F n .
As indicated by the result obtained in Eq. (80), the variations δ a ( 1 ) t are coupled to the variations v 1 t . Therefore, they can be obtained by simultaneously solving Eqs. (80) and (9), which together will be called the “2nd-Level Variational Sensitivity System (2nd-LVSS).” The solution of the 2nd-LVSS, namely the vector v 2 t δ a ( 1 ) t , v 1 t , will be called the “2nd-level variational sensitivity function.” Since the 2nd-LVSS depends on the variations δ F (stemming from variations in the model parameters), it would need to be solved anew for each such variation. The repeated solving of the 2nd-LVSS can be avoided by following the general principles underlying the 2nd-FASAM [16], which considers the function v 2 t v 1 t , δ a ( 1 ) t to be an element in a Hilbert space denoted as H 2 Ω t . The Hilbert space H 2 Ω t is considered to be endowed with an inner product denoted as χ 2 , η 2 2 , between two vectors χ 2 t χ 1 2 t , χ 2 2 t H 2 Ω t and η 2 t = η 1 2 t , η 2 2 t H 2 Ω t , with η 1 2 t η 1 , 1 2 t , , η 1 , T H 2 t , η 2 2 t η 2 , 1 2 t , , η 2 , T H 2 t , χ 1 2 t χ 1 , 1 2 t , , χ 1 , T H 2 t , χ 2 2 t χ 2 , 1 2 t , , χ 2 , T H 2 t , which is defined as follows:
χ 2 , η 2 2 t 0 t f χ 2 t η 2 t d t = χ 1 2 , η 1 2 1 + χ 2 2 , η 2 2 1 = j = 1 T H t 0 t f χ 1 , j 2 t η 1 , j 2 t d t + j = 1 T H t 0 t f χ 2 , j 2 t η 2 , j 2 t d t .
Following the general principles underlying the 2nd-FASAM [16], the function v 2 t v 1 t , δ a ( 1 ) t will be eliminated from the expression of each indirect-effect terms δ R / F j i n d , j = 1 , ... , T F , defined in Eq. (78). This elimination is achieved by considering, for each index j = 1 , ... , T F , a vector-valued function denoted as a 2 t ; j = a 1 2 t ; j , a 2 2 t ; j H 2 Ω t , with a 1 2 t ; j a 1 , 1 2 t ; j , , a 1 , T H 2 t ; j and a 2 2 t ; j a 2 , 1 2 t ; j , , a 2 , T H 2 t ; j . Using the definition provided in Eq. (82), we construct the inner product of Eqs. (9) and (80) with the vector a 2 t ; j = a 1 2 t ; j , a 2 2 t ; j , to obtain the following relation:
i = 1 T H t 0 t f a 1 , i 2 t ; j v i 1 t d t i = 1 T H t 0 t f a 1 , i 2 t ; j φ i F ; t d t k = 1 T H t 0 t ψ i h τ ; F ; τ h k τ v k 1 τ d τ + i = 1 T H t 0 t f a 2 , i 2 t ; j δ a i 1 t d t i = 1 T H t 0 t f a 2 , i 2 t ; j k = 1 T H ψ k h t ; F ; t h i t d t t t f δ a k 1 τ φ k F ; τ d τ i = 1 T H t 0 t f a 2 , i 2 t ; j m = 1 T H 2 D h t ; F ; t h m t h i t v m 1 t d t i = 1 T H t 0 t f a 2 , i 2 t ; j k = 1 T H m = 1 T H 2 ψ k h t ; F ; t h m t h i t v m 1 t d t t t f a k 1 τ φ k F ; τ d τ = Q 2 ,
where:
Q 2 i = 1 T H t 0 t f a 1 , i 2 t d t k = 1 T F q i k F ; t δ F k + i = 1 T H t 0 t f a 2 , i 2 t d t k = 1 T F S i k F ; t δ F k .
Following the principles of the 2nd-CASAM [16], the left-side of Eq. (83) will be identified with the indirect-effect term defined in Eq. (78), thereby determining the (yet undetermined) functions a 2 t ; j = a 1 2 t ; j , a 2 2 t ; j . For this purpose, the right-side of Eq. (78) is cast in the form of the inner product v 2 t , 2 = v 1 t , 1 + δ a ( 1 ) t , 1 . The terms on the right-side of Eq. (78) involving the components of the function δ a ( 1 ) t are already in the desired format, but the terms involving the components of the function v 1 t must be re-arranged, as follows:
(i)
The fourth term on the right-side of Eq. (78), is recast by using “integration by parts” as follows:
i = 1 T H t 0 t f a i 1 t φ i F ; t F j t 0 t k = 1 T H ψ i h τ ; F ; τ h k τ v k τ d τ = i = 1 T H k = 1 T H t 0 t a i 1 t φ i F ; t F j d t t 0 t ψ i h τ ; F ; τ h k τ v k τ d τ t = t f i = 1 T H k = 1 T H t 0 t a i 1 t φ i F ; t F j d t t 0 t ψ i h τ ; F ; τ h k τ v k τ d τ t = t 0 i = 1 T H k = 1 T H t 0 t f ψ i h t ; F ; t h k t v k t t 0 t a i 1 τ φ i F ; τ F j d τ = i = 1 T H k = 1 T H t 0 t f ψ i h t ; F ; t h k t v k t d t t t f a i 1 τ φ i F ; τ F j d τ .
(ii)
The sixth (last) term on the right-side of Eq. (78), is recast by using “integration by parts,” as above, to obtain the following relation:
i = 1 T H t 0 t f a i 1 t φ i F ; t d t t 0 t k = 1 T H 2 ψ i h τ ; F ; τ h k τ F j v k τ d τ = i = 1 T H k = 1 T H t 0 t f 2 ψ i h t ; F ; t h k t F j v k t d t t t f a i 1 τ φ i F ; τ d τ .
Using in Eq. (78) the results obtained in Eqs. (85) and (86) yields the following expression for the indirect-effect term, for j = 1 , ... , T F :
δ R F j i n d k = 1 T H t 0 t f v k t d t 2 D h t ; F θ ; t h k t F j + i = 1 T H ψ i h t ; F ; t h k t t t f a i 1 τ φ i F ; τ F j d τ + i = 1 T H 2 ψ i h t ; F ; t h k t F j t t f a i 1 τ φ i F ; τ d τ + k = 1 T H t 0 t f δ a k 1 t g k F ; t F j + φ k F ; t F j t 0 t ψ k h τ ; F ; τ d τ + φ k F ; t t 0 t ψ k F j d τ d t .
The left-side of Eq. (83) is now recast in the form of the inner product v 2 t , 2 by performing the following operations:
(i)
The second term on the left-side of Eq. (83) is rearranged by using “integration by parts” as follows:
i = 1 T H t 0 t f a 1 , i 2 t ; j φ i F ; t d t k = 1 T H t 0 t ψ i h τ ; F ; τ h k τ v k 1 τ d τ = k = 1 T H i = 1 T H t 0 t f v k t ψ i h t ; F ; t h k t d t t t f a 1 , i 2 τ ; j φ i F ; τ d τ .
(ii)
The fourth term on the left-side of Eq. (83) is rearranged by using “integration by parts” as follows:
i = 1 T H t 0 t f a 2 , i 2 t ; j k = 1 T H ψ k h t ; F ; t h i t d t t t f δ a k 1 τ φ k F ; τ d τ = i = 1 T H k = 1 T H t 0 t a 2 , i 2 t ; j k = 1 T H ψ k h t ; F ; t h i t d t t t f δ a k 1 τ φ k F ; τ d τ t = t f i = 1 T H k = 1 T H t 0 t a 2 , i 2 t ; j k = 1 T H ψ k h t ; F ; t h i t d t t t f δ a k 1 τ φ k F ; τ d τ t = t 0 i = 1 T H k = 1 T H t 0 t f d t δ a k 1 t φ k F ; t t 0 t a 2 , i 2 τ ; j ψ k h τ ; F ; τ h i τ d τ = k = 1 T H i = 1 T H t 0 t f δ a k 1 t φ k F ; t d t t 0 t a 2 , i 2 τ ; j ψ k h τ ; F ; τ h i τ d τ
(iii)
The fifth term on the left-side of Eq. (83) is rearranged as follows:
i = 1 T H t 0 t f a 2 , i 2 t ; j k = 1 T H 2 D h t ; F ; t h k t h i t v k 1 t d t = k = 1 T H t 0 t f v k 1 t i = 1 T H 2 D h t ; F ; t h k t h i t a 2 , i 2 t ; j d t .
(iv)
The sixth term on the left-side of Eq. (83) is rearranged as follows:
i = 1 T H t 0 t f a 2 , i 2 t k = 1 T H m = 1 T H 2 ψ k h t ; F ; t h m t h i t v m 1 t d t t t f a k 1 τ φ k F ; τ d τ = m = 1 T H i = 1 T H k = 1 T H t 0 t f v m 1 t 2 ψ k h t ; F ; t h m t h i t a 2 , i 2 t t t f a k 1 τ φ k F ; τ d τ d t = k = 1 T H t 0 t f v k 1 t i = 1 T H n = 1 T H 2 ψ n h t ; F ; t h k t h i t a 2 , i 2 t t t f a n 1 τ φ n F ; τ d τ d t
Inserting the results obtained in Eqs. (88)‒(91) into the left-side of Eq. (83) yields the following relation:
Q 2 = k = 1 T H t 0 t f d t v k 1 t a 1 , k 2 t ; j i = 1 T H ψ i h t ; F ; t h k t t t f a 1 , i 2 τ ; j φ i F ; τ d τ i = 1 T H 2 D h t ; F ; t h k t h i t a 2 , i 2 t ; j i = 1 T H n = 1 T H 2 ψ n h t ; F ; t h k t h i t a 2 , i 2 t t t f a n 1 τ φ n F ; τ d τ + k = 1 T H t 0 t f d t δ a k 1 t a 2 , k 2 t ; j i = 1 T H φ k F ; t d t t 0 t a 2 , i 2 τ ; j ψ k h τ ; F ; τ h i τ d τ .
The right-side of Eq. (92) can now be required to represent the indirect-effect term defined in Eq. (87), by imposing the requirement that the hitherto arbitrary function a 2 t ; j = a 1 2 t ; j , a 2 2 t ; j be the solution of the following NIE-Volterra equations, for    i = 1 , ... , T H ; ​      j = 1 , ... , T F :
a 1 , k 2 t ; j i = 1 T H ψ i h t ; F ; t h k t t t f a 1 , i 2 τ ; j φ i F ; τ d τ i = 1 T H 2 D h t ; F ; t h k t h i t a 2 , i 2 t ; j i = 1 T H n = 1 T H 2 ψ n h t ; F ; t h k t h i t a 2 , i 2 t t t f a n 1 τ φ n F ; τ d τ = 2 D h t ; F θ ; t h k t F j + i = 1 T H ψ i h t ; F ; t h k t t t f a i 1 τ φ i F ; τ F j d τ + i = 1 T H 2 ψ i h t ; F ; t h k t F j t t f a i 1 τ φ i F ; τ d τ ;
a 2 , k 2 t ; j i = 1 T H φ k F ; t d t t 0 t a 2 , i 2 τ ; j ψ k h τ ; F ; τ h i τ d τ = g k F ; t F j + φ k F ; t F j t 0 t ψ k h τ ; F ; τ d τ + φ k F ; t t 0 t ψ k F j d τ .
It follows from Eqs. (92)‒(94) that the indirect-effect term δ R / F j i n d defined by Eq. (78) or, equivalently, Eq. (87) can be expressed in terms of the function a 2 t ; j = a 1 2 t ; j , a 2 2 t ; j as follows, for j = 1 , ... , T F :
δ R F j i n d = i = 1 T H t 0 t f a 1 , i 2 t ; j d t n = 1 T F q i n F ; t δ F n + i = 1 T H t 0 t f a 2 , i 2 t ; j d t n = 1 T F S i n F ; t δ F n = i = 1 T H t 0 t f a 1 , i 2 t ; j d t n = 1 T F δ F n g i F ; t F n + φ i F ; t F n t 0 t ψ i h τ ; F ; τ d τ + φ i F ; t t 0 t ψ i h τ ; F ; τ F n d τ + i = 1 T H t 0 t f a 2 , i 2 t ; j d t n = 1 T F 2 D h t ; F ; t F n h i t δ F n + k = 1 T H n = 1 T F 2 ψ k h t ; F ; t F n h i t δ F n t t f a k 1 τ φ k F ; τ d τ + k = 1 T H ψ k h t ; F ; t h i t n = 1 T F t t f a k 1 τ φ k F ; τ F n δ F n d τ .
The second-order sensitivities 2 R h ; F θ / F j F n of the decoder-response with respect to the components of the feature function are obtained by adding the expression of the indirect-effect term obtained in Eq. (95) to the expression for the direct-effect term obtained in Eq. (77) and subsequently identifying the expressions that multiply the variations δ F n . The expressions thus obtained for 2 R h ; F θ / F j F n , for j , n = 1 , ... , T F , are as follows:
2 R h ; F θ F j F n 2 D h t ; F θ ; t F n F j + i = 1 T H t 0 t f a i 1 t 2 g i h ; F F n F j d t + i = 1 T H t 0 t f a i 1 t 2 φ i F ; t F n F j t 0 t ψ i h τ ; F ; τ d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t F j t 0 t ψ i h τ ; F ; τ F n d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t F n d t t 0 t ψ i h τ ; F ; τ F j d τ + i = 1 T H t 0 t f a i 1 t φ i F ; t d t 2 ψ i h τ ; F ; τ F n F j + i = 1 T H t 0 t f a 1 , i 2 t ; j d t g i F ; t F n + φ i F ; t F n t 0 t ψ i h τ ; F ; τ d τ + φ i F ; t t 0 t ψ i h τ ; F ; τ F n d τ + i = 1 T H t 0 t f a 2 , i 2 t ; j d t 2 D h t ; F ; t F n h i t + k = 1 T H 2 ψ k h t ; F ; t F n h i t t t f a k 1 τ φ k F ; τ d τ + k = 1 T H ψ k h t ; F ; t h i t t t f a k 1 τ φ k F ; τ F n d τ .
The NIE-Volterra system presented in Eqs. (93) and (94) is called the “2nd-Level Adjoint Sensitivity System (2nd-LASS)” and its solution, a 2 j ; t = a 1 2 j ; t , a 2 2 j ; t , j = 1 , ... , T F , is called the “2nd-level adjoint sensitivity function.” Since the sources on the right-sides of Eqs. (93) and (94) stem from the first-order sensitivities R h ; F θ / F j , j = 1 , ... , T F , they are dependent on the index “j”, which implies that for each first-order sensitivity R h ; F θ / F j , there will correspond a distinct 2nd-LASS, having a distinct solution a 2 j ; t = a 1 2 j ; t , a 2 2 j ; t , a fact that has been emphasized by using the index “j” in the list of arguments of this 2nd-level adjoint sensitivity function. Therefore, there will be as many 2nd-level adjoint functions as there are distinct first-order sensitivities R h ; F θ / F j , which is equivalent to the number of components F j of the “feature-function” F θ . Notably, the integral operators on the left-sides of Eqs. (93) and (94) do not depend on the index “j”, which means that the same left-hand side needs to be inverted for computing the 2nd-level adjoint function, regardless of the source term on the right-side (which corresponds to the particular component of the feature-function) of Eqs. (93) and (94). Therefore, if the inverses of the operators appearing on the left-sides of Eqs. (93) and (94) could be stored, they would not need to be inverted repeatedly, so the various 2nd-level adjoint functions would be computed most efficiently.
The second-order sensitivities of the decoder-response with respect to the optimal weights/parameters θ k , k = 1 , ... , T W , are obtained analytically by using the chain rule in conjunction with the expressions obtained in Eqs. (96) and (18), as follows:
2 R F θ θ k θ j = θ k i = 1 T F R F θ F i θ F i θ θ j ,      j , k = 1 , ... , T W .
When there are no feature functions but only individual model parameters, i.e., when F i θ θ i for all i = 1 , ... , T F T W , the expression obtained in Eq. (96) yields directly the second-order sensitivities 2 R / θ i θ j , for all i , j = 1 , ... , T W . In this case, the 2nd-LASS would need to be solved T W -times rather than just T F -times T F < T W , when T F feature functions can be constructed.

5. Illustrative Application of the 2nd-FASAM-NIE-V to Neutron Slowing Down in an Infinite Homogeneous Hydrogenous Medium

The 2nd-FASAM-NIE-V methodology developed in Section 4 will be applied in this Section to the illustrative model (considered in Section 3) that describes the energy distribution of neutrons in a homogeneous hydrogenous medium. As discussed in Section 4, the second-order sensitivities of the decoder response, R C E , with respect to the feature functions F θ , Σ d θ , and S, will be determined by considering the second-order sensitivities to be “the first-order sensitivities of the first-order sensitivities.” Thus, the first-order sensitivities obtained in Eqs. (67)‒(69) will play the role of ”responses” in the application of the 2nd-FASAM-NIE-V methodology.

5.1. Computation of Second-Order Sensitivities Stemming from R / F θ

The second-order sensitivities stemming from the first-order sensitivity R / F θ will be determined from the expression of the first-order G-variation δ R / F θ , which is by definition obtained from Eq. (67) as shown below:
δ R F θ = d d ε S 0 + ε δ S E s E l E s a 1 , 0 E + ε δ a 1 E E s E F θ 0 + ε δ F θ d E ε = 0 = δ R F θ d i r + δ R F θ i n d ,
where the direct-effect term δ R / F θ d i r is defined as follows:
δ R F θ d i r δ S E s E l E s a 1 E E s E F θ d E θ = θ 0 + δ F θ S E s E l E s a 1 E E s E F θ ln E s E d E θ = θ 0 ,
while the indirect-effect term δ R / F θ i n d is defined as follows:
δ R F θ i n d S E s E l E s δ a 1 E E s E F θ d E θ = θ 0 .
The direct-effect term can be computed at this time. On the other hand, the indirect-effect term can be computed after determining the variational function δ a 1 E , which is the solution of the G-differentiated 1st-LASS defined in Eq. (50). By definition, the G-differential of Eq. (50) is provided by the following equation:
d d ε a 1 , 0 E + ε δ a 1 E ε = 0 d d ε F θ 0 + ε δ F θ E E l E a 1 , 0 e + ε δ a 1 e d e ε = 0 = d d ε Σ d θ 0 + ε δ Σ d θ ε = 0 .
Performing the operations indicated in Eq. (101) yields the following equation to be satisfied by the function δ a 1 E at the nominal parameter values (the superscript “zero” will be omitted for notational simplicity):
δ a 1 E F θ E E l E δ a 1 e d e = δ F θ E E l E a 1 e d e + δ Σ d θ .
Evidently, Eq. (102) would need to be solved repeatedly, for every parameter variation, in order to compute the function δ a 1 E that would correspond to the respective parameter variation. As was shown in Section 4, the need for repeatedly solving Eq. (102)can be avoided by deriving an alternative expression for the indirect-effect term which does not involve δ a 1 E . This alternative expression is derived by introducing the 2nd-LASS, the solution of which would replace the function δ a 1 E in the alternative expression for the indirect-effect term. Notably, Eq. (102) is independent of variations in the forward function, δ C E ; therefore, the 2nd-level adjoint sensitivity function will comprise a single component, denoted as a 2 E ; 1 , and the 2nd-LASS will comprise a single equation for this component.
Following the principles of the 2nd-FASAM-NIE-V, we use Eq. (45) to construct the inner product of Eq. (102) with a function a 2 E ; 1 , where the argument “1” indicates that this 2nd-level adjoint sensitivity function will correspond to the first-order sensitivity R C E / F θ , to obtain the following relation:
E l E s a 2 E ; 1 δ a 1 E d E F θ E l E s a 2 E ; 1 d E E E l E δ a 1 e d e = δ F θ E l E s a 2 E ; 1 d E E E l E a 1 e d e + δ Σ d θ E l E s a 2 E ; 1 d E .
The function a 2 E ; 1 will be determined by requiring the left-side of Eq. (103) to represent the indirect-effect term defined in Eq. (100). For this purpose, the left-side will be recast using “integration by parts” into the following form:
E l E s a 2 E ; 1 δ a 1 E d E F θ E l E s a 2 E ; 1 d E E E l E δ a 1 e d e = E l E s δ a 1 E a 2 E ; 1 F θ E E s a 2 e ; 1 d e e d E .
Requiring the right-side of Eq. (104) to represent the indirect-effect term defined in Eq. (100) yields the following Volterra-type 2nd-LASS to be satisfied by the 2nd-level adjoint sensitivity function a 2 E ; 1 :
a 2 E ; 1 F θ E E s a 2 e ; 1 d e e = S E s E s E F θ .
The above 2nd-LASS is to be satisfied at the nominal parameter values θ 0 . Notably, Eq. (105) is an “final-value problem” since the computation of a 2 E ; 1 commences at the highest energy, where a 2 E s ; 1 = S / E s , and proceeds towards the lowest energy value, E l . For subsequent verification purposes, the closed-form explicit expression of a 2 E ; 1 obtained by solving Eq.(105) is as follows:
a 2 E ; 1 = E F θ S F θ E s F θ 1 ln E + S F θ E s F θ 1 ln E s + S E s F θ 1 .
It follows from Eqs. (103)‒(105) that the indirect-effect term δ R / F θ i n d is given by the following expression involving the 2nd-level adjoint sensitivity function a 2 E ; 1 :
δ R F θ i n d = δ F θ E l E s a 2 E ; 1 d E E E l E a 1 e d e + δ Σ d θ E l E s a 2 E ; 1 d E .
Adding the expression for the indirect-effect term δ R / F θ i n d obtained in Eq. (107) to the expression for the direct-effect term δ R / F θ d i r obtained in Eq. (99) yields the following expression for the G-differential δ R / F θ :
δ R F θ = δ F θ E l E s a 2 E ; 1 d E E E l E a 1 e d e + δ Σ d θ E l E s a 2 E ; 1 d E + δ S E s E l E s a 1 E E s E F θ d E + δ F θ S E s E l E s a 1 E E s E F θ ln E s E d E .
The above expression is to be evaluated at the nominal parameter values θ 0 .
It follows from the expression obtained in Eq. (108) that the second-order sensitivities (of the decoder response with respect to the feature functions) which stem from the first-order sensitivity R C E / F θ have the following expressions to be evaluated at the nominal parameter values θ 0 :
2 R 2 F θ = E l E s a 2 E ; 1 d E E E l E a 1 e d e + S E s E l E s a 1 E E s E F θ ln E s E d E ;
2 R Σ d θ F θ = 1 E s E l E s a 2 E ; 1 d E ;
2 R S F θ = 1 E s E l E s a 1 E E s E F θ d E .
Since the 1st-level adjoint sensitivity function a 1 E is already available, the sensitivity 2 R / S F θ can be computed. The closed-form explicit expressions for the above second-order sensitivities are obtained by inserting the expression obtained in Eq. (106) for a 2 E ; 1 and the expression obtained in Eq. (51) for a 1 E , and performing the respective integrations. Carrying out these operations yields the following expressions:
2 R F θ F θ = S Σ d θ 2 1 F θ 3 1 E s E l F θ 1 2 1 F θ 2 E s E l F θ 1 ln E s E l F θ 1 F θ E s E l F θ 1 ln E s E l 2 ;
2 R Σ d θ F θ = S 1 F θ 1 1 F θ 1 E s E l F θ 1 F θ E s E l F θ 1 ln E s E l ;
2 R S F θ = Σ d θ 1 F θ 1 1 F θ 1 E s E l F θ 1 F θ E s E l F θ 1 ln E s E l .

5.2. Computation of Second-Order Sensitivities Stemming from R / Σ d θ

The second-order sensitivities stemming from the first-order sensitivity R / Σ d θ will be determined from the expression of the first-order G-variation δ R / Σ d θ , which is by definition obtained from Eq. (68) as shown below:
δ R Σ d θ = d d ε E l E s C 0 E + ε δ C E d E ε = 0 = E l E s δ C E d E .
Comparing Eq. (115) with Eq. (35), it becomes apparent that the following relation holds:
δ R / Σ d θ = δ R C ; θ ; δ C i n d / Σ d θ
Replacing the expression obtained for δ R C ; θ ; δ C i n d in Eq. (65) into Eq. (116) yields the following expression:
δ R Σ d θ = F δ S E s Σ d E l E s a 1 E d E + S δ F E s Σ d E l E s a 1 E d E E s E F θ .
It follows from Eq. (117) that the second-order sensitivities that stem from R / Σ d have the following expressions:
2 R F θ Σ d θ = S E s Σ d θ E l E s a 1 E d E E s E F θ ;
2 R δ Σ d θ Σ d θ = 0 ;
2 R δ S Σ d θ = F θ E s Σ d θ E l E s a 1 E d E .
The explicit closed-form expressions of the above second-order sensitivities are obtained by replacing the expression of a 1 E in Eqs. (118) and (120), respectively, and performing the respective integrations to obtain:
2 R F θ Σ d θ = S 1 F θ 1 1 F θ 1 E s E l F θ 1 F θ E s E l F θ 1 ln E s E l ;
2 R δ S Σ d θ = F θ 1 F θ 1 E s E l F θ 1 .
The expression for 2 R / F θ Σ d θ in Eq. (118) must be equivalent to the expression for 2 R / Σ d θ F θ in Eq. (110). Notably, therefore, these mixed second-order sensitivities are computed twice, using distinct expressions involving distinct adjoint functions, which provides an intrinsic mechanism for the stringent verification of the accuracy of the computations used for obtaining the numerical values of the respective adjoint functions.

5.3. Computation of Second-Order Sensitivities Stemming from R / S

The second-order sensitivities stemming from the first-order sensitivity R / S will be determined from the expression of the first-order G-variation δ R / S , which is by definition obtained from Eq. (69) as shown below:
δ R S = d d ε F θ 0 + ε δ F θ E s E l E s a 1 , 0 E + ε δ a 1 E d E ε = 0 = δ R S d i r + δ R S i n d ,
where the direct-effect term δ R / S d i r is defined as follows:
δ R S d i r δ F θ E s E l E s a 1 E d E θ = θ 0 ,
while the indirect-effect term δ R / S i n d is defined as follows:
δ R S i n d F θ E s E l E s δ a 1 E d E θ = θ 0 .
The direct-effect term δ R / S d i r can be computed at this time. To determine the indirect-effect term δ R / S i n d without needing to compute the variational function δ a 1 E by solving repeatedly Eq. (102), the steps outlined in Section 5.1 are applied using a 2nd-order adjoint sensitivity function denoted as a 2 E ; 3 , where the argument “3” indicates that this function will correspond to the first-order sensitivity R / S . Thus, following the conceptual steps outlined in Eqs. (103)‒(107) but for the function a 2 E ; 3 as the counterpart of the function a 2 E ; 1 , and for the indirect-effect term δ R / S i n d as the counterpart of the indirect-effect term δ R / F θ i n d , leads to the following expression for the indirect-effect term δ R / S i n d :
δ R S i n d δ F θ E l E s a 2 E ; 3 d E E E l E a 1 e d e + δ Σ d θ E l E s a 2 E ; 3 d E θ = θ 0 ,
where the 2nd-order adjoint sensitivity function a 2 E ; 3 is the solution of the following Volterra-type 2nd-LASS:
a 2 E ; 3 F θ E E s a 2 e ; 3 d e e = F θ E s .
For verification purposes, the explicit closed-form expression of the solution of Eq. (127) is provided below.
a 2 E ; 3 = F θ E s E s E F θ .
Adding the expressions obtained in Eqs. (126) and (124) yields the following expression for the first G-variation δ R / S :
δ R S = δ F θ E s E l E s a 1 E d E + δ F θ E l E s a 2 E ; 3 d E E E l E a 1 e d e + δ Σ d θ E l E s a 2 E ; 3 d E .
It follows from Eq. (129) that the second-order sensitivities stemming from the first-order sensitivity R / S have the following expressions:
2 R F θ S = 1 E s E l E s a 1 E d E + E l E s a 2 E ; 3 d E E E l E a 1 e d e
2 R Σ d θ S = E l E s a 2 E ; 3 d E .
2 R S S = 0 .
Inserting the expressions of a 2 E ; 3 and a 1 E , respectively, into Eqs. (130) and (131), and performing the respective integrations, yields the following closed-form explicit expressions for the respective second-order sensitivities:
R F θ S = Σ d θ 1 F θ 1 1 F θ 1 E s E l F θ 1 F θ E s E l F θ 1 ln E s E l ;
2 R Σ d θ S = F θ 1 F θ 1 E s E F θ 1 .
The expression for 2 R / S F θ in Eq. (111) must be equivalent to the expression for 2 R / F θ S in Eq. (130). The expression for 2 R / S Σ d θ in Eq. (122) must be equivalent to the expression for 2 R / Σ d θ S in Eq. (131). The equivalences of these corresponding expressions provide stringent verification criteria for the accuracy of the computation of the respective adjoint functions.

5.4. Discussion: Direct Computation of Second-Order Sensitivities Versus Their Indirect Computation via Feature Functions

Notably, the 9 second-order sensitivities of the decoder response with respect to the 3 feature functions, F θ , Σ d θ and S were computed using 3 adjoint computations; each of these adjoint computations corresponds to one of the 3 first-order sensitivities of the decoder response with respect to the feature functions. Only 6 of these 9 second-order sensitivities have distinct values; the mixed sensitivities were computed twice, using different adjoint functions thus providing stringent verification criteria for the numerical computations of these functions. The second-order sensitivities of the decoder response with respect to the primary model parameters are obtained by applying the “chain-rule of differentiation” provided in Eq. (97) to the second-order sensitivities with respect to the feature functions.
On the other hand, the computation of the second-order sensitivities of the decoder response directly with respect to the primary model parameters would be performed by treating each of the first-order sensitivities defined in Eqs. (54)‒(57) as a “decoder/model response.” A shown in Section 3.1, there would be T W + 1 first-order sensitivities in this case, which means that there would be T W + 1 “2nd-level adjoint sensitivity systems” to be solved, each one having a source-term that would correspond to one of the T W + 1 first-order sensitivities. Evidently, it is considerably more advantageous computationally to consider compute the second-order sensitivities via “feature functions,” whenever possible, rather than directly with respect to the primary model parameters.

6. Discussion and Conclusions

This work has introduced the general mathematical framework of the 2nd-FASAM-NIE-V methodology. The acronym “2nd-FASAM-NIE-V” stands for “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integral Equations of Volterra Type.” The 2nd-FASAM-NIE-V encompasses the mathematical framework of the (first-order) 1st-FASAM-NIE-V methodology, which enables the most efficient computation of the exact expressions of all first-order sensitivities with respect to the feature functions and also with respect to the optimal values of the NIE-net’s parameters/weights, after the respective NIE-Volterra-net was optimized to represent the underlying physical system. The 1st-FASAM-NIE-V methodology requires a single large-scale computation for determining the first-level adjoint sensitivity function that is subsequently used for computing the sensitivities using conventional numerical quadrature formulas. The 2nd-FASAM-NIE-V requires as many large-scale computations (to solve the 2nd-Level Adjoint Sensitivity System) as there are first-order sensitivities of the decoder response with respect to the feature functions. Subsequently, the second-order sensitivities of the decoder response with respect to the primary model parameters are obtained by applying the “chain-rule of differentiation” to the second-order sensitivities with respect to the feature functions.
The application of the 1st-FASAM-NIE-V and the 2nd-FASAM-NIE-V methodologies has been illustrated by using a well-known model for neutron slowing down in a homogeneous hydrogenous medium. This model has been chosen because the application of the 1st-FASAM-NIE-V and the 2nd-FASAM-NIE-V yields tractable explicit exact expressions for all quantities of interest, including the various adjoint sensitivity functions and first- and second-order sensitivities of the decoder response with respect to all feature functions and also with respect to the primary model parameters. This illustrative application highlights the unsurpassed efficiency of the 1st-FASAM-NIE-V and the 2nd-FASAM-NIE-V for second-order sensitivity analysis of NIE-Volterra nets. Ongoing research aims at developing the “Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Integro-Differential Equations,” which will enable, in premiere, the exact computations of second-order sensitivities of decoder responses with respect to optimized weights/parameters, based on the NIDE-models introduced in [22].

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed at the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. In Advances in Neural Information Processing Systems; Curran Associates, Inc.: New York, NY, USA, 2018; Volume 31, pp. 6571–6583. [Google Scholar] [CrossRef]
  2. Ruthotto, L.; Haber, E. Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 2018, 62, 352–364. [Google Scholar] [CrossRef]
  3. Lu, Y.; Zhong, A.; Li, Q.; Dong, B. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 3276–3285. [Google Scholar]
  4. Dupont, E.; Doucet, A.; The, Y.W. Augmented neural odes. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32, pp. 14–15. [Google Scholar]
  5. Grathwohl, W.; Chen, R.T.Q.; Bettencourt, J.; Sutskever, I.; Duvenaud, D. Ffjord: Free-form continuous dynamics for scalable reversible generative models. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
  6. Zhong, Y.D.; Dey, B.; Chakraborty, A. Symplectic ode-net: Learning Hamiltonian dynamics with control. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
  7. Kidger, P.; Morrill, J.; Foster, J.; Lyons, T. Neural controlled differential equations for irregular time series. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; Volume 33, pp. 6696–6707. [Google Scholar]
  8. Morrill, J.; Salvi, C.; Kidger, P.; Foster, J. Neural rough differential equations for long time series. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 7829–7838. [Google Scholar]
  9. Kidger, P. On Neural Differential Equations. arXiv 2022, arXiv:2202.02435. [Google Scholar]
  10. Rokhlin, V. Rapid solution of integral equations of classical potential theory. J. Comput. Phys. 1985, 60, 187–207. [Google Scholar] [CrossRef]
  11. Rokhlin, V. Rapid solution of integral equations of scattering theory in two dimensions. J. Comput. Phys. 1990, 86, 414–439. [Google Scholar] [CrossRef]
  12. Greengard, L.; Kropinski, M.C. An integral equation approach to the incompressible Navier-Stokes equations in two dimensions. SIAM J. Sci. Comput. 1998, 20, 318–336. [Google Scholar] [CrossRef]
  13. Effati, S.; Buzhabadi, R. A neural network approach for solving Fredholm integral equations of the second kind. Neural Comput. Appl. 2012, 21, 843–852. [Google Scholar] [CrossRef]
  14. Zappala, E.; de Oliveira Fonseca, A.H.; Caro, J.O.; van Dijk, D. Neural Integral Equations. arXiv 2023, arXiv:2209.15190v4. [Google Scholar]
  15. Xiong, Y.; Zeng, Z.; Chakraborty, R.; Tan, M.; Fung, G.; Li, Y.; Singh, V. Nystromformer: A nystrom-based algorithm for approximating self-attention. Proc. AAAI Conf. Artif. Intell. 2021, 35, 14138. [Google Scholar] [PubMed]
  16. Cacuci, D.G. Cacuci, D.G. Introducing the nth-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (nth-FASAM-N): I. Mathematical Framework. Am. J. Comput. Math. 2024, 14, 11–42. https://doi.org/10.4236/ajcm.2024.141002. See also: Dan Gabriel Cacuci, The nth-Order Comprehensive Adjoint Sensitivity Analysis Methodology (nth-CASAM): Overcoming the Curse of Dimensionality in Sensitivity and Uncertainty Analysis, Volume III: Nonlinear Systems, 369 pages, Springer Nature Switzerland, Cham, 2023. [CrossRef]
  17. Cacuci, D.G. Introducing the Second-Order Features Adjoint Sensitivity Analysis Methodology for Neural Ordinary Differential Equations. I: Mathematical Framework. Processes 2024, 12, 2660. [Google Scholar] [CrossRef]
  18. Cacuci, D.G. Introducing the Second-Order Features Adjoint Sensitivity Analysis Methodology for Fredholm-Type Neural Integral Equations. Mathematics 2025, 13, 14. [Google Scholar] [CrossRef]
  19. Weinberg, A.M.; Wigner, E.P. The Physical Theory of Neutron Chain Reactors; The University of Chicago Press: Chicago, Il., USA, 1958. [Google Scholar]
  20. Lamarsh, J.R. Introduction to Nuclear Reactor Theory; Addison-Wesley Publishing Company: Reading, Massachusetts, USA, 1966. [Google Scholar]
  21. Duderstadt, J.J.; Hamilton, L.J. Nuclear Reactor Analysis; John Wiley & Sons: New York, USA, 1976. [Google Scholar]
  22. Zappala, E.; de Oliveira Fonseca, A.H.; Moberly, A.H.; Higley, J.M.; Abdallah, C.; Cardin, J.; Van Dijk, D. Neural Integro-Differential Equations. arXiv 2022, arXiv:2206.14282v1. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated