5.1. First-Order Sensitivities of the Flux Response
The first-order sensitivity of the response
is provided by the first-order G-differential of the expression in Eq. (31), which is, by definition, obtained as follows:
The variation
is the solution of the “First-Level Variational Sensitivity System (1
st-LVSS) which is obtained by G-differentiating Eqs. (13)‒(15), which yields the following expressions:
Performing the operations involving the scalar
in Eqs. (63)‒(68) yields the following expression for the 1
st-LVSS:
The 1
st-LVSS comprising Eqs. (69)‒(74) represents the specific form taken on by the general NODE-representation of the 1
st-LVSS provided by Eqs. (44) and (45) for the Nordheim-Fuchs model. Comparing Eqs. (69)‒(74) to Eqs. (44) and (45) indicates the following correspondences:
It is evident that the 1st-LVSS would need to be solved repeatedly in order to compute the 1st-level variational function for every possible variations in the model parameters and variations in the initial conditions (‘encoder”). This computationally expensive path can be avoided by applying the concepts of the 1st- CASAM-NODE previously outlined in Subsection 4.1, as follows:
Consider that the 1st-level variational function , is an element in a Hilbert space denoted as , , comprising elements of the form , , and being endowed with the inner product introduced in Eq. (50), which takes on the following particular form for the Nordheim-Fuchs model:
- 2.
Use Eq. (79) to form the inner product of Eqs. (69)‒(71) with a yet undefined function , to obtain the following relation, which is the particular form taken on by Eq. (51) for the Nordheim-Fuchs model:
- 3.
Integrating by parts the terms on the left-side of Eq. (80) yields the following relation
The relation obtained in Eq. (81) is the particular form taken on by Eq. (52) for the Nordheim-Fuchs model.
- 4.
The definition of the function is now completed by requiring that: (i) the integral term on the right-side of Eq.(81) represent the G-differential defined in Eq. (62), and (ii) the appearance of the unknown values of the components of be eliminated from appearing in Eq. (81). These requirements will be satisfied if the function is the solution of the following “1st-Level Adjoint Sensitivity System (1st-LASS)”:
It is important to note that if the vector-valued function is linear in (in which case the NODE would be linear), then the 1st-level adjoint sensitivity function would not depend on , so the “forward solution path” would not need to be stored in order to compute . Otherwise, however, the “forward solution path” would need to be stored in order to compute .
- 5.
Using Eqs. (84), (85), (80), (62), (72), (73) and (74) in Eq. (81) yields the following expression for the first G-differential of the response under consideration:
It follows from Eq. (86) that the first-order sensitivities of the response
with respect to the parameters and initial conditions underlying the Nordheim-Fuchs model have the following expressions, all of which are to be evaluated at the nominal values of the respective parameters and functions (but the superscript “zero” is omitted to simplify the notation):
5.4. First-Order Sensitivities of the Thermal Conductivity Response
The first-order G-differential of the response
defined in Eq. (34) is obtained as follows:
where the direct-effect and the indirect-effect terms, respectively, are defined as follows:
The direct-effect term yields the following sensitivities which can be evaluated immediately:
The indirect-effect term can be evaluated only after determining the variational function
, which is the solution of the 1
st-LVSS defined by Eqs. (69)‒(74). The need for solving (repeatedly) the 1
st-LVSS can be circumvented by applying the principles of the 1
st-CASAM-NODE, as previously outlined. Thus, following the same procedure as detailed in
Section 5.1 leads to the following 1
st-LASS for the 1
st-level adjoint sensitivity function, denoted as
, for computing the sensitivities stemming from the indirect-effect term
:
It is important to note that all of the following 1st-Level Adjoint Sensitivity Systems, enumerated in items (i) through (iv), below:
the 1st-LASS defined by Eqs. (84) and (85), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities of the component of the state function ;
the 1st-LASS defined by Eqs. (95) and (96), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities of the component of the state function ];
the 1st-LASS defined by Eqs. (98) and (99), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities of the component of the state function , and
the 1st-LASS defined by Eqs. (104) and (105), which are solved for obtaining the corresponding 1st-level adjoint sensitivity function needed for computing the sensitivities stemming from the indirect-effect term ;
…have the same structures/operators on their left sides, and the respective adjoint sensitivity function all satisfy the same final-time conditions; only the source terms on the right-sides of the respective 1st-LASS differ from each other. Consequently, the same numerical procedures and/or neural nets can be used for computing the respective 1st-level adjoint sensitivity functions.
Since the NODE is a first-order ODE, the corresponding 1
st-LASS is solved “backwards” in time, starting at the final time-step
, as indicated by the general 1
st-CASAM-NODE methodology presented in
Section 4. If the NODE is linear in the state function (dependent variable)
, then the 1
st-LASS will be independent of
, so the “forward solution path” would not need to be stored in order to compute the 1
st-level adjoint sensitivity functions. In contradistinction, if the NODE is nonlinear in the state function (dependent variable)
, then the 1
st-LASS will depend on
, so the “forward solution path” would need to be stored in order to compute the respective 1
st-level adjoint sensitivity functions.
Furthermore, the same formal expressions are obtained for the sensitivities of the responses considered. Thus, the respective 1st-level adjoint sensitivity functions differ from each other according to the response considered, but the quadrature-schemes needed to evaluate the integrals defining the respective sensitivities are the same. Therefore, the same numerical procedures and/or neural nets can be used for computing the respective integrals that define the 1st-order sensitivities, while using the appropriate/corresponding 1st-level adjoint sensitivity functions. If the decoder-response depends on parameters/weights, additional sensitivities arise from the respective non-vanishing “direct-effect term.”
If simple relations can be obtained among the responses of interest, such as Eqs. (11) and (16) for the illustrative paradigm example, then the sensitivities of the various responses can be obtained by using these relationships, but this is seldom the case in practice.
5.5. Most Efficient Computation of First-Order Sensitivities: Application of the 1st-FASAM-N
In most, if not all, practical situations, the equations modeling the physical system under consideration can be recast to suit the computation of the response under consideration and, consequently, the computation of the response sensitivities with respect to the underlying model parameters. For example, the response
involves just the function
; hence, this response would be ideally computed, together with its sensitivities to parameters, by using an equation containing as few as possible dependent variables other than the ones [e.g.,
] needed for computing the response. Such an equation was obtained in Eq. (17), which contains just the dependent variable
, so it would be more advantageous to us it for the sensitivity analysis of
rather than use the entire system of equations underlying the Nordheim-Fuchs model, as was done, for illustrative purposes, in
Section 5.1. Furthermore, the form of Eq. (17) indicates that the “features” (i.e., functions) of model parameters characterizing this balance equation can be chosen as follows:
where the vector of
primary model parameters is defined as follows:
Note that the vector includes the initial condition .
In terms of the “feature function”
, Eq. (17) can alternatively be written as follows:
In terms of the feature function
, the solution of Eq. (108) has the following form:
Of course, a specific NODE would need to be constructed to model Eq. (108).
The form of Eq. (108) is suitable for applying the “n
th-Order Features Adjoint Sensitivity Analysis Methodology for Nonlinear Systems (n
th-FASAM-N)” [
24], which is the most efficient methodology for computing sensitivities, particularly for sensitivities of second- and higher-order. This methodology considers the specific “features” of model parameters, such as the function
, to compute sensitivities with respect to model parameters more efficiently than by considering directly the respective primary parameters.
For the computation of 1
st-order sensitivities, the 1
st-FASAM-N commences by constructing the 1
st-Level Variational Sensitivity System (1
st-LVSS) for the variational function
by applying the definition of the first-order G-differential to Eq. (108), which yields:
Performing the operations indicated in Eqs. (110) and (111) yields the following expression for the 1
st-LVSS satisfied by the variational function
:
The 1st-LVSS represented by Eq. (112) is to be solved at the nominal values for the parameters and the state function but the superscript “0” (which indicates “nominal values”) has been omitted to simplify the notation.
Numerically, the 1
st-LVSS would need to be solved anew for the various variations
,
, in the components of the feature function
. This need for repeatedly solving the 1
st-LVSS can be avoided by constructing the corresponding 1
st-Level Adjoint Sensitivity System (1
st-LASS). The Hilbert space appropriate for the construction of the 1
st-LASS corresponding to Eq. (112) is endowed with the following particular form of Eq. (79):
Using Eq. (114) to form the inner product of Eq. (112) with a yet undefined function
yields the following relation:
Integrating by parts the left side of Eq. (115) yields the following relation:
Identifying the integral on the right-side of Eq. (116) with the G-differential
of the response
obtained in Eq. (32) and eliminating the unknown value
from the right-side of Eq. (116) by setting
yields the following 1
st-Level Adjoint Sensitivity System (1
st-LASS) for the 1
st-level adjoint sensitivity function
:
The 1
st-LASS represented by Eqs. (117) and (118) is independent of variations in the feature functions (and/or parameters) so it would need to be solved only once, numerically. In the present case, the 1
st-LASS can be solved analytically to obtain the following closed-form expression for the 1
st-level adjoint sensitivity function
:
where
denotes the Heaviside functional.
Using Eqs. (116)‒(118) in Eq. (115) yields the following expression for the first-order total G-differential
of the response
in terms of the 1
st-level adjoint function
:
It follows from Eqs. (120), (119) and (109) that the two sensitivities of the response
with respect to the two components of the feature function
have the following expressions:
The above expressions are to be evaluated at the nominal parameter values but the superscript “zero” has been omitted, for simplicity. The expressions obtained in Eqs. (121) and (122) can be verified by differentiating the expression provided in Eq. (109), evaluated at a user-chosen time within the interval .
The sensitivities of the response
with respect to the model parameters and initial condition are obtained by using the following “chain-rule” relationship:
The explicit expressions for the specific sensitivities of the response
with respect to the parameters underlying the feature functions are obtained using Eq. (123) in conjunction with Eqs. (121) and (122) while recalling the definitions of the feature functions
and
defined in Eq. (106). The detailed expressions of these sensitivities are as follows:
Notably, the application of the 1st-FASAM-N requires one “large-scale” computation to solve the 1st-LASS, cf. Eq. (117) and (118), which is a single ODE, to obtain the 1st-level adjoint function , which is a scalar-valued function. However, solving the forward model, cf. Eq. (17), and the corresponding 1st-LASS, comprising Eq. (117) and (118), would require the construction of a separate (albeit simpler) NODE. The 1st-level adjoint function is subsequently used in performing two integrals (quadrature) for obtaining the two sensitivities of the response with respect to the two components and of the feature function . Subsequently, all of the response sensitivities with respect to the model’s primary parameters are obtained analytically by using the chain-rule to differentiate the components of the feature function with respect to the underlying model parameters and initial conditions.
In contradistinction, if one wishes to compute directly the sensitivities of the response with respect to the model parameters and initial conditions, it has been shown in Subsections 5.1‒5.4 that the original NODE can be used to solve (backward in time) the 1st-LASS, which comprises a system of three coupled ODEs (rather than a single ODE if the 1st-FASAM is used) for obtaining the 1st-level adjoint function, which is a vector-valued function comprising three components, cf. for the response . The respective vector-valued 1st-level adjoint function is subsequently used in computing six (rather than two, if the 1st-FASAM is used) integrals (quadrature) for obtaining the six sensitivities of the respective response with respect to the six model parameters.
Equations similar to Eq. (17) can be derived for the reactor-flux and reactor temperature responses, so the 1st-FASAM can be applied in a similar fashion to compute the first-order sensitivities of these responses. Using the sensitivities with respect to the reactor temperature response would readily provide the first-order sensitivities of the reactor thermal conductivity response. However, corresponding to each of these responses, a specific NODE would need to be constructed. Of course, any of these specific NODE would have much simpler structures than the NODE for solving simultaneously the system of coupled ODEs presented in Subsections 5.1 through 5.4.