1. Introduction
The two most usual approaches involved in simulation based engineering –SBE–, the one mainly based on physics, and the more recent one, the one based on the use of data manipulated by advanced machine learning techniques, both present their inherent limitations.
In general, the physics-based modeling framework produces responses that approximate quite well the real ones, as soon as an accurate enough model exists. The main difficulties when considering such a physics-based approach are: (i) the fidelity of the model itself, assumed calibrated; (ii) the impact that variability and uncertainty induces; and (iii) the computing time required for solving the complex and intricate mathematical models.
On the other hand, the data-driven framework is not fully satisfactory, because even if the data (assumed noise-free) represent well the reality, its extrapolation or interpolation in space and time from the collected data (in particular locations and times instants) entail usually a noticeable accuracy loss.
Of course, by increasing the number of collected data, one could expect approximating the real solution with more fidelity, however, data is not always simple to collect, not always possible to access, and in all cases collecting data is expensive (cost of sensors, cost of communication and analysis, ...). Equipping a very large industrial or civil infrastructure with millions of sensors to cover all its spatial dimension seems simply unreasonable.
Moreover, even when the solution is properly approximated, two difficulties persist: (i) the solution explainability, compulsory to certify solutions and decisions; and (ii) the domain of validity when extrapolating far from the domain where data was collected.
In view of the limitations of both existing procedures, a gateway consists of allying both to conciliate agility and fidelity. The hybrid paradigm seems a valuable and appealing option. It considers the reality expressible from the addition of two contributions: the existing knowledge (the state-of-the-art physics-based model or any other kind of knowledge-based model) and the part of the reality that the model ignores, the so-called ignorance (also called deviation, gap, discrepancy, or simply, ignorance).
1.1. The three main SBE methodologies revisited
To introduce and discuss different simulation based engineering frameworks, we consider a certain field describing the evolution of the variable u along the space defined by the coordinates .
We assume that it exists a certain knowledge on the addressed physics, described by the model , that in general consists of a system of algebraic equations or a system of differential equations complemented with the appropriate boundary and initial conditions ensuring the problem solvability. The solution provided by the just referred model, that as indicated, represents the existing knowledge on the problem at hand, is represented by .
Due to the partial knowledge on the addressed physical phenomenon, the calculated solution is expected to differ from the reference one , which is intended to be represented with maximum fidelity.
Thus, we define the residual
according to
where the error can be computed from its norm,
.
For reducing that error different possibilities exist:
Physics-based model improvement. This approach consists of refining the modelling, by enriching the model itself, , such that its solution exhibits a smaller error, i.e. .
-
Fully data-driven description. The data-driven route consists of making a large sampling of the space , , with large enough, and with the points location , , maximizing the domain coverage. These points are grouped into the set .
The coverage is defined by the convex hull of the set , ensuring interpolation for , limiting the risky extrapolation to the region of outside the convex hull .
Factorial samplings try to maximize the coverage, however factorial samplings, or the ones based on the use of Gauss-Lobatto quadratures, related to approximations making use of orthogonal polynomials [
3], fail when the dimensionality of the space
increases.
When , sparse sampling is preferred, LHP (Latin hyper cube) for instance. Samplings based on gaussian processes –GP– aim at distributing the points in locations where the uncertainty is maximum (with respect to the predictions inferred from the previously collected data).
Finally, the so-called
active learning techniques drive the sampling aiming at maximizing the representation of a certain goal oriented quantity of interest [
32].
In what follows, we assume a generic sampling to access to the reference solution , , assuming a perfect measurability.
Now, to infer the solution at
, it suffices constructing an interpolation or approximation, more generally, an adequate regression
:
with
.
Different possibilities exist, among them: regularized polynomial regressions Please rearrange all the references to appear in numerical order. [
29], neural networks –NN– [
16,
31], support vector regression –SVR– [
9], decision trees and their random forest counterparts [
4,
23], ... to name few.
The trickiest issue concerns the error evaluation, that is quantified from a part of the data kept outside the training-set, the so-called test-set, used to quantify the performances of the trained regression.
The main drawbacks of such a general procedure, particularly exacerbated in the multi-dimensional case (), are the following:
- –
Ability to explain the regression .
- –
The size of the data-set (), scaling with the problem dimensionality .
- –
The optimal sampling to cover while guaranteeing the accuracy of , or the one of the goal oriented quantities of interest.
Hybrid approach. It proceeds by embracing the physics-based and data-driven approaches, that as described in the next section, will improve the physics-based accuracy (while profiting of the physics-based explanatory capabilities) from the use of a data-driven enrichment, that at its turn, and under certain conditions, needs less amount of data than the fully data-driven approach just discussed [
6].
5. Application to the evaluation of the top-oil temperature of an electric power transformer
This section addresses an applicative case of industrial relevance. Aging of transformers exhibits a correlation with the temperature of the oil all along their lives in service, and moreover, the oil temperature seems to be an appealing feature to anticipate faults, and consequently to be used in predictive maintenance.
Some simplified models exist to evaluate the oil temperature depending on the transformer delivered power and the ambient temperature. Standards propose different simplified models for that purpose, as the one considered later.
Because of the complexity of a transformer, large in size and embracing numerous physics and scales, an appealing modeling route consists of using a simplified model and then enrich it from the available collected data [
22].
The just referred correction comes from the time-integration of a dynamical system, involving a parametric loading (delivered power and ambient temperature), learnt from the available data under the stability constraints.
To illustrate the construction of the enriched model for the top oil temperature prediction , we consider as input the ambient temperature , and the transformer load , which are both measured every hour. Thus, the model parameters read .
The transformer oil temperature results from
where
represents the deviation, and
the temperature predicted by a state-of-the-art simplified model [
33]:
where
: The position of the tap-changer;
: the load factor, a ratio between the nominal load current and the actual current;
P: represents the iron, copper and supplementary losses. The power that heats the oil is composed of the losses that do not depend on the transformer load (iron losses, supposed constant) and the losses that depend on the transformer load (copper and supplementary losses) that depend on the average windings temperature and the load factor, according to: , with and (k being a correction factor related to the material resistivity);
: The ambient temperature;
: The simulated top oil temperature
: The temperature difference between the simulated top oil temperature and the ambient temperature;
and : thermal resistance and thermal capacitance of the equivalent transformer thermal circuit;
: The average winding temperature;
: The difference between the average winding temperature and the simulated oil temperature . It is supposed constant and found during the commissioning test (standards)
The physics-based model (
46) is calibrated from the available experimental data, from which, parameters
and
are obtained. Two physics-based models are used, a linear one where
and
are assumed constant, and a nonlinear physics-based model where
and
are temperature dependent, that is, both coefficients depend on the top-oil temperature
.
The available experimental data
, the prediction from the calibrated physics-based non-linear model (
46),
, and the deviation between both them,
, are all depicted in
Figure 6.
The model correction (the deviation model,
) is obtained also using two different approaches: the linearized ResNet:
and the nonlinear ResNet counterpart:
where
represents the augmented features set, that contains the physical features
augmented with the model prediction
.
Functions
and
are described by using two LSTM architectures (described in
Table 1 and
Table 2 respectively for the linearized ResNet) that consider the extended features involved in
at the present time, as well as at the previous 4 time-steps.
Thus, the linearized correction dynamical model reads
The neural network architecture considered for describing functions
and
are both based on the use of LSTM layers combined with a deep dense neural network layer, as described in
Table 1 and
Table 2 for the linearized ResNet. They were built-up by using Tensorflow Keras libraries. The inputs involved in Eq. (
49) are shown in
Figure 7.
The training was performed on the initial of the available measures, while the testing was performed on the remaining .
The prediction performed from the corrected (enriched) model is depicted in
Figure 8, on the testing interval. It is important to note, that the learnt models
and
, were used to integrate the dynamical problem that governs the time evolution of the deviation, Eq. (
49), that is, the computed deviation at time
was computed from the one at time
, and then it served to compute the one at the next time step
, and so on.
To distinguish between the known deviation
and the one computed from the integrator based on the learnt dynamics
and
, the later will be noter by the
hat symbol, i.e.
From the computed correction, the model was enriched, exhibiting excellent accuracy and the stability performances.
The fact that the linearized dynamical system operating on the solution correction exhibits good performances, proves that most of the problem nonlinearities were captured by the first order simplified model. When it comes to the nonlinear version of the correction effort, the dynamic version of the model is written in a similar manner as shown in equation (
49), with the dynamical integration form used:
To compare the performances of the hybrid model (data-driven enrichment of the simplified physics-based model), the model governing the experimental data evolution was learnt using the same rationale but now applied on the data, according to:
where
and
refer to the parametric functions related to the measured oil temperature, both them depending exclusively on the input features
.
Again, the training was performed using the same model presented in
Table 1 and
Table 2, with the same
and
training and test data-sets.
Figure 9 depicts the results obtained from the integration, where again the
hat symbol refers the integrated temperature.
Figure 9 depicts the data-driven model integration, proving the stability performances. However, when comparing the residual error (with respect to the experimental data) of the fully data-driven prediction and the one related to the physics-based enriched solution obtained from the data-driven model of the deviation, both compared in
Figure 10, the hybrid modeling framework performs better, ensuring higher accuracy.
Table 3, compares the mean value of the errors associated with the different tested models. From
Table 3, one can infer that:
The hybrid approach improves the physics-based model performances.
Enriching a richer nonlinear physics-based model outperform with respect to enriching the linear counterpart of the simplified physics-based model.
When the considered physics-based models are too far from the reference solution (experimental data) the data-driven model can outperform the hybrid modeling.
Table 3.
Comparing different built models: mean error (in °C) on the testing set. The nonlinear physics-based model mean error was 3.91°C when used alone, over the same testing set, and 3.25°C for the linear physics based model.
Table 3.
Comparing different built models: mean error (in °C) on the testing set. The nonlinear physics-based model mean error was 3.91°C when used alone, over the same testing set, and 3.25°C for the linear physics based model.
ResNet |
Fully |
HT from a linear |
HT from a nonlinear |
|
data-driven |
physical model |
physical model |
Linear stabilized |
2.173 |
3.143 |
1.620 |
Nonlinear stabilized |
1.716 |
1.516 |
1.439 |
To show the effect of the stabilization, a ResNet is trained without enforcing the stability constraints previously proposed, by using the formulation:
with no conditions imposed during the calculation of
g and
h.
The discrete form
provides excellent predictions. It is important note that here the solution at time
,
, is computed from the exact deviation at time
,
.
However, a full integration where
is computed from the previously computed
produces extremely bad predictions, a direct consequence of the lack of stability.
Figure 11 proves the importance of using stable formulations when the learnt model is expected serving for performing integrations from an initial condition, as it is always the case in prognosis applications.