Preprint
Article

Machine Learning Method for Approximate Solutions for Reaction-Diffusion Equations with Multivalued Interaction Functions

Altmetrics

Downloads

135

Views

63

Comments

0

Submitted:

29 July 2024

Posted:

30 July 2024

Read the latest preprint version here

Alerts
Abstract
This paper presents machine learning methods for approximate solutions of reaction-diffusion equations with multivalued interaction functions. This approach addresses the challenge of finding all possible solutions for such equations, which often lack uniqueness. The proposed method utilizes physics-informed neural networks (PINNs) to approximate generalized solutions.
Keywords: 
Subject: Computer Science and Mathematics  -   Computational Mathematics

1. Introduction

In this paper we establish machine learning methods for approximate solutions of classes of reaction-diffusion equations with multivalued interaction functions allowing for non-unique solutions of the Cauchy problem. The relevance of this problem is primarily due to the lack of methods for finding all solutions for such mathematical objects. Therefore, there is an expectation that another approximate method will provide us with yet another solution for these problems. In addition, methods for approximate solutions of nonlinear systems with partial derivatives without uniqueness are mostly theoretical and are used primarily in qualitative research [1,2]. The availability of computational power for parallel computations and the creation of open-source software libraries such as PyTorch [3] have stimulated a new wave of development in IT and artificial intelligence methods. Sample-based methods for approximate solutions of such problems were first proposed in [4]. To date, such systems with smooth nonlinearities have been qualitatively and numerically studied. There is a need to develop a methodology for approximating generalized solutions of nonlinear differential-operator systems without uniqueness using recurrent neural networks, sample-based methods, and variations of the Monte Carlo method.
Let T , ν > 0 , and u 0 : R 2 R be a sufficiently smooth function. We consider the problem:
u t ( x , t ) ν 2 u ( x , t ) f ( u ( x , t ) ) , ( x , t ) R 2 × [ 0 , T ] ,
with initial conditions:
u ( x , 0 ) = u 0 ( x ) , x R 2 ,
where
f ( s ) : = { 0 } , s < 0 ; 0 , 1 , s = 0 ; { 1 } , s > 0 .
For a fixed u 0 C 0 ( R 2 ) let Ω R 2 be a bounded domain with sufficiently smooth boundary and supp u 0 Ω . According to [1] (see the book and references therein), there exists a weak solution u = u ( x , t ) L 2 ( 0 , T ; H 0 1 ( Ω ) ) with u t L 2 ( 0 , T ; H 1 ( Ω ) ) , of Problem (1)–(2) in the following sense:
0 T Ω u ( x , t ) v ( x ) η t ( t ) d x d t + 0 T Ω ( u ( x , t ) · v ( x ) + d ( x , t ) v ( x ) ) η ( t ) d t = 0 ,
for all v C 0 Ω , η C 0 ( 0 , T ) , where d : R × [ 0 , T ] R be a measurable function such that
d ( x , t ) f ( u ( x , t ) ) f o r a . e . ( x , t ) R 2 × ( 0 , T ) .
Such inclusions with multivalued nonlinearities appear in problems of climatology (Budyko-Sellers Model), chemical kinetics (Belousov-Zhabotinsky equations), biology (Lotka–Volterra systems with diffusion), quantum mechanics (FitzHugh–Nagumo system), engineering and medicine (several syntheses and impulse control problems); see [1,2] and references therein.
The main goal of this paper is to develop an algorithm for approximation of solutions for classes of reaction-diffusion equations with multivalued interaction functions allowing for non-unique solutions of the Cauchy problem (1)–(2) via the so-called physics-informed neural networks (PINNs); [5,6,7] and references therein.

2. Methodology of Approximate Solutions for Reaction-Diffusion Equations with Multivalued Interaction Functions

Fix an arbitrary T > 0 , and a sufficiently smooth function u 0 : R R . We approximate the function f by the following Lipschitz functions:
f k ( s ) : = 0 , s < 0 ; k s , s [ 0 , 1 k ) ; 1 , s 1 k , k = 1 , 2 , .
For a fixed k = 1 , 2 , , consider the problem:
u k t ( x , t ) = ν 2 u k ( x , t ) f k ( u k ( x , t ) ) , ( x , t ) R 2 × [ 0 , T ] ,
with initial conditions:
u k ( x , 0 ) = u 0 ( x ) , x R 2 .
According to [2] and references therein, for each k = 1 , 2 , Problem (7)–(8) has an unique solution u k C 2 , 1 ( R 2 × [ 0 , T ] ) . Moreover, [8] implies that each convergent subsequence { u k l } l = 1 , 2 , { u k } k = 1 , 2 , of corresponding solutions to Problem (7)–(8) weakly converges to a solution u of Problem (1)–(2) in the space
W : = { z L 2 ( 0 , T ; H 0 1 ( Ω ) ) : z t L 2 ( 0 , T ; H 1 ( Ω ) ) }
endowed with the standard graph norm, where Ω R 2 is a bounded domain with sufficiently smooth boundary and supp u 0 Ω .
Thus, the first step of the algorithm is to replace the function f in Problem (1)–(2) with f k considering Problem (7)–(8) for sufficiently large k .
Let us now consider Problem (7)–(8) for sufficiently large k . Theorem 16.1.1 from [5] allows us to reformulate Problem (7)–(8) as an infinite dimensional stochastic optimization problem over a certain function space. More exactly, let t C ( [ 0 , T ] ; ( 0 , ) ) , ξ C ( R 2 ; ( 0 , ) ) , let ( Ω , F , P ) be a probability space, let T : Ω [ 0 , T ] and X : Ω R 2 be independent random variables. Assume for all A B ( [ 0 , T ] ) , B B ( R 2 ) that
P ( T A ) = A t ( t ) d t and P ( X B ) = B ξ ( x ) d x .
Note that f k : R R be Lipschitz continuous, and let L k : C 1 , 2 ( R 2 × [ 0 , T ] , R ) [ 0 , ] satisfy for all v = ( v ( x , t ) ) ( x , t ) [ 0 , T ] × R 2 C 2 , 1 ( R 2 × [ 0 , T ] ) that
L k ( v ) = E v ( X , 0 ) u 0 ( X ) 2 + v t ( X , T ) ν 2 v ( X , T ) f k ( v ( X , T ) ) 2 .
Theorem 16.1.1 from [5] implies that the following two statements are equivalent:
  • It holds that L k ( u k ) = inf v C 2 , 1 ( R 2 × [ 0 , T ] ) L k ( v ) .
  • It holds u k C 2 , 1 ( R 2 × [ 0 , T ] ) is the solution of Problem (7)–(8).
Thus, the second step of the algorithm is to reduce the regularized Problem (7)–(8) to the infinite dimensional stochastic optimization problem in C 2 , 1 ( R 2 × [ 0 , T ] ) :
L k ( v ) min , v C 2 , 1 ( R 2 × [ 0 , T ] ) .
However, due to its infinite dimensionality, the optimization problem (10) is not yet suitable for numerical computations. Therefore, we apply the third step, the so-called Deep Galerkin Method (DGM) [9], that is, we transform this infinite dimensional stochastic optimization problem into a finite dimensional one by incorporating artificial neural networks (ANNs); see [5,9] and references therein. Let a : R R be differentiable, let h N , l 1 , l 2 , . . . , l h , d N satisfy d = 4 l 1 + [ k = 2 h l k ( l k 1 + 1 ) ] + l h + 1 , and let L k , h : R d [ 0 , ) satisfy for all θ R d that
L k , h ( θ ) = L ( N M a , l 1 , M a , l 2 , . . . , M a , l h , id R θ , 3 ) = E N M a , l 1 , , M a , l h , id R θ , 3 ( 0 , X ) g ( X ) 2 + N M a , l 1 , , M a , l h , id R θ , 3 t ( T , X ) ν 2 N M a , l 1 , , M a , l h , id R θ , 3 ( T , X ) f k N M a , l 1 , , M a , l h , id R θ , 3 ( T , X ) 2 ,
where M ψ , d is the d-dimensional version of a function ψ , that is,
M ψ , d : R d R d
is the function which satisfies for all x = ( x k ) k { 1 , 2 , , d } R d , y = ( y k ) k { 1 , 2 , , d } R d with k { 1 , 2 , , d } : y k = ψ ( x k ) that
M ψ , d ( x ) = y ;
for each d , L N , l 0 , l 1 , . . . , l L N , θ R d satisfying d k = 1 L l k ( l k 1 + 1 ) , and for a function Ψ k : R l k R l k ,   k { 1 , 2 , . . . , L } , we denote by N Ψ 1 , Ψ 2 , . . . , Ψ L θ , l 0 : R l 0 R l L the realization function of the fully-connected feedforward artificial neural network associated to θ with L + 1 layers with dimensions ( l 0 , l 1 , . . . , l L ) and activation functions ( Ψ 1 , Ψ 2 , . . . Ψ L ) , defined as:
N Ψ 1 , Ψ 2 , . . . , Ψ L θ , l 0 ( x ) = ( Ψ L A l L , l L 1 θ , k = 1 L 1 l k ( l k 1 + 1 ) Ψ L 1 A l L 1 , l L 2 θ , k = 1 L 2 l k ( l k 1 + 1 ) . . . . . . Ψ 2 A l 2 , l 1 θ , l 1 ( l 0 + 1 ) Ψ 1 A l 1 , l 0 θ , 0 ) ( x ) ,
for all x R l 0 ; and for each d , m , n N , s N 0 : = N { 0 } , θ = ( θ 1 , θ 2 , , θ d ) R d satisfying d s + m n + m , the affine function A s , m , n θ from R n to R m associated to ( θ , s ) , is defined as
A s , m , n θ ( x ) = θ s + 1 θ s + 2 θ s + n θ s + n + 1 θ s + n + 2 θ s + 2 n θ s + 2 n + 1 θ s + 2 n + 2 θ s + 3 n θ s + ( m 1 ) n + 1 θ s + ( m 1 ) n + 2 θ s + m n x 1 x 2 x n + θ s + m n + 1 θ s + m n + 2 θ s + m n + m
for all x = ( x 1 , x 2 , , x n ) R n .
The final step in the derivation involves approximating the minimizer of L k , h using stochastic gradient descent optimization methods [5]. Let ξ R d , J N , ( γ n ) n N [ 0 , ) , for each n N , j { 1 , 2 , . . . , J } let T : Ω [ 0 , T ] and X n , j : Ω R 2 be random variables. Let for each n N , j { 1 , 2 , . . . , J } , A B ( [ 0 , T ] ) , B B ( R 2 )
P ( T A ) = P ( T n , j A ) and P ( X B ) = P ( X n , j B ) .
Let k , h : R d × R 2 × [ 0 , T ] R is defined as
k , h ( θ , x , t ) = N M a , l 1 , M a , l 2 , , M a , l h , id R θ , 3 ( x , 0 ) u 0 ( x ) 2 + N M a , l 1 , M a , l 2 , , M a , l h , id R θ , 3 t ( x , t ) ν 2 N M a , l 1 , M a , l 2 , , M a , l h , id R θ , 3 ( x , t ) f k N M a , l 1 , M a , l 2 , , M a , l h , id R θ , 3 ( x , t ) 2 ,
for each θ R d , x R 2 , t [ 0 , T ] , and let Θ = ( Θ n ) n N 0 : N 0 × Ω R d satisfy for all n N that
Θ 0 = ξ and Θ n = Θ n 1 γ n 1 J j = 1 J ( θ k , h ) ( Θ n 1 , T n , j , X n , j ) .
Ultimately, for sufficiently large k , h , n N , the realization N M a , l 1 , M a , l 2 , , M a , l h , id R Θ n , 3 is chosen as an approximation:
N M a , l 1 , M a , l 2 , , M a , l h , id R Θ n , 3 u
of the unknown solution u of (1)–(2) in the space W defined in (9).
So, the following theorem is justified.
Theorem 1. 
Let T > 0 , and u 0 C 0 ( R 2 ) . Then the sequence of { N M a , l 1 , M a , l 2 , , M a , l h , id R Θ n , 3 } k , h , n defined in (13)–(14) has an accumulation point in the weak topology of W defined in (9). Moreover, each partial limit of the sequence in hands is weakly converges in W to the solution of Problem (1)–(2) in the sense of (4)–(5).
Proof. 
According to Steps 1–4 above, to derive PINNs, we approximate u in the space W defined in (9) by a deep ANN N θ : R 2 × [ 0 , T ] R with parameters θ R d and minimize the empirical risk associated to L k ( v ) over the parameter space R d . More precisely, we approximate the solution u of (1)–(2) by N θ * where
θ * arg min θ R d 1 n i = 1 n N θ ( X i , 0 ) u 0 ( X i ) 2 + N θ t ( X i , T i ) ν 2 N θ ( X i , T i ) f k ( N θ ( X i , T i ) ) 2
for a suitable choice of training data { ( X i , T i ) } i = 1 n . Here n N denotes the number of training samples and the pairs ( X i , T i ) , i { 1 , 2 , , n } , denote the realizations of the random variables X and T.
Analogously, to derive DGMs, we approximate u by a deep Galerkin method (DGM) G θ : R 2 × [ 0 , T ] R with parameters θ R d and minimize the empirical risk associated to L k , h ( v ) over the parameter space R d . More precisely, we approximate the solution u of (1)–(2) by G θ * , where
θ * arg min θ R d 1 n i = 1 n G θ ( X i , 0 ) u 0 ( X i ) 2 + G θ t ( X i , T i ) ν 2 G θ ( X i , T i ) f k ( G θ ( X i , T i ) ) 2
for a suitable choice of training data { ( X i , T i ) } i = 1 n . Here n N denotes the number of training samples and the pairs ( X i , T i ) , i { 1 , 2 , , n } , denote the realizations of the random variables X and T.    □
The empirical risk minimization problems for PINNs and DGMs are typically solved using SGD or variants thereof, such as Adam [5]. The gradients of the empirical risk with respect to the parameters θ can be computed efficiently using automatic differentiation, which is commonly available in deep learning frameworks such as TensorFlow and PyTorch. We provide implementation details and numerical simulations for PINNs and DGMs in the next section.

3. Numerical Implementation

Let us present a straightforward implementation of the method as detailed in the previous Section for approximating a solution u W of Problem (1)–(2) with the initial condition u 0 ( x ) : = ψ ( x 1 2 + x 2 2 ) , where
ψ ( s ) : = sin ( 8 π exp 1 3 3 s , s [ 0 , 3 ) ; 0 , othervise ,
( x 1 , x 2 ) R 2 . Let k = 0.01 . This implementation follows the original proposal by [6], where 20.000 realizations of the random variable ( X , T ) are first chosen. Here, T is uniformly distributed over [ 0 , 3 ] , and X follows a normal distribution in R 2 with mean 0 R 2 and covariance 4 I 2 R 2 × 2 . A fully connected feed-forward ANN with 4 hidden layers, each containing 50 neurons, and employing the Swish activation function is then trained. The training process uses batches of size 256 , sampled from the 20.000 preselected realizations of ( X , T ) . Optimization is carried out using the Adam SGD method. A plot of the resulting approximation of the solution u after 20.000 training steps is shown in Figure 1.

4. Conclusions

In this paper, we presented a novel machine learning methodology for approximating solutions to reaction-diffusion equations with multivalued interaction functions, a class of equations characterized by non-unique solutions. The proposed approach leverages the power of physics-informed neural networks (PINNs) to provide approximate solutions, addressing the need for new methods in this domain.
Our methodology consists of four key steps:
  • Approximation of the Interaction Function: We replaced the multivalued interaction function with a sequence of Lipschitz continuous functions, ensuring the problem becomes well-posed.
  • Formulation of the Optimization Problem: The regularized problem was reformulated as an infinite-dimensional stochastic optimization problem.
  • Application of Deep Galerkin Method (DGM): We transformed the infinite-dimensional problem into a finite-dimensional one by incorporating artificial neural networks (ANNs).
  • Optimization and Approximation: Using stochastic gradient descent (SGD) optimization methods, we approximated the minimizer of the empirical risk, yielding an approximation of the unknown solution.
The numerical implementation demonstrated the effectiveness of the proposed method. We used a fully connected feed-forward ANN to approximate the solution of a reaction-diffusion equation with specific initial conditions. The results showed that the PINN method could approximate solutions accurately, as evidenced by the visual plots.
The key contributions of this paper are as follows:
  • Development of a Machine Learning Framework: We established a robust framework using PINNs to tackle reaction-diffusion equations with multivalued interaction functions.
  • Handling Non-Uniqueness: Our method addresses the challenge of non-unique solutions, providing a practical tool for approximating generalized solutions.
  • Numerical Validation: We provided a detailed implementation and numerical validation, demonstrating the practical applicability of the proposed approach.
Future work could explore the extension of this methodology to other classes of partial differential equations with multivalued interaction functions, as well as further optimization and refinement of the neural network architectures used in the approximation process. The integration of more advanced machine learning techniques and the exploration of their impact on the accuracy and efficiency of the solutions also present promising avenues for research.

Author Contributions

All the authors contributed equally to this work.

Funding

“This research was funded by EIT Manufacturing asbl, 0123U103025, grant: “EuroSpaceHub - increasing the transfer of space innovations and technologies by bringing together the scientific community, industry and startups in the space industry”. The second and the third authors were partially supported by NRFU project No. 2023.03/0074 “Infinite-dimensional evolutionary equations with multivalued and stochastic dynamics”.

Institutional Review Board Statement

The authors have nothing to declare.

Informed Consent Statement

The authors have nothing to declare.

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Conflicts of Interest

The author have no relevant financial or non-financial interests to disclose.

References

  1. Zgurovsky, M.Z.; Mel’nik, V.S.; Kasyanov, P.O. Evolution Inclusions and Variation Inequalities for Earth Data Processing I: Operator Inclusions and Variation Inequalities for Earth Data Processing; Vol. 24, Springer Science & Business Media, 2010.
  2. Zgurovsky, M.Z.; Kasyanov, P.O. Qualitative and quantitative analysis of nonlinear systems; Springer, 2018.
  3. Paszke, A.; Sam, G.; Chintala.; Soumith.; Chanan, G. PyTorch, 2016. Accessed on June 5, 2024.
  4. Rust, J. Using randomization to break the curse of dimensionality. Econometrica: Journal of the Econometric Society 1997, pp. 487–516.
  5. Jentzen, A.; Kuckuck, B.; von Wurstemberger, P. Mathematical introduction to deep learning: methods, implementations, and theory. arXiv preprint arXiv:2310.20360 2023.
  6. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics 2019, 378, 686–707. [Google Scholar] [CrossRef]
  7. Beck, C.; Hutzenthaler, M.; Jentzen, A.; Kuckuck, B. An overview on deep learning-based approximation methods for partial differential equations. Discrete and Continuous Dynamical Systems-B 2023, 28, 3697–3746. [Google Scholar] [CrossRef]
  8. Zgurovsky, M.Z.; Kasyanov, P.O.; Kapustyan, O.V.; Valero, J.; Zadoianchuk, N.V. Evolution Inclusions and Variation Inequalities for Earth Data Processing III: Long-Time Behavior of Evolution Inclusions Solutions in Earth Data Analysis; Vol. 27, Springer Science & Business Media, 2012.
  9. Sirignano, J.; Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. Journal of computational physics 2018, 375, 1339–1364. [Google Scholar] [CrossRef]
Figure 1. Plots for the functions [ 3 , 3 ] 2 x U ( x , t ) R , where t { 0 , 0.6 , 1.2 , 1.8 , 2.4 , 3 } and U C ( R 2 × [ 0 , 3 ] ) is an approximation of the solution u of Problem (1)–(2) with u 0 ( x ) : = ψ ( x 1 2 + x 2 2 ) , where ψ is defined in (15), computed by means of the PINN method as implemented in Source code 1.
Figure 1. Plots for the functions [ 3 , 3 ] 2 x U ( x , t ) R , where t { 0 , 0.6 , 1.2 , 1.8 , 2.4 , 3 } and U C ( R 2 × [ 0 , 3 ] ) is an approximation of the solution u of Problem (1)–(2) with u 0 ( x ) : = ψ ( x 1 2 + x 2 2 ) , where ψ is defined in (15), computed by means of the PINN method as implemented in Source code 1.
Preprints 113621 g001
Listing 1. Modified version of sourse code from Section 16.3 of 5.
Listing 1. Modified version of sourse code from Section 16.3 of 5.
Preprints 113621 g002aPreprints 113621 g002bPreprints 113621 g002cPreprints 113621 g002d
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated