Preprint
Article

On the Convergence of Randomized Block Kaczmarz Algorithm for Solving Matrix Equation

Altmetrics

Downloads

78

Views

8

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

10 October 2023

Posted:

11 October 2023

You are already at the latest version

Alerts
Abstract
Randomized block Kaczmarz Method and randomized extended block Kaczmarz Method are proposed for solving matrix equation AXB=C, where the matrix A and B may be full rank or rank deficient. These methods are iterative methods without matrix multiplication, and are especially suitable for solving large-scale matrix equations. It is theoretically proved these methods converge to the solution or least square solution of the matrix equation. The numerical results show that these methods are more efficient than the existing algorithms for high-dimensional matrix equations.
Keywords: 
Subject: Computer Science and Mathematics  -   Computational Mathematics

MSC:  65F10; 65F45; 65H10

1. Introduction

Consider the linear matrix equation
A X B = C ,
where A R m × p , B R q × n and C R m × n . Such problems arise in linear control and filtering theory for continuous or discrete-time large-scale dynamical systems. If A X B = C is consistent, X * = A C B is the minimum Frobenius norm solution. If A X B = C is inconsistent, X * = A C B is the minimum Frobenius norm least squares solution. When the matrices A and B are small and dense, direct methods based on QR-fraction are attractive [1,2]. However, for large matrices A and B, iterative methods have attracted much attention [3,4,5,6,7]. Recently, Du et al. proposed the randomized block coordinate descent (RBCD) method for solving the matrix least-squares problem min X R p × q C A X B F without strong convexity assumption in [8]. This method requires that matrix B is full row rank. Wu et al. [9] introduced two kinds of Kaczmarz-type methods to solve consistent matrix equation A X B = C : relaxed greedy randomized Kaczmarz (ME-RGRK) and maximal weighted residual Kaczmarz (ME-MWRK). Although the row and column index selection strategy is time-consuming, the ideas of these two methods are suitable for solving large-scale consistent matrix equations.
In this paper, randomized Kaczmarz method [10] and randomized extended Kaczmarz method [11] are used to solve consistent and inconsistent matrix equations (1) by the product of matrix and vector.
All the results in this paper hold in the complex field. But for the sake of simplicity, we only discuss it in the real number field.
In this paper, we denote A T , A , r ( A ) , R ( A ) , A F = trace ( A T A ) and A , B F = trace ( A T B ) as the transpose, the Moore-Penrose generalized inverse, the rank of A, the column space of A, the Frobenius norm of A and the inner product of two matrices A and B, respectively. For an integer n 1 , let [ n ] = { 1 , 2 , , n } . We use I to denote the identity matrix whose order is clear from the context. In addition, for a given matrix G = ( g i j ) R m × n , G i , : , G : , j , σ max ( G ) and σ min ( G ) , are used to denote ith row, jth column, the maximum singular value and the smallest nonzero singular value of G respectively. Let E k denote the expected value conditional on the first k iterations, that is, E k [ · ] = E [ · | i 0 , j 0 , i 1 , j 1 , . . . , i k 1 , j k 1 ] , where i s and j s ( s = 0 , 1 , . . . , k 1 ) are the row and the column chosen at the sth iteration. Let the conditional expectations with respect to the random row index be E k i [ · ] = E [ · | i 0 , j 0 , i 1 , j 1 , . . . , i k 1 , j k 1 , j k ] and with respect to the random column index be E k j [ · ] = E [ · | i 0 , j 0 , i 1 , j 1 , . . . , i k 1 , j k 1 , i k ] . By the law of total expectation, it holds that E k [ · ] = E k i [ E k j [ · ] ] .
The organization of this paper is as follows. In Section 2, we will discuss Kaczmarz method (ME-RBK and ME-PRBK) for finding the minimal F-norm solution ( A C B ) of the consistent matrix Eq. (1). In Section 3, we discuss the extended Kaczmarz method (IME-REBK) for finding the minimal F-norm least-squares solution of the matrix Eq. (1). In Section 4, some numerical examples are provided to illustrate the effectiveness of our new methods. Finally, some brief concluding remarks are described in Section 5.

2. The Randomized Block Kaczmarz Method for Consistent Equation

At the kth iteration, Kaczmarz method selects randomized a row i [ m ] of A and does an orthogonal projection of the current estimate matrix X ( k ) onto the corresponding hyperplane H i = { X R p × q | A i , : X B = C i , : } , that is
min X R p × q 1 2 X X ( k ) F 2 s . t . A i , : X B = C i , : .
The Lagrangian function of the conditional optimization problem (2) is
L ( X , Y ) = 1 2 X X ( k ) F 2 + Y , A i , : X B C i , : ,
where Y R 1 × n is Lagrangian multiplier. By the matrix differentiation, we get the gradient of L ( X , Y ) and set L ( X , Y ) = 0 for finding the stationary matrix:
X L ( X , Y ) | X ( k + 1 ) = X ( k + 1 ) X ( k ) + A i , : T Y B T = 0 , Y L ( X , Y ) | X ( k + 1 ) = A i , : X ( k + 1 ) B C i , : = 0 .
By the first equation of (4), we have X ( k + 1 ) = X ( k ) A i , : T Y B T . Substituting this into the second equation of (4), we can get Y = 1 A i , : 2 2 ( C i , : A i , : X ( k ) ) ( B T B ) . So the projected randomized block Kacmarz (ME-PRBK) for solving A X B = C iterates as
X ( k + 1 ) = X ( k ) + A i , : T A i , : 2 2 ( C i , : A i , : X ( k ) B ) B .
However, in practice, it is very expensive to calculate the pseudoinverse of large-scale matrices. Next, we generalize the average block Kaczmarz method [12] for solving linear equations to matrix equations.
At the kth step, we get the approximate solution X ( k + 1 ) by projecting the current estimate X ( k ) onto the hyperplane H i , j = { X R p × q | A i , : X ( k ) B : , j = C i , j } . Using the Lagrangian multiplier method, we can obtain the following Kaczmarz method for A X B = C :
X ( k + 1 ) = X ( k ) + A i , : T ( C i , : A i , : X ( k ) B : , j ) B : , j T A i , : 2 2 B : , j 2 2 .
Inspired by the idea of average block Kaczmaz algorithm for A x = b , we consider the average block Kaczmaz method for A X B = C respect to B.
X ( k + 1 ) = X ( k ) + λ A i , : T A i , : 2 2 j = 1 n v j k ( C i , : A i , : X ( k ) B : , j ) B : , j T B : , j 2 2 ,
where λ is stepsize and v j k be the weights that satisfy v j k 0 and j = 1 n v j k = 1 . If v j k = B : , j 2 2 B F 2 , then
X ( k + 1 ) = X ( k ) + λ B F 2 A i , : T A i , : 2 2 ( C i , : A i , : X ( k ) B ) B T .
Set α = λ B F 2 > 0 , we obtain the following randomized block Kaczmarz iteration
X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : ( A i , : X ( k ) ) B ) B T , k = 0 , 1 , 2 , ,
where i is selected with probability p i = A i , : 2 2 A F 2 . The cost of each iteration of this method is 4 q ( n + p ) + p + 1 2 q if the square of the row norm of A has been calculated in advance. We describe this method as Algorithm 1, which is called the ME-RBK algorithm.
Preprints 87402 i001
Remark 1. 
Note that the problem of finding a solution of A X B = C can be posed as the following linear least squares problem
min X R p × q 1 2 A X B C F 2 = min X R p × q 1 2 i = 1 m A i , : X B C i , : 2 2 .
Define the component function
f i ( X ) = 1 2 A i , : X B C i , : 2 2 ,
then differentiate with X to get its gradient
f i ( X ) = A i , : T ( A i , : X B C i , : ) B T .
Therefore, the randomized block Kaczmarz method (9) is equivalent to one step of the stochastic gradient decent method [13] applied to (10) with stepsize α A i , : 2 2 .
First, we give the following lemma whose proof can be found in [8].
Lemma 1. 
[8] Let A R m × p and B R q × n be any nonzero matrix. Let
M = { M R p × q | Y R m × n s . t . M = A T Y B T } .
For any matrix M M , it holds
A M B F 2 σ min 2 ( A ) σ min 2 ( B ) M F 2 .
Remark 2. 
M M means that M : , j R ( A T ) , j = 1 , 2 , , q and ( M i , : ) T R ( B ) , i = 1 , 2 , , p . In fact, M is well defined because 0 M and A C B M .
In the following theorem, with the idea of the RK method [10], we will prove that Algorithm 1 converges to the the least F-norm solution of A X B = C .
Theorem 1. 
Assume 0 < α < 2 B 2 2 . If Eq. (1) is consistent, the sequence { X ( k ) } generated by the ME-RBK method starting from the initial matrix X ( 0 ) R p × q in which X : , j ( 0 ) R ( A T ) , j = 1 , 2 , , q and ( X i , : ( 0 ) ) T R ( B ) , i = 1 , 2 , , p , converges linearly to A C B in mean square form. Moreover, the solution error in expectation for the iteration sequence X ( k ) obeys
E X ( k ) A C B F 2 ρ k X ( 0 ) A C B F 2 ,
where ρ = 1 2 α α 2 B 2 2 A F 2 σ min 2 ( A ) σ min 2 ( B ) , and i [ m ] picked with probability p i ( A ) = A i , : 2 2 A F 2 .
Proof. 
For k = 0 , 1 , 2 , , by (9) and A i , : A C B B = C i , : (consistency), we have
X ( k + 1 ) A C B = X ( k ) + α A i , : 2 2 A i , : T ( C i , : A i , : X ( k ) B ) B T A C B = ( X ( k ) A C B ) α A i , : 2 2 A i , : T A i , : ( X ( k ) A C B ) B B T ,
then
X ( k + 1 ) A C B F 2 = X ( k ) A C B F 2 + α 2 A i , : 2 4 A i , : T A i , : ( X ( k ) A C B ) B B T F 2 2 α A i , : 2 2 X ( k ) A C B , A i , : T A i , : ( X ( k ) A C B ) B B T F .
It follows from
α 2 A i , : 2 4 A i , : T A i , : ( X ( k ) A C B ) B B T F 2 = α 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B B T 2 2 ( by trace ( u u T ) = u 2 2 for any vector u ) α 2 B 2 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2 ( by u T B T 2 = B u 2 B 2 u 2 ) ,
and
2 α A i , : 2 2 X ( k ) A C B , A i , : T A i , : ( X ( k ) A C B ) B B T F = 2 α A i , : 2 2 trace ( B T ( X ( k ) A C B ) T A i , : T A i , : ( X ( k ) A C B ) B ) ( by trace ( M N ) = trace ( N M ) ) = 2 α A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2
that
X ( k + 1 ) A C B F 2 X ( k ) A C B F 2 2 α α 2 B 2 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2 .
By taking the conditional expectation, we have
E k X ( k + 1 ) A C B F 2 X ( k ) A C B F 2 E k 2 α α 2 B 2 2 A i , : 2 2 A i , : ( X ( k ) A C B ) B 2 2 = X ( k ) A C B F 2 2 α α 2 B 2 2 A F 2 A ( X ( k ) A C B ) B F 2 .
By X ( 0 ) M and A C B M , we have X ( 0 ) A C B M . Noting A i , : T = A T I : , i , it is easy to show that X ( k ) A C B M by induction. Then by Lemma 1 and 0 < α < 2 B 2 2 , we can get
E k X ( k + 1 ) A C B F 2 X ( k ) A C B F 2 2 α α 2 B 2 2 A F 2 σ min 2 ( A ) σ min 2 ( B ) X ( k ) A C B F 2 = 1 2 α α 2 B 2 2 A F 2 σ min 2 ( A ) σ min 2 ( B ) X ( k ) A C B F 2 .
Finally, by ((12)) and induction on the iteration index k, we obtain the estimate ((11)).    □
Remark 3. 
By the similar approach as used in the proof of Theorem 1, we can prove that the iterate X ( k ) generated by ME-PRBK (5) satisfies the following estimate
E X ( k ) A C B F 2 ρ ^ k X ( 0 ) A C B F 2 ,
where ρ ^ = 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B 2 2 . The convergence factor of GRK in [14] is ρ G R K = 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B F 2 . It is obvious that
ρ G R K > min 0 < α < 2 B 2 2 ρ = 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B 2 2 = ρ ^
and ρ < ρ G R K when 1 1 B 2 2 B F 2 B 2 2 < α < 1 + 1 B 2 2 B F 2 B 2 2 . This means that the convergence factor of ME-PRBK is the smallest and the factor of ME-RBK can be smaller than that of GRK when α is properly selected.

3. The Randomized Extended Block Kaczmarz Method for Inconsistent Equation

In [11,15,16], the authors proved that the Kaczmarz method does not converge to the least squares solution of A X = b when A X = b is inconsistent. Analogically, if the matrix Eq. (1) is inconsistent, the above ME-PRBK method dose not converge to A C B . The following theorem give the error bound of the for the inconsistent matrix equation.
Theorem 2. 
Assume that the consistent equation A X B = C has a solution X * = A C B . Let X ^ ( k ) denote the kth iterate of the ME-PRBK method applied to the inconsistent A X B = C + W for any W R m × n starting from the initial matrix X ( 0 ) R p × q in which X : , j ( 0 ) R ( A T ) , j = 1 , 2 , , q and ( X i , : ( 0 ) ) T R ( B ) , i = 1 , 2 , , p . In exact arithmetic, it follows that
E X ^ ( k ) A C B F 2 ρ k X ( 0 ) A C B F 2 + 1 ρ k 1 ρ W B F 2 A F 2 ,
Proof. 
Set H i = { X | A i X B = C i } , H ^ i = { X | A i X B = C i + W i } . Let Y denote the iterate of the PRBK method applied to the consistent A X B = C at kth step, that is
Y = X ^ ( k ) + A i , : T A i , : 2 2 ( C i , : A i , : X ^ ( k ) B ) B .
It follows from
Y A C B , X ^ ( k + 1 ) A C B F = Y A C B , A i , : T A i , : 2 2 W i B F = trace ( B ) T W i T A i , : A i , : 2 2 ( Y A C B ) = trace W i T A i , : Y B A i , : A C B B A i , : 2 2 ( B T B ) = 0
and
X ^ ( k + 1 ) Y F 2 = 1 A i , : 2 4 A i , : T W i B F 2 = trace ( B ) T W i T A i , : A i , : T W i B 1 A i , : 2 2 = W i B 2 2 A i , : 2 2
that
X ^ ( k + 1 ) A C B F 2 = Y A C B F 2 + X ^ ( k + 1 ) Y F 2 = Y A C B F 2 + W i B 2 2 A i , : 2 2 .
By taking the conditional expectation on both sides of (14), we can obtain
E k X ^ ( k + 1 ) A C B F 2 = E k Y A C B F 2 + E k W i B 2 2 A i , : 2 2 ( 1 σ min 2 ( A ) σ min 2 ( B ) A F 2 B 2 2 ) X ^ ( k ) A C B F 2 + W B F 2 A F 2
The inequality is obtained by Remark 3. Applying this recursive relation iteratively, we have
E X ^ ( k + 1 ) A C B F 2 ρ E X ^ ( k ) A C B F 2 + W B F 2 A F 2 ρ 2 E X ^ ( k 1 ) A C B F 2 + ( ρ + 1 ) W B F 2 A F 2 ρ k + 1 X ^ ( 0 ) A C B F 2 + ( ρ k + + ρ + 1 ) W B F 2 A F 2 = ρ k + 1 X ( 0 ) A C B F 2 + 1 ρ k + 1 1 ρ W B F 2 A F 2 .
This completes the proof.    □
Next, we use the idea of randomized extended Kaczmarz method (see [16,17,18] for details) for solving the least squares solution of the inconsistent Eq. (1). At each iteration, Z ( k ) is the kth iterate of ME-RBK applied to A T Z B T = 0 with the initial guess Z ( 0 ) , and X ( k ) is one-step ME-RBK update for A X B = C Z ( k ) . we can get the following randomized extended block Kaczmarz iteration
Z ( k + 1 ) = Z ( k ) α A : , j 2 2 A : , j ( ( ( A : , j T Z ( k ) ) B T ) B ) , X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) ( A i , : X ( k ) ) B ) B T ,
where α > 0 is the step size, and i and j are selected with probability p i = A i , : 2 2 A F 2 and p ^ j ( A ) = A : , j 2 2 A F 2 , respectively. The cost of each iteration of this method is 4 n ( q + m ) + m + 1 2 n q for updating Z ( k ) and ( 4 q + 1 ) ( n + p ) + 1 2 q for updating X ( k ) if the square of the row norm and the column norm of A have been calculated in advance. We describe this method as Algorithm 2, which is called the ME-REBK algorithm.
Preprints 87402 i002
Theorem 3. 
Assume 0 < α < 2 B 2 2 . Let { Z ( k ) } denote the kth iterate of ME-RBK applied to A T Z B T = 0 starting from the initial matrix Z ( 0 ) R m × n in which Z : , j ( 0 ) C : , j + R ( A ) , j = 1 , 2 , , n and ( Z i , : ( 0 ) ) T ( C i , : ) T + R ( B T ) , i = 1 , 2 , , m . Then { Z ( k ) } converges linearly to C A A C B B in mean square form, and the solution error in expectation for the iteration sequence X ( k ) obeys
E Z ( k ) ( C A A C B B ) F 2 ρ k Z ( 0 ) ( C A A C B B ) F 2 ,
where the jth column of A is selected with probability p ^ j ( A ) = A : , j 2 2 A F 2 .
Proof. 
The approach of proof is the same as that used in the proof of Theorem 1, so we omit it. □
Theorem 4. 
Assume 0 < α < 2 B 2 2 . The sequence { X ( k ) } is generated by the ME-REBK method for A X B = C starting from the initial matrix X ( 0 ) R p × n and Z ( 0 ) R m × n , where X : , j ( 0 ) R ( A T ) , j = 1 , 2 , , q , ( X i , : ( 0 ) ) T R ( B ) , i = 1 , 2 , , p Z : , j ( 0 ) C : , j + R ( A ) , j = 1 , 2 , , n and ( Z i , : ( 0 ) ) T ( C i , : ) T + R ( B T ) , i = 1 , 2 , , m . For any ε > 0 , it holds
E X ( k ) A C B F 2 ( 1 + ε ) k + 1 ( 1 + ε ) ε 2 α 2 B 2 2 ρ k A F 2 Z ( 0 ) ( C A A C B B ) F 2 + ( 1 + ε ) k ρ k X ( 0 ) A C B F 2 ,
where i [ m ] , j [ p ] are picked with probability p i ( A ) = A i , : 2 2 A F 2 and p ^ j ( A ) = A : , j 2 2 A F 2 , respectively.
Proof. 
Let X ( k ) denote the kth iterate of ME-REBK method for A X B = C , and X ˜ ( k + 1 ) be the one-step Kaczmarz update for the matrix equation A X B = A A C B B from X ( k ) , i.e.,
X ˜ ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( A i , : A C B B A i , : X ( k ) B ) B T .
We have
X ˜ ( k + 1 ) A C B = X ( k ) A C B α A i , : 2 2 A i , : T A i , : ( X ( k ) A C B ) B B T
and
X ( k + 1 ) X ˜ ( k + 1 ) = α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) B T .
For any ε > 0 , by triangle inequality and Young’s inequality, we can get
X ( k + 1 ) A C B F 2 = ( X ( k + 1 ) X ˜ ( k + 1 ) ) + ( X ˜ ( k + 1 ) A C B ) F 2 ( X ( k + 1 ) X ˜ ( k + 1 ) F + X ˜ ( k + 1 ) A C B F ) 2 X ( k + 1 ) X ˜ ( k + 1 ) F 2 + X ˜ ( k + 1 ) A C B F 2 + 2 X ( k + 1 ) X ˜ ( k + 1 ) F X ˜ ( k + 1 ) A C B F ( 1 + 1 ε ) X ( k + 1 ) X ˜ ( k + 1 ) F 2 + ( 1 + ε ) X ˜ ( k + 1 ) A C B F 2 .
By taking the conditional expectation on the both sides of (18), we have
E k X ( k + 1 ) A C B F 2 ( 1 + 1 ε ) E k X ( k + 1 ) X ˜ ( k + 1 ) F 2 + ( 1 + ε ) E k X ˜ ( k + 1 ) A C B F 2 .
It follows from
X ( k + 1 ) X ˜ ( k + 1 ) F 2 = α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) B T F 2 = α 2 A i , : 2 2 trace B ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) T ( C i , : Z i , : ( k + 1 ) A i , : A C B B ) B T α 2 B 2 2 A i , : 2 2 C i , : Z i , : ( k + 1 ) A i , : A C B B 2 2
that
E k X ( k + 1 ) X ˜ ( k + 1 ) F 2 α 2 B 2 2 E k j E k i C i , : Z i , : ( k + 1 ) A i , : A C B B 2 2 A i , : 2 2 = α 2 B 2 2 E k j 1 A F 2 i = 1 m C i , : Z i , : ( k + 1 ) A i , : A C B B 2 2 = α 2 B 2 2 A F 2 E k Z ( k + 1 ) ( C A A C B B ) F 2 .
It yields
E X ( k + 1 ) X ˜ ( k + 1 ) F 2 α 2 B 2 2 A F 2 E Z ( k + 1 ) ( C A A C B B ) F 2 α 2 B 2 2 A F 2 ρ k + 1 Z ( 0 ) ( C A A C B B ) F 2 ( by Lemma 3 ) .
By X ( 0 ) A C B M , we have X ( k ) A C B M . Then, by using Theorem 1, we can get
E k [ X ˜ ( k + 1 ) A C B F 2 ] = E k X ( k ) A C B α A i , : 2 2 A i , : T A i , : ( X ( k ) A C B ) B B T F 2 ρ X ( k ) A C B F 2 ,
then
E X ˜ ( k + 1 ) A C B F 2 ρ E X ( k ) A C B F 2 .
Combining (19),(20) and (21) yields
E X ( k + 1 ) A C B F 2 ( 1 + 1 ε ) E X ( k + 1 ) X ˜ ( k + 1 ) F 2 + ( 1 + ε ) E X ˜ ( k + 1 ) A C B F 2 ( 1 + 1 ε ) α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 + ( 1 + ε ) ρ E X ( k ) A C B F 2 ( 1 + 1 ε ) α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 [ 1 + ( 1 + ε ) ] + ( 1 + ε ) 2 ρ 2 E X ( k 1 ) A C B F 2 ( 1 + 1 ε ) α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 i = 0 k ( 1 + ε ) i + ( 1 + ε ) k + 1 ρ k + 1 X ( 0 ) A C B F 2 = ( 1 + ε ) k + 2 ( 1 + ε ) ε 2 α 2 B 2 2 ρ k + 1 A F 2 Z ( 0 ) ( C A A C B B ) F 2 + ( 1 + ε ) k + 1 ρ k + 1 X ( 0 ) A C B F 2 .
This completes the proof. □
Remark 4. 
Replacing B T in (15) with B , we obtain the following projection-based randomized extended block Kaczmarz mathod (ME-PREBK) iteration
Z ( k + 1 ) = Z ( k ) α A : , j 2 2 A : , j ( ( ( A : , j T Z ( k ) ) B T ) ( B ) T ) , X ( k + 1 ) = X ( k ) + α A i , : 2 2 A i , : T ( C i , : Z i , : ( k + 1 ) ( A i , : X ( k ) ) B ) B ,

4. Numerical Experiments

In this section, we will present some experiment results of the proposed algorithms for solving various matrix equations, and compare them with ME-RGRK and ME-MWRK in [9] for consistent matrix equations and RBCD in [8] for inconsistent matrix equations. All experiments are carried out by using MATLAB (version R2020a) on a DESKTOP-8CBRR86 with Intel(R) Core(TM) i7-4712MQ CPU @2.30GHz 2.29GHz, RAM 8GB and Windows 10.
All computations are started from the initial guess X ( 0 ) = 0 , the step size α = 1.5 B 2 2 , and terminated once the relative error (RE) of the solution, defined by
R E = X ( k ) X * F 2 X * F 2
at the the current iterate X ( k ) , satisfies R E < 10 6 or exceeds maximum iteration K = 50000 , where X * = A C B . We report the average number of iterations (denoted as “IT”) and the average computing time in seconds (denoted as“CPU”) for 20 trials repeated runs of the corresponding method. Three examples are tested, and A and B are generated as follows.
  • Type I: For given m , p , q , n , the entries of A and B are generated from a standard normal distribution, i.e., A = r a n d n ( m , p ) , B = r a n d n ( q , n ) .
  • Type II: Like [14], for given m , p , and r 1 = r a n k ( A ) , we construct a matrix A by A = U 1 D 1 V 1 T , where U 1 R m × r 1 and V 1 R p × r 1 are orthogonal columns matrices, D R r 1 × r 1 is a diagonal matrix whose first r 2 diagonal entries are uniformly distributed numbers in [ σ min ( A ) , σ max ( A ) ] , and the last two diagonal entries are σ max ( A ) , σ min ( A ) . The entries of B are generated by similar method with parameters q , n , r 2 = r a n k ( B ) .
  • Type III: The real-world sparse data come from the Florida sparse matrix collection [19].

4.1. Consistent Matrix Equation

Given A , B , we set C = A X * B with X * = r a n d n ( p , q ) to construct a consistent matrix equation. In Table 1, Table 2 and Table 3, we report the average IT and CPU of the ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK methods for solving consistent matrix. In the following tables, the item `>’ represents that the number of iteration steps exceeds the maximum iteration (50000), and the item `-’ represents that the method does not converge.
From these tables, we can see that the ME-RBK and ME-PRBK methods vastly outperform the ME-RGRK and ME-MWRK methods in terms of both IT and CPU times regardless of whether the matrices A and B are full column/row rank or not. As the increasing of matrix dimension, the CPU time of ME-RBK and ME-PRBK methods is increasing slowly , while the running time of ME-RGRK and ME-MWRK increases dramatically.
In addition, when the matrix size is small, the ME-PRBK method is competitive, because of the pseudoinverse is less expensive and the number of iteration steps is small. When the matrix size is large, the matrix is large, the ME-RBK method is more challenging because it does not need to calculate the pseudoinverse (see the last line in Table 1).

4.2. Inonsistent Matrix Equation

To construct an inconsistent matrix equation, we set C = A X * B + R , where X * and R are random matrices which are generated by X * = r a n d n ( p , q ) and R = δ * r a n d n ( p , q ) , δ ( 0 , 1 ) . Numerical results of the RBCD, IME-REBK and IME-PREBK methods are listed in Table 4, Table 5 and Table 6. From these tables, we can see that the IME-PREBK method is better than the RBCD method in terms of IT and CPU time, especially when the σ max σ min is large (see the last line in Table 5). The IME-REBK method is not competitive for B with full row rank because it needs to solve two equations. However, when B has not full row rank, the RBCD method does not converge, while the IME-REBK and IME-PREBK methods do.

5. Conclusions

In this paper we have proposed a randomized block Kaczmarz algorithm for solving the consistent matrix equation and its extended version for the inconsistent case. Theoretically, we have proved the proposed algorithms converge linearly to the unique minimal F-norm solution or least-squares solution (i.e., A C B ) without requirements on A and B have full column/row rank. Numerical results show the effectiveness of the algorithms. Since the proposed algorithms only require one row or one column of A at each iteration without matrix-matrix product, it is suitable for the scenarios where the matrix A is too large to fit in memory or matrix multiplication is considerably expensive.

Author Contributions

Conceptualization, L.X.; Methodology, L.X. and W.B.; Validation, L.X. and W.L.; Writing—original draft preparation, L.X.; Writing—review and editing, L.X., W.B and W.L.; Software, L.X.; Visualization, L.X. and W.L. All authors have read and agreed to the published version of the manuscript

Funding

This research received no external funding.

Institutional Review Board Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data Availability Statement

The datasets that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors are thankful to the referees for their constructive comments and valuable suggestions, which have greatly improved the original manuscript of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fausett, D.W.; Fulton, C.T. Large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 1994, 15, 219–227. [Google Scholar] [CrossRef]
  2. Zha, H.Y. Comments on large least squares problems involving Kronecker products. SIAM J. Matrix Anal. Appl. 1995, 16, 1172–1172. [Google Scholar] [CrossRef]
  3. Peng, Z.Y. An iterative method for the least squares symmetric solution of the linear matrix equation AXB = C. Appl. Math. Comput. 2005, 170, 711–723. [Google Scholar] [CrossRef]
  4. Ding, F.; Liu, P.X.; Ding, J. Iterative solutions of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 2008, 197, 41–50. [Google Scholar] [CrossRef]
  5. Huang, G.X.; Yin, F.; Guo, K. An iterative method for the skew-symmetric solution and the optimal approximate solution of the matrix equation AXB = C. J. Comput. Appl. Math. 2008, 212, 231–244. [Google Scholar] [CrossRef]
  6. Wang, X.; Li, Y.; Dai, L. On hermitian and skew-hermitian splitting iteration methods for the linear matrix equation AXB = C. Comput. Math. Appl. 2013, 65, 657–664. [Google Scholar] [CrossRef]
  7. Shafiei, S.G.; Hajarian, M. Developing Kaczmarz method for solving Sylvester matrix equations. J. Franklin Inst. 2022, 359, 8991–9005. [Google Scholar] [CrossRef]
  8. Du, K.; Ruan, C.C.; Sun, X.H. On the convergence of a randomized block coordinate descent algorithm for a matrix least squaress problem. Appl. Math. Lett. 2022, 124, 107689. [Google Scholar] [CrossRef]
  9. Wu, N.C.; Liu, C.Z.; Zuo, Q. On the Kaczmarz methods based on relaxed greedy selection for solving matrix equation AXB=C. J. Comput. Appl. Math. 2022, 413, 114374. [Google Scholar] [CrossRef]
  10. Strohmer, T.; Vershynin, R. A randomized Kaczmarz algorithm with exponential convergence. J Fourier Anal. Appl. 2009, 15, 262–278. [Google Scholar] [CrossRef]
  11. Zouzias, A.; Freris, N.M. Randomized extended Kaczmarz for solving least squares. SIAM J Matrix Anal Appl. 2013, 34, 773–793. [Google Scholar] [CrossRef]
  12. Ion, N. Faster randomized block kaczmarz algorithms. SIAM J. Matrix Anal. Appl. 2019, 40, 1425–1452. [Google Scholar]
  13. Nemirovski, A.; Juditsky, A.; Lan, G.; Shapiro, A. Robust stochastic approximation approach to stochastic programming. SIAM J. Optimiz. 2009, 19, 1574–1609. [Google Scholar] [CrossRef]
  14. Niu, Y.Q.; Zheng, B. On global randomized block Kaczmarz algorithm for solving large-scale matrix equations. arXiv 2204, arXiv:2204.13920. [Google Scholar]
  15. Needell, D. Randomized Kaczmarz solver for noisy linear systems. BIT Numer Math. 2010, 50, 395–403. [Google Scholar] [CrossRef]
  16. Ma, A.; Needell, D. A. Ramdas, Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods. SIAM J. Matrix Anal. Appl. 2015, 36, 1590–1604. [Google Scholar] [CrossRef]
  17. Du, K. Tight upper bounds for the convergence of the randomized extended Kaczmarz and Gauss-Seidel algorithms. Numer. Linear Algebra Appl. 2019, 26, e2233. [Google Scholar] [CrossRef]
  18. Du, K.; Si, W.T.; Sun, X.H. Randomized extended average block kaczmarz for solving least squares. SIAM J. Sci. Comput. 2020, 42, A3541–A3559. [Google Scholar] [CrossRef]
  19. Davis, T.A.; Hu, Y. The university of Florida sparse matrix collection. Math. Softw. 2011, 38, 1–25. [Google Scholar] [CrossRef]
Table 1. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type I.
Table 1. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type I.
Preprints 87402 i003
Table 2. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type II.
Table 2. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type II.
Preprints 87402 i004
Table 3. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type III.
Table 3. IT and CPU of ME-RGRK, ME-MWRK, ME-RBK and ME-PRBK for the consistent matrix equations with Type III.
Preprints 87402 i005
Table 4. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type I.
Table 4. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type I.
Preprints 87402 i006
Table 5. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type II.
Table 5. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type II.
Preprints 87402 i007
Table 6. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type III.
Table 6. IT and CPU of RBCD, IME-REBK and IME-PREBK for the inconsistent matrix equations with Type III.
Preprints 87402 i008
Disclaimer/Publisher’s Note:  The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated