Convergence of Peaceman-Rachford splitting method with Bregman distance for three-block nonconvex nonseparable optimization

Ying Zhao; Heng-you Lan; Hai-yang Xu

doi:10.1515/dema-2024-0036

Open Access Published by De Gruyter Open Access September 25, 2024

Convergence of Peaceman-Rachford splitting method with Bregman distance for three-block nonconvex nonseparable optimization

Ying Zhao , Heng-you Lan and Hai-yang Xu

From the journal Demonstratio Mathematica

https://doi.org/10.1515/dema-2024-0036

Abstract

It is of strong theoretical significance and application prospects to explore three-block nonconvex optimization with nonseparable structure, which are often modeled for many problems in machine learning, statistics, and image and signal processing. In this article, by combining the Bregman distance and Peaceman-Rachford splitting method, we propose a novel three-block Bregman Peaceman-Rachford splitting method (3-BPRSM). Under a general assumption, global convergence is presented via optimality conditions. Furthermore, we prove strong convergence when the augmented Lagrange function satisfies Kurdyka-Łojasiewicz property. In addition, if the association function possessing the Kurdyka-Łojasiewicz property exhibits a distinctive structure, then linear and sublinear convergence rate of 3-BPRSM can be guaranteed. Finally, a preliminary numerical experiment demonstrates the effectiveness.

Keywords: convergence analysis; Peaceman-Rachford splitting method with Bregman distance; three-block nonconvex nonseparable optimization; Kurdyka-Łojasiewicz property; optimality condition

MSC 2010: 90C26; 65K05; 49K35; 41A25

1 Introduction

As we all know, optimization plays a crucial role in solving various complex models across diverse fields. One notable example is the application of heuristic computing with sequential quadratic programming to solve a nonlinear hepatitis B virus model, which is proposed by Umar et al. [1]. Optimization techniques are employed to fine-tune the parameters of the model, which can enable a more accurate representation of the complex dynamics of hepatitis B virus transmission. The iterative nature of sequential quadratic programming allows researchers to minimize the model’s nonlinearity and to enhance its predictive capabilities. This approach not only contributes to a deeper understanding of biological mechanisms underlying hepatitis B virus propagation, but also highlights a pivotal role of optimization to refine and optimize mathematical models in the field of virology. The integration of heuristic computing with optimization techniques proves instrumental in advancing research and providing valuable insights into the dynamics of infectious diseases. Other models solved by the optimization process can be found in [2–8] and the references therein.

Nonconvex optimization is indispensable in practical engineering applications and offers versatile solutions to complex problems across various domains. For instance, Bian and Chen [9] addressed image processing in Examples 5.1 and 5.2, Xu et al. applied to a typical compressed sensing problem, i.e., the sparse signal recovery problem [10, (52), p. 1020] for solving the brand new feature of l 1 2 regularization, Lin et al. tested the performance of linearized alternating direction method with parallel splitting and adaptive penalty on sparse representation and low-rank recovery problems (2), (3), and (5) on page 289 of [11]. Be with meanwhile, Ames and Hong extended sparse principal component analysis and estimated the solution of any penalized eigenproblem of the form (4.1) [12, p. 741] with nonconcave objective function in general. Further, Lin et al. demonstrated utility of the alternating direction method of multipliers (ADMM) algorithm through identified sparsity structures, among mass-spring system, network with 100 unstable nodes and block sparsity: A bio-chemical reaction example (see, respectively, pages 2429 and 2430 in [13]). Thus, it can be seen that studying nonconvex optimization holds both strong theoretical significance and promising application prospects. Especially, for giving a partial answer, the open problem on convergence analysis of ADMM for solving large-scale nonconvex separable optimization, and Guo et al. [14] considered the following nonconvex optimization for two-block problem:

(1.1) min ξ ( x ) + ζ ( y ) s.t. M x + y = d ,

which can be caused based on acquiring the sparsest solution of a linear system [15, p. 14]. Moreover, the formulation as (1.1), finding a minimizer of some composite objective functions in the fields of signal and image processing and machine learning, was investigated by Wang et al. [15], where M ∈ R m × n is a given matrix, d ∈ R m is the observed data, ξ : R n → R is usually a (quadratic, or logistic) loss function, and ζ : R m → R is often a regularizer such as the l 1 norm or l 1 2 quasi-norm.

However, when one considers the case that location parameters of a linear regression model are different for each groups, that is, it can be regarded that location parameters express individual effects of groups, one of the composite objective functions in the form of (1.1) is often nonseparable. For example, Ohishia et al. [16] presented that the parameter vector μ is estimated by minimizing the penalized residual sum of squares, which includes nonseparable objective function ‖ y − X β − R μ ‖ 2 2 , where β is a vector of regression coefficients. Indeed, nonseparable function exists objectively in image and signal processing including fused Lasso [16], group Lasso [17], total variation regularization [18], statistics, and machine learning such as large-scale problems [19]. Because of impersonally existing nonseparable structure, Liu et al. [20] challenged convergence analysis of a splitting method for a constrained nonconvex nonseparable optimization as follows:

(1.2) min ξ ( x ) + ζ ( y ) + ℓ ( x , y ) s.t. M x + y = d ,

which reduces to (1.1) when the nonseparable function ℓ ( x , y ) ≡ 0 for any ( x , y ) ∈ R n × R m . As He and Yuan [21] pointed out “many applications such as sparse and/or low-rank optimization, compressive sensing, statistical learning, computer vision and large-scale distributed wireless network can be modeled or reformulated as a convex minimization model with linear constraints and a separable objective function in form of the sum of more than one function,” numerous problems in practical engineering applications can be effectively modeled as three-block nonconvex (nonseparable) optimization problem. For instance, He and Yuan successfully addressed the patch low-rank image decomposition model ([21, (7.2), p. 815]). In addition, the SCAD- l 2 model extensively explored by Zeng and Xie [22] serves as a notable example. Furthermore, Zhang et al. [23] applied pADMMz to resolve the challenge of precisely extracting background/foreground from a given video. So the models (1.1) or (1.2) may not fully describe the aforementioned practical application problems and additional variables need to be added to better represent the complexity of these application problems. Hence, it is necessary and valuable to explore the following three-block nonconvex constrained optimization with nonseparable coupled term:

(1.3) min f ( x ) + g ( y ) + h ( z ) + l ( x , y , z ) s.t. A x + y + z = b ,

where f : R n → R ⋃ { + ∞ } is proper and lower semicontinuous, g : R m → R , h : R m → R , and l ( x , y , z ) : R n × R m × R m → R is continuously differentiable, A ∈ R m × n and b ∈ R m are the given matrix and vector, respectively. If the nonseparable term l ( x , y , z ) = 0 for each ( x , y , z ) ∈ R n × R m × R m , (1.3) is equivalent to a separable three-block optimization problem with linear constraints due to He and Yuan [21]. In fact, the form of (1.3) has many applications in various fields such as image processing [9,17,21], video processing [23], and background/foreground extraction [23,24].

On the other hand, to solve (1.1) or (1.2), many methods have been presented, such as ADMM [25], Douglas-Rachford splitting method [26], Peaceman-Rachford splitting method (PRSM) [27], and forward-backward splitting algorithm [28]. However, one can know that the biggest difference between PRSM and ADMM in the iterative process is to add the middle update item, which can also be seen as a symmetric version of ADMM. As Gabay [29] pointed out, although PRSM is less robust and requires more constraints than ADMM to guarantee convergence, PRSM converges faster than ADMM subject to convergence. Many scholars found that adding appropriate Bregman distance to ADMM can effectively simplify subproblem calculation or make the subproblem closed solution and further can improve numerical effect. Indeed, Li and Pong [30] proposed a variant of ADMM, that is, adding Bregman distance parameter to the second subproblem of the classical ADMM. More significant advances have been accomplished in the past few years upon ADMM or its variant version. For more detail, one can refer to [31,32] and the references therein.

In 2008, Wang et al. [33] indicated that as a natural extension of ADMM, multiblock ADMM is a widely used scheme and has also been found very useful in solving various nonconvex optimization problems. Meanwhile, considering the three-block nonconvex optimization problem with linear equality constraint, where the coefficient matrix of one variable is negative identity matrix, Yang et al. [24] studied the classical ADMM of Lagrange multiplier update step with relaxation factor. Furthermore, Chao et al. [34] declared that ADMM or its directly extend version may not converge, when the involved number of blocks is more than two, or there is a nonconvex function, or there is a nonseparable structure. Thus, a natural idea is to employ or extend PRSM to solve the problem (1.3). However, the work of Chen et al. [35] presented a counterexample that demonstrated the lack of convergence in the direct extension of the multiblock convex optimization problem. This discovery has aroused significant attention from the research community. Hence, PRSM or its direct extension may not convergent when the number of blocks exceeds two, or nonconvex functions are presented, or a nonseparable structure is involved. What is more, the study of PRSM is limited to two-block problem and a few three-block separable problems, and considering the three-block nonseparable problem has not been reported. Indeed, Bnouhachem and Rassias [36] proposed a Bregman proximal PRSM for solving a separable convex minimization model and investigated global convergence and convergence rate of the method in the ergodic sense under standard assumptions. Li et al. [37] applied strict contraction PRSM to a convex minimization problem with two linear constraints on nonseparable structure and proved convergence. In addition, Liu et al. [38] dealt with a special class of multiblock convex optimization problems, whose nonseparable structure is a quadratic function. Chao et al. [39] adopted the substitution method to consider the near block minimization method of multipliers, obtained the convergence of such multiblock optimization problems, and provided the corresponding iterative complexity results.

Inspired by the aforementioned work and challenged by investigation of the three-block nonseparable problems, the purpose and main contribution of this article is to propose and prove convergence of a three-block Bregman Peaceman-Rachford splitting method (3-BPRSM) for the nonconvex and nonseparable problem (1.3).

The novelty of this article can be summarized as follows:

We extend PRSM from two-block to three-block, that is, the model (1.3) in this article is more general comparing with the existing work in [36–39], and prove its convergence and obtain convergence rate.
Adding Bregman distance to the subproblem of x ∈ R n allows the function f ( x ) and the matrix A to obtain the descent properties of Lagrange function without any additional restrictions.
The introduction of relaxation factor makes the iteration widely representative, including several types of ADMM and PRSM. Therefore, 3-BPRSM provides a relatively broad splitting algorithm framework for the three-block problem (1.3), which can uniformly analyze the theoretical properties of a large class of Bregman-type Peaceman-Rachford splitting algorithms.

The remainder of this article is organized as follows: some basic concepts and necessary foreshadowing for further analysis are presented in Section 2. We present 3-BPRSM and analyze its convergence for three-block nonconvex and nonseparable problems in Section 3. In Section 4, a numerical example is implemented to illustrate and validate our major results. Finally, Section 5 concludes with some discussions of future work.

2 Preliminaries

Denote ‖ ⋅ ‖ as Euclidean norm of a vector. Let dom f = { x ∈ R n : f ( x ) < + ∞ } be the domain of a function f : R n → R ∪ { + ∞ } . For a symmetric semipositive definite matrix H , ‖ x ‖ H 2 = x ⊤ H x , where H ≽ ( ≻ ) 0 indicates that H is a symmetric semi-positive (positive) definite matrix. λ min ( H ) ‖ x ‖ 2 ≤ x ⊤ H x ≤ λ max ( H ) ‖ x ‖ 2 holds for all x ∈ R n , where λ min ( H ) and λ max ( H ) represent the minimum eigenvalue and the maximum eigenvalue of the symmetric matrix H , respectively.

Definition 2.1

Let f : R n → R . If for all x , y ∈ R n , there exists L f > 0 such that ∣ f ( x ) − f ( y ) ∣ ≤ L f ∣ x − y ∣ , then f is called l f -Lipschitz continuous.

Definition 2.2

Let S ⊆ R n . The distance from point x ∈ R n to S is defined as d ( x , S ) = inf y ∈ S ‖ y − x ‖ . In particular, when S = ∅ , d ( x , S ) = + ∞ .

Definition 2.3

Let S ⊂ R n be a nonempty convex set, and f be a function defined on S . If for any x 1 , x 2 ∈ S and each α ∈ ( 0 , 1 ) , the following inequality holds:

f ( α x 1 + ( 1 − α ) x 2 ) ≤ α f ( x 1 ) + ( 1 − α ) f ( x 2 ) ,

then f is said to be convex on S . In particular, if x 1 ≠ x 2 , we have

f ( α x 1 + ( 1 − α ) x 2 ) < α f ( x 1 ) + ( 1 − α ) f ( x 2 ) ,

then f is called strictly convex on S .

Further, if for all x 1 , x 2 ∈ S and any α ∈ ( 0 , 1 ) , there exists a constant σ > 0 such that

f ( α x 1 + ( 1 − α ) x 2 ) ≤ α f ( x 1 ) + ( 1 − α ) f ( x 2 ) − 1 2 σ α ( 1 − α ) ‖ x 1 − x 2 ‖ 2 ,

then f is said to be strongly α -convex.

Definition 2.4

Let f : R n → R ∪ { + ∞ } be a proper lower semicontinuous function.

The Fréchet subdifferential of f at x ∈ dom f is defined as follows:
∂ ˆ f ( x ) = x * ∈ R n : lim y ≠ x inf y → x f ( y ) − f ( x ) − ⟨ x * , y − x ⟩ ‖ y − x ‖ ≥ 0 .
And in particular, ∂ ˆ f ( x ) = ∅ when x ∉ dom f .
The limiting subdifferential of f at x ∈ dom f is decided by
∂ f ( x ) = { x * ∈ R n : ∃ x k → x , f ( x k ) → f ( x ) , x ˆ k ∈ ∂ ˆ f ( x k ) , x ˆ k → x * } .

Definition 2.5

[40] For differentiable convex function ϕ : R n → R , the corresponding Bregman distance is defined as follows:

△ ϕ ( x , y ) = ϕ ( x ) − ϕ ( y ) − ⟨ ∇ ϕ ( y ) , x − y ⟩ , ∀ x , y ∈ R n .

In particular, when ϕ ( x ) = ‖ x ‖ 2 , the corresponding Bregman distance can be reduced to ‖ x − y ‖ 2 , which is the classical Euclidean distance [41].

Definition 2.6

If f is continuously differentiable in the first order on a nonempty open convex set S ⊂ R n , then the following equality holds:

f ( y ) = f ( x ) + ∇ f ( x ) ⊤ ( y − x ) + o ( ‖ y − x ‖ ) , ∀ x , y ∈ S .

Definition 2.7

A differentiable function f defined on a nonempty open convex set S ⊂ R n , is

convex if and only if
f ( y ) ≥ f ( x ) + ⟨ ∇ f ( x ) , y − x ⟩ , ∀ x , y ∈ S ,
or
⟨ ∇ f ( x ) − ∇ f ( y ) , x − y ⟩ ≥ 0 , ∀ x , y ∈ S .
strongly σ -convex if and only if
f ( y ) ≥ f ( x ) + ⟨ ∇ f ( x ) , y − x ⟩ + σ 2 ‖ y − x ‖ 2 , ∀ x , y ∈ S .

Definition 2.8

The subdifferential of a proper lower semicontinuous function f has many basic and important properties as follows:

∂ ˆ f ( x ) ⊆ ∂ f ( x ) holds for all x ∈ R n , and ∂ f ( x ) is a closed set if ∂ ˆ f ( x ) is a closed convex set.
x * ∈ ∂ f ( x ) if x k * ∈ ∂ f ( x k ) and lim k → ∞ ( x k , x k * ) = ( x , x * ) , so ∂ f ( x ) is closed.
If x ˆ ∈ R n is the local minimum of f , then 0 ∈ ∂ f ( x ˆ ) . x ˆ is the critical point of f when 0 ∈ ∂ f ( x ˆ ) , the set of critical points of f is denoted by crit f .
If g : R n → R is continuously differentiable, then ∂ ( f + g ) ( x ) = ∂ f ( x ) + ∇ g ( x ) owns for all x ∈ dom f .

Let Φ η be the set of all concave functions φ : [ 0 , η ) → [ 0 , + ∞ ) that satisfy the following conditions: (i) φ ( 0 ) = 0 ; (ii) φ is continuous at 0 and continuous differentiable on ( 0 , η ) ; (iii) φ ′ ( t ) > 0 for each t ∈ ( 0 , η ) .

Lemma 2.9

[42] (Kurdyka-Łojasiewicz property (KLP)) Let f : R n → R ∪ { + ∞ } be proper lower semicontinuous function, x ¯ ∈ dom ( ∂ f ) ≔ { x ∈ R n : ∂ f ( x ) ≠ ∅ } . Let [ η 1 < f < η 2 ] ≔ { x ∈ R n : η 1 < f ( x ) < η 2 } . If there exist η ∈ ( 0 , + ∞ ] , a neighborhood U of x ¯ and a concave function φ ∈ Φ η , then the following inequality owns for all x ∈ U ∩ [ f ( x ¯ ) < f < f ( x ¯ ) + η ] :

φ ′ ( f ( x ) − f ( x ¯ ) ) d ( 0 , ∂ f ( x ) ) ≥ 1 ,

where f is said to have KLP at x ¯ . Meanwhile, φ is the associate function of f with KLP.

Lemma 2.10

[40] Let △ ϕ ( x , y ) be the corresponding Bregman distance for differentiable convex function ϕ : R n → R . Then

nonnegative: △ ϕ ( x , y ) ≥ 0 and △ ϕ ( x , x ) = 0 hold for all x , y ∈ R n ;
convexity: △ ϕ ( x , y ) is convex at x, not necessarily convex at y;
strong convexity: If ϕ is strongly σ -convex, then △ Φ ( x , y ) ≥ σ 2 ∥ x − y ∥ 2 for all x , y ∈ R n .

Lemma 2.11

[42] Assume that H ( x , y , z ) = p ( x ) + q ( y ) + h ( z ) , here p : R n → R ∪ { + ∞ } and q , h : R m → R ∪ { + ∞ } are proper lower semicontinuous functions. Then the following equality holds for all ( x , y , z ) ∈ dom H = dom p × dom q × dom h :

∂ H ( x , y , z ) = ∂ x H ( x , y , z ) × ∂ y H ( x , y , z ) × ∂ z H ( x , y , z ) .

Lemma 2.12

[43] (Uniform KLP) Let Ω be a compact set and f : R n → R ∪ { + ∞ } be proper and lower semicontinuous. If f takes a fixed constant on Ω and satisfies KLP in Lemma 2.9 at any point in Ω , then there exist ζ > 0 , η > 0 , φ ∈ Φ η such that for all x ¯ ∈ Ω and x ∈ { x ∈ R n : d ( x , Ω ) < ζ } ∩ [ f ( x ¯ ) < f ( x ) < f ( x ¯ ) + η ] , the following inequality holds:

φ ′ ( f ( x ) − f ( x ¯ ) ) d ( 0 , ∂ f ( x ) ) ≥ 1 .

Lemma 2.13

[44] Suppose that p : R n → R is continuous and differentiable, and gradient ∇ p is L p -Lipschitz continuous. Then

∣ p ( y ) − p ( x ) − ⟨ ∇ p ( x ) , y − x ⟩ ∣ ≤ L p 2 ‖ y − x ‖ 2 , ∀ x , y ∈ R n .

Lemma 2.14

The limit of { e k } exists when { e k } monotonically descends and has a lower bound.
If { e k } is monotonic and there exists an infinite subsequence e k j → e , then lim k → ∞ e k j = e .

3 Algorithm and convergence analysis

Aiming at the problem (1.3), combining with the Bregman distance and PRSM, the following 3-BPRSM is proposed, and Figure 1 is a flowchart of the proposed 3-BPRSM:

Figure 1

Flowchart of the proposed 3-BPRSM.

Algorithm 1. 3-BPRSM
	Input: Initial point x 0 ∈ dom f , y 0 ∈ dom g , z 0 ∈ dom h , ( x 0 , y 0 , z 0 ) ∈ dom l , matrix b , λ 0 , β , r , s , and the number of iteration steps k = 0 .
1:	for k = 0 , 1 , 2 , 3 , … do
2:	x k + 1 ∈ argmin x { ℒ β ( x , y k , z k , λ k ) + △ ϕ ( x , x k ) }	(3.1)
3:	λ k + 1 2 = λ k − r β ( A x k + 1 + y k + z k − b )	(3.2)
4:	y k + 1 ∈ argmin y { ℒ β ( x k + 1 , y , z k , λ k + 1 2 ) }	(3.3)
5:	z k + 1 ∈ argmin z { ℒ β ( x k + 1 , y k + 1 , z , λ k + 1 2 ) }	(3.4)
6:	if ∥ A x k + 1 + y k + 1 + z k + 1 − b ∥ 2 ≤ m 1 0 − 4 , then
7:	break
8:	else
9:	λ k + 1 = λ k + 1 2 − s β ( A x k + 1 + y k + 1 + z k + 1 − b )	(3.5)
10:	end if
11:	return x k + 1 , λ k + 1 2 , y k + 1 , z k + 1
12:	end for

The augmented Lagrange function for (1.3) is defined as follows:

(3.6) ℒ β ( x , y , z , λ ) = f ( x ) + g ( y ) + h ( z ) + l ( x , y , z ) − ⟨ λ , A x + y + z − b ⟩ + β 2 ‖ A x + y + z − b ‖ 2 ,

where λ ∈ R m is the Lagrange multiplier and β ( > 0 ) is the penalty parameter. Suppose that △ ϕ is the Bregman distance with respect to the differentiable convex function ϕ and ω ≔ ( x , y , z , λ ) , according to the augmented Lagrange function (3.6), one can define as follows:

(3.7) ∂ x ℒ β ( ω ) = ∂ f ( x ) + ∇ x l ( x , y , z ) − A ⊤ λ + β A ⊤ ( A x + y + z − b ) , ∂ y ℒ β ( ω ) = ∇ g ( y ) + ∇ y l ( x , y , z ) − λ + β ( A x + y + z − b ) , ∂ z ℒ β ( ω ) = ∇ h ( z ) + ∇ z l ( x , y , z ) − λ + β ( A x + y + z − b ) , ∂ λ ℒ β ( ω ) = − ( A x + y + z − b ) .

From (3.7), we can obtain the following result.

Lemma 3.1

Let ( x * , y * , z * , λ * ) be the stable point of ℒ β ( x , y , z , λ ) . Then 0 ∈ ∂ ℒ β ( x * , y * , z * , λ * ) if and only if the following equalities hold by Lemma 2.11:

A ⊤ λ * − ∇ x l ( x * , y * , z * ) ∈ ∂ f ( x * ) , λ * − ∇ y l ( x * , y * , z * ) = ∇ g ( y * ) , λ * − ∇ z l ( x * , y * , z * ) = ∇ h ( z * ) , A x * + y * + z * − b = 0 .

The augmented Lagrange function ℒ β ( x , y , z , λ ) has the following properties for all θ ∈ R and ( x , y , z , λ ) ∈ R n × R m × R m × R m :

(3.8) ℒ β ( x , y , z , λ − θ ( A x + y + z − b ) ) = ℒ β ( x , y , z , λ ) + θ ‖ A x + y + z − b ‖ 2 .

To simplify the analysis, we introduce symbols:

v = ( x , y , z ) , v * = ( x * , y * , z * ) , v k = ( x k , y k , z k ) , ω = ( v , λ ) , ω * = ( v * , λ * ) , ω k = ( v k , λ k ) .

To analyze the convergence, some basic assumptions of the problem (1.3) are given.

Assumption 3.2

f : R n → R ⋃ { + ∞ } is the normal lower semicontinuous, g : R m → R , h : R m → R and l ( x , y , z ) : R n × R m × R m → R are differentiable, where gradient ∇ g , ∇ h , and ∇ l are Lipschitz continuous with constants L g , L h , and L l , respectively.
ϕ is strongly σ -convex differentiable function, and gradient ∇ ϕ is L ϕ -Lipschitz continuous.

Remark 3.3

According to the optimality conditions of 3-BPRSM, one can obtain

(3.9) 0 ∈ ∂ f ( x k + 1 ) + ∇ x l ( x k + 1 , y k , z k ) − A ⊤ λ k + β A ⊤ ( A x k + 1 + y k + z k − b ) + ∇ ϕ ( x k + 1 ) − ∇ ϕ ( x k ) , 0 = ∇ g ( y k + 1 ) + ∇ y l ( x k + 1 , y k + 1 , z k ) − λ k + 1 2 + β ( A x k + 1 + y k + 1 + z k − b ) , 0 = ∇ h ( z k + 1 ) + ∇ z l ( v k + 1 ) − λ k + 1 2 + β ( A x k + 1 + y k + 1 + z k + 1 − b ) .

To analyze the monotonicity of { ℒ β ( ω k ) } , let

(3.10) δ 1 ≔ δ 1 ( r , s , β ) = σ 2 − 6 [ L l 2 + β 2 ( 1 − s ) 2 λ max ( A ⊤ A ) ] ( r + s ) β , δ 2 ≔ δ 2 ( r , s , β ) = β − L g − L l 2 − s r β r + s − 6 [ ( L l ) 2 + β 2 ( 1 − s ) 2 ] ( r + s ) β = [ r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 ] β 2 − ( r + s ) ( L g + L l ) β − 12 ( L l ) 2 2 ( r + s ) β , δ 3 ≔ δ 3 ( r , s , β ) = β − L h − L l 2 − s r β r + s − 6 [ ( L l + L h ) 2 + β 2 ( 1 − s ) 2 ] ( r + s ) β = [ r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 ] β 2 − ( r + s ) ( L h + L l ) β − 12 ( L l + l h ) 2 2 ( r + s ) β , δ ≔ δ ( r , s , β ) = min { δ 1 , δ 2 , δ 3 } .

Lemma 3.4

If Assumption 3.2 holds, r and s in 3-BPRSM satisfy r + s > 0 , then

(3.11) ℒ β ( ω k + 1 ) − ℒ β ( ω k ) ≤ − δ ‖ v k + 1 − v k ∣ ∣ 2 , ∀ k ≥ 0 ,

where the definition of δ is shown in (3.10).

Proof

On the one hand, according to (3.6) and (3.4), we have

ℒ β ( v k + 1 , λ k + 1 2 ) − ℒ β ( x k + 1 , y k + 1 , z k , λ k + 1 2 ) = h ( z k + 1 ) − h ( z k ) + l ( v k + 1 ) − l ( x k + 1 , y k + 1 , z k ) − ⟨ λ k + 1 2 , z k + 1 − z k ⟩ + β 2 ‖ A x k + 1 + y k + 1 + z k + 1 − b ∣ ∣ 2 − β 2 ‖ A x k + 1 + y k + 1 + z k − b ∣ ∣ 2 = h ( z k + 1 ) − h ( z k ) + l ( v k + 1 ) − l ( x k + 1 , y k + 1 , z k ) − ⟨ λ k + 1 2 − β ( A x k + 1 + y k + 1 − b ) , z k + 1 − z k ⟩ − β 2 ‖ z k + 1 − z k ‖ 2 = h ( z k + 1 ) − h ( z k ) + l ( v k + 1 ) − l ( x k + 1 , y k + 1 , z k ) − ⟨ ∇ h ( z k + 1 ) + ∇ z l ( v k + 1 ) , z k + 1 − z k ⟩ − β 2 ‖ z k + 1 − z k ‖ 2 ,

where the second equality is obtained by c 2 − d 2 = ( c + d ) ( c − d ) . Take advantage of the Lipschitz continuity of ∇ h and ∇ l , and Lemma 2.13. Then we have

h ( z k + 1 ) − h ( z k ) − ⟨ ∇ h ( z k + 1 ) , z k + 1 − z k ⟩ ≤ L h 2 ‖ z k + 1 − z k ‖ 2

and

l ( v k + 1 ) − l ( x k + 1 , y k + 1 , z k ) − ⟨ ∇ z l ( v k + 1 ) , z k + 1 − z k ⟩ ≤ L l 2 ∥ z k + 1 − z k ∥ 2 .

Thus, one has

(3.12) ℒ β ( v k + 1 , λ k + 1 2 ) − ℒ β ( x k + 1 , y k + 1 , z k , λ k + 1 2 ) ≤ − β − L h − L l 2 ‖ y k + 1 − y k ‖ 2 .

Similarly, by (3.3) and (3.4), the following inequality can be deduced in the same way:

(3.13) ℒ β ( x k + 1 , y k + 1 , z k , λ k + 1 2 ) − ℒ β ( x k + 1 , y k , z k , λ k + 1 2 ) ≤ − β − L g − L l 2 ‖ y k + 1 − y k ‖ 2 .

Moreover, it is known from (3.2), (3.5), and (3.8) that

(3.14) ℒ β ( x k + 1 , y k , z k , λ k + 1 2 ) − ℒ β ( x k + 1 , y k , z k , λ k ) = r β ‖ A x k + 1 + y k + z k − b ‖ 2 ,

(3.15) ℒ β ( w k + 1 ) − ℒ β ( v k + 1 , λ k + 1 2 ) = s β ‖ A x k + 1 + y k + 1 + z k + 1 − b ‖ 2 .

In addition, since x k + 1 is the optimal solution to (3.1), combining with Lemma 2.10, the following inequality holds:

(3.16) ℒ β ( x k + 1 , y k , z k , λ k ) − ℒ β ( ω k ) ≤ − △ ϕ ( x k + 1 , x k ) ≤ − σ 2 ‖ x k + 1 − x k ‖ 2 .

Hence, according to (3.12)–(3.16), one can obtain

(3.17) ℒ β ( ω k + 1 ) − ℒ β ( ω k ) = ( 3.15 ) ℒ β ( v k + 1 , λ k + 1 2 ) + s β ‖ A x k + 1 + y k + 1 + z k + 1 − b ‖ 2 − ℒ β ( ω k ) ≤ ( 3.13 ) ℒ β ( x k + 1 , y k + 1 , z k , λ k + 1 2 ) − β − L h − L l 2 ‖ y k + 1 − y k ‖ 2 + s β ‖ A x k + 1 + y k + 1 + z k + 1 − b ‖ 2 − ℒ β ( ω k ) ≤ ( 3.12 ) ℒ β ( x k + 1 , y k , z k , λ k + 1 2 ) − β − L h − L l 2 ‖ z k + 1 − z k ‖ 2 − β − L g − L l 2 ‖ y k + 1 − y k ‖ 2 + s β ‖ A x k + 1 + y k + 1 + z k + 1 − b ‖ 2 − ℒ β ( ω k ) = ( 3.14 ) ℒ β ( x k + 1 , y k , z k , λ k ) − β − L h − L l 2 ‖ z k + 1 − z k ‖ 2 − β − L g − L l 2 ‖ y k + 1 − y k ‖ 2 + r β ‖ A x k + 1 + y k + z k − b ‖ 2 + s β ‖ A x k + 1 + y k + 1 + z k + 1 − b ‖ 2 − ℒ β ( ω k ) ≤ ( 3.16 ) − σ 2 ∥ x k + 1 − x k ∥ 2 − β − L h − L l 2 ‖ z k + 1 − z k ‖ 2 − β − L g − L l 2 ‖ y k + 1 − y k ‖ 2 + r β ∥ A x k + 1 + y k + z k − b ∥ 2 + s β ∥ A x k + 1 + y k + 1 + z k + 1 − b ∥ 2 .

On the other hand, the equalities hold by the multipliers update formulas (3.2) and (3.5):

(3.18) A x k + 1 + y k + z k − b = − 1 ( r + s ) β ( λ k + 1 − λ k ) − s r + s ( y k + 1 − y k ) − s r + s ( z k + 1 − z k ) ,

(3.19) A x k + 1 + y k + 1 + z k + 1 − b = − 1 ( r + s ) β ( λ k + 1 − λ k ) + r r + s ( y k + 1 − y k ) + r r + s ( z k + 1 − z k ) .

Then, from (3.5) and (3.9), we have

(3.20) λ k + 1 = ∇ h ( z k + 1 ) + ∇ z l ( v k + 1 ) + β ( 1 − s ) ( A x k + 1 + y k + 1 + z k − b ) .

Combining with the Lipschitz continuity of ∇ g and ∇ h , it is derived that

(3.21) ‖ λ k + 1 − λ k ‖ = ‖ ∇ h ( z k + 1 ) − ∇ h ( z k ) + ∇ z l ( v k + 1 ) − ∇ z l ( v k ) + β ( 1 − s ) [ A ( x k + 1 − x k ) + ( y k + 1 − y k ) + ( z k + 1 − z k ) ] ‖ ≤ L l ‖ x k + 1 − x k ‖ + L l ‖ y k + 1 − y k ‖ + ( L h + L l ) ‖ z k + 1 − z k ‖ + β ‖ 1 − s ‖ ( ‖ A ( x k + 1 − x k ) ‖ + ‖ y k + 1 − y k ‖ + ‖ z k + 1 − z k ‖ ) .

Furthermore, according to the Cauchy inequality, the following inequality holds:

(3.22) 1 6 ‖ λ k + 1 − λ k ‖ 2 ≤ β 2 ( 1 − s ) 2 ( ‖ A ( x k + 1 − x k ) ‖ 2 + ‖ y k + 1 − y k ‖ 2 + ‖ z k + 1 − z k ‖ 2 ) + L l 2 ‖ x k + 1 − x k ‖ 2 + L l 2 ‖ y k + 1 − y k ‖ 2 + ( L h + L l ) 2 ‖ z k + 1 − z k ‖ 2 .

Thus, combining (3.18), (3.19), (3.22), and r + s > 0 , one has

(3.23) r β ∥ A x k + 1 + y k + z k − b ∥ 2 + s β ∥ A x k + 1 + y k + 1 + z k + 1 − b ∥ 2 = 1 ( r + s ) β ∥ λ k + 1 − λ k ∥ 2 + r s β r + s ∥ y k + 1 − y k ∥ 2 r s β r + s ∥ z k + 1 − z k ∥ 2 ≤ 6 [ L l 2 + β 2 ( 1 − s ) 2 λ max ( A T A ) ] ( r + s ) β ‖ x k + 1 − x k ‖ 2 + 6 [ ( L l ) 2 + β 2 ( 1 − s ) 2 ] ( r + s ) β + r s β r + s ‖ y k + 1 − y k ‖ 2 + 6 [ ( L h + L l ) 2 + β 2 ( 1 − s ) 2 ] ( r + s ) β + r s β r + s ‖ z k + 1 − z k ‖ 2 .

Finally, substituting (3.23) into (3.17), the following inequality saves:

ℒ β ( ω k + 1 ) − ℒ β ( ω k ) ≤ − σ 2 + 6 [ L l 2 + β 2 ( 1 − s ) 2 λ max ( A ⊤ A ) ] ( r + s ) β ∥ x k + 1 − x k ∥ 2 + − β − L g − L l 2 + 6 [ ( L l ) 2 + β 2 ( 1 − s ) 2 ] ( r + s ) β + r s β r + s ‖ y k + 1 − y k ‖ 2 + 6 [ ( L h + L l ) 2 + β 2 ( 1 − s ) 2 ] ( r + s ) β + r s β r + s ‖ z k + 1 − z k ‖ 2 = − δ 1 ( r , s , β ) ‖ x k + 1 − x k ‖ 2 − δ 2 ( r , s , β ) ‖ y k + 1 − y k ‖ 2 − δ 3 ( r , s , β ) ‖ z k + 1 − z k ‖ 2 ≤ − δ ‖ v k + 1 − v k ‖ 2 .

The proof is completed.□

It follows that { ℒ β ( w k ) } has sufficient descent property when δ > 0 , and this can be achieved by choosing a suitable value for ( r , s , β , σ ) . So some of the restrictions on the parameter ( r , s , β , σ ) are given below.

Assumption 3.5

In 3-BPRSM, the strongly convex coefficient σ of function ϕ and parameter ( r , s , β ) meet the following conditions:

The solution of r + s > 0 and r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 > 0 can be expressed as follows:
( r , s ) ∈ D ≔ 12 ( 1 − s ) 2 − s 1 − 2 s , + ∞ × − ∞ , 1 2 ⋃ − s , 12 ( 1 − s ) 2 − s 1 − 2 s × 6 − 6 5 , 6 + 6 5 ;
β > β 0 ≔ max { ϖ , ϱ } , where
ϖ = ( r + s ) ( L g + L h ) + ( r + s ) 2 ( l g + l h ) 2 + 48 [ r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 ] ( L l 2 ) 2 [ r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 ] ,

ϱ = ( r + s ) ( L g + L h ) + ( r + s ) 2 ( l g + l h ) 2 + 48 [ r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 ] ( ( L h + L l ) 2 ) 2 [ r ( 1 − 2 s ) + s − 12 ( 1 − s ) 2 ] ;
σ > 12 [ L h 2 + β 2 ( 1 − s ) 2 λ max ( A ⊤ A ) ] ( r + s ) β .

In particular, r = s holds when s ∈ ( 6 ⁄ 7 , 1 ) . Therefore, the parameter ( r , s ) can take the same value in this range.

Lemma 3.6

If both Assumptions 3.2 and 3.5 hold, then δ 1 , δ 2 , and δ defined by (3.10) are positive. Thus, { ℒ β ( ω k ) } decreases monotonically.

Proof

δ 1 > 0 and δ 2 > 0 clearly hold from Assumption 3.2. Further, δ > 0 hold by (3.10). Finally, { ℒ β ( ω k ) } decreases monotonically by (3.11).□

Lemma 3.7

If both Assumptions 3.2 and 3.5 own, and { ω k } in 3-BPRSM is bounded, then ∑ k = 0 + ∞ ‖ ω k + 1 − ω k ‖ 2 < + ∞ .

Proof

By the boundedness of the sequence { ω k } , we can find a subsequence { ω k j } such that lim k j → + ∞ ω k j = ω * . Since the function f ( x ) is lower semicontinuous, and functions g ( y ) , h ( z ) , and l ( x , y , z ) are continuous, one can conclude that the Lagrange function ℒ β ( ⋅ ) is also lower semicontinuous. Hence, we have ℒ β ( ω * ) ≤ liminf k j → + ∞ ℒ β ( ω k j ) ≤ ℒ β ( ω 0 ) < + ∞ . This implies that { ℒ β ( ω k j ) } is bounded from below. By combining the lower semicontinuity of { ℒ β ( ω k j ) } , we know that the entire sequence { ℒ β ( ω k j ) } is convergent. So its entire sequence converges and lim k → + ∞ ℒ β ( ω k ) = inf k ℒ β ( ω k ) ≥ ℒ β ( ω * ) holds since { ℒ β ( ω k ) } is monotonically decreasing and has convergent subsequences combining with Lemma 2.14. Therefore, the inequality ℒ β ( ω k ) ≥ ℒ β ( ω * ) holds for all k . In addition, by (3.11), one has

(3.24) δ ‖ v k + 1 − v k ‖ 2 ≤ ℒ β ( ω k ) − ℒ β ( ω k + 1 ) , ∀ k ≥ 0 .

Summing up (3.24) from k = 0 to k = q , and the following inequality holds when ℒ β ( ω 0 ) < + ∞ :

δ ∑ k = 0 q ‖ v k + 1 − v k ‖ 2 ≤ ℒ β ( ω 0 ) − ℒ β ( ω q + 1 ) ≤ ℒ β ( ω 0 ) − ℒ β ( ω * ) < + ∞ .

Then because δ > 0 , one immediately knows that ∑ k = 0 + ∞ ‖ v k + 1 − v k ‖ 2 < + ∞ holds. So we can know

∑ k = 0 + ∞ ‖ x k + 1 − x k ‖ 2 < + ∞ , ∑ k = 0 + ∞ ‖ y k + 1 − y k ‖ 2 < + ∞ , ∑ k = 0 + ∞ ‖ z k + 1 − z k ‖ 2 < + ∞ .

Finally, by combining with (3.22), we obtain ∑ k = 0 + ∞ ‖ λ k + 1 − λ k ‖ 2 < + ∞ , and so ∑ k = 0 + ∞ ‖ ω k + 1 − ω k ‖ 2 < + ∞ . The proof is complete.□

Theorem 3.8

(Global convergent) Let Ω be the set of accumulation point of { ω k } . Assuming that Assumptions 3.2 and 3.5 hold, and { ω k } generated by 3-BPRSM is bounded, then the following three results are true:

Ω is a nonempty convex set, and d ( ω k , Ω ) → 0 , k → + ∞ ;
Ω ⊆ crit ℒ β ;
The entire sequence of { ℒ β ( ω k ) } is convergent, and ℒ β ( ω * ) = lim k → + ∞ ℒ β ( ω k ) = inf k ℒ β ( ω k ) holds for all ω * ∈ Ω . Further, ℒ β ( ⋅ ) is finite and a constant on Ω .

Proof

(i) The conclusion (i) holds by the definition of Ω and the boundedness of { ω k } .

(ii) Let ω * = ( x * , y * , z * , λ * ) ∈ Ω . Then there exists a subsequence { ω j k } of { ω k } converges to ω * . lim k → + ∞ ‖ ω k + 1 − ω k ‖ = 0 is known from Lemma 3.7; thus, lim k j → + ∞ ω k j + 1 = ω * . Then letting k = k j → + ∞ in (3.2), we can know that { λ k j + 1 2 } is bounded, so we have lim k j → + ∞ λ k j + 1 2 = λ * * . Let k = k j → + ∞ and take the limit in (3.2) and (3.5). Then

λ * * = λ * − r β ( A x * + y * + z * − b ) , λ * = λ * * − s β ( A x * + y * + z * − b ) .

And because r + s > 0 , one can obtain from the aforementioned equation that

A x * + y * + z * − b = 0 , λ * * = λ * .

Thus, v * = ( x * , y * , z * ) is the feasible point for the problem (1.3). In addition, noting that x k j + 1 is the optimal solution of (3.1), then the following inequality holds:

f ( x k j + 1 ) + l ( x k j + 1 , y k j , z k j ) − ⟨ λ k j , A x k j + 1 ⟩ + β 2 ‖ A x k j + 1 + y k j + z k j − b ‖ 2 + △ ϕ ( x k j + 1 , x k j ) ≤ f ( x * ) + h ( x * , y k j , z k j ) − ⟨ λ k j , A x * ) + β 2 ‖ A x * + y k j + z k j − b ‖ 2 + △ ϕ ( x * , x k j ) .

Combining lim k j → + ∞ ω k j = lim k j → + ∞ ω k j + 1 = ω * and the continuous differentiability of ϕ , by the aforementioned inequality, now we know that limsup k j → + ∞ f ( x k j + 1 ) ≤ f ( x * ) holds. Also noticing that the lower semicontinuity of f , it follows that f ( x * ) ≤ lim k j → + ∞ f ( x k j + 1 ) , so we can obtain

(3.25) lim k j → + ∞ f ( x k j + 1 ) = f ( x * ) .

In addition, lim k j → + ∞ ‖ ∇ ϕ ( x k j + 1 ) − ∇ ϕ ( x k j ) ‖ = 0 according to Assumption 3.2 and lim k j → + ∞ ‖ x k j + 1 − x k j ‖ = 0 . Hence, taking into account the closeness of ∂ f , the continuity of ∇ g and ∇ h , as well as (3.25), letting k = k j → + ∞ and taking the limit in (3.9), then we obtain

A ⊤ λ * − ∇ x l ( v * ) ∈ ∂ f ( x * ) , λ * − ∇ y l ( v * ) = ∇ g ( y * ) , λ * − ∇ z l ( v * ) = ∇ h ( z * ) , A x * + y * + z * − b = 0 ,

and ω * ∈ crit ℒ β by Lemma 3.1.

(iii) Let ω ∗ ∈ Ω . There exists at least one susequence { ω k j } of { ω k } converges to ω ∗ . According to (3.6) and (3.11), lim k j → + ∞ ℒ β ( ω k j + 1 ) = ℒ β ( ω * ) holds. Combining with the monotonicity of { ℒ β ( ω k ) } and Lemma 2.14, we can see that the whole sequence { ℒ β ( ω k ) } is convergent. Thus, + ∞ > ℒ β ( ω 0 ) ≥ lim k → + ∞ ℒ β ( ω k ) = inf k ℒ β ( ω k ) = ℒ β ( ω * ) saves by ℒ β ( ω k ) ≤ ℒ β ( ω 0 ) < + ∞ . Hence, ℒ β ( ω * ) ≡ lim k → + ∞ ℒ β ( ω k ) < + ∞ holds for all ω * ∈ Ω . The proof is completed.□

Lemma 3.9

Let

(3.26) ℰ 1 k + 1 = β A ⊤ ( y k + 1 − y k ) + β A ⊤ ( z k + 1 − z k ) − A ⊤ ( λ k + 1 − λ k ) − [ ∇ ϕ ( x k + 1 ) − ∇ ϕ ( x k ) ] + ∇ x l ( v k + 1 ) − ∇ x l ( x k + 1 , y k , z k ) , ℰ 2 k + 1 = − s r + s ( λ k + 1 − λ k ) + r s β r + s ( y k + 1 − y k + z k + 1 − z k ) − ∇ y l ( x k + 1 , y k + 1 , z k ) + ∇ y l ( v k + 1 ) + β ( z k + 1 − z k ) , ℰ 3 k + 1 = s r + s ( λ k + 1 − λ k ) + r s β r + s ( y k + 1 − y k + z k + 1 − z k ) , ℰ 4 k + 1 = 1 ( r + s ) β ( λ k + 1 − λ k ) − r r + s ( y k + 1 − y k + z k + 1 − z k ) .

If both ∇ ϕ and ∇ ϕ are Lipschitz continuous, then ℰ k + 1 ≔ ( ℰ 1 k + 1 , ℰ 2 k + 1 , ℰ 3 k + 1 , ℰ 4 k + 1 ) ∈ ∂ ℒ β ( ω k + 1 ) and there exists a constant ζ > 0 such that

(3.27) ( 0 , ∂ ℒ β ( ω k + 1 ) ) ≤ ζ ‖ v k + 1 − v k ‖ , ∀ k ≥ 0 .

Proof

First, according to (3.9), there exists ξ k + 1 ∈ ∂ f ( x k + 1 ) such that

ξ k + 1 + ∇ x h ( x k + 1 , y k ) − A ⊤ λ k + β A ⊤ ( A x k + 1 + y k − b ) + ∇ ϕ ( x k + 1 ) − ∇ ϕ ( x k ) = 0 .

Combining (3.26) and (3.7), one has

ε 1 k + 1 = β A ⊤ ( y k + 1 − y k ) + β A ⊤ ( z k + 1 − z k ) − A ⊤ ( λ k + 1 − λ k ) − [ ∇ ϕ ( x k + 1 ) − ∇ ϕ ( x k ) ] + ∇ x l ( v k + 1 ) − ∇ x l ( x k + 1 , y k , z k ) + { ξ k + 1 + ∇ x h ( x k + 1 , y k ) − A ⊤ λ k + β A ⊤ ( A x k + 1 + y k + z k − b ) + ∇ ϕ ( x k + 1 ) − ∇ ϕ ( x k ) } = ξ k + 1 + ∇ x l ( v k + 1 ) − A ⊤ λ k + 1 + β A ⊤ ( A x k + 1 + y k + 1 + z k + 1 − b ) ∈ ∂ x ℒ β ( ω k + 1 ) .

Second, we substitute λ k + 1 2 from (3.5) into (3.9), which yields

s β ( A x k + 1 + y k + 1 + z k + 1 − b ) = ∇ g ( y k + 1 ) + ∇ y l ( v k + 1 ) − λ k + 1 + β ( A x k + 1 + y k + 1 + z k − b ) .

Combining (3.7), (3.19), and (3.26), one has

ε 2 k + 1 = − s r + s ( λ k + 1 − λ k ) + r s β r + s ( y k + 1 − y k + z k + 1 − z k ) − ∇ y l ( x k + 1 , y k + 1 , z k ) + ∇ y l ( v k + 1 ) + β ( z k + 1 − z k ) = s β ( A x k + 1 + y k + 1 + z k + 1 − b ) − ∇ y l ( x k + 1 , y k + 1 , z k ) + ∇ y l ( v k + 1 ) + β ( z k + 1 − z k ) = ∇ g ( y k + 1 ) + ∇ y l ( v k + 1 ) − λ k + 1 + β ( A x k + 1 + y k + 1 + z k + 1 − b ) ∈ ∂ y ℒ β ( ω k + 1 ) .

Similarly, by substituting (3.4) for λ k + 1 2 in (3.5), we obtain

s β ( A x k + 1 + y k + 1 + z k + 1 − b ) = ∇ h ( z k + 1 ) + ∇ z l ( v k + 1 ) − λ k + 1 + β ( A x k + 1 + y k + 1 + z k + 1 − b ) .

By combining with (3.7), (3.19), and (3.26), we have

ℰ 3 k + 1 = s β 1 ( r + s ) β ( λ k + 1 − λ k ) + r r + s ( y k + 1 − y k + z k + 1 − z k ) = s β ( A x k + 1 + y k + 1 + z k + 1 − b ) = ∇ h ( z k + 1 ) + ∇ z l ( v k + 1 ) − λ k + 1 + β ( A x k + 1 + y k + 1 + z k + 1 − b ) ∈ ∂ z ℒ β ( ω k + 1 ) .

Again, from (3.7), (3.19), and Lemma 3.9, we know that

ℰ 4 k + 1 = 1 ( r + s ) β ( λ k + 1 − λ k ) − r r + s ( y k + 1 − y k + z k + 1 − z k ) = − ( A x k + 1 + y k + 1 − b ) = ( 3.7 ) ∂ λ ℒ β ( ω k + 1 ) .

So, ε k + 1 ∈ ∂ ℒ β ( w k + 1 ) .

Finally, combining (3.26) and the Lipschitz continuity of ∇ ϕ and ∇ l , there exists ζ 1 > 0 such that

‖ ε k + 1 ‖ ≤ ζ 1 ( ‖ x k + 1 − x k ‖ + ‖ y k + 1 − y k ‖ + ‖ z k + 1 − z k ‖ + ‖ λ k + 1 − λ k ‖ ) .

In addition, by (3.21), there exists ζ 2 > 0 such that

‖ λ k + 1 − λ k ‖ ≤ ζ 2 ( ‖ x k + 1 − x k ‖ + ‖ y k + 1 − y k ‖ + ‖ z k + 1 − z k ‖ ) .

Considering the aforementioned two inequalities and ε k + 1 ∈ ∂ ℒ β ( w k + 1 ) , similar to the proof of Lemma 9 in [20], now one knows that

d ( 0 , ∂ ℒ β ( ω k + 1 ) ) ≤ ‖ ε k + 1 ‖ ≤ ζ 1 ( 1 + ζ 2 ) ( ‖ x k + 1 − x k ‖ + ‖ y k + 1 − y k ‖ + ‖ z k + 1 − z k ‖ ) ≤ ζ ‖ v k + 1 − ν k ‖ ,

where ζ ≔ 2 ζ 1 ( 1 + ζ 2 ) . The proof is completed.□

Theorem 3.10

(Strong convergence) If both Assumptions 3.2 and 3.5 hold, { ω k } is bounded, and ℒ β ( ω ) is a function with KLP, then

∑ k = 0 + ∞ ∥ ω k + 1 − ω k ∥ < + ∞ .

Furthermore, { ω k } converges to the critical point of ℒ β ( ⋅ ) .

Proof

According to Theorem 3.8 (iii), we have ℒ β ( ω * ) = lim k → + ∞ ℒ β ( ω k ) = inf k ℒ β ( ω k ) for all w * ∈ Ω . Next, we consider two cases.

(Case I) If there exists an integer k 0 such that ℒ β ( ω k 0 ) = ℒ β ( ω * ) , then from (3.11) and Lemma 3.6, one can obtain

δ ‖ v k + 1 − v k ‖ 2 ≤ ℒ β ( ω k ) − ℒ β ( ω k + 1 ) ≤ ℒ β ( ω k 0 ) − ℒ β ( ω * ) = 0 , ∀ k ≥ k 0 .

Thereby, v k + 1 = v k owns for all k ≥ k 0 . This together with (3.21), it follows that λ k + 1 = λ k for all k ≥ k 0 . Further, ω k + 1 = ω k 0 ∈ Ω holds for all k ≥ k 0 . The conclusion saves.

(Case II) Suppose that ℒ β ( ω k ) > ℒ β ( ω * ) holds for all k ≥ 0 . As is known from Theorem 3.8 (iii), ℒ β ( ⋅ ) takes a constant on Ω . Thus, ℒ β ( ⋅ ) satisfies the uniform KLP by Lemma 2.12. For the parameters ζ and η in Lemma 2.12, since d ( ω k , Ω ) → 0 and ℒ β ( ω * ) = inf k ℒ β ( ω k ) , there exists a positive integer k ˜ such that

d ( ω k , Ω ) < ζ , ℒ β ( ω * ) < ℒ β ( ω k ) < ℒ β ( ω * ) + η , ∀ k > k ˜ .

Thus, by combining with Lemma 2.12, we have

φ ′ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) d ( 0 , ∂ ℒ β ( ω k ) ) ≥ 1 , ∀ k > k ˜ .

In addition, the following inequality owns by the concavity of φ :

φ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) − φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) ≥ φ ′ ( ℒ β ( ω k ) − ℒ β ( w * ) ) ( ℒ β ( ω k ) − ℒ β ( ω k + 1 ) ) .

Simultaneously, because 1 φ ′ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) ≤ d ( 0 , ∂ ℒ β ( ω k ) ) ≤ ζ ‖ v k − v k − 1 ‖ and φ ′ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) > 0 , one has

ℒ β ( ω k ) − ℒ β ( ω k + 1 ) ≤ φ ( ℒ β ( ω k ) − ℒ β ( ω k ) ) − φ ℒ β ( ω k − 1 ) − ℒ β ( ω * ) φ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) ≤ ζ ‖ v k − v k − 1 ‖ [ φ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) − φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) ] .

Combining Lemma 3.4 with the aforementioned inequalities, we have

δ ‖ v k + 1 − v k ‖ 2 ≤ ζ ‖ v k − v k − 1 ‖ φ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) − φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) , ∀ k > k ˜ ,

further implies,

‖ v k + 1 − v k ‖ ≤ ‖ v k − v k − 1 ‖ ζ δ φ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) − φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) , ∀ k > k ˜ .

According to the aforementioned inequality and 2 a b ≤ a + b for all a , b ≥ 0 , we can obtain

(3.28) 2 ‖ v i + 1 − v i ‖ ≤ ‖ v i − v i − 1 ‖ + ζ δ φ ( ℒ β ( ω i ) − ℒ β ( ω * ) ) − φ ( ℒ β ( ω i + 1 ) − ℒ β ( ω * ) ) , ∀ i > k ˜ .

Then, summing up (3.28) from i = k + 1 ( ≥ k ˜ + 1 ) to i = q and rearranging terms, we have

2 ∑ i = k + 1 q ‖ v i + 1 − v i ‖ ≤ ∑ i = k + 1 q ‖ v i − v i − 1 ‖ + ζ δ φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) − φ ( ℒ β ( ω q + 1 ) − ℒ β ( ω * ) ) .

According to φ ( ℒ β ( ω q + 1 ) − ℒ β ( ω ∗ ) ) ≥ 0 , further one has

∑ i = k + 1 q ‖ v i + 1 − v i ‖ ≤ ∑ i = k + 1 q ‖ v i + 1 − v i ‖ + ‖ v i + 1 − v q ‖ ≤ ‖ v k + 1 − v k ‖ + ζ δ φ ( ℒ B ( ω k + 1 ) − ℒ B ( ω * ) ) .

And taking the limit q → + ∞ in the aforementioned inequality, then one has

(3.29) ∑ i = k + 1 + ∞ ‖ v i + 1 − v i ‖ ≤ ‖ v k + 1 − v k ‖ + ζ δ φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) < + ∞ , ∀ k ≥ k ˜ .

When k = k ˜ , we have

∑ i = k ˜ + 1 + ∞ ‖ v i + 1 − v i ‖ ≤ ‖ v k ˜ + 1 − v k ˜ ‖ + ζ δ φ ( ℒ β ( ω k ˜ + 1 ) − ℒ β ( ω * ) ) < + ∞ .

So ∑ k = 0 + ∞ ‖ v k + 1 − v k ‖ < + ∞ , and it follows from (3.21) that ∑ k = 0 + ∞ ‖ λ k + 1 − λ k ‖ < + ∞ . Finally, from ω k = ( v k , λ k ) , we obtain

∑ k = 0 + ∞ ‖ ω k + 1 − ω k ‖ ≤ ∑ k = 0 + ∞ ‖ v k + 1 − v k ‖ + ∑ k = 0 + ∞ ‖ λ k + 1 − λ k ‖ < + ∞ .

Hence, { w k } is Cauchy sequence, which is convergent. From Theorem 3.8, it can be seen that { w k } converges to the critical point of ℒ β ( ⋅ ) . The proof is completed.□

Theorem 3.11

(Convergence rate analysis) Suppose that both Assumptions 3.2 and 3.5 hold, { ω k } generated by 3-BPRSM is bounded, ℒ β ( ⋅ ) satisfies KLP, and the association function is φ ( t ) = c t 1 − θ , θ ∈ [ 0 , 1 ) , c > 0 . Then the following four conclusions own:

lim k → + ∞ ω k = ω * ∈ crit ℒ β .
If θ = 0 , then ω k converges after a finite number of iterations, i.e., k exists such that ω k = ω * .
If θ ∈ ( 0 , 1 2 ] , then there is a constant τ ∈ [ 0 , 1 ) such that ‖ ω k − ω * ‖ = O ( τ k ) , and thus, 3-BPRSM converges linearly.
If θ ∈ ( 1 2 , 1 ] , then ‖ ω k − ω * ‖ = O ( k 1 − θ 1 − 2 θ ) , and so 3-BPRSM is sublinearly convergent.

Proof

First, the conclusion (i) holds from Theorem 3.10, and ℒ β ( ⋅ ) satisfies KLP at ω * . Now let us analyze conclusions (ii)–(iv).

For case θ = 0 , one has φ ( t ) = c t and φ ′ ( t ) = c . If there exists a positive integer k 0 such that ℒ β ( ω k 0 ) = ℒ β ( ω * ) , combining with the proof of (Case I) in Theorem 3.10, { ω k } terminates the iteration at k 0 . If ℒ β ( ω k ) > ℒ β ( ω * ) holds for all k , then we have φ ′ ( ℒ β ( ω k ) − ℒ β ( ω * ) ) d ( 0 , ∂ ℒ β ( ω k ) ) = c d ( 0 , ∂ ℒ β ( w k ) ) ≥ 1 for all k > k ˜ by the proof process of (Case II) in Theorem 3.10 and (3.26), so this contradicts (3.27). Thus, there exists a positive integer k ˆ ( > k ˜ ) for all k > k ˆ such that ℒ β ( ω k ) = ℒ β ( ω * ) , and conclusion (iii) holds from (Case I) of Theorem 3.10.

Next, let us discuss the case of θ > 0 . Taking Ξ k ≔ ∑ i = k + 1 + ∞ ∥ v i + 1 − v i ∥ , then from (3.29), it follows that

(3.30) Ξ k + 1 ≤ ( Ξ k − Ξ k + 1 ) + ζ δ φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) , ∀ k ≥ k ˜ .

At the same time, it is noted that ℒ β ( ⋅ ) satisfies KLP at ω * , and one can immediately known that

φ ′ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) d ( 0 , ∂ ℒ β ( ω k + 1 ) ) ≥ 1 .

In combination with φ ( t ) = c t 1 − θ , the aforementioned inequality is equivalent to

(3.31) ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) θ ≤ c ( 1 − θ ) d ( 0 , ∂ ℒ β ( ω k + 1 ) ) .

From (3.27), we can also know

(3.32) d ( 0 , ∂ ℒ β ( ω k + 1 ) ) ≤ ζ ‖ v k + 1 − v k ‖ = ζ ( Ξ k − Ξ k + 1 ) .

Combining with (3.31) and (3.32), there exists γ > 0 such that

φ ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) = c ( ℒ β ( ω k + 1 ) − ℒ β ( ω * ) ) 1 − θ ≤ γ ( Ξ k − Ξ k + 1 ) 1 − θ θ .

According to (3.30), one has

(3.33) Ξ k + 1 ≤ ( Ξ k − Ξ k + 1 ) + ζ δ γ ( Ξ k − Ξ k + 1 ) 1 − θ θ , ∀ k ≥ k ˜ .

Based on (3.33), the remaining proof are completed with the help of the relevant conclusions of Attouch and Bolte [45].

By [45], if θ ∈ ( 0 , 1 2 ] , then one knows that there exist c 1 > 0 and τ ∈ [ 0 , 1 ) such that

(3.34) ‖ v k − v * ‖ ≤ c 1 τ k ⇒ ‖ x k − x * ‖ = O ( τ k ) , ‖ y k − y * ‖ = O ( τ k ) , ‖ z k − z * ‖ = O ( τ k ) .

Further, there exist c 2 > 0 such that

(3.35) ‖ v k − v * ‖ ≤ c 2 k 1 − θ 1 − 2 θ ⇒ ‖ x k − x * ‖ = O ( k 1 − θ 1 − 2 θ ) , ‖ y k − y * ‖ = O ( k 1 − θ 1 − 2 θ ) , ‖ z k − z * ‖ = O ( k 1 − θ 1 − 2 θ ) .

when θ ∈ ( 1 2 , 1 ] .

In addition, from Lemma 3.1, (3.20), and the Lipschitz continuity of ∇ g and ∇ h , we obtain

(3.36) ‖ λ k − λ * ‖ = ‖ ∇ g ( y k ) − ∇ g ( y * ) + ∇ h ( z k ) − ∇ h ( z * ) + ∇ y l ( v k ) − ∇ y l ( v * ) + ∇ z l ( v k ) − ∇ z l ( v * ) + β ( 1 − s ) [ A ( x k − x * ) + ( y k − y * ) + ( z k − z * ) ] ‖ = O ( ‖ v k − v * ‖ ) .

Finally, combining with (3.34)–(3.36), now we know that conclusions (iii) and (iv) are verified.□

4 Numerical simulations

Sparse signal reconstruction from incomplete observation data sets is a prominent research area in compressed sensing. The objective is to find the sparsest solution to a system of linear equations represented by [15, p. 14]:

(4.1) min ‖ x ‖ 0 s.t. C x = b ,

where C ∈ R m x n is a measurement matrix, b ∈ R m is observation data, e > 0 is a regularization parameter and ‖ x ‖ 0 is the number of nonzero elements of x . How to recover an n -dimensional sparse signal x ? Wang et al. described in [15] that the system (4.1) and its linear transformation or its regularization version are NP-hard problems. To surmount this difficulty, l 0 regularization is often relaxed to l 1 ⁄ 2 regularization. Instead of directly solving (4.1), researchers commonly solve the following constrained optimization problem [10,15] (also [34, the problem (79), p. 11]):

(4.2) min e ‖ x ‖ 1 2 1 2 + 1 2 ‖ y ‖ 2 s.t. C x − y = b ,

where ∥ x ∥ ( 1 ⁄ 2 ) = ∑ i = 1 n ∣ x i ∣ ( 1 ⁄ 2 ) 2 .

Based on (4.2) and the objective existence of nonseparable structure, now we generate the following optimization problem with line constraint:

(4.3) min e ∥ x ∥ 1 2 1 2 + 1 2 ∥ y ∥ 2 + 1 2 ∥ D 1 x + D 2 y + z ∥ 2 s.t. A x + y + z = b ,

which was studied in [34, (81), p. 12] by Chao et al. when A 2 = E , the identity matrix.

To evaluate the effectiveness, by applying 3-BPRSM, we construct the following algorithm to solve the nonconvex optimization problem (4.3) with ϕ ( x ) = 1 2 x i μ 1 I 1 − β A T A 2 :

(4.4) x k + 1 = S 1 μ 1 G 1 x k − D 1 T ( D 1 x k + D 2 y k + z k ) − β A T y k − z k + b + λ k β ; 2 e μ 1 , λ k + 1 2 = λ k − r β ( A x k + 1 + y k + z k − b ) , y k + 1 = 1 1 + β − D 2 T ( D 1 x k + D 2 y k + z k ) − β A x k + 1 − z k + b + λ k β , z k + 1 = 1 1 + β − D 1 x k + 1 + D 2 y k + 1 + β A x k + 1 + y k + 1 − b − λ k β , λ k + 1 = λ k + 1 2 − s β ( A x k + 1 + y k + 1 + z k + 1 − b ) ,

where S ( ⋅ , ⋅ ) represents the half shrinkage operator [10], which is defined as S ( x , τ ) = { s τ ( x 1 ) , … , s τ ( x n ) } ⊤ with

s τ ( x i ) = 2 x i 3 1 + cos 2 3 ( π − ℘ ( x ) ) , ∣ x i ∣ > 54 3 4 τ 2 3 , 0 , otherwise ,

and

℘ ( x ) = arccos τ 8 ∣ x ∣ 3 − 3 2 .

In this example, the measurement matrix A is generated from a normal distribution N ( 0 , 1 ) and normalized to have unit ( l 1 2 ) norm for its columns. The variables x and y are generated with 100 nonzero entries a Gaussian distribution. The initial values x 0 , y 0 , z 0 , and λ 0 are all initialized to be 0. The vector b = A x 0 + y 0 + z 0 + ν , where ν ∼ N ( 0 , 1 0 − 3 I ) . We set μ 1 = 30 , β = 20 and the regularization parameters c = 0.1 . At the iteration point of k , the residual at iteration k is defined as r k = A x k + y k + z k − b . The termination criterion is defined as follows:

∥ r k ∥ 2 ≤ m 1 0 − 4 .

To validate the effectiveness, we compare 3-BPRSM with LBADMM [34]. The numerical results are presented in Table 1. The code is implemented using MATLAB R2022a, and the computations are performed on a computer running Windows 10, Intel (R) Core (TM) i7-8550U 1.80GHz CPU, with 8GB of memory. The reported results include the number of iterations (Iter ( k )) and the objective function value ( f -val). When n = 1,500 and m = 1,500 , we conduct a total of 435 iterations, resulting in a convergence value of 260.78. As the values of n and m increase, our iteration time extends, but we can still ensure the convergence of the algorithm and guarantee that 3-BPRSM outperforms LBADMM.

Table 1

Comparison of iteration effect between 3-BPRSM and LBADMM

( n , m , m )	Alg		Iter ( k )					min ‖ r ‖ 2 / f -val
			30	60	90	120	150
(1,500, 1,500, 1,500)	LBADMM	‖ r ‖ 2	0.0149	0.0095	0.005	—	—	0.0038
		f -val	426.08	346.53	324.89	—	—	318.91
	3-BPRSM	‖ r ‖ 2	0.6265	0.0569	0.0245	0.0177	0.0146	0.0039
		f -val	438.59	319.35	290.67	280.04	274.69	260.78
(3,000, 3,000, 3,000)	LBADMM	‖ r ‖ 2	0.0198	0.0096	0.0062	—	—	0.0054
		f -val	776.68	642.71	606.32	—	—	603.88
	3-BPRSM	‖ r ‖ 2	0.8995	0.1094	0.0392	0.0266	0.0235	0.0054
		f -val	799.43	588.99	536.66	514.98	503.62	484.49
(6,000, 6,000, 6,000)	LBADMM	‖ r ‖ 2	0.0295	0.0161	0.0103	—	—	0.0075
		f -val	1482.94	1212.92	1135.46	—	—	1121.64
	3-BPRSM	‖ r ‖ 2	0.6311	0.0728	0.0455	0.0326	0.0295	0.0077
		f -val	1518.09	1126.23	1043.22	1010.47	991.68	959.42

A part of computational results are presented in Figures 2, 3, 4, which show the trends of the objective value (objective-value) and the residual defined as ‖ r k ‖ = ‖ A x k + y k + z k − b ‖ ( ‖ r ‖ 2 ).

$Figure 2 Comparison of convergence when n = 1,500 n=\hspace{0.1em}\text{1,500}\hspace{0.1em} and m = 1,500 m=\hspace{0.1em}\text{1,500}\hspace{0.1em} . (a) The objective value when n = 1,500 n=\hspace{0.1em}\text{1,500}\hspace{0.1em} and m = 1,500 m=\hspace{0.1em}\text{1,500}\hspace{0.1em} and (b) ‖ r ‖ 2 \Vert r{\Vert }_{2} under n = 1,500 n=\hspace{0.1em}\text{1,500}\hspace{0.1em} and m = 1,500 m=\hspace{0.1em}\text{1,500}\hspace{0.1em} .$

Figure 2

Comparison of convergence when n = 1,500 and m = 1,500 . (a) The objective value when n = 1,500 and m = 1,500 and (b) ‖ r ‖ 2 under n = 1,500 and m = 1,500 .

$Figure 3 Comparison of convergence when n = 3,000 n=\hspace{0.1em}\text{3,000}\hspace{0.1em} and m = 3,000 m=\hspace{0.1em}\text{3,000}\hspace{0.1em} . (a) The objective value when n = 3,000 n=\hspace{0.1em}\text{3,000}\hspace{0.1em} and m = 3,000 m=\hspace{0.1em}\text{3,000}\hspace{0.1em} . and (b) ‖ r ‖ 2 \Vert r{\Vert }_{2} under n = 3,000 n=\hspace{0.1em}\text{3,000}\hspace{0.1em} and m = 3,000 m=\hspace{0.1em}\text{3,000}\hspace{0.1em} .$

Figure 3

Comparison of convergence when n = 3,000 and m = 3,000 . (a) The objective value when n = 3,000 and m = 3,000 . and (b) ‖ r ‖ 2 under n = 3,000 and m = 3,000 .

$Figure 4 Comparison of convergence when n = 6,000 n=\hspace{0.1em}\text{6,000}\hspace{0.1em} and m = 6,000 m=\hspace{0.1em}\text{6,000}\hspace{0.1em} . (a) The objective value when n = 6,000 n=\hspace{0.1em}\text{6,000}\hspace{0.1em} and m = 6,000 m=\hspace{0.1em}\text{6,000}\hspace{0.1em} . and (b) ‖ r ‖ 2 \Vert r{\Vert }_{2} under n = 6,000 n=\hspace{0.1em}\text{6,000}\hspace{0.1em} and m = 6,000 m=\hspace{0.1em}\text{6,000}\hspace{0.1em} .$

Figure 4

Comparison of convergence when n = 6,000 and m = 6,000 . (a) The objective value when n = 6,000 and m = 6,000 . and (b) ‖ r ‖ 2 under n = 6,000 and m = 6,000 .

5 Concluding remarks

In this article, we propose a novel 3-BPRSM, which tailored for three-block optimization problems characterized by a nonconvex and nonseparable structure. The efficacy of 3-BPRSM for solving the problem of sparse signal reconstruction in compressed sensing models based on incomplete observation datasets is demonstrated, which is able to deal with nonconvexity and indivisibility and is therefore particularly suited to the case of incomplete datasets. By integrating 3-BPRSM, we aim to improve the reconstruction of sparse signals and contribute to the wider field of compressed sensing. Finally, we perform a comparative analysis to demonstrate the superior performance of 3-BPRSM over LBADMM.

In summary, we have not only theoretically verified the convergence and convergence rates of 3-BPRSM but also validated its effectiveness through numerical examples in practical applications. However, despite the encouraging achievements of 3-BPRSM, it is evident that further refinement is needed, particularly in the context of sparse signal reconstruction from an incomplete observed dataset in compressive sensing. We aim to identify examples that more closely align with real-world scenarios to articulate the practical significance of our algorithm more clearly.

Moreover, we think that the following open questions for reference by compare with [46–48] will play a positive guiding role in the future research:

In the problem (1.3), the vectors y and z have the same dimension m , and this choice is a special case. Can the problem (1.3) be generalized such that y and z have different dimensions? If the constraint is A x + B y + C z = 0 , then it can be extended to a more general constrained optimization. Can convergence be guaranteed in this case?
In the method proposed in this article, only Bregman distance is added to the subproblem of x . If Bregman distance is added to the subproblems of y and z , can the numerical effect be further improved?
Is it possible to deal the nonseparable structure in the objective function with linear approximation to solve some cases, which are difficult to solve subproblems because of the nonseparable structure?
In validating the effectiveness of Algorithm 1, we considered one synthetic numerical example. How to solve statement of real practical problems via using 3-BPRSM proposed in this article? This will help to demonstrate the applicability of the algorithm in real scenarios and provide the reader with a more practical validation in the future work.

Acknowledgements

The authors are grateful for the reviewer’s valuable comments and constructive remarks that improved the manuscript.

Funding information: This work was partially supported by the Innovation Fund of Postgraduate, Sichuan University of Science & Engineering (Y2023333) and the Scientific Research and Innovation Team Program of Sichuan University of Science and Technology (SUSE652B002).
Author contributions: All authors have read and agreed to the final manuscript. YZ, LH, and XH designed the conceptualization and models; YZ and XH formally analyzed, wrote the main manuscript text, and developed the model code and performed the simulations; LH supervised, reviewed, and edited the manuscript.
Conflict of interest: No potential conflict of interest was reported by the authors.
Ethical approval: The conducted research is not related to either human or animal use.
Data availability statement: Data sharing is not applicable to this article as no data sets were generated or analyzed during this study.

References

[1] M. Umar, Z. Sabir, M. A. Z. Raja, H. M. Baskonus, M. R. Ali, and N. A. Shah, Heuristic computing with sequential quadratic programming for solving a nonlinear hepatitis B virus model. Math. Comput. Simulat. 212 (2023), 234–248, DOI: https://doi.org/10.1016/j.matcom.2023.04.034. Search in Google Scholar

[2] Z. Sabir, M. A. Z. Raja, H. M. Baskonus, and A. Ciancio, Numerical performance using the neural networks to solve the nonlinear biological quarantined based COVID-19 model, Atti Accad. Peloritana Pericolanti Cl. Sci. Fis. Mat. Natur. 101 (2023), no. 1, A10, 18. Search in Google Scholar

[3] A. Ayub, Z. Sabir, S. B. Said, H. M. Baskonus, R. Sadat, and M. R. Ali, Nature analysis of Cross fluid flow with inclined magnetic dipole, Microsyst. Technol. 29 (2023), 697–714, DOI: https://doi.org/10.1007/s00542-023-05438-5. Search in Google Scholar

[4] Q. L. Chen, Z. Sabir, M. A. Z. Raja, W. Gao, and H. M. Baskonus, A fractional study based on the economic and environmental mathematical model, Alexandria Eng. J. 65 (2023), 761–770, DOI: https://doi.org/10.1016/j.aej.2022.09.033. Search in Google Scholar

[5] Z. Sabir, M. Umar, M. A. Z. Raja, H. M. Baskonus, and W. Gao, Designing of Morlet wavelet as a neural network for a novel prevention category in the HIV system, Int. J. Biomath. 15 (2022), no. 4, Paper no. 2250012, 22 pp, DOI: https://doi.org/10.1142/S1793524522500127. Search in Google Scholar

[6] Z. Sabir, H. A. Wahab, S. Javeed, and H. M. Baskonus, An efficient stochastic numerical computing framework for the nonlinear higher order singular models, Fractal Fract. 5 (2021), no. 4, Paper No. 176, 14 pp, DOI: https://doi.org/10.3390/fractalfract5040176. Search in Google Scholar

[7] M. Umar, Z. Sabir, M. A. Z. Raja, H. M. Baskonus, S. W. Yao, and E. Ilhan, A novel study of Morlet neural networks to solve the nonlinear HIV infection system of latently infected cells, Results Phys. 25 (2021), Paper No. 104235, 13 pp, DOI: https://doi.org/10.1016/j.rinp.2021.104235. Search in Google Scholar

[8] Y. G. Sánchez, Z. Sabir, H. Günerhan, and H. M. Baskonus, Analytical and approximate solutions of a novel nervous stomach mathematical model, Discrete Dyn. Nat. Soc. 2020 (2020), no. 1, 5063271, DOI: https://doi.org/10.1155/2020/5063271. Search in Google Scholar

[9] W. Bian and X. J. Chen, Linearly constrained non-Lipschitz optimization for image restoration, SIAM J. Imaging Sci. 8 (2015), 2294–2322, DOI: https://doi.org/10.1137/140985639. Search in Google Scholar

[10] Z. B. Xu, X. Y. Chang, F. M. Xu, and H. Zhang, L1⁄2 regularization: A thresholding representation theory and a fast solver, IEEE Trans. Neural Network Learn. Sys. 23 (2012), no. 7, 1013–1027, DOI: https://doi.org/10.1109/TNNLS.2012.2197412. Search in Google Scholar

[11] Z. C. Lin, R. S. Liu, and H. Li, Linearized alternating direction method with parallel splitting and adaptive penalty for separable convex programs in machine learning, Mach. Learn. 99 (2015), 287–325, DOI: https://doi.org/10.1007/s10994-014-5469-5. Search in Google Scholar

[12] B. P. W. Ames and M. Y. Hong, Alternating direction method of multipliers for penalized zerovariance discriminant analysis, Comput. Optim. Appl. 64 (2016), 725–754, DOI: https://doi.org/10.1007/s10589-016-9828-y. Search in Google Scholar

[13] F. Lin, M. Fardad, and M. R. Jovanovic, Design of optimal sparse feedback gains via the alternating direction method of multipliers, IEEE Trans. Autom. Control 58 (2013), no. 9, 2426–2431, DOI: https://doi.org/10.1109/TAC.2013.2257618. Search in Google Scholar

[14] K. Guo, D. R. Han, and T. T. Wu, Convergence of alternating direction method for minimizing sum of two nonconvex functions with linear constraints, Int. J. Comput. Math. 94 (2017), 1653–1669, DOI: https://doi.org/10.1080/00207160.2016.1227432. Search in Google Scholar

[15] F. H. Wang, Z. B. Xu, and H. K. Xu, Convergence of alternating direction method with multipliers for non-convex composite problems, arXiv preprint (2014), https://arxiv.org/abs/1410.8625.Search in Google Scholar

[16] M. Ohishia, K. Fukuib, K. Okamurac, Y. Itohc, and H. Yanagihara, Coordinate optimization for generalized fused Lasso, Commun. Stat. Theory Methods 50 (2021), 5955–5973, DOI: https://doi.org/10.1080/03610926.2021.1931888. Search in Google Scholar

[17] X. L. Lu and X. B. Lü, ADMM for image restoration based on nonlocal simultaneous sparse Bayesian coding, Signal Process Image Commun. 70 (2019), 157–173, DOI: https://doi.org/10.1-016/j.image.2018.09.012. Search in Google Scholar

[18] B. Wahlberg, S. Boyd, M. Annergren, and Y. Wang, An ADMM algorithm for a class of total variation regularized estimation problems, IFAC Proc. 45 (2012), 83–88, DOI: https://doi.org/1-0.3182/20120711-3-BE-2027.00310. Search in Google Scholar

[19] M. Meselhi, R. Sarker, D. Essam, and S. Elsayed, A decomposition approach for large-scale non-separable optimization problems, Appl. Soft. Comput. 115 (2022), 108168, DOI: https://doi.org/10.1016/j.asoc.2021.108168. Search in Google Scholar

[20] P. J. Liu, J. B. Jian, B. Heeee, and X. Z. Jiang, Convergence of Bregman Peaceman-Rachford splitting method for nonconvex nonseparable optimization, J. Oper. Res. Soc. China 11 (2023), 707–733, DOI: https://doi.org/10.1007/s40305-022-00411-x. Search in Google Scholar

[21] B. S. He and X. M. Yuan, A class of ADMM-based algorithms for three-block separable convex programming, Comput. Optim. Appl. 70 (2018), no. 3, 791–826, DOI: https://doi.org/10.1007/s10589-018-9994-1. Search in Google Scholar

[22] L. M. Zeng and J. Xie, Group variable selection via SCAD-L2, Statistics 48 (2014), no. 1, 49–66, DOI: https://doi.org/10.1080/02331888.2012.719513. Search in Google Scholar

[23] C. Zhang, Y. Z. Song, X. J. Cai, and D. R. Han, An extended proximal ADMM algorithm for three-block nonconvex optimization problems, J. Comput. Appl. Math. 398 (2021), Paper no. 113681, pp. 1–14, DOI: https://doi.org/10.1016/j.cam.2021.113681. Search in Google Scholar

[24] L. Yang, T. K. Pong, and X. J. Chen, Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction, SIAM J. Imaging Sci. 10 (2017), 74–110, DOI: https://doi.org/10.1137/15M1027528. Search in Google Scholar

[25] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Comput. Math. Appl. 2 (1976), 17–40, DOI: https://doi.org/10.1016/0898-1221(76)90003-1. Search in Google Scholar

[26] J. Eckstein and D. Bertsekas, On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators, Math. Program. 55 (1992), 293–318, DOI: https://doi.org/10.1007/BF01581204. Search in Google Scholar

[27] D. W. Peaceman and J. H. H. Rachford, The numerical solution of parabolic and elliptic differential equations, J. Soc. Indust. Appl. Math. 3 (1955), 28–41, DOI: https://doi.org/10.1137/0103003. Search in Google Scholar

[28] P. L. Combettes and V. R. Wajs, Signal recovery by proximal forward-backward splitting, SIAM J. Multiscale Model. Simul. 4 (2005), 1168–1200, DOI: https://doi.org/10.1137/050626090. Search in Google Scholar

[29] D. Gabay, Chapter IX applications of the method of multipliers to variational inequalities, Stud. Math. Appl. 15 (1983), 299–331, DOI: https://doi.org/10.1016/S0168-2024(08)70034-1Search in Google Scholar

[30] G. Y. Li and T. K. Pong, Global convergence of splitting methods for nonconvex composite optimization, SIAM J. Optim. 25 (2015), 2434–2460, DOI: https://doi.org/10.1137/140998135. Search in Google Scholar

[31] Z. H. Jia, X. Gao, X. Cai, and D. R. Han, Local linear convergence of the alternating direction method of multipliers for nonconvex separable optimization problems, J. Optim. Theory Appl. 188 (2021), 1–25, DOI: https://doi.org/10.1007/s10957-020-01782-y. Search in Google Scholar

[32] R. I. Bot and D. K. Nguyen, The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates, Math. Oper. Res. 45 (2020), 682–712, DOI: https://doi.org/10.1287/moor.2019.1008. Search in Google Scholar

[33] F. H. Wang, W. F. Cao, and Z. B. Xu, Convergence of multiblock Bregman ADMM for nonconvex composite problems, Sci. China Inf. Sci. 61 (2018), no. 12, Paper No. 122101, pp. 1–12, DOI: https://doi.org/10.1007/s11432-017-9367-6. Search in Google Scholar

[34] M. T. Chao, Z. Deng, and J. B. Jian, Convergence of linear Bregman ADMM for nonconvex and nonsmooth problems with nondivisble structure, Complexity 2020 (2020), Paper No. 6237942, pp. 1–14, DOI: https://doi.org/10.1155/2020/6237942. Search in Google Scholar

[35] C. H. Chen, B. S. He, Y. Y. Ye, and X. M. Yuan, The direct extension of ADMM for multiblock convex minimization problems is not necessarily convergent, Math. Program. 155 (2016), 57–79, DOI: https://doi.org/10.1007/s10107-014-0826-5. Search in Google Scholar

[36] A. Bnouhachem and M. T. Rassias, A Bregman proximal Peaceman-Rachford splitting method for convex programming, Appl. Set-Valued Anal. Optim. 4 (2022), 129–143, DOI: https://doi.org/10.23952/asvao.4-.2022.2.01. Search in Google Scholar

[37] P. Li, Y. Shen, S. Jiang, Z. Liu, and C. H. Chen, Convergence study on strictly contractive Peaceman-Rachford splitting method for nondivisble convex minimization models with quadratic coupling terms, Comput. Optim. Appl. 78 (2021), 87–124, DOI: https://doi.org/10.1007/s10589-020-00229-4. Search in Google Scholar

[38] F. X. Liu, L. L. Xu, Y. H. Sun, and D. R. Han, A proximal alternating direction method for multiblock coupled convex optimization, J. Ind. Manag. Optim. 15 (2018), no. 2, 723–737, DOI: https://doi.org/10.3934/jimo.201-8067. Search in Google Scholar

[39] M. T. Chao, C. Z. Cheng, and D. Y. Liang, A proximal block minimization method of multipliers with a substitution procedure, Optim. Methods Softw. 30 (2015), 825–842, DOI: https://doi.org/1-0.1080/10556788.2014.992432. Search in Google Scholar

[40] L. M. Bregman, The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming, USSR Comput. Math. Math. Phys. 7 (1967), no. 3, 200–217, DOI: https://doi.org/10.1016/0041-5553(67)90040-7. Search in Google Scholar

[41] A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh, Clustering with bregman divergences, J. Mach. Learn. Res. 6 (2005), 1705–1749, DOI: https://dl.acm.org/doi/pdf/10.5555/1046920.1194902. Search in Google Scholar

[42] H. Attouch, J. Bolte, P. Redont, and A. Soubeyran, Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Łojasiewicz inequality, Math. Oper. Res. 35 (2010), 438–457, DOI: https://doi.org/10.48550/arXiv.0801.1780. Search in Google Scholar

[43] J. Bolte, S. Sabach, and M. Teboulle, Proximal alternating linearized minimization or nonconvex and nonsmooth problems, Math. Program. 146 (2014), 459–494, DOI: https://doi.org/10.1-007/s10107-013-0701-9. Search in Google Scholar

[44] Y. Nesterov, Introduction Lectures on Convex Optimization: A Basic Course, Springer Science & Business Media, Berlin, 2013. Search in Google Scholar

[45] H. Attouch and J. Bolte, On the convergence of the proximal algorithm for nonsmooth functions involving analytic features, Math. Program 116 (2009), 5–16, DOI: https://doi.org/10.1007/s10-107-007-0133-5. Search in Google Scholar

[46] M. N. Raihen and S. Akter, Prediction modeling using deep learning for the classification of grape-type dried fruits, Int. J. Math. Comput. Eng. 2 (2024), no. 1, 1–12, DOI: https://doi.org/10.2478/ijmce-2024-0001. Search in Google Scholar

[47] M. Omar and D. Burrell, From text to threats: A language model approach to software vulnerability detection, Int. J. Math. Comput. Eng. 2 (2024), no. 1, 23–34, DOI: https://doi.org/10.2478/ijmce-2024-0003. Search in Google Scholar

[48] D. Dalal, P. Kumar, and C. Cattani, Optimizing industrial growth through alternative forest biomass resources: A mathematical model using DDE, Int. J. Math. Comput. Eng. 1 (2023), no. 2, 187–200, DOI: https://doi.org/10.2478/ijmce-2023-0015. Search in Google Scholar

Received: 2023-07-27

Revised: 2024-04-01

Accepted: 2024-05-17

Published Online: 2024-09-25

This work is licensed under the Creative Commons Attribution 4.0 International License.

Convergence of Peaceman-Rachford splitting method with Bregman distance for three-block nonconvex nonseparable optimization

Abstract

1 Introduction

2 Preliminaries

Definition 2.1

Definition 2.2

Definition 2.3

Definition 2.4

Definition 2.5

Definition 2.6

Definition 2.7

Definition 2.8

Lemma 2.9

Lemma 2.10

Lemma 2.11

Lemma 2.12

Lemma 2.13

Lemma 2.14

3 Algorithm and convergence analysis

Lemma 3.1

Assumption 3.2

Remark 3.3

Lemma 3.4

Proof

Assumption 3.5

Lemma 3.6

Proof

Lemma 3.7

Proof

Theorem 3.8

Proof

Lemma 3.9

Proof

Theorem 3.10

Proof

Theorem 3.11

Proof

4 Numerical simulations

5 Concluding remarks

Acknowledgements

References

Journal and Issue

Articles in the same Issue