In mathematics, a symmetric matrix with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the row vector transpose of [1] More generally, a Hermitian matrix (that is, a complex matrix equal to its conjugate transpose) is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of

Positive semi-definite matrices are defined similarly, except that the scalars and are required to be positive or zero (that is, nonnegative). Negative-definite and negative semi-definite matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes called indefinite.

Ramifications

edit

It follows from the above definitions that a matrix is positive-definite if and only if it is the matrix of a positive-definite quadratic form or Hermitian form. In other words, a matrix is positive-definite if and only if it defines an inner product.

Positive-definite and positive-semidefinite matrices can be characterized in many ways, which may explain the importance of the concept in various parts of mathematics. A matrix M is positive-definite if and only if it satisfies any of the following equivalent conditions.

  •   is congruent with a diagonal matrix with positive real entries.
  •   is symmetric or Hermitian, and all its eigenvalues are real and positive.
  •   is symmetric or Hermitian, and all its leading principal minors are positive.
  • There exists an invertible matrix   with conjugate transpose   such that  

A matrix is positive semi-definite if it satisfies similar equivalent conditions where "positive" is replaced by "nonnegative", "invertible matrix" is replaced by "matrix", and the word "leading" is removed.

Positive-definite and positive-semidefinite real matrices are at the basis of convex optimization, since, given a function of several real variables that is twice differentiable, then if its Hessian matrix (matrix of its second partial derivatives) is positive-definite at a point   then the function is convex near p, and, conversely, if the function is convex near   then the Hessian matrix is positive-semidefinite at  

The set of positive definite matrices is an open convex cone, while the set of positive semi-definite matrices is a closed convex cone.[2]

Some authors use more general definitions of definiteness, including some non-symmetric real matrices, or non-Hermitian complex ones.

Definitions

edit

In the following definitions,   is the transpose of     is the conjugate transpose of   and   denotes the n dimensional zero-vector.

Definitions for real matrices

edit

An   symmetric real matrix   is said to be positive-definite if   for all non-zero   in   Formally,

 

An   symmetric real matrix   is said to be positive-semidefinite or non-negative-definite if   for all   in   Formally,

 

An   symmetric real matrix   is said to be negative-definite if   for all non-zero   in   Formally,

 

An   symmetric real matrix   is said to be negative-semidefinite or non-positive-definite if   for all   in   Formally,

 

An   symmetric real matrix which is neither positive semidefinite nor negative semidefinite is called indefinite.

Definitions for complex matrices

edit

The following definitions all involve the term   Notice that this is always a real number for any Hermitian square matrix  

An   Hermitian complex matrix   is said to be positive-definite if   for all non-zero   in   Formally,

 

An   Hermitian complex matrix   is said to be positive semi-definite or non-negative-definite if   for all   in   Formally,

 

An   Hermitian complex matrix   is said to be negative-definite if   for all non-zero   in   Formally,

 

An   Hermitian complex matrix   is said to be negative semi-definite or non-positive-definite if   for all   in   Formally,

 

An   Hermitian complex matrix which is neither positive semidefinite nor negative semidefinite is called indefinite.

Consistency between real and complex definitions

edit

Since every real matrix is also a complex matrix, the definitions of "definiteness" for the two classes must agree.

For complex matrices, the most common definition says that   is positive-definite if and only if   is real and positive for every non-zero complex column vectors   This condition implies that   is Hermitian (i.e. its transpose is equal to its conjugate), since   being real, it equals its conjugate transpose   for every   which implies  

By this definition, a positive-definite real matrix   is Hermitian, hence symmetric; and   is positive for all non-zero real column vectors   However the last condition alone is not sufficient for   to be positive-definite. For example, if  

then for any real vector   with entries   and   we have   which is always positive if   is not zero. However, if   is the complex vector with entries 1 and   one gets

 

which is not real. Therefore,   is not positive-definite.

On the other hand, for a symmetric real matrix   the condition "  for all nonzero real vectors   does imply that   is positive-definite in the complex sense.

Notation

edit

If a Hermitian matrix   is positive semi-definite, one sometimes writes   and if   is positive-definite one writes   To denote that   is negative semi-definite one writes   and to denote that   is negative-definite one writes  

The notion comes from functional analysis where positive semidefinite matrices define positive operators. If two matrices   and   satisfy   we can define a non-strict partial order   that is reflexive, antisymmetric, and transitive; It is not a total order, however, as   in general, may be indefinite.

A common alternative notation is       and   for positive semi-definite and positive-definite, negative semi-definite and negative-definite matrices, respectively. This may be confusing, as sometimes nonnegative matrices (respectively, nonpositive matrices) are also denoted in this way.

Examples

edit
  • The identity matrix   is positive-definite (and as such also positive semi-definite). It is a real symmetric matrix, and, for any non-zero column vector z with real entries a and b, one has

      Seen as a complex matrix, for any non-zero column vector z with complex entries a and b one has  

    Either way, the result is positive since   is not the zero vector (that is, at least one of   and   is not zero).
  • The real symmetric matrix   is positive-definite since for any non-zero column vector z with entries a, b and c, we have   This result is a sum of squares, and therefore non-negative; and is zero only if   that is, when   is the zero vector.
  • For any real invertible matrix   the product   is a positive definite matrix (if the means of the columns of A are 0, then this is also called the covariance matrix). A simple proof is that for any non-zero vector   the condition   since the invertibility of matrix   means that  
  • The example   above shows that a matrix in which some elements are negative may still be positive definite. Conversely, a matrix whose entries are all positive is not necessarily positive definite, as for example   for which  

Eigenvalues

edit

Let   be an   Hermitian matrix (this includes real symmetric matrices). All eigenvalues of   are real, and their sign characterize its definiteness:

  •   is positive definite if and only if all of its eigenvalues are positive.
  •   is positive semi-definite if and only if all of its eigenvalues are non-negative.
  •   is negative definite if and only if all of its eigenvalues are negative
  •   is negative semi-definite if and only if all of its eigenvalues are non-positive.
  •   is indefinite if and only if it has both positive and negative eigenvalues.

Let   be an eigendecomposition of   where   is a unitary complex matrix whose columns comprise an orthonormal basis of eigenvectors of   and   is a real diagonal matrix whose main diagonal contains the corresponding eigenvalues. The matrix   may be regarded as a diagonal matrix   that has been re-expressed in coordinates of the (eigenvectors) basis   Put differently, applying   to some vector   giving   is the same as changing the basis to the eigenvector coordinate system using   giving   applying the stretching transformation   to the result, giving   and then changing the basis back using   giving  

With this in mind, the one-to-one change of variable   shows that   is real and positive for any complex vector   if and only if   is real and positive for any   in other words, if   is positive definite. For a diagonal matrix, this is true only if each element of the main diagonal – that is, every eigenvalue of   – is positive. Since the spectral theorem guarantees all eigenvalues of a Hermitian matrix to be real, the positivity of eigenvalues can be checked using Descartes' rule of alternating signs when the characteristic polynomial of a real, symmetric matrix   is available.

Decomposition

edit

Let   be an   Hermitian matrix.   is positive semidefinite if and only if it can be decomposed as a product   of a matrix   with its conjugate transpose.

When   is real,   can be real as well and the decomposition can be written as  

  is positive definite if and only if such a decomposition exists with   invertible. More generally,   is positive semidefinite with rank   if and only if a decomposition exists with a   matrix   of full row rank (i.e. of rank  ). Moreover, for any decomposition    [3]

Proof

If   then   so   is positive semidefinite. If moreover   is invertible then the inequality is strict for   so   is positive definite. If   is   of rank   then  

In the other direction, suppose   is positive semidefinite. Since   is Hermitian, it has an eigendecomposition   where   is unitary and   is a diagonal matrix whose entries are the eigenvalues of   Since   is positive semidefinite, the eigenvalues are non-negative real numbers, so one can define   as the diagonal matrix whose entries are non-negative square roots of eigenvalues. Then   for   If moreover   is positive definite, then the eigenvalues are (strictly) positive, so   is invertible, and hence   is invertible as well. If   has rank   then it has exactly   positive eigenvalues and the others are zero, hence in   all but   rows are all zeroed. Cutting the zero rows gives a   matrix   such that  

The columns   of   can be seen as vectors in the complex or real vector space   respectively. Then the entries of   are inner products (that is dot products, in the real case) of these vectors   In other words, a Hermitian matrix   is positive semidefinite if and only if it is the Gram matrix of some vectors   It is positive definite if and only if it is the Gram matrix of some linearly independent vectors. In general, the rank of the Gram matrix of vectors   equals the dimension of the space spanned by these vectors.[4]

Uniqueness up to unitary transformations

edit

The decomposition is not unique: if   for some   matrix   and if   is any unitary   matrix (meaning  ), then   for  

However, this is the only way in which two decompositions can differ: The decomposition is unique up to unitary transformations. More formally, if   is a   matrix and   is a   matrix such that   then there is a   matrix   with orthonormal columns (meaning  ) such that  [5] When   this means   is unitary.

This statement has an intuitive geometric interpretation in the real case: let the columns of   and   be the vectors   and   in   A real unitary matrix is an orthogonal matrix, which describes a rigid transformation (an isometry of Euclidean space  ) preserving the 0 point (i.e. rotations and reflections, without translations). Therefore, the dot products   and   are equal if and only if some rigid transformation of   transforms the vectors   to   (and 0 to 0).

Square root

edit

A Hermitian matrix   is positive semidefinite if and only if there is a positive semidefinite matrix   (in particular   is Hermitian, so  ) satisfying   This matrix   is unique,[6] is called the non-negative square root of   and is denoted with   When   is positive definite, so is   hence it is also called the positive square root of  

The non-negative square root should not be confused with other decompositions   Some authors use the name square root and   for any such decomposition, or specifically for the Cholesky decomposition, or any decomposition of the form   others only use it for the non-negative square root.

If   then  

Cholesky decomposition

edit

A Hermitian positive semidefinite matrix   can be written as   where   is lower triangular with non-negative diagonal (equivalently   where   is upper triangular); this is the Cholesky decomposition. If   is positive definite, then the diagonal of   is positive and the Cholesky decomposition is unique. Conversely if   is lower triangular with nonnegative diagonal then   is positive semidefinite. The Cholesky decomposition is especially useful for efficient numerical calculations. A closely related decomposition is the LDL decomposition,   where   is diagonal and   is lower unitriangular.

Williamson theorem

edit

Any   positive definite Hermitian real matrix   can be diagonalized via symplectic (real) matrices. More precisely, Williamson's theorem ensures the existence of symplectic   and diagonal real positive   such that  .

Other characterizations

edit

Let   be an   real symmetric matrix, and let   be the "unit ball" defined by   Then we have the following

  •   is a solid slab sandwiched between  
  •   if and only if   is an ellipsoid, or an ellipsoidal cylinder.
  •   if and only if   is bounded, that is, it is an ellipsoid.
  • If   then   if and only if     if and only if  
  • If   then   for all   if and only if   So, since the polar dual of an ellipsoid is also an ellipsoid with the same principal axes, with inverse lengths, we have   That is, if   is positive-definite, then   for all   if and only if  

Let   be an   Hermitian matrix. The following properties are equivalent to   being positive definite:

The associated sesquilinear form is an inner product
The sesquilinear form defined by   is the function   from   to   such that   for all   and   in   where   is the conjugate transpose of   For any complex matrix   this form is linear in   and semilinear in   Therefore, the form is an inner product on   if and only if   is real and positive for all nonzero   that is if and only if   is positive definite. (In fact, every inner product on   arises in this fashion from a Hermitian positive definite matrix.)
Its leading principal minors are all positive
The kth leading principal minor of a matrix   is the determinant of its upper-left   sub-matrix. It turns out that a matrix is positive definite if and only if all these determinants are positive. This condition is known as Sylvester's criterion, and provides an efficient test of positive definiteness of a symmetric real matrix. Namely, the matrix is reduced to an upper triangular matrix by using elementary row operations, as in the first part of the Gaussian elimination method, taking care to preserve the sign of its determinant during pivoting process. Since the kth leading principal minor of a triangular matrix is the product of its diagonal elements up to row   Sylvester's criterion is equivalent to checking whether its diagonal elements are all positive. This condition can be checked each time a new row   of the triangular matrix is obtained.

A positive semidefinite matrix is positive definite if and only if it is invertible.[7] A matrix   is negative (semi)definite if and only if   is positive (semi)definite.

Quadratic forms

edit

The (purely) quadratic form associated with a real   matrix   is the function   such that   for all     can be assumed symmetric by replacing it with   since any asymmetric part will be zeroed-out in the double-sided product.

A symmetric matrix   is positive definite if and only if its quadratic form is a strictly convex function.

More generally, any quadratic function from   to   can be written as   where   is a symmetric   matrix,   is a real n vector, and   a real constant. In the   case, this is a parabola, and just like in the   case, we have

Theorem: This quadratic function is strictly convex, and hence has a unique finite global minimum, if and only if   is positive definite.

Proof: If   is positive definite, then the function is strictly convex. Its gradient is zero at the unique point of   which must be the global minimum since the function is strictly convex. If   is not positive definite, then there exists some vector   such that   so the function   is a line or a downward parabola, thus not strictly convex and not having a global minimum.

For this reason, positive definite matrices play an important role in optimization problems.

Simultaneous diagonalization

edit

One symmetric matrix and another matrix that is both symmetric and positive definite can be simultaneously diagonalized. This is so although simultaneous diagonalization is not necessarily performed with a similarity transformation. This result does not extend to the case of three or more matrices. In this section we write for the real case. Extension to the complex case is immediate.

Let   be a symmetric and   a symmetric and positive definite matrix. Write the generalized eigenvalue equation as   where we impose that   be normalized, i.e.   Now we use Cholesky decomposition to write the inverse of   as   Multiplying by   and letting   we get   which can be rewritten as   where   Manipulation now yields   where   is a matrix having as columns the generalized eigenvectors and   is a diagonal matrix of the generalized eigenvalues. Now premultiplication with   gives the final result:   and   but note that this is no longer an orthogonal diagonalization with respect to the inner product where   In fact, we diagonalized   with respect to the inner product induced by  [8]

Note that this result does not contradict what is said on simultaneous diagonalization in the article Diagonalizable matrix, which refers to simultaneous diagonalization by a similarity transformation. Our result here is more akin to a simultaneous diagonalization of two quadratic forms, and is useful for optimization of one form under conditions on the other.

Properties

edit

Induced partial ordering

edit

For arbitrary square matrices     we write   if   i.e.,   is positive semi-definite. This defines a partial ordering on the set of all square matrices. One can similarly define a strict partial ordering   The ordering is called the Loewner order.

Inverse of positive definite matrix

edit

Every positive definite matrix is invertible and its inverse is also positive definite.[9] If   then  [10] Moreover, by the min-max theorem, the kth largest eigenvalue of   is greater than or equal to the kth largest eigenvalue of  

Scaling

edit

If   is positive definite and   is a real number, then   is positive definite.[11]

Addition

edit
  • If   and   are positive-definite, then the sum   is also positive-definite.[11]
  • If   and   are positive-semidefinite, then the sum   is also positive-semidefinite.
  • If   is positive-definite and   is positive-semidefinite, then the sum   is also positive-definite.

Multiplication

edit
  • If   and   are positive definite, then the products   and   are also positive definite. If   then   is also positive definite.
  • If   is positive semidefinite, then   is positive semidefinite for any (possibly rectangular) matrix   If   is positive definite and   has full column rank, then   is positive definite.[12]

Trace

edit

The diagonal entries   of a positive-semidefinite matrix are real and non-negative. As a consequence the trace,   Furthermore,[13] since every principal sub-matrix (in particular, 2-by-2) is positive semidefinite,  

and thus, when    

An   Hermitian matrix   is positive definite if it satisfies the following trace inequalities:[14]  

Another important result is that for any   and   positive-semidefinite matrices,   This follows by writing   The matrix   is positive-semidefinite and thus has non-negative eigenvalues, whose sum, the trace, is therefore also non-negative.

Hadamard product

edit

If   although   is not necessary positive semidefinite, the Hadamard product is,   (this result is often called the Schur product theorem).[15]

Regarding the Hadamard product of two positive semidefinite matrices     there are two notable inequalities:

  • Oppenheim's inequality:  [16]
  •  [17]

Kronecker product

edit

If   although   is not necessary positive semidefinite, the Kronecker product  

Frobenius product

edit

If   although   is not necessary positive semidefinite, the Frobenius inner product   (Lancaster–Tismenetsky, The Theory of Matrices, p. 218).

Convexity

edit

The set of positive semidefinite symmetric matrices is convex. That is, if   and   are positive semidefinite, then for any   between 0 and 1,   is also positive semidefinite. For any vector  :  

This property guarantees that semidefinite programming problems converge to a globally optimal solution.

Relation with cosine

edit

The positive-definiteness of a matrix   expresses that the angle   between any vector   and its image   is always  

  the angle between   and  

Further properties

edit
  1. If   is a symmetric Toeplitz matrix, i.e. the entries   are given as a function of their absolute index differences:   and the strict inequality   holds, then   is strictly positive definite.
  2. Let   and   Hermitian. If   (resp.,  ) then   (resp.,  ).[18]
  3. If   is real, then there is a   such that   where   is the identity matrix.
  4. If   denotes the leading   minor,   is the kth pivot during LU decomposition.
  5. A matrix is negative definite if its kth order leading principal minor is negative when   is odd, and positive when   is even.
  6. If   is a real positive definite matrix, then there exists a positive real number   such that for every vector    
  7. A Hermitian matrix is positive semidefinite if and only if all of its principal minors are nonnegative. It is however not enough to consider the leading principal minors only, as is checked on the diagonal matrix with entries 0 and −1 .

Block matrices and submatrices

edit

A positive   matrix may also be defined by blocks:  

where each block is   By applying the positivity condition, it immediately follows that   and   are hermitian, and  

We have that   for all complex   and in particular for   Then  

A similar argument can be applied to   and thus we conclude that both   and   must be positive definite. The argument can be extended to show that any principal submatrix of   is itself positive definite.

Converse results can be proved with stronger conditions on the blocks, for instance, using the Schur complement.

Local extrema

edit

A general quadratic form   on   real variables   can always be written as   where   is the column vector with those variables, and   is a symmetric real matrix. Therefore, the matrix being positive definite means that   has a unique minimum (zero) when   is zero, and is strictly positive for any other  

More generally, a twice-differentiable real function   on   real variables has local minimum at arguments   if its gradient is zero and its Hessian (the matrix of all second derivatives) is positive semi-definite at that point. Similar statements can be made for negative definite and semi-definite matrices.

Covariance

edit

In statistics, the covariance matrix of a multivariate probability distribution is always positive semi-definite; and it is positive definite unless one variable is an exact linear function of the others. Conversely, every positive semi-definite matrix is the covariance matrix of some multivariate distribution.

Extension for non-Hermitian square matrices

edit

The definition of positive definite can be generalized by designating any complex matrix   (e.g. real non-symmetric) as positive definite if   for all non-zero complex vectors   where   denotes the real part of a complex number  [19] Only the Hermitian part   determines whether the matrix is positive definite, and is assessed in the narrower sense above. Similarly, if   and   are real, we have   for all real nonzero vectors   if and only if the symmetric part   is positive definite in the narrower sense. It is immediately clear that  is insensitive to transposition of  

Consequently, a non-symmetric real matrix with only positive eigenvalues does not need to be positive definite. For example, the matrix   has positive eigenvalues yet is not positive definite; in particular a negative value of   is obtained with the choice   (which is the eigenvector associated with the negative eigenvalue of the symmetric part of  ).

In summary, the distinguishing feature between the real and complex case is that, a bounded positive operator on a complex Hilbert space is necessarily Hermitian, or self adjoint. The general claim can be argued using the polarization identity. That is no longer true in the real case.

Applications

edit

Heat conductivity matrix

edit

Fourier's law of heat conduction, giving heat flux   in terms of the temperature gradient   is written for anisotropic media as   in which   is the symmetric thermal conductivity matrix. The negative is inserted in Fourier's law to reflect the expectation that heat will always flow from hot to cold. In other words, since the temperature gradient   always points from cold to hot, the heat flux   is expected to have a negative inner product with   so that   Substituting Fourier's law then gives this expectation as   implying that the conductivity matrix should be positive definite.

See also

edit

References

edit
  1. ^ van den Bos, Adriaan (March 2007). "Appendix C: Positive semidefinite and positive definite matrices". Parameter Estimation for Scientists and Engineers (.pdf) (online ed.). John Wiley & Sons. pp. 259–263. doi:10.1002/9780470173862. ISBN 978-047-017386-2. Print ed. ISBN 9780470147818
  2. ^ Boyd, Stephen; Vandenberghe, Lieven (8 March 2004). Convex Optimization. Cambridge University Press. doi:10.1017/cbo9780511804441. ISBN 978-0-521-83378-3.
  3. ^ Horn & Johnson (2013), p. 440, Theorem 7.2.7
  4. ^ Horn & Johnson (2013), p. 441, Theorem 7.2.10
  5. ^ Horn & Johnson (2013), p. 452, Theorem 7.3.11
  6. ^ Horn & Johnson (2013), p. 439, Theorem 7.2.6 with  
  7. ^ Horn & Johnson (2013), p. 431, Corollary 7.1.7
  8. ^ Horn & Johnson (2013), p. 485, Theorem 7.6.1
  9. ^ Horn & Johnson (2013), p. 438, Theorem 7.2.1
  10. ^ Horn & Johnson (2013), p. 495, Corollary 7.7.4(a)
  11. ^ a b Horn & Johnson (2013), p. 430, Observation 7.1.3
  12. ^ Horn & Johnson (2013), p. 431, Observation 7.1.8
  13. ^ Horn & Johnson (2013), p. 430
  14. ^ Wolkowicz, Henry; Styan, George P.H. (1980). "Bounds for Eigenvalues using Traces". Linear Algebra and its Applications (29). Elsevier: 471–506.
  15. ^ Horn & Johnson (2013), p. 479, Theorem 7.5.3
  16. ^ Horn & Johnson (2013), p. 509, Theorem 7.8.16
  17. ^ Styan, G.P. (1973). "Hadamard products and multivariate statistical analysis". Linear Algebra and Its Applications. 6: 217–240., Corollary 3.6, p. 227
  18. ^ Bhatia, Rajendra (2007). Positive Definite Matrices. Princeton, New Jersey: Princeton University Press. p. 8. ISBN 978-0-691-12918-1.
  19. ^ Weisstein, Eric W. "Positive definite matrix". MathWorld. Wolfram Research. Retrieved 26 July 2012.

Sources

edit
edit
  NODES
Idea 1
idea 1
Note 11