[线性代数] - Courant Fischer定理

Courant Fischer定理给出了一个关于Hermitian矩阵特征值的变分特性描述

推导

Hermitian矩阵定义

满足以下式子的矩阵称为Hermitian矩阵

H=H,H=HTH = H^*, H^* = \overline{H}^T

性质

  • AA是Hermitian矩阵,则xCn\forall x \in \mathbb{C}^nxTAxx^TAx是实数
  • Hermitian矩阵的特征值都是实数
  • Hermitian矩阵属于不同特征值的特征向量都互为正交

Rayleigh 商(Rayleigh Quotient)

R=xAxxxR = \frac{x^*Ax}{x^*x}

极大极小定理

ARn×nA \in \mathbb{R}^{n \times n}的Hermitian矩阵,且特征值为λ1λ2λn\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n,则有

max=xAxxx=λ1xRn,xOmin=xAxxx=λnxRn,xO\begin{aligned} max &= \frac{x^*Ax}{x^*x} = \lambda_1 & x \in \mathbb{R}^n, x \notin {O} \\ min &= \frac{x^*Ax}{x^*x} = \lambda_n & x \in \mathbb{R}^n, x \notin {O} \end{aligned}

即Rayleigh商的上界是最大特征值λ1\lambda_1,下界是最小特征值λn\lambda_n,令

Λ=[λ1000λ2000λn]U=[u1u2un]\begin{aligned} \Lambda &= \begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{bmatrix} \\ U &= \begin{bmatrix}u_1 & u_2 & \cdots & u_n\end{bmatrix} \end{aligned}

其中uiu_iλi\lambda_i对应的特征向量,且为标准正交基,即

ui=1uiTuj=0(ij)\begin{aligned} \parallel u_i\parallel &= 1 \\ u^T_iu_j &= 0 \quad (i \neq j) \end{aligned}

则有

R=xAxxx=xUΛUxxIx=xUΛUxxUUxR = \frac{x^*Ax}{x^*x} = \frac{x^*U\Lambda U^*x}{x^*Ix} = \frac{x^*U\Lambda U^*x}{x^*UU^*x}

z=Ux=[zi]z = U^*x = [z_i],则

R=zΛzzzλ1(z12++zn2)z12++zn2=λ1R = \frac{z^*\Lambda z}{z^*z} \leq \frac{\lambda_1(|z_1|^2 + \cdots + |z_n|^2)}{|z_1|^2 + \cdots + |z_n|^2} = \lambda_1

同理

R=zΛzzzλn(z12++zn2)z12++zn2=λnR = \frac{z^*\Lambda z}{z^*z} \geq \frac{\lambda_n(|z_1|^2 + \cdots + |z_n|^2)}{|z_1|^2 + \cdots + |z_n|^2} = \lambda_n

x=u1x = u_1,即z=Uu1=[100]Tz = U^*u_1 = \begin{bmatrix}1&0&\cdots&0\end{bmatrix}^T,故zΛzzz=λ1\frac{z^*\Lambda z}{z^*z} = \lambda_1
x=unx = u_n,即z=Uun=[001]Tz = U^*u_n = \begin{bmatrix}0&0&\cdots&1\end{bmatrix}^T,故zΛzzz=λn\frac{z^*\Lambda z}{z^*z} = \lambda_n
以上式子表明:Rayleigh商的最大值与最小值分別发生于对应 λ1\lambda_1λn\lambda_n 的特征向量u1u_1unu_n

几何意义

Rayleigh定理的几何意义在于,如果限制xx为单位向量,λ1\lambda_1λn\lambda_n给出二次型XAxX^*Ax的最大值与最小值,考虑Ii=[000100]TI_i = \begin{bmatrix}0&0&\cdots&0&1&0&\cdots&0\end{bmatrix}^T,其中第ii个元素为11,则

eiAei=aiie_i^*Ae_i = a_{ii}

Rayleigh定理有这个必然结果:Hermitian矩阵A=[aij]A = [a_{ij}]的任意元素主对角元也落在λ1\lambda_1λn\lambda_n之间,即λnaiiλi\lambda_n \leq a_{ii} \leq \lambda_{i},可以利用此性质估计Hermitian矩阵的最大特征值的下界和最小特征值的上界。

A=[123254349]A = \begin{bmatrix} 1&2&3 \\ 2&5&4 \\ 3&4&9 \end{bmatrix}

即可知道AA的最大特征值9\geq 9,最小特征值1\leq 1

其他特征值

我们可以通过Rayleigh定理求出Hermitian矩阵的最大和最小特征值,我们还可以求出其他特征值λ2,,λn1\lambda_2, \cdots, \lambda_{n-1},我们考虑与u1u_1的正交补空间里的向量xx,既有xu1x \perp u_1,则

z=Ux=[u1u2un]=(0z2zn)Tz = U^*x = \begin{bmatrix}u_1&u_2&\cdots&u_n\end{bmatrix} = \begin{pmatrix}0 & z_2 & \cdots & z_n\end{pmatrix}^T

zΛzzz=λ2z22++λnzn2z12++zn2λ2(z22++zn2)z12++zn2=λ2\frac{z^*\Lambda z}{z^*z} = \frac{\lambda_2|z_2|^2 + \cdots + \lambda_n|z_n|^2}{|z_1|^2 + \cdots + |z_n|^2} \leq \frac{\lambda_2(|z_2|^2 + \cdots + |z_n|^2)}{|z_1|^2 + \cdots + |z_n|^2} = \lambda_2

也就可以得出

max(xAxxx)=λ2,(xu1,x0)max(\frac{x^*Ax}{x^*x}) = \lambda_2, (x \perp u_1, x \neq 0)

分割定理

ARn×nA \in \mathbb{R}^{n \times n}的Hermitian矩阵,B=UHAUB = U^HAU,其中UUn×(n1)n\times (n-1)阶矩阵,且UHU=In1U^HU=I_{n-1},即

U=[0TIn1]=[000100010001]U = \begin{bmatrix} 0^T \\ I_{n-1} \end{bmatrix}=\begin{bmatrix} 0 & 0 & \cdots & 0 \\ 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix}

再设AABB的特征值分别为

λ1λ2λnμ1μ2μn\begin{aligned} \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n \\ \mu_1 \geq \mu_2 \geq \cdots \geq \mu_n \end{aligned}

λ1μ1λ2μuμn1λn\lambda_1 \geq \mu_1 \geq \lambda_2 \geq \mu_u \geq \cdots \geq \mu_{n-1} \geq \lambda_n