Inversion with SVD

Authors
Affiliations
University of British Columbia
University of British Columbia

Fundamental understanding about non-uniqueness and ill-conditioning of the linear inverse problem is readily achieved with the aid of Singular Value Decomposition (SVD). In this chapter we show how this decomposition ……

Recall

The Inverse Problem from Inverse Theory Overview

Given observations diobs, i=1,,Nd^{obs}_i, ~i=1,\dots,N, uncertainties ϵj\epsilon_j and the ability to forward model F[m]=d\mathcal{F}[m]=d

Find the earth model mm that gave rise to the data

The Linear Inverse Problem from Linear Tikhonov Inversion

di=vgi(x)m(x)dxgi(x)=eipxcos(2πiqx)\begin{aligned} d_i&=\int_vg_i(x)m(x)dx \\ g_i(x)&=e^{ipx}cos(2\pi iqx) \end{aligned}

where djd_j is the ith datum, gjg_j the kernel function for the ith datum, and mm is the model

Each datum is an inner product of the model with the kernel function. If we evaluate one case: di=gm=4.89d_i=\mathbf{g}\cdot\mathbf{m}=4.89

Basic Challenges of the Inverse Problem: Non-uniqueness

The data are defined such that

di=j=1MGijmjd=Gm,   G:(N×M)\begin{aligned} d_i&=\sum^M_{j=1}G_{ij}m_j \\ \mathbf{d}&=\mathbf{Gm}, ~~~G:(N\times M)\end{aligned}

Typically M>NM>N creating an underdetermined problem, which has infinitely many solutions. The inverse problem in this case is also ill-conditioned and small changes in the data will yield large changes in the model.

In this chapter we consider the question: How do we understand and deal with these elements?

Solving Gm=d\mathbf{Gm}=\mathbf{d}

Given Gm=d\mathbf{Gm}=\mathbf{d} and knowing GRN×M\mathbf{G}\in\R^{N\times M}, mRM\mathbf{m}\in\R^M, and dRN\mathbf{d}\in\R^N, we want to write m=G1d\mathbf{m}=\mathbf{G}^{-1}\mathbf{d}, but G\mathbf{G} is not sparse and G1\mathbf{G}^{-1} does not exist.

For example, if we write Gm=dGm=d as m1+2m2=2m_1+2m_2=2 such that G=[1,2]G=[1,2], d=2d=2, and m=(m1,m2)Tm=(m_1,m_2)^T and consider m=G1dm=G^{-1}d.

Singular Value Decomposition

G=UpΛVpT\mathbf{G}=\mathbf{U}_p\mathbf{\Lambda}\mathbf{V}^T_p

di=vgi(x)m(x)dxd_i=\int_vg_i(x)m(x)dx

Gp=( g1  gN )\mathbf{G}_p =\begin{pmatrix} ~-& \mathbf{g}_1 & -~ \\ & \vdots & \\ ~-& \mathbf{g}_N & - ~\end{pmatrix}, where G\mathbf{G} is a (M×N)(M\times N) matrix

Up=(u1up)\mathbf{U}_p =\begin{pmatrix} |& & | \\ \mathbf{u}_1 & \dots & \mathbf{u}_p\\ |& & | \end{pmatrix}, where Up\mathbf{U}_p is a (N×P)(N\times P) matrix and ui\mathbf{u}_i are the singular vectors of the data space (UpTUp=Ip)(\mathbf{U}_p^T\mathbf{U}_p=\mathbf{I}_p)

Λp=(λ1λp)\mathbf{\Lambda}_p =\begin{pmatrix} \lambda_1& & \\ & \ddots & \\ & & \lambda_p\\\end{pmatrix}, where Λp\mathbf{\Lambda}_p is a (P×P)(P\times P) matrix and λi\lambda_i are the singular values (λ1λ2λp>0)(\lambda_1\geq\lambda_2\geq\dots\lambda_p>0)

Vp=(v1vp)\mathbf{V}_p =\begin{pmatrix} |& & | \\ \mathbf{v}_1 & \dots & \mathbf{v}_p\\ |& & | \end{pmatrix}, where Vp\mathbf{V}_p is a (M×P)(M\times P) matrix and vi\mathbf{v}_i are the singular vectors of the model space (VpTVp=Ip)(\mathbf{V}^T_p\mathbf{V}_p=\mathbf{I}_p)

m\mathbf{m}^{\parallel}: activated portion of the model space

m\mathbf{m}^{\perp}: annihilator space

Solving with SVD

Gm=d, G=UΛVTUΛVTm=dΛVT=UTdVTm=Λ1UTdVVTm=VΛ1UTd\begin{aligned}\mathbf{Gm}&=\mathbf{d}, ~\mathbf{G}=\mathbf{U\Lambda V}^T \\ \mathbf{U\Lambda V}^T\mathbf{m}&=\mathbf{d}\\ \mathbf{\Lambda V}^T&=\mathbf{U}^T\mathbf{d}\\ \mathbf{V}^T\mathbf{m}&=\mathbf{\Lambda}^{-1}\mathbf{U}^T\mathbf{d}\\ \mathbf{VV}^T\mathbf{m}&=\mathbf{V}\mathbf{\Lambda}^{-1}\mathbf{U}^T\mathbf{d}\end{aligned}

Define G=VΛ1UT, mc=Gd\mathbf{G}^\dagger=\mathbf{V\Lambda}^{-1}\mathbf{U}^T, ~\mathbf{m}_c=\mathbf{G}^\dagger\mathbf{d}

m=mc=VVTm=VΛ1UTd\mathbf{m}^\parallel=\mathbf{m}_c=\mathbf{VV}^T\mathbf{m}=\mathbf{V\Lambda}^{-1}\mathbf{U}^T\mathbf{d}

mc=VΛ1UTd=i=1N(uiTdλi)vi\mathbf{m}_c=\mathbf{V\Lambda}^{-1}\mathbf{U^T}\mathbf{d}=\sum^N_{i=1}\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i

Effects: successively add vectors

Each vector vi\mathbf{v}_i is scaled by uiTdλi\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\rightarrow\infty if λi0\lambda_i\rightarrow0

ϕd=Gmd2\phi_d=\|\mathbf{Gm}-\mathbf{d}\|^2

ϕm=m2\phi_m=\|\mathbf{m}\|^2

mc=i=1p(uiTdλi)vi\mathbf{m}_c=\sum^p_{i=1}\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i

Ill-conditionedness

mc=i=1p(uiTdλi)vi\mathbf{m}_c=\sum^p_{i=1}\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i

Small singular value \leftrightarrow highly oscillatory, large amplitude of noise

Truncated SVD

If data are inaccurate the noise is also amplified by 1λi\frac{1}{\lambda_i}

mc=i=1q(uiTdλi)vi+i=q+1N(uiTdλi)vi\mathbf{m}_c=\sum^q_{i=1}\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i + \sum^N_{i=q+1}\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i but the second term causes more harm than good

So we simply use mc=i=1q(uiTdλi)vi\mathbf{m}_c=\sum^q_{i=1}\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i, meaning the solution lies in a small sub-space.

Truncated SVD treats/takes care of issues caused by non-uniqueness and ill-conditioning

Consider Tikhonov Solution

Data misfit ϕd=Gmd2\phi_d=\|\mathbf{Gm}-\mathbf{d}\|^2

Regularization (Model norm?) ϕm=m2\phi_m=\|\mathbf{m}\|^2

Minimize ϕ=ϕd+βϕm=Gmd2+βmc2\begin{aligned}\phi&=\phi_d+\beta\phi_m\\ &=\|\mathbf{Gm}-\mathbf{d}\|^2+\beta\|\mathbf{m}_c\|^2\end{aligned}

Minimum g=mϕ=0\nabla\mathbf{g}=\nabla_m\phi=0

Differentiate with respect to vector ϕm=2GT(Gmd)+2βm=0\frac{\partial\phi}{\partial\mathbf{m}}=2\mathbf{G}^T(\mathbf{Gm}-\mathbf{d})+2\beta\mathbf{m}=0

So (GTG+βI)m=GTd(\mathbf{G}^T\mathbf{G}+\beta\mathbf{I})\mathbf{m}=\mathbf{G}^T\mathbf{d}, where GTG\mathbf{G}^T\mathbf{G} is a rank deficient (M×M)(M\times M) matrix, where rank(G)N, M>Nrank(\mathbf{G})\leq N, ~M>N, but β>0\beta>0, so adding βI\beta\mathbf{I} makes all the Eigen values β\geq\beta

Solve Am=b\mathbf{Am}=\mathbf{b}, letting A=GTG+βI\mathbf{A}=\mathbf{G}^T\mathbf{G}+\beta\mathbf{I} and b=GTd\mathbf{b}=\mathbf{G}^T\mathbf{d}

Tikhonov solution v TSVD

Tikhonov, TSVD, weighted TSVD

(GTG+βI)m=GTd(\mathbf{G}^T\mathbf{G}+\beta\mathbf{I})\mathbf{m}=\mathbf{G}^T\mathbf{d}

Exact correspondence is obtained by modifying to include weights

mc=VTΛ1Ud\mathbf{m}_c=\mathbf{VT\Lambda}^{-1}\mathbf{Ud}

mc=i=1pti(uiTdλi)vi\mathbf{m}_c=\sum^p_{i=1}t_i\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i

T=(t1tp),  0ti1\mathbf{T} =\begin{pmatrix} t_1& & \\ & \ddots & \\ & & t_p \end{pmatrix},~~0\leq t_i\leq 1

ti=λi2λi2+βt_i=\frac{\lambda_i^2}{\lambda_i^2+\beta}

Summary

Gm=d\mathbf{Gm}=\mathbf{d}

G=UΛVT\mathbf{G}=\mathbf{U\Lambda V}^T

mc=i=1pti(uiTdλi)vi\mathbf{m}_c=\sum^p_{i=1}t_i\left(\frac{\mathbf{u}_i^T\mathbf{d}}{\lambda_i}\right)\mathbf{v}_i

Small singular values are associated with high frequency vectors in the model space and amplify the noise

Instabilities are handled by truncating, adjusting their influence (ti=λi2λi2+βt_i=\frac{\lambda_i^2}{\lambda_i^2+\beta}), which is exactly the same as solving the Tikhonov problem.

Minimize ϕ=ϕd+βϕm=Gmd2+βmc2\begin{aligned}\phi&=\phi_d+\beta\phi_m\\ &=\|\mathbf{Gm}-\mathbf{d}\|^2+\beta\|\mathbf{m}_c\|^2\end{aligned}

What does the solution tell us?

A full data set can recover information about m\mathbf{m}^{\parallel}

The regularized solution is in a reduced region of the model space

The geophysical model lies outside of this region. To explore that we need to incorporate more information.