Linear Equations

Linear Equations

Definition: Systems of Linear Equations

A system of linear equations (or linear system) is a collection of one or more linear equations involving the same variables. If there are $m$ equations with $n$ unknown variables to solve for

$$ \begin{align*} a_{1,1} x_1 + a_{1,2} x_2 + \ldots + a_{1,n} x_n & = b_1 \\ a_{2,1} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n & = b_2 \\ \vdots & \\ a_{m,1} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n & = b_m. \end{align*} $$

The system of linear equations can be written in matrix form ${Ax=b}$, where

$$ \begin{align*} A = \left( \begin{array}{cccc} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{array} \right), \quad x = \left( \begin{array}{c} x_1 \\\ x_2 \\\ \vdots \\\ x_n \end{array} \right), \quad b = \left( \begin{array}{c} b_1 \\ b_2 \\ \vdots \\ b_m \end{array} \right), \end{align*} $$

so that $A \in \mathbb{R}^{m \times n}$, $x \in \mathbb{R}^n$ and $b \in \mathbb{R}^m$.

Definition: Banded Systems
A banded matrix is a matrix whose non-zero entries are confined to a diagonal band, comprising the main diagonal and zero or more diagonals on either side.
Definition: Symmetric Matrices

A square matrix $A$ is symmetric if ${A=A^T}$, that is, $a_{i,j}=a_{j,i}$ for all indices $i$ and $j$.

A square matrix is said to be Hermitian if the matrix is equal to its conjugate transpose, i.e. $a_{i,j}=\overline{a_{j,i}}$ for all indices $i$ and $j$. A Hermitian matrix is written as $A=A^H$.

Definition: Positive Definite Matrices
A matrix, $M$, is said to be positive definite if it is symmetric (or Hermitian) and all its eigenvalues are real and positive.
Definition: Nonsingular Matrices
A matrix is non-singular or invertible if there exists a matrix $A^{-1}$ such that ${A^{-1}A = A A^{-1} = I,}$ where $I$ is the identity matrix.
Note: Properties of Nonsingular Matrices

For a nonsingular matrix, the following all hold:

  • Nonsingular matrix has full rank
  • A square matrix is nonsingular if and only if the determinant of the matrix is non-zero.
  • If a matrix is singular, both versions of Gaussian elimination will fail due to division by zero, yielding a floating exception error.
Definition:
If $\tilde{x}$ is an approximate solution to the linear problem ${Ax=b}$, then the residual is defined as ${r = A \tilde{x}-b}$.

Direct Methods

Algorithm: Gaussian Elimination

Gaussian elimination is a method to solve systems of linear equations based on forward elimination (a series of row-wise operations) to convert the matrix, $A$, to upper triangular form (echelon form), and then back substitution to solve the system. The row operations are:

  • row swapping
  • row scaling, i.e. multiplying by a non-zero scalar
  • row addition, i.e. adding a multiple of one row to another

Forward elimination is written as

1  For $k=1$ to $n-1$:
2  $\hspace{2cm}$ For $i = k + 1$ to $n$:
3  $\hspace{4cm}$ For $j = k$ to $n$:
4  $\hspace{6cm} a_{i,j} = a_{i,j} - \dfrac{a_{i,k}}{a_{k,k}} a_{k,j}$
5  $\hspace{4cm}$ End
6  $\hspace{2cm} b_{i} = b_{i} - \dfrac{a_{i,k}}{a_{k,k}} b_{k}$
7  End

Then back substitute, starting with the last unknown first:

8  Set $x_n = \dfrac{b_n}{a_{n,n}}$
9  For $i$ =$n-1$ to $1$:
10 $\hspace{2cm} y = b_i$
11 $\hspace{2cm}$ For $j = n$ to $i+1$:
12 $\hspace{4cm} y = y - a_{i,j} x_j$
13 $\hspace{2cm}$ End
14 $\hspace{2cm} x_i = \dfrac{y}{a_{i,i}}$
15 End

Algorithm: Gaussian Elimination with Scaled Partial Pivoting

A pivot element is the element of a matrix, $A$, which is selected to do certain calculations first. Pivoting helps reduce errors due to rounding.

1  Find maximal absolute values vector $s$ with entries $s_i= \max_{j=1,\ldots,n}\left| a_{i,j}\right|$
2  For $k=1$ to $n-1$:
3  $\hspace{2cm}$ For $i = k$ to $n$:
4  $\hspace{4cm}$ Compute $\left| \dfrac{a_{i,k}}{s_i} \right|$
5  $\hspace{2cm}$ End
6  $\hspace{2cm}$ Find row with largest relative pivot element, denote this as row $j$
7  $\hspace{2cm}$ Swap rows $k$ and $j$
8  $\hspace{2cm}$ Swap entries $k$ and $j$ in vector $s$
9  $\hspace{2cm}$ Do forward elimination on row $k$
10 End

The matrix will be in row-echelon form, so that back substitution can now be performed.

Definition: Upper and Lower Triangular Matrices
A square matrix is said to be lower triangular matrix if all the elements above the main diagonal are zero and upper triangular if all the entries below the main diagonal are zero.
Theorem: $LU$-Decomposition

Let ${A \in \mathbb{R}^{n \times n}}$ be invertible. Then there exists a decomposition of $A$ such that ${A=LU}$, where $L$ is a lower triangular matrix and $U$ is an upper triangular matrix, And

$$ L = U_1^{-1} U_2^{-1} \cdots U_{n-1}^{-1} $$

where each matrix $U_i$ is a matrix which describes the $i$<sup<th step in forward elimination. The upper triangular matrix $U$ is given by

$$ U = U_{n-1} \cdots U_2 U_1 A. $$
Definition: Cholesky-Decomposition
A symmetric, positive definite matrix can be decomposed as $A=\tilde{L}\tilde{L}^T$, where ${\tilde{L} = L D^{1/2}}$, where $D$ is a diagonal matrix whose elements $d_i$ are all positive, so that $D^{1/2}$ has diagonal elements $\sqrt{d_i}$.
Algorithm: Cholesky Algorithm

Cholesky’s algorithm to compute the entries of the triangular matrix $L$ can be written as

1  For $i=1$ to $n$:
2  $\hspace{2cm}$ For $j = 1$ to $i-1$:
3  $\hspace{4cm}$ Set $y = a_{i,j}$
4  $\hspace{4cm}$ For $k = 1$ to $j-1$:
5  $\hspace{6cm}$ $y = y - l_{i,k} l_{j,k}$
6  $\hspace{4cm}$ End
7  $\hspace{4cm}$ $l_{i,j} = y / l_{j,j}$
8  $\hspace{2cm}$ End
9  $\hspace{2cm}$ Set $y = a_{i,i}$
10 $\hspace{2cm}$ For $k= 1$ to $i-1$:
11 $\hspace{4cm}$ $y = y - l_{i,k} l_{i,k}$
12 $\hspace{2cm}$ End
13 $\hspace{2cm}$ If $y \le 0$:
14 $\hspace{4cm}$ exit as there is no solution
15 $\hspace{2cm}$ Else:
16 $\hspace{4cm}$ $l_{i,i} = \sqrt{y}$
12 $\hspace{2cm}$ End
17 End

Indirect Methods

For a non-singular matrix $A$, consider the iterative scheme

$$ Q x_{k +1}= \left( Q - A \right) x_{k} + b. $$

This is equivalent to

$$ x_{k +1} = \left( I - Q^{-1} A \right) x_{k} + Q^{-1} b. $$
Definition: Spectral Radius of a Matrix

The spectral radius of a matrix $A$ is defined as

$$ \rho \left( A \right) = \max \left\{ \left| \lambda_1 \right|, \left| \lambda_2 \right|, \ldots \left| \lambda_n \right| \right\} $$

where the $\lambda_i$ are the eigenvalues of the matrix.

Theorem: Convergence
The iterative scheme converges if and only if the spectral radius of the matrix ${I - Q^{-1} A}$ is less than one, i.e. ${\rho\left( I - Q^{-1} A \right) \lt 1}$.
Definition: Richardson Iteration

Let $Q=I$, then Richardson iteration computes the sequence of vectors

$$ x_{k +1} = \left(I - A \right) x_{k} + b. $$

This sequence may converge to $x = A^{-1} b$, depending on $A$.

The modified Richardson iteration scales ${Q = \omega I}$, so that

$$ x_{k +1}= x_{k} + \omega \left( b - A x_{k} \right). $$
Definition: Jacobi Iteration

The Jacobi iteration scheme has $Q=D$, so

$$ x_{k +1} = \left(I - D^{-1} A \right) x_{k} + D^{-1} b. $$

where $D$ is the matrix comprised of the diagonal elements of $A$.

Definition: Diagonally Dominant Matrices

A matrix $A \in \mathbb{R}^{n \times n}$ is said to be diagonally dominant if, for every row, the absolute value of the diagonal element is greater or equal to the sum of the magnitudes of all other elements, i.e.

$$ \left| a_{i,i} \right| \ge \sum\limits_{\substack{j=1,\newline{}j \neq i}}^n \left| a_{i,j} \right| \quad \text{for all } i \in (1,n) $$
Theorem: Convergence of Jacobi Scheme
If the matrix $A$ is diagonally dominant, then the Jacobi scheme converges for any initial guess $x_0$.
Definition: Gauss-Seidel Scheme

Let $Q=L + D$, then the Gauss-Seidel scheme is given by

$$ \begin{equation*} (D + L) x_{k+1} = -U x_{k} + b \end{equation*} $$

where $L$ is the lower triangular part of $A$, and $D$ is the diagonal matrix of $A$.

Theorem: Convergence of Gauss-Seidel Scheme
If the matrix $A$ is diagonally dominant, then the Gauss-Seidel scheme converges for any initial guess $x_0$.
Definition: Successive Over Relaxation (SOR)

Let $Q=L + \dfrac{1}{\omega}D$, thus, the method of successive over relaxation (SOR) is given by

$$ \begin{equation*} (D + \omega L) x_{k+1} = -\left( \left(\omega - 1 \right)D + \omega U \right) x_{k} + \omega b. \end{equation*} $$
Theorem: Convergence of Successive Over Relaxation Scheme
Let $A$ be a symmetric matrix with positive entries on the diagonal and let ${\omega \in \left(0,2\right)}$. Then, if and only if $A$ is positive definite will the method of successive over relaxation converge.

Fundamental Theorem of Numerical Analysis

Definition: Stable

A numerical method is said to be stable if and only if any initial error $e_0$ is damped during the iterations, i.e. ${\left\| e_k \right\| < \left\| e_0 \right\| }$.

Note that $\Vert x \Vert $ is a norm of a vector, such as ${\Vert x \Vert_2 = \sqrt{x_0^2 + x_1^2 + \ldots + x_n^2} }$.

Definition: Consistent
A numerical method is said to be consistent if any fixed point $x^{\ast}$ of the iteration is a solution to the problem being solved.

For linear systems, a fixed point, $x^{\ast}$, fulfils

$$ x^{\ast} = \left( I - Q^{-1} A \right) x^{\ast} + Q^{-1} b \Leftrightarrow A x^{\ast} = b. $$

If the iterative method for a linear system is stable then

$$ e_k = \left( I - Q^{-1} A \right)^k e_0, $$

so then ${\Vert I - Q^{-1} A \Vert < 1}$ for ${\Vert e_k \Vert < \Vert e_0 \Vert}$, where ${\Vert I - Q^{-1} A \Vert}$ is a matrix norm.

Definition: Convergent
A numerical method is said to be convergent if ${x_k \rightarrow x^{\ast}}$ as ${k \rightarrow \infty}$ where $x^{\ast}$ is the exact solution.
Theorem: Fundamental Theorem of Numerical Analysis
A numerical method is convergent if and only if it is consistent and stable.
Example:

An .ipynb notebook with an example of iterative solvers for linear systems can be accessed online [here].

It can be downloaded from [here] as a python file or downloaded as a notebook from [here].