Linear Equations | David Sinden

Courses

Linear Equations

Definition: Systems of Linear Equations

A system of linear equations (or linear system) is a collection of one or more linear equations involving the same variables. If there are $m$ equations with $n$ unknown variables to solve for

$$ \begin{align*} a_{1,1} x_1 + a_{1,2} x_2 + \ldots + a_{1,n} x_n & = b_1 \\ a_{2,1} x_1 + a_{2,2} x_2 + \cdots + a_{2,n} x_n & = b_2 \\ \vdots & \\ a_{m,1} x_1 + a_{m,2} x_2 + \cdots + a_{m,n} x_n & = b_m. \end{align*} $$

The system of linear equations can be written in matrix form ${Ax=b}$, where

$$ \begin{align*} A = \left( \begin{array}{cccc} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{array} \right), \quad x = \left( \begin{array}{c} x_1 \\\ x_2 \\\ \vdots \\\ x_n \end{array} \right), \quad b = \left( \begin{array}{c} b_1 \\ b_2 \\ \vdots \\ b_m \end{array} \right), \end{align*} $$

so that $A \in \mathbb{R}^{m \times n}$, $x \in \mathbb{R}^n$ and $b \in \mathbb{R}^m$.

Definition: Banded Systems

A banded matrix is a matrix whose non-zero entries are confined to a diagonal band, comprising the main diagonal and zero or more diagonals on either side.

Definition: Symmetric Matrices

A square matrix $A$ is symmetric if ${A=A^T}$, that is, $a_{i,j}=a_{j,i}$ for all indices $i$ and $j$.

A square matrix is said to be Hermitian if the matrix is equal to its conjugate transpose, i.e. $a_{i,j}=\overline{a_{j,i}}$ for all indices $i$ and $j$. A Hermitian matrix is written as $A=A^H$.

Definition: Positive Definite Matrices

A square matrix, $M$, is said to be positive definite if it is symmetric (or Hermitian) and all its eigenvalues are real and positive.

Definition: Nonsingular Matrices

A square matrix is non-singular (or invertible) if there exists a matrix $A^{-1}$ such that ${A^{-1}A = A A^{-1} = I,}$ where $I$ is the identity matrix.

Note: Properties of Nonsingular Matrices

For a nonsingular matrix, the following all hold:

A nonsingular matrix has full rank
A square matrix is nonsingular if and only if the determinant of the matrix is non-zero.
If a matrix is singular, both versions of Gaussian elimination will fail due to division by zero, yielding a floating exception error.

Definition:

If $\tilde{x}$ is an approximate solution to the linear problem ${Ax=b}$, then the residual is defined as ${r = A \tilde{x}-b}$.

Direct Methods

Algorithm: Gaussian Elimination

Gaussian elimination is a method to solve systems of linear equations based on forward elimination (a series of row-wise operations) to convert the matrix, $A$, to upper triangular form (echelon form), and then back substitution to solve the system. The row operations are:

row swapping
row scaling, i.e. multiplying by a non-zero scalar
row addition, i.e. adding a multiple of one row to another

Forward elimination is written as

1 For $k=1$ to $n-1$:
2 $\hspace{2cm}$ For $i = k + 1$ to $n$:
3 $\hspace{4cm}$ For $j = k$ to $n$:
4 $\hspace{6cm} a_{i,j} = a_{i,j} - \dfrac{a_{i,k}}{a_{k,k}} a_{k,j}$
5 $\hspace{4cm}$ End
6 $\hspace{2cm} b_{i} = b_{i} - \dfrac{a_{i,k}}{a_{k,k}} b_{k}$
7 End

Then back substitute, starting with the last unknown first:

8 Set $x_n = \dfrac{b_n}{a_{n,n}}$
9 For $i$ =$n-1$ to $1$:
10 $\hspace{2cm} y = b_i$
11 $\hspace{2cm}$ For $j = n$ to $i+1$:
12 $\hspace{4cm} y = y - a_{i,j} x_j$
13 $\hspace{2cm}$ End
14 $\hspace{2cm} x_i = \dfrac{y}{a_{i,i}}$
15 End

Algorithm: Gaussian Elimination with Scaled Partial Pivoting

A pivot element is the element of a matrix, $A$, which is selected to do certain calculations first. Pivoting helps reduce errors due to rounding.

1 Find maximal absolute values vector $\vec{s}$ with entries $s_i= \max\limits_{j=1,\ldots,n} \left| a_{i,j} \right|$
2 For $k=1$ to $n-1$:
3 $\hspace{2cm}$ For $i = k$ to $n$:
4 $\hspace{4cm}$ Compute $\left| \dfrac{a_{i,k}}{s_i} \right|$
5 $\hspace{2cm}$ End
6 $\hspace{2cm}$ Find row with largest relative pivot element, denote this as row $j$
7 $\hspace{2cm}$ Swap rows $k$ and $j$
8 $\hspace{2cm}$ Swap entries $k$ and $j$ in vector $\vec{s}$
9 $\hspace{2cm}$ Do forward elimination on row $k$
10 End

The matrix will be in row-echelon form, so that back substitution can now be performed.

Definition: Upper and Lower Triangular Matrices

A square matrix is said to be a lower triangular matrix if all the elements above the main diagonal are zero and upper triangular if all the entries below the main diagonal are zero.

If all of the entries on the main diagonal of a lower triangular matrix are also zero, the matrix is called strictly lower triangular.

The product of two upper triangular matrices is also upper triangular.
The inverse of an upper triangular matrix is also upper triangular.

Theorem: $LU$-Decomposition

Let ${A \in \mathbb{R}^{n \times n}}$ be invertible. Then there exists a decomposition of $A$ such that ${A=LU}$, where $L$ is a lower triangular matrix and $U$ is an upper triangular matrix.

A single pass of Gaussian forward elimination can be written as a matrix operation, where the elementary row operations are expressed as a Gaussian transformation matrix, $M$. This is a lower triangular matrix. It can be shown that

$$ L = M_1^{-1} M_2^{-1} \cdots M_{n-1}^{-1} $$

where each matrix $M_i$ is a matrix which describes the $i$^th pass of forward elimination. Equivalently, the upper triangular matrix $U$ is given by

$$ U = M_{n-1} \cdots M_2 M_1 A. $$

Definition: Cholesky-Decomposition

A symmetric, positive definite matrix can be decomposed as $A=\tilde{L}\tilde{L}^T$, where ${\tilde{L} = L D^{1/2}}$, where $D$ is a diagonal matrix whose elements $d_i$ are all positive, so that $D^{1/2}$ has diagonal elements $\sqrt{d_i}$.

Algorithm: Cholesky Algorithm

Cholesky’s algorithm to compute the entries of the triangular matrix $\tilde{L}$ can be written as

1 For $i=1$ to $n$:
2 $\hspace{2cm}$ For $j = 1$ to $i-1$:
3 $\hspace{4cm}$ Set $y = a_{i,j}$
4 $\hspace{4cm}$ For $k = 1$ to $j-1$:
5 $\hspace{6cm}$ $y = y - l_{i,k} l_{j,k}$
6 $\hspace{4cm}$ End
7 $\hspace{4cm}$ $l_{i,j} = y / l_{j,j}$
8 $\hspace{2cm}$ End
9 $\hspace{2cm}$ Set $y = a_{i,i}$
10 $\hspace{2cm}$ For $k= 1$ to $i-1$:
11 $\hspace{4cm}$ $y = y - l_{i,k} \, l_{i,k}$
12 $\hspace{2cm}$ End
13 $\hspace{2cm}$ If $y \le 0$:
14 $\hspace{4cm}$ exit as there is no solution
15 $\hspace{2cm}$ Else:
16 $\hspace{4cm}$ $l_{i,i} = \sqrt{y}$
12 $\hspace{2cm}$ End
17 End

Indirect Methods

For a non-singular matrix $A$, consider the iterative scheme

$$ Q x_{k +1}= \left( Q - A \right) x_{k} + b. $$

This is equivalent to

$$ x_{k +1} = \left( I - Q^{-1} A \right) x_{k} + Q^{-1} b. $$

Definition: Spectral Radius of a Matrix

The spectral radius of a matrix $A$ is defined as

$$ \rho \left( A \right) = \max \left\{ \left| \lambda_1 \right|, \left| \lambda_2 \right|, \ldots \left| \lambda_n \right| \right\} $$

where the $\lambda_i$ are the eigenvalues of the matrix.

Theorem: Convergence

The iterative scheme converges if and only if the spectral radius of the matrix ${I - Q^{-1} A}$ is less than one, i.e. ${\rho\left( I - Q^{-1} A \right) \lt 1}$.

Definition: Richardson Iteration

Let $Q=I$, then Richardson iteration computes the sequence of vectors

$$ x_{k +1} = \left(I - A \right) x_{k} + b. $$

This sequence may converge to $x = A^{-1} b$, depending on $A$.

The modified Richardson iteration scales ${Q = \omega I}$, so that

$$ x_{k +1}= x_{k} + \omega \left( b - A x_{k} \right). $$

Definition: Jacobi Iteration

The Jacobi iteration scheme has $Q=D$, so

$$ x_{k +1} = \left(I - D^{-1} A \right) x_{k} + D^{-1} b. $$

where $D$ is the matrix comprised of the diagonal elements of $A$.

Definition: Diagonally Dominant Matrices

A matrix $A \in \mathbb{R}^{n \times n}$ is said to be diagonally dominant if, for every row, the absolute value of the diagonal element is greater or equal to the sum of the magnitudes of all other elements, i.e.

$$ \left| a_{i,i} \right| \ge \sum\limits_{\substack{j=1,\newline{}j \neq i}}^n \left| a_{i,j} \right| \quad \text{for all } i \in (1,n) $$

Theorem: Convergence of Jacobi Scheme

If the matrix $A$ is diagonally dominant, then the Jacobi scheme converges for any initial guess $x_0$.

Definition: Gauss-Seidel Scheme

Let $Q=L + D$, then the Gauss-Seidel scheme is given by

$$ \begin{equation*} (D + L) x_{k+1} = -U x_{k} + b \end{equation*} $$

where $L$ is the strictly lower triangular part of $A$, $U$ is the strictly upper triangular part of $A$, and $D$ is the diagonal matrix of $A$.

Theorem: Convergence of Gauss-Seidel Scheme

If the matrix $A$ is diagonally dominant, then the Gauss-Seidel scheme converges for any initial guess $x_0$.

Definition: Successive Over Relaxation (SOR)

Let $Q=L + \dfrac{1}{\omega}D$, thus, the method of successive over relaxation (SOR) is given by

$$ \begin{equation*} (D + \omega L) x_{k+1} = -\left( \left(\omega - 1 \right)D + \omega U \right) x_{k} + \omega b. \end{equation*} $$

Theorem: Convergence of Successive Over Relaxation Scheme

Let $A$ be a symmetric matrix with positive entries on the diagonal and let ${\omega \in \left(0,2\right)}$. Then, if and only if $A$ is positive definite will the method of successive over relaxation converge.

Fundamental Theorem of Numerical Analysis

Definition: Stable

A numerical method is said to be stable if and only if any initial error $e_0$ is damped during the iterations, i.e. ${\left\| e_k \right\| < \left\| e_0 \right\| }$.

Note that $\Vert x \Vert $ is a norm of a vector, such as ${\Vert x \Vert_2 = \sqrt{x_0^2 + x_1^2 + \ldots + x_n^2} }$.

Definition: Consistent

A numerical method is said to be consistent if any fixed point $x^{\ast}$ of the iteration is a solution to the problem being solved.

For linear systems, a fixed point, $x^{\ast}$, fulfils

$$ x^{\ast} = \left( I - Q^{-1} A \right) x^{\ast} + Q^{-1} b \Leftrightarrow A x^{\ast} = b. $$

If the iterative method for a linear system is stable then

$$ e_k = \left( I - Q^{-1} A \right)^k e_0, $$

so then ${\Vert I - Q^{-1} A \Vert < 1}$ for ${\Vert e_k \Vert < \Vert e_0 \Vert}$, where ${\Vert I - Q^{-1} A \Vert}$ is a matrix norm.

Definition: Convergent

A numerical method is said to be convergent if ${x_k \rightarrow x^{\ast}}$ as ${k \rightarrow \infty}$ where $x^{\ast}$ is the exact solution.

Theorem: Fundamental Theorem of Numerical Analysis

A numerical method is convergent if and only if it is consistent and stable.

Example:

An .ipynb notebook with an example of iterative solvers for linear systems can be accessed online [here].

It can be downloaded from [here] as a python file or downloaded as a notebook from [here].

Last updated on May 14, 2025

← Number Representations Jun 1, 2021

Nonlinear Equations Nov 17, 2023 →