Solution of a linear system

 

Solution of a linear system[edit]

The steepest descent algorithm applied to the Wiener filter[11]

Gradient descent can be used to solve a system of linear equations

reformulated as a quadratic minimization problem. If the system matrix  is real symmetric and positive-definite, an objective function is defined as the quadratic function, with minimization of

so that

For a general real matrix linear least squares define

In traditional linear least squares for real  and  the Euclidean norm is used, in which case

The line search minimization, finding the locally optimal step size  on every iteration, can be performed analytically for quadratic functions, and explicit formulas for the locally optimal  are known.[5][12]

For example, for real symmetric and positive-definite matrix , a simple algorithm can be as follows,[5]

To avoid multiplying by  twice per iteration, we note that  implies , which gives the traditional algorithm,[13]

The method is rarely used for solving linear equations, with the conjugate gradient method being one of the most popular alternatives. The number of gradient descent iterations is commonly proportional to the spectral condition number  of the system matrix  (the ratio of the maximum to minimum eigenvalues of ), while the convergence of conjugate gradient method is typically determined by a square root of the condition number, i.e., is much faster. Both methods can benefit from preconditioning, where gradient descent may require less assumptions on the preconditioner.[13]

Solution of a non-linear system[edit]

Gradient descent can also be used to solve a system of nonlinear equations. Below is an example that shows how to use the gradient descent to solve for three unknown variables, x1x2, and x3. This example shows one iteration of the gradient descent.

Consider the nonlinear system of equations

Let us introduce the associated function

where

One might now define the objective function

which we will attempt to minimize. As an initial guess, let us use

We know that

where the Jacobian matrix  is given by

We calculate:

Thus

and

An animation showing the first 83 iterations of gradient descent applied to this example. Surfaces are isosurfaces of  at current guess , and arrows show the direction of descent. Due to a small and constant step size, the convergence is slow.

Now, a suitable  must be found such that

This can be done with any of a variety of line search algorithms. One might also simply guess  which gives

Evaluating the objective function at this value, yields

The decrease from  to the next step's value of

is a sizable decrease in the objective function. Further steps would reduce its value further until an approximate solution to the system was found.


Comments