Solution of a linear system
Solution of a linear system[edit]
Gradient descent can be used to solve a system of linear equations
reformulated as a quadratic minimization problem. If the system matrix is real symmetric and positive-definite, an objective function is defined as the quadratic function, with minimization of
so that
For a general real matrix , linear least squares define
In traditional linear least squares for real and the Euclidean norm is used, in which case
The line search minimization, finding the locally optimal step size on every iteration, can be performed analytically for quadratic functions, and explicit formulas for the locally optimal are known.[5][12]
For example, for real symmetric and positive-definite matrix , a simple algorithm can be as follows,[5]
To avoid multiplying by twice per iteration, we note that implies , which gives the traditional algorithm,[13]
The method is rarely used for solving linear equations, with the conjugate gradient method being one of the most popular alternatives. The number of gradient descent iterations is commonly proportional to the spectral condition number of the system matrix (the ratio of the maximum to minimum eigenvalues of ), while the convergence of conjugate gradient method is typically determined by a square root of the condition number, i.e., is much faster. Both methods can benefit from preconditioning, where gradient descent may require less assumptions on the preconditioner.[13]
Solution of a non-linear system[edit]
Gradient descent can also be used to solve a system of nonlinear equations. Below is an example that shows how to use the gradient descent to solve for three unknown variables, x1, x2, and x3. This example shows one iteration of the gradient descent.
Consider the nonlinear system of equations
Let us introduce the associated function
where
One might now define the objective function
which we will attempt to minimize. As an initial guess, let us use
We know that
where the Jacobian matrix is given by
We calculate:
Thus
and
Now, a suitable must be found such that
This can be done with any of a variety of line search algorithms. One might also simply guess which gives
Evaluating the objective function at this value, yields
The decrease from to the next step's value of
is a sizable decrease in the objective function. Further steps would reduce its value further until an approximate solution to the system was found.
Comments
Post a Comment