Introduction

Newton's method, also known as the Newton-Raphson method, is an iterative algorithm used to find the roots of a real-valued function. It can also be extended to find the minimum or maximum of a function. In this tutorial, we will explore the mathematical concepts behind Newton's method and provide Python code examples for both root-finding and optimization problems.

Finding Roots

Consider a real-valued function $f(x)$. The goal is to find a value $x^$ such that $f(x^) = 0$. Newton's method starts with an initial guess $x_0$ and iteratively updates it using the following formula:

$$ x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)} $$

where $f'(x_n)$ is the derivative of $f(x)$ evaluated at $x_n$.

Example: Single Variable Function

Let's find the root of the function $f(x) = x^3 - 2x - 5$.

def f(x):
    return x**3 - 2*x - 5

def f_prime(x):
    return 3*x**2 - 2

def newton_method(f, f_prime, x0, tolerance=1e-6, max_iterations=100):
    x = x0
    for i in range(max_iterations):
        fx = f(x)
        if abs(fx) < tolerance:
            return x
        x = x - fx / f_prime(x)
    return x

x0 = 2  # Initial guess
root = newton_method(f, f_prime, x0)
print(f"The root is approximately {root:.6f}")

Output:

The root is approximately 2.094551

Optimization

Newton's method can be extended to find the minimum or maximum of a function. In this case, we iteratively update the point $x_n$ using the following formula:

$$ x_{n+1} = x_n - \frac{f'(x_n)}{f''(x_n)} $$

where $f'(x_n)$ and $f''(x_n)$ are the first and second derivatives of $f(x)$ evaluated at $x_n$.

Multivariate Optimization

For a multivariate function $f(x_1, x_2, \ldots, x_n)$, the gradient vector $\nabla f(x)$ and the Hessian matrix $H(x)$ are defined as follows:

Gradient vector: $\nabla f(x) = \begin{bmatrix} \frac{\partial f}{\partial x_1} \\ \frac{\partial f}{\partial x_2} \\ \vdots \\ \frac{\partial f}{\partial x_n} \end{bmatrix}$

Hessian matrix: $H(x) = \begin{bmatrix} \frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_n} \\ \frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_n^2} \end{bmatrix}$

Newton's method for optimization in the multivariate case iteratively updates the point $x_n$ using the following formula: