class GradientDescent(maxiter=100, learning_rate=0.01, tol=1e-07, callback=None, perturbation=None)[source]

The gradient descent minimization routine.

For a function $$f$$ and an initial point $$\vec\theta_0$$, the standard (or « vanilla ») gradient descent method is an iterative scheme to find the minimum $$\vec\theta^*$$ of $$f$$ by updating the parameters in the direction of the negative gradient of $$f$$

$\vec\theta_{n+1} = \vec\theta_{n} - \vec\eta\nabla f(\vec\theta_{n}),$

for a small learning rate $$\eta > 0$$.

You can either provide the analytic gradient $$\vec\nabla f$$ as gradient_function in the optimize method, or, if you do not provide it, use a finite difference approximation of the gradient. To adapt the size of the perturbation in the finite difference gradients, set the perturbation property in the initializer.

This optimizer supports a callback function. If provided in the initializer, the optimizer will call the callback in each iteration with the following information in this order: current number of function values, current parameters, current function value, norm of current gradient.

Exemples

A minimum example that will use finite difference gradients with a default perturbation of 0.01 and a default learning rate of 0.01.

An example where the learning rate is an iterator and we supply the analytic gradient. Note how much faster this convergences (i.e. less nfevs) compared to the previous example.

Paramètres
• maxiter (int) – The maximum number of iterations.

• learning_rate (Union[float, Callable[[], Iterator]]) – A constant or generator yielding learning rates for the parameter updates. See the docstring for an example.

• tol (float) – If the norm of the parameter update is smaller than this threshold, the optimizer is converged.

• perturbation (Optional[float]) – If no gradient is passed to GradientDescent.optimize the gradient is approximated with a symmetric finite difference scheme with perturbation perturbation in both directions (defaults to 1e-2 if required). Ignored if a gradient callable is passed to GradientDescent.optimize.

Methods

Attributes

