GradientDescent¶
- class GradientDescent(maxiter=100, learning_rate=0.01, tol=1e-07, callback=None, perturbation=None)[source]¶
Bases:
Optimizer
The gradient descent minimization routine.
For a function \(f\) and an initial point \(\vec\theta_0\), the standard (or "vanilla") gradient descent method is an iterative scheme to find the minimum \(\vec\theta^*\) of \(f\) by updating the parameters in the direction of the negative gradient of \(f\)
\[\vec\theta_{n+1} = \vec\theta_{n} - \vec\eta\nabla f(\vec\theta_{n}),\]for a small learning rate \(\eta > 0\).
You can either provide the analytic gradient \(\vec\nabla f\) as
gradient_function
in theoptimize
method, or, if you do not provide it, use a finite difference approximation of the gradient. To adapt the size of the perturbation in the finite difference gradients, set theperturbation
property in the initializer.This optimizer supports a callback function. If provided in the initializer, the optimizer will call the callback in each iteration with the following information in this order: current number of function values, current parameters, current function value, norm of current gradient.
Examples
A minimum example that will use finite difference gradients with a default perturbation of 0.01 and a default learning rate of 0.01.
An example where the learning rate is an iterator and we supply the analytic gradient. Note how much faster this convergences (i.e. less
nfevs
) compared to the previous example.- Parameters
maxiter (
int
) -- The maximum number of iterations.learning_rate (
Union
[float
,Callable
[[],Iterator
]]) -- A constant or generator yielding learning rates for the parameter updates. See the docstring for an example.tol (
float
) -- If the norm of the parameter update is smaller than this threshold, the optimizer is converged.perturbation (
Optional
[float
]) -- If no gradient is passed toGradientDescent.optimize
the gradient is approximated with a symmetric finite difference scheme withperturbation
perturbation in both directions (defaults to 1e-2 if required). Ignored if a gradient callable is passed toGradientDescent.optimize
.
Methods
Get the support level dictionary.
We compute the gradient with the numeric differentiation in the parallel way, around the point x_center.
Minimize the scalar function.
Print algorithm-specific options.
Set max evals grouped
Sets or updates values in the options dictionary.
Wrap the function to implicitly inject the args at the call of the function.
Attributes
- bounds_support_level¶
Returns bounds support level
- gradient_support_level¶
Returns gradient support level
- initial_point_support_level¶
Returns initial point support level
- is_bounds_ignored¶
Returns is bounds ignored
- is_bounds_required¶
Returns is bounds required
- is_bounds_supported¶
Returns is bounds supported
- is_gradient_ignored¶
Returns is gradient ignored
- is_gradient_required¶
Returns is gradient required
- is_gradient_supported¶
Returns is gradient supported
- is_initial_point_ignored¶
Returns is initial point ignored
- is_initial_point_required¶
Returns is initial point required
- is_initial_point_supported¶
Returns is initial point supported
- setting¶
Return setting
- settings¶
- Return type
Dict
[str
,Any
]