Adagrad

From HandWiki
Redirect page

Background

The basic goal for us to design algorithms is to solve convex optimization problems. Gradient descent (GD) algorithms are always used for convex problems where we have an empirical loss function as the objective function, i.e., for some differentiable objective function like