Abstract:An adaptive dual averaging method is proposed to solve the problem of AdaGrad’s application of adaptive matrices to the random gradient descent method, which reduces the search for hyperparameters in engineering. The AdaGrad adaptive matrix is introduced into the dual averaging method framework to form an adaptive dual averaging method, and its feasibility and convergence effect are verified through convex optimization experiments. The mathematical derivation results show that the AdaDA method for general convex functions under nonsmooth conditions can achieve an optimal individual convergence rate related to dimension O(1/ √t ), providing theoretical support for it.