Please don't provide what has been posted on answers because I
checked that and it's not even a little bit related to the problem.
Please help answer this. Thank you!
(approx. 25 pts) Problem 3 (A Limited-Memory Version of Adagrad): In this exercise, we investigate a limited-memory version of the gradient descent method with adaptive diagonal scaling. The update of the method is given by x³+¹ = x² — αkD¹▼ƒ(x²), (2) where ak≥ 0 is a suitable step size and Dk E Rnxn is a diagonal matrix that is chosen as follows: Dk = diag(vk, vh, ..., vk) and k = €+ Σ (▼ƒ(x³)){},_tm(k) = max ax{0, k-m}, Vi= 1,..., n. j=tm (k) Here, € > 0, the memory constant m € N, and the initial point xº € Rn are given parameters. == -D¹Vf(x) is a descent direction for all k € N (assuming a) Show that the direction dk = ▼ƒ(xk) ‡0).
Please don't provide what has been posted on answers because I checked that and it's not even a little bit related to the
-
answerhappygod
- Site Admin
- Posts: 899604
- Joined: Mon Aug 02, 2021 8:13 am
Please don't provide what has been posted on answers because I checked that and it's not even a little bit related to the
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!