(a) (2 points) When computing the reward for an MDP, what does a higher discount factor mean? (b) (2 points) What is the

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

(a) (2 points) When computing the reward for an MDP, what does a higher discount factor mean? (b) (2 points) What is the

Post by answerhappygod »

A 2 Points When Computing The Reward For An Mdp What Does A Higher Discount Factor Mean B 2 Points What Is The 1
A 2 Points When Computing The Reward For An Mdp What Does A Higher Discount Factor Mean B 2 Points What Is The 1 (72.23 KiB) Viewed 40 times
(a) (2 points) When computing the reward for an MDP, what does a higher discount factor mean? (b) (2 points) What is the difference between the V function (VH) and the Q function (Q+) of a policy ? (c) (2 points) Although policy evaluation takes less time than value iteration, what can be an advantage of using value iteration over policy evaluation? Consider the n state MDP below. In the state n, there is just one action that collects the reward of +10. For every other state there are two actions: right which moves determinis- tically one step to the next state, or reset which moves back to state 1. There is a reward factor of +1 for right and 0 for reset. The discount factor is y = +10 -n-1 2 +1 n +1 +1 (a) (2 points) What is the optimal policy Topt? (e) (2 points) What is the optimal value of state n (Vope(n))? (f) (4 points) Compute the optimal value function, Vopt(k) for all k = 1,...,n-1, in terms of Vopt(n). (g) (6 points) Suppose you are doing value iteration to compute the values V(k) for all k <n. You start with all value estimates equal to 0. Show the values for all k after 1 and 2 iterations respectively.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply