Problem 2. Answer the following questions and show your work for questions b. and c. a. In value iteration, let k be the

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

Problem 2. Answer the following questions and show your work for questions b. and c. a. In value iteration, let k be the

Post by answerhappygod »

Problem 2 Answer The Following Questions And Show Your Work For Questions B And C A In Value Iteration Let K Be The 1
Problem 2 Answer The Following Questions And Show Your Work For Questions B And C A In Value Iteration Let K Be The 1 (720.61 KiB) Viewed 39 times
Problem 2. Answer the following questions and show your work for questions b. and c. a. In value iteration, let k be the iteration index. Write the formula to update Qx(s,a) from R(s,a,s'), T(s,a,s'), VK-1(s'), y, and write the formula to compute Vx(s) from Qx(s,a). lk (s, a) = Vk (s) = b. Consider the MDP with transition model and reward function as given in the table below. Triangle is player MAX. Assume the discount factor y = 1, given that V. (s) = 0 for both states, fill in the values for V1, V2, Q1, Q2 in the figure below. = S a S a S B 1 که بابا M s' A B A A A A A A B В. 1 2 2 3 3 T(s,a,s') R(s,a,s') 0 0 1 0 1 1 0 0 0.5 0 0.5 0 A B A B A B 1 1 2 2 3 3 A B T(s,a,s') R(s,a,s') 0.5 10 0.5 0 1 0 0 0 0.5 2 0.5 4 B B B A B V2 A B Q2 (4.11 (A,2 (A,31 (B, 1 (B.21 (B,36 V A * B Q, (A,11 (A.21 (A,31 (B,11 (B,22 (B,36 A vo . B
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply