5. (10 points) Given the following racing MDP model : States: Cool, Warm, Overheated Actions Slow, Fast Transition funct
Posted: Thu Jun 30, 2022 7:50 pm
5. (10 points) Given the following racing MDP model : States: Cool, Warm, Overheated Actions Slow, Fast Transition function as shown in Figure Reward function: signed numbers Slow : 1.0 Cool V₂ V₁ 0.5 Vo Slow 0.5 Fast 0.5 +2 Cool 2 Warm 0.5 +2 1) (4 points) Write the value iteration equation. 2) (6 points) Given Vo and discount = 1, what is the value of V₁ and V₂ for each state respectively? Please present the details of calculation. Warm 1 Overheated Fast 0 1.0 -10 Overheated