Reinforcement Learning (a) (4 points) What is the difference between Model-based and Model-free methods for rein- forcem

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

Reinforcement Learning (a) (4 points) What is the difference between Model-based and Model-free methods for rein- forcem

Post by answerhappygod »

Reinforcement Learning A 4 Points What Is The Difference Between Model Based And Model Free Methods For Rein Forcem 1
Reinforcement Learning A 4 Points What Is The Difference Between Model Based And Model Free Methods For Rein Forcem 1 (58.32 KiB) Viewed 47 times
Reinforcement Learning (a) (4 points) What is the difference between Model-based and Model-free methods for rein- forcement learning? (b) (12 points) Consider a system with two states and two actions. You perform actions and observe the rewards and transitions listed below. Each step lists the current state, reward, action and resulting transition as Si; R = r; ax : S; S;. Apply Q-learning algorithm with a learning rate of a = 0.5 and a discount factor of y = 0.5. The Q-learning equation for each step t +1 is given below: Qu+1(a, 8) + Qt(a, s) + a(R(s) + x[max Qt(a', s')]) The Q-table entries for step t=0 are initialized to zero. Q S S2 ao 0 a2 0 0 Draw the Q-table after performing each step below. i) t=1: Sı; R= -10;aj: SS ii) t=2: Si; R= -10; az : Si S2 iii) t=3: S; R = +20; a : S2 →S iv) t=4: S; R = -10; az: SS2 - (e) (4 points) After step t = 4, what is the optimal policy for the state S, and S,?
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply