Consider the following environment: your agent is placed next to a cliff and must get to the goal. The shortest path to

Business, Finance, Economics, Accounting, Operations Management, Computer Science, Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Algebra, Precalculus, Statistics and Probabilty, Advanced Math, Physics, Chemistry, Biology, Nursing, Psychology, Certifications, Tests, Prep, and more.
Post Reply
answerhappygod
Site Admin
Posts: 899604
Joined: Mon Aug 02, 2021 8:13 am

Consider the following environment: your agent is placed next to a cliff and must get to the goal. The shortest path to

Post by answerhappygod »

Consider The Following Environment Your Agent Is Placed Next To A Cliff And Must Get To The Goal The Shortest Path To 1
Consider The Following Environment Your Agent Is Placed Next To A Cliff And Must Get To The Goal The Shortest Path To 1 (60.35 KiB) Viewed 35 times
Consider the following environment: your agent is placed next to a cliff and must get to the goal. The shortest path to the goal is to move along the edge of the cliff. There is also a longer path to the goal that requires the agent to first move away from the cliff, and then towards the goal. The reward for reaching the goal is 100 points, and the reward for falling of the cliff is −1000 points. Every move we make incurs a reward of −1. Assume we use an epsilon-greedy policy for exploration. If we would like to learn the shortest path, should we use an on-policy or off-policy algorithm? Explain why. Note: reading chapter 6 of Sutton \& Barto will help you answer this question.
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!
Post Reply