3. (2pts) we will formulate Tic-Tac-Toe as an environment in which we can train a reinforcement learning agent. You will
-
answerhappygod
- Site Admin
- Posts: 899604
- Joined: Mon Aug 02, 2021 8:13 am
3. (2pts) we will formulate Tic-Tac-Toe as an environment in which we can train a reinforcement learning agent. You will
3. (2pts) we will formulate Tic-Tac-Toe as an environment in which we can train a reinforcement learning agent. You will play as X's, and your opponent will be O's. Two-player games such as Tic-Tac-Toe are often modeled using game theory, in which we try and predict the moves of our opponent as well. For simplicity, we ignore the modeling of the opponent moves and treat our opponent's actions as a source of randomness within the environment. Assume you always go first. What are the states and actions within the Tic-Tac-Toe reinforcement learning environment? How does the current state affect the actions you can take?
Join a community of subject matter experts. Register for FREE to view solutions, replies, and use search function. Request answer by replying!