Hands-on intelligent agents with OpenAI Gym a step-by-step guide to develop AI agents using deep reinforcement learning
Implement intelligent agents using PyTorch to solve classic AI problems, play console games like Atari, and perform tasks such as autonomous driving using the CARLA driving simulator Key Features Explore the OpenAI Gym toolkit and interface to use over 700 learning tasks Implement agents to solve si...
Otros Autores: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham ; Mumbai :
Packt
2018.
|
Edición: | 1st edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630749006719 |
Tabla de Contenidos:
- Cover
- Title Page
- Copyright and Credits
- Dedication
- Packt Upsell
- Contributors
- Table of Contents
- Preface
- Chapter 1: Introduction to Intelligent Agents and Learning Environments
- What is an intelligent agent?
- Learning environments
- What is OpenAI Gym?
- Understanding the features of OpenAI Gym
- Simple environment interface
- Comparability and reproducibility
- Ability to monitor progress
- What can you do with the OpenAI Gym toolkit?
- Creating your first OpenAI Gym environment
- Creating and visualizing a new Gym environment
- Summary
- Chapter 2: Reinforcement Learning and Deep Reinforcement Learning
- What is reinforcement learning?
- Understanding what AI means and what's in it in an intuitive way
- Supervised learning
- Unsupervised learning
- Reinforcement learning
- Practical reinforcement learning
- Agent
- Rewards
- Environment
- State
- Model
- Value function
- State-value function
- Action-value function
- Policy
- Markov Decision Process
- Planning with dynamic programming
- Monte Carlo learning and temporal difference learning
- SARSA and Q-learning
- Deep reinforcement learning
- Practical applications of reinforcement and deep reinforcement learning algorithms
- Summary
- Chapter 3: Getting Started with OpenAI Gym and Deep Reinforcement Learning
- Code repository, setup, and configuration
- Prerequisites
- Creating the conda environment
- Minimal install - the quick and easy way
- Complete install of OpenAI Gym learning environments
- Instructions for Ubuntu
- Instructions for macOS
- MuJoCo installation
- Completing the OpenAI Gym setup
- Installing tools and libraries needed for deep reinforcement learning
- Installing prerequisite system packages
- Installing Compute Unified Device Architecture (CUDA)
- Installing PyTorch
- Summary.
- Chapter 4: Exploring the Gym and its Features
- Exploring the list of environments and nomenclature
- Nomenclature
- Exploring the Gym environments
- Understanding the Gym interface
- Spaces in the Gym
- Summary
- Chapter 5: Implementing your First Learning Agent - Solving the Mountain Car problem
- Understanding the Mountain Car problem
- The Mountain Car problem and environment
- Implementing a Q-learning agent from scratch
- Revisiting Q-learning
- Implementing a Q-learning agent using Python and NumPy
- Defining the hyperparameters
- Implementing the Q_Learner class's __init__ method
- Implementing the Q_Learner class's discretize method
- Implementing the Q_Learner's get_action method
- Implementing the Q_learner class's learn method
- Full Q_Learner class implementation
- Training the reinforcement learning agent at the Gym
- Testing and recording the performance of the agent
- A simple and complete Q-Learner implementation for solving the Mountain Car problem
- Summary
- Chapter 6: Implementing an Intelligent Agent for Optimal Control using Deep Q-Learning
- Improving the Q-learning agent
- Using neural networks to approximate Q-functions
- Implementing a shallow Q-network using PyTorch
- Implementing the Shallow_Q_Learner
- Solving the Cart Pole problem using a Shallow Q-Network
- Experience replay
- Implementing the experience memory
- Implementing the replay experience method for the Q-learner class
- Revisiting the epsilon-greedy action policy
- Implementing an epsilon decay schedule
- Implementing a deep Q-learning agent
- Implementing a deep convolutional Q-network in PyTorch
- Using the target Q-network to stabilize an agent's learning
- Logging and visualizing an agent's learning process
- Using TensorBoard for logging and visualizing a PyTorch RL agent's progress.
- Managing hyperparameters and configuration parameters
- Using a JSON file to easily configure parameters
- The parameters manager
- A complete deep Q-learner to solve complex problems with raw pixel input
- The Atari Gym environment
- Customizing the Atari Gym environment
- Implementing custom Gym environment wrappers
- Reward clipping
- Preprocessing Atari screen image frames
- Normalizing observations
- Random no-ops on reset
- Fire on reset
- Episodic life
- Max and skip-frame
- Wrapping the Gym environment
- Training the deep Q-learner to play Atari games
- Putting together a comprehensive deep Q-learner
- Hyperparameters
- Launching the training process
- Testing performance of your deep Q-learner in Atari games
- Summary
- Chapter 7: Creating Custom OpenAI Gym Environments - CARLA Driving Simulator
- Understanding the anatomy of Gym environments
- Creating a template for custom Gym environment implementations
- Registering custom environments with OpenAI Gym
- Creating an OpenAI Gym-compatible CARLA driving simulator environment
- Configuration and initialization
- Configuration
- Initialization
- Implementing the reset method
- Customizing the CARLA simulation using the CarlaSettings object
- Adding cameras and sensors to a vehicle in CARLA
- Implementing the step function for the CARLA environment
- Accessing camera or sensor data
- Sending actions to control agents in CARLA
- Continuous action space in CARLA
- Discrete action space in CARLA
- Sending actions to the CARLA simulation server
- Determining the end of episodes in the CARLA environment
- Testing the CARLA Gym environment
- Summary
- Chapter 8: Implementing an Intelligent - Autonomous Car Driving Agent using Deep Actor-Critic Algorithm
- The deep n-step advantage actor-critic algorithm
- Policy gradients
- The likelihood ratio trick.
- The policy gradient theorem
- Actor-critic algorithm
- Advantage actor-critic algorithm
- n-step advantage actor-critic algorithm
- n-step returns
- Implementing the n-step return calculation
- Deep n-step advantage actor-critic algorithm
- Implementing a deep n-step advantage actor critic agent
- Initializing the actor and critic networks
- Gathering n-step experiences using the current policy
- Calculating the actor's and critic's losses
- Updating the actor-critic model
- Tools to save/load, log, visualize, and monitor
- An extension - asynchronous deep n-step advantage actor-critic
- Training an intelligent and autonomous driving agent
- Training and testing the deep n-step advantage actor-critic agent
- Training the agent to drive a car in the CARLA driving simulator
- Summary
- Chapter 9: Exploring the Learning Environment Landscape - Roboschool, Gym-Retro, StarCraft-II, DeepMindLab
- Gym interface-compatible environments
- Roboschool
- Quickstart guide to setting up and running Roboschool environments
- Gym retro
- Quickstart guide to setup and run Gym Retro
- Other open source Python-based learning environments
- StarCraft II - PySC2
- Quick start guide to setup and run StarCraft II PySC2 environment
- Downloading the StarCraft II Linux packages
- Downloading the SC2 maps
- Installing PySC2
- Playing StarCraftII yourself or running sample agents
- DeepMind lab
- DeepMind Lab learning environment interface
- reset(episode=-1, seed=None)
- step(action, num_steps=1)
- observations()
- is_running()
- observation_spec()
- action_spec()
- num_steps()
- fps()
- events()
- close()
- Quick start guide to setup and run DeepMind Lab
- Setting up and installing DeepMind Lab and its dependencies
- Playing the game, testing a randomly acting agent, or training your own!
- Summary.
- Chapter 10: Exploring the Learning Algorithm Landscape - DDPG (Actor-Critic), PPO (Policy-Gradient), Rainbow (Value-Based)
- Deep Deterministic Policy Gradients
- Core concepts
- Proximal Policy Optimization
- Core concept
- Off-policy learning
- On-policy
- Rainbow
- Core concept
- DQN
- Double Q-Learning
- Prioritized experience replay
- Dueling networks
- Multi-step learning/n-step learning
- Distributional RL
- Noisy nets
- Quick summary of advantages and applications
- Summary
- Other Books You May Enjoy
- Index.