Hands-on reinforcement learning with python master reinforcement learning and deep reinforcement learning by building intelligent app

A hands-on guide enriched with examples to master deep reinforcement learning algorithms with Python About This Book Your entry point into the world of artificial intelligence using the power of Python An example-rich guide to master various RL and DRL algorithms Explore various state-of-the-art arc...

Full description

Bibliographic Details
Other Authors:	Ravichandiran, Sudharsan, author (author)
Format:	eBook
Language:	Inglés
Published:	Birmingham, London ; Mumbai : Packt 2018.
Edition:	1st edition
Subjects:	Machine learning.
See on Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630398506719

Table of Contents:

Cover
Title Page
Copyright and Credits
Dedication
Packt Upsell
Contributors
Table of Contents
Preface
Chapter 1: Introduction to Reinforcement Learning
What is RL?
RL algorithm
How RL differs from other ML paradigms
Elements of RL
Agent
Policy function
Value function
Model
Agent environment interface
Types of RL environment
Deterministic environment
Stochastic environment
Fully observable environment
Partially observable environment
Discrete environment
Continuous environment
Episodic and non-episodic environment
Single and multi-agent environment
RL platforms
OpenAI Gym and Universe
DeepMind Lab
RL-Glue
Project Malmo
ViZDoom
Applications of RL
Education
Medicine and healthcare
Manufacturing
Inventory management
Finance
Natural Language Processing and Computer Vision
Summary
Questions
Further reading
Chapter 2: Getting Started with OpenAI and TensorFlow
Setting up your machine
Installing Anaconda
Installing Docker
Installing OpenAI Gym and Universe
Common error fixes
OpenAI Gym
Basic simulations
Training a robot to walk
OpenAI Universe
Building a video game bot
TensorFlow
Variables, constants, and placeholders
Variables
Constants
Placeholders
Computation graph
Sessions
TensorBoard
Adding scope
Summary
Questions
Further reading
Chapter 3: The Markov Decision Process and Dynamic Programming
The Markov chain and Markov process
Markov Decision Process
Rewards and returns
Episodic and continuous tasks
Discount factor
The policy function
State value function
State-action value function (Q function)
The Bellman equation and optimality
Deriving the Bellman equation for value and Q functions
Solving the Bellman equation.
Dynamic programming
Value iteration
Policy iteration
Solving the frozen lake problem
Value iteration
Policy iteration
Summary
Questions
Further reading
Chapter 4: Gaming with Monte Carlo Methods
Monte Carlo methods
Estimating the value of pi using Monte Carlo
Monte Carlo prediction
First visit Monte Carlo
Every visit Monte Carlo
Let's play Blackjack with Monte Carlo
Monte Carlo control
Monte Carlo exploration starts
On-policy Monte Carlo control
Off-policy Monte Carlo control
Summary
Questions
Further reading
Chapter 5: Temporal Difference Learning
TD learning
TD prediction
TD control
Q learning
Solving the taxi problem using Q learning
SARSA
Solving the taxi problem using SARSA
The difference between Q learning and SARSA
Summary
Questions
Further reading
Chapter 6: Multi-Armed Bandit Problem
The MAB problem
The epsilon-greedy policy
The softmax exploration algorithm
The upper confidence bound algorithm
The Thompson sampling algorithm
Applications of MAB
Identifying the right advertisement banner using MAB
Contextual bandits
Summary
Questions
Further reading
Chapter 7: Deep Learning Fundamentals
Artificial neurons
ANNs
Input layer
Hidden layer
Output layer
Activation functions
Deep diving into ANN
Gradient descent
Neural networks in TensorFlow
RNN
Backpropagation through time
Long Short-Term Memory RNN
Generating song lyrics using LSTM RNN
Convolutional neural networks
Convolutional layer
Pooling layer
Fully connected layer
CNN architecture
Classifying fashion products using CNN
Summary
Questions
Further reading
Chapter 8: Atari Games with Deep Q Network
What is a Deep Q Network?
Architecture of DQN
Convolutional network.
Experience replay
Target network
Clipping rewards
Understanding the algorithm
Building an agent to play Atari games
Double DQN
Prioritized experience replay
Dueling network architecture
Summary
Questions
Further reading
Chapter 9: Playing Doom with a Deep Recurrent Q Network
DRQN
Architecture of DRQN
Training an agent to play Doom
Basic Doom game
Doom with DRQN
DARQN
Architecture of DARQN
Summary
Questions
Further reading
Chapter 10: The Asynchronous Advantage Actor Critic Network
The Asynchronous Advantage Actor Critic
The three As
The architecture of A3C
How A3C works
Driving up a mountain with A3C
Visualization in TensorBoard
Summary
Questions
Further reading
Chapter 11: Policy Gradients and Optimization
Policy gradient
Lunar Lander using policy gradients
Deep deterministic policy gradient
Swinging a pendulum
Trust Region Policy Optimization
Proximal Policy Optimization
Summary
Questions
Further reading
Chapter 12: Capstone Project - Car Racing Using DQN
Environment wrapper functions
Dueling network
Replay memory
Training the network
Car racing
Summary
Questions
Further reading
Chapter 13: Recent Advancements and Next Steps
Imagination augmented agents
Learning from human preference
Deep Q learning from demonstrations
Hindsight experience replay
Hierarchical reinforcement learning
MAXQ Value Function Decomposition
Inverse reinforcement learning
Summary
Questions
Further reading
Assessments
Other Books You May Enjoy
Index.

Hands-on reinforcement learning with python master reinforcement learning and deep reinforcement learning by building intelligent app

Similar Items