Python machine learning machine learning and deep learning with Python, scikit-learn, and TensorFlow

Unlock modern machine learning and deep learning techniques with Python by using the latest cutting-edge open source Python libraries. About This Book Second edition of the bestselling book on Machine Learning A practical approach to key frameworks in data science, machine learning, and deep learnin...

Descripción completa

Detalles Bibliográficos
Otros Autores: Raschka, Sebastian, author (author), Mirajalili, Vahid, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England ; Mumbai, [India] : Packt 2017.
Edición:Second edition, fully revised and updated
Colección:Expert Insight
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630672006719
Tabla de Contenidos:
  • Cover
  • Copyright
  • Credits
  • About the Authors
  • About the Reviewers
  • www.PacktPub.com
  • Packt is Searching for Authors Like You
  • Table of Contents
  • Preface
  • Chapter 1: Giving Computers the Ability to Learn from Data
  • Building intelligent machines to transform data into knowledge
  • The three different types of machine learning
  • Making predictions about the future with supervised learning
  • Classification for predicting class labels
  • Regression for predicting continuous outcomes
  • Solving interactive problems with reinforcement learning
  • Discovering hidden structures with unsupervised learning
  • Finding subgroups with clustering
  • Dimensionality reduction for data compression
  • Introduction to the basic terminology and notations
  • A roadmap for building machine learning systems
  • Preprocessing - getting data into shape
  • Training and selecting a predictive model
  • Evaluating models and predicting unseen data instances
  • Using Python for machine learning
  • Installing Python and packages from the Python Package Index
  • Using the Anaconda Python distribution and package manager
  • Packages for scientific computing, data science, and machine learning
  • Summary
  • Chapter 2: Training Simple Machine Learning Algorithms for Classification
  • Artificial neurons - a brief glimpse into the early history of machine learning
  • The formal definition of an artificial neuron
  • The perceptron learning rule
  • Implementing a perceptron learning algorithm in Python
  • An object-oriented perceptron API
  • Training a perceptron model on the Iris dataset
  • Adaptive linear neurons and the convergence of learning
  • Minimizing cost functions with gradient descent
  • Implementing Adaline in Python
  • Improving gradient descent through feature scaling
  • Large-scale machine learning and stochastic gradient descent
  • Summary.
  • Chapter 3: A Tour of Machine Learning Classifiers Using scikit-learn
  • Choosing a classification algorithm
  • First steps with scikit-learn - training a perceptron
  • Modeling class probabilities via logistic regression
  • Logistic regression intuition and conditional probabilities
  • Learning the weights of the logistic cost function
  • Converting an Adaline implementation into an algorithm for logistic regression
  • Training a logistic regression model with scikit-learn
  • Tackling overfitting via regularization
  • Maximum margin classification with support vector machines
  • Maximum margin intuition
  • Dealing with a nonlinearly separable case using slack variables
  • Alternative implementations in scikit-learn
  • Solving nonlinear problems using a kernel SVM
  • Kernel methods for linearly inseparable data
  • Using the kernel trick to find separating hyperplanes in high-dimensional space
  • Decision tree learning
  • Maximizing information gain - getting the most bang for your buck
  • Building a decision tree
  • Combining multiple decision trees via random forests
  • K-nearest neighbors - a lazy learning algorithm
  • Summary
  • Chapter 4: Building Good Training Sets - Data Preprocessing
  • Dealing with missing data
  • Identifying missing values in tabular data
  • Eliminating samples or features with missing values
  • Imputing missing values
  • Understanding the scikit-learn estimator API
  • Handling categorical data
  • Nominal and ordinal features
  • Creating an example dataset
  • Mapping ordinal features
  • Encoding class labels
  • Performing one-hot encoding on nominal features
  • Partitioning a dataset into separate training and test sets
  • Bringing features onto the same scale
  • Selecting meaningful features
  • L1 and L2 regularization as penalties against model complexity
  • A geometric interpretation of L2 regularization.
  • Sparse solutions with L1 regularization
  • Sequential feature selection algorithms
  • Assessing feature importance with random forests
  • Summary
  • Chapter 5: Compressing Data via Dimensionality Reduction
  • Unsupervised dimensionality reduction via principal component analysis
  • The main steps behind principal component analysis
  • Extracting the principal components step by step
  • Total and explained variance
  • Feature transformation
  • Principal component analysis in scikit-learn
  • Supervised data compression via linear discriminant analysis
  • Principal component analysis versus linear discriminant analysis
  • The inner workings of linear discriminant analysis
  • Computing the scatter matrices
  • Selecting linear discriminants for the new feature subspace
  • Projecting samples onto the new feature space
  • LDA via scikit-learn
  • Using kernel principal component analysis for nonlinear mappings
  • Kernel functions and the kernel trick
  • Implementing a kernel principal component analysis in Python
  • Example 1 - separating half-moon shapes
  • Example 2 - separating concentric circles
  • Projecting new data points
  • Kernel principal component analysis in scikit-learn
  • Summary
  • Chapter 6: Learning Best Practices for Model Evaluation and Hyperparameter Tuning
  • Streamlining workflows with pipelines
  • Loading the Breast Cancer Wisconsin dataset
  • Combining transformers and estimators in a pipeline
  • Using k-fold cross-validation to assess model performance
  • The holdout method
  • K-fold cross-validation
  • Debugging algorithms with learning and validation curves
  • Diagnosing bias and variance problems with learning curves
  • Addressing over- and underfitting with validation curves
  • Fine-tuning machine learning models via grid search
  • Tuning hyperparameters via grid search
  • Algorithm selection with nested cross-validation.
  • Looking at different performance evaluation metrics
  • Reading a confusion matrix
  • Optimizing the precision and recall of a classification model
  • Plotting a receiver operating characteristic
  • Scoring metrics for multiclass classification
  • Dealing with class imbalance
  • Summary
  • Chapter 7: Combining Different Models for Ensemble Learning
  • Learning with ensembles
  • Combining classifiers via majority vote
  • Implementing a simple majority vote classifier
  • Using the majority voting principle to make predictions
  • Evaluating and tuning the ensemble classifier
  • Bagging - building an ensemble of classifiers from bootstrap samples
  • Bagging in a nutshell
  • Applying bagging to classify samples in the Wine dataset
  • Leveraging weak learners via adaptive boosting
  • How boosting works
  • Applying AdaBoost using scikit-learn
  • Summary
  • Chapter 8: Applying Machine Learning to Sentiment Analysis
  • Preparing the IMDb movie review data for text processing
  • Obtaining the movie review dataset
  • Preprocessing the movie dataset into more convenient format
  • Introducing the bag-of-words model
  • Transforming words into feature vectors
  • Assessing word relevancy via term frequency-inverse document frequency
  • Cleaning text data
  • Processing documents into tokens
  • Training a logistic regression model for document classification
  • Working with bigger data - online algorithms and out-of-core learning
  • Topic modeling with Latent Dirichlet Allocation
  • Decomposing text documents with LDA
  • LDA with scikit-learn
  • Summary
  • Chapter 9: Embedding a Machine Learning Model into a Web Application
  • Serializing fitted scikit-learn estimators
  • Setting up an SQLite database for data storage
  • Developing a web application with Flask
  • Our first Flask web application
  • Form validation and rendering
  • Setting up the directory structure.
  • Implementing a macro using the Jinja2 templating engine
  • Adding style via CSS
  • Creating the result page
  • Turning the movie review classifier into a web application
  • Files and folders - looking at the directory tree
  • Implementing the main application as app.py
  • Setting up the review form
  • Creating a results page template
  • Deploying the web application to a public server
  • Creating a PythonAnywhere account
  • Uploading the movie classifier application
  • Updating the movie classifier
  • Summary
  • Chapter 10: Predicting Continuous Target Variables with Regression Analysis
  • Introducing linear regression
  • Simple linear regression
  • Multiple linear regression
  • Exploring the Housing dataset
  • Loading the Housing dataset into a data frame
  • Visualizing the important characteristics of a dataset
  • Looking at relationships using a correlation matrix
  • Implementing an ordinary least squares linear regression model
  • Solving regression for regression parameters with gradient descent
  • Estimating coefficient of a regression model via scikit-learn
  • Fitting a robust regression model using RANSAC
  • Evaluating the performance of linear regression models
  • Using regularized methods for regression
  • Turning a linear regression model into a curve - polynomial regression
  • Adding polynomial terms using scikit-learn
  • Modeling nonlinear relationships in the Housing dataset
  • Dealing with nonlinear relationships using random forests
  • Decision tree regression
  • Random forest regression
  • Summary
  • Chapter 11: Working with Unlabeled Data - Clustering Analysis
  • Grouping objects by similarity using k-means
  • K-means clustering using scikit-learn
  • A smarter way of placing the initial cluster centroids using k-means++
  • Hard versus soft clustering
  • Using the elbow method to find the optimal number of clusters.
  • Quantifying the quality of clustering via silhouette plots.