Building machine learning systems with Python explore machine learning and deep learning techniques for building intelligent systems using scikit-learn and TensorFlow
Get more from your data by creating practical machine learning systems with Python Key Features Develop your own Python-based machine learning system Discover how Python offers multiple algorithms for modern machine learning systems Explore key Python machine learning libraries to implement in your...
Otros Autores: | , , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt
July 2018.
|
Edición: | Third edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630734206719 |
Tabla de Contenidos:
- Cover
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributors
- Table of Contents
- Preface
- Chapter 1: Getting Started with Python Machine Learning
- Machine learning and Python - a dream team
- What the book will teach you - and what it will not
- How to best read this book
- What to do when you are stuck
- Getting started
- Introduction to NumPy, SciPy, Matplotlib, and TensorFlow
- Installing Python
- Chewing data efficiently with NumPy and intelligently with SciPy
- Learning NumPy
- Indexing
- Handling nonexistent values
- Comparing the runtime
- Learning SciPy
- Fundamentals of machine learning
- Asking a question
- Getting answers
- Our first (tiny) application of machine learning
- Reading in the data
- Preprocessing and cleaning the data
- Choosing the right model and learning algorithm
- Before we build our first model
- Starting with a simple straight line
- Toward more complex models
- Stepping back to go forward - another look at our data
- Training and testing
- Answering our initial question
- Summary
- Chapter 2: Classifying with Real-World Examples
- The Iris dataset
- Visualization is a good first step
- Classifying with scikit-learn
- Building our first classification model
- Evaluation - holding out data and cross-validation
- How to measure and compare classifiers
- A more complex dataset and the nearest-neighbor classifier
- Learning about the seeds dataset
- Features and feature engineering
- Nearest neighbor classification
- Looking at the decision boundaries
- Which classifier to use
- Summary
- Chapter 3: Regression
- Predicting house prices with regression
- Multidimensional regression
- Cross-validation for regression
- Penalized or regularized regression
- L1 and L2 penalties
- Using Lasso or ElasticNet in scikit-learn
- Visualizing the Lasso path.
- P-greater-than-N scenarios
- An example based on text documents
- Setting hyperparameters in a principled way
- Regression with TensorFlow
- Summary
- Chapter 4: Classification I - Detecting Poor Answers
- Sketching our roadmap
- Learning to classify classy answers
- Tuning the instance
- Tuning the classifier
- Fetching the data
- Slimming the data down to chewable chunks
- Preselecting and processing attributes
- Defining what a good answer is
- Creating our first classifier
- Engineering the features
- Training the classifier
- Measuring the classifier's performance
- Designing more features
- Deciding how to improve the performance
- Bias, variance and their trade-off
- Fixing high bias
- Fixing high variance
- High or low bias?
- Using logistic regression
- A bit of math with a small example
- Applying logistic regression to our post-classification problem
- Looking behind accuracy - precision and recall
- Slimming the classifier
- Ship it!
- Classification using Tensorflow
- Summary
- Chapter 5: Dimensionality Reduction
- Sketching our roadmap
- Selecting features
- Detecting redundant features using filters
- Correlation
- Mutual information
- Asking the model about the features using wrappers
- Other feature selection methods
- Feature projection
- Principal component analysis
- Sketching PCA
- Applying PCA
- Limitations of PCA and how LDA can help
- Multidimensional scaling
- Autoencoders, or neural networks for dimensionality reduction
- Summary
- Chapter 6: Clustering - Finding Related Posts
- Measuring the relatedness of posts
- How not to do it
- How to do it
- Preprocessing - similarity measured as a similar number of common words
- Converting raw text into a bag of words
- Counting words
- Normalizing word count vectors
- Removing less important words
- Stemming.
- Installing and using NLTK
- Extending the vectorizer with NLTK's stemmer
- Stop words on steroids
- Our achievements and goals
- Clustering
- K-means
- Getting test data to evaluate our ideas
- Clustering posts
- Solving our initial challenge
- Another look at noise
- Tweaking the parameters
- Summary
- Chapter 7: Recommendations
- Rating predictions and recommendations
- Splitting into training and testing
- Normalizing the training data
- A neighborhood approach to recommendations
- A regression approach to recommendations
- Combining multiple methods
- Basket analysis
- Obtaining useful predictions
- Analyzing supermarket shopping baskets
- Association rule mining
- More advanced basket analysis
- Summary
- Chapter 8: Artificial Neural Networks and Deep Learning
- Using TensorFlow
- TensorFlow API
- Graphs
- Sessions
- Useful operations
- Saving and restoring neural networks
- Training neural networks
- Convolutional neural networks
- Recurrent neural networks
- LSTM for predicting text
- LSTM for image processing
- Summary
- Chapter 9: Classification II - Sentiment Analysis
- Sketching our roadmap
- Fetching the Twitter data
- Introducing the Naïve Bayes classifier
- Getting to know the Bayes theorem
- Being naïve
- Using Naïve Bayes to classify
- Accounting for unseen words and other oddities
- Accounting for arithmetic underflows
- Creating our first classifier and tuning it
- Solving an easy problem first
- Using all classes
- Tuning the classifier's parameters
- Cleaning tweets
- Taking the word types into account
- Determining the word types
- Successfully cheating using SentiWordNet
- Our first estimator
- Putting everything together
- Summary
- Chapter 10: Topic Modeling
- Latent Dirichlet allocation
- Building a topic model
- Comparing documents by topic.
- Modeling the whole of Wikipedia
- Choosing the number of topics
- Summary
- Chapter 11: Classification III - Music Genre Classification
- Sketching our roadmap
- Fetching the music data
- Converting into WAV format
- Looking at music
- Decomposing music into sine-wave components
- Using FFT to build our first classifier
- Increasing experimentation agility
- Training the classifier
- Using a confusion matrix to measure accuracy in multiclass problems
- An alternative way to measure classifier performance using receiver-operator characteristics
- Improving classification performance with mel frequency cepstral coefficients
- Music classification using Tensorflow
- Summary
- Chapter 12: Computer Vision
- Introducing image processing
- Loading and displaying images
- Thresholding
- Gaussian blurring
- Putting the center in focus
- Basic image classification
- Computing features from images
- Writing your own features
- Using features to find similar images
- Classifying a harder dataset
- Local feature representations
- Image generation with adversarial networks
- Summary
- Chapter 13: Reinforcement Learning
- Types of reinforcement learning
- Policy and value network
- Q-network
- Excelling at games
- A small example
- Using Tensorflow for the text game
- Playing breakout
- Summary
- Chapter 14: Bigger Data
- Learning about big data
- Using jug to break up your pipeline into tasks
- An introduction to tasks in jug
- Looking under the hood
- Using jug for data analysis
- Reusing partial results
- Using Amazon Web Services
- Creating your first virtual machines
- Installing Python packages on Amazon Linux
- Running jug on our cloud machine
- Automating the generation of clusters with cfncluster
- Summary
- Appendi A: Where to Learn More About Machine Learning
- Online courses
- Books
- Blogs.
- Data sources
- Getting competitive
- All that was left out
- Summary
- Other Books You May Enjoy
- Index.