Machine learning for hackers
If you're an experienced programmer interested in crunching data, this book will get you started with machine learning-a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and s...
Autor principal: | |
---|---|
Otros Autores: | , , |
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Sebastopol, California :
O'Reilly Media
2012.
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628120306719 |
Tabla de Contenidos:
- Machine generated contents note: 1. Using R
- R for Machine Learning
- Downloading and Installing R
- IDEs and Text Editors
- Loading and Installing R Packages
- R Basics for Machine Learning
- Further Reading on R
- 2. Data Exploration
- Exploration versus Confirmation
- What Is Data?
- Inferring the Types of Columns in Your Data
- Inferring Meaning
- Numeric Summaries
- Means, Medians, and Modes
- Quantiles
- Standard Deviations and Variances
- Exploratory Data Visualization
- Visualizing the Relationships Between Columns
- 3. Classification: Spam Filtering
- This or That: Binary Classification
- Moving Gently into Conditional Probability
- Writing Our First Bayesian Spam Classifier
- Defining the Classifier and Testing It with Hard Ham
- Testing the Classifier Against All Email Types
- Improving the Results
- 4. Ranking: Priority Inbox
- How Do You Sort Something When You Don't Know the Order?
- Ordering Email Messages by Priority.
- Contents note continued: Priority Features of Email
- Writing a Priority Inbox
- Functions for Extracting the Feature Set
- Creating a Weighting Scheme for Ranking
- Weighting from Email Thread Activity
- Training and Testing the Ranker
- 5. Regression: Predicting Page Views
- Introducing Regression
- The Baseline Model
- Regression Using Dummy Variables
- Linear Regression in a Nutshell
- Predicting Web Traffic
- Defining Correlation
- 6. Regularization: Text Regression
- Nonlinear Relationships Between Columns: Beyond Straight Lines
- Introducing Polynomial Regression
- Methods for Preventing Overfitting
- Preventing Overfitting with Regularization
- Text Regression
- Logistic Regression to the Rescue
- 7. Optimization: Breaking Codes
- Introduction to Optimization
- Ridge Regression
- Code Breaking as Optimization
- 8. PCA: Building a Market Index
- Unsupervised Learning
- 9. MDS: Visually Exploring US Senator Similarity.
- Contents note continued: Clustering Based on Similarity
- A Brief Introduction to Distance Metrics and Multidirectional Scaling
- How Do US Senators Cluster?
- Analyzing US Senator Roll Call Data (101st--111th Congresses)
- 10. kNN: Recommendation Systems
- The k-Nearest Neighbors Algorithm
- R Package Installation Data
- 11. Analyzing Social Graphs
- Social Network Analysis
- Thinking Graphically
- Hacking Twitter Social Graph Data
- Working with the Google SocialGraph API
- Analyzing Twitter Networks
- Local Community Structure
- Visualizing the Clustered Twitter Network with Gephi
- Building Your Own "Who to Follow" Engine
- 12. Model Comparison
- SVMs: The Support Vector Machine
- Comparing Algorithms.