Hands-on ensemble learning with R a beginner's guide to combining the power of machine learning algorithms using ensemble techniques

Explore powerful R packages to create predictive models using ensemble methods Key Features Implement machine learning algorithms to build ensemble-efficient models Explore powerful R packages to create predictive models using ensemble methods Learn to build ensemble models on large datasets using a...

Full description

Bibliographic Details
Other Authors: Tattar, Prabhanjan Narayanachar, author (author)
Format: eBook
Language:Inglés
Published: Birmingham ; Mumbai : Packt Publishing 2018.
Edition:1st edition
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630745306719
Table of Contents:
  • Cover
  • Copyright
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: Introduction to Ensemble Techniques
  • Datasets
  • Hypothyroid
  • Waveform
  • German Credit
  • Iris
  • Pima Indians Diabetes
  • US Crime
  • Overseas visitors
  • Primary Biliary Cirrhosis
  • Multishapes
  • Board Stiffness
  • Statistical/machine learning models
  • Logistic regression model
  • Logistic regression for hypothyroid classification
  • Neural networks
  • Neural network for hypothyroid classification
  • Naïve Bayes classifier
  • Naïve Bayes for hypothyroid classification
  • Decision tree
  • Decision tree for hypothyroid classification
  • Support vector machines
  • SVM for hypothyroid classification
  • The right model dilemma!
  • An ensemble purview
  • Complementary statistical tests
  • Permutation test
  • Chi-square and McNemar test
  • ROC test
  • Summary
  • Chapter 2: Bootstrapping
  • Technical requirements
  • The jackknife technique
  • The jackknife method for mean and variance
  • Pseudovalues method for survival data
  • Bootstrap - a statistical method
  • The standard error of correlation coefficient
  • The parametric bootstrap
  • Eigen values
  • Rule of thumb
  • The boot package
  • Bootstrap and testing hypotheses
  • Bootstrapping regression models
  • Bootstrapping survival models*
  • Bootstrapping time series models*
  • Summary
  • Chapter 3: Bagging
  • Technical requirements
  • Classification trees and pruning
  • Bagging
  • k-NN classifier
  • Analyzing waveform data
  • k-NN bagging
  • Summary
  • Chapter 4: Random Forests
  • Technical requirements
  • Random Forests
  • Variable importance
  • Proximity plots
  • Random Forest nuances
  • Comparisons with bagging
  • Missing data imputation
  • Clustering with Random Forest
  • Summary
  • Chapter 5: The Bare Bones Boosting Algorithms
  • Technical requirements
  • The general boosting algorithm
  • Adaptive boosting.
  • Gradient boosting
  • Building it from scratch
  • Squared-error loss function
  • Using the adabag and gbm packages
  • Variable importance
  • Comparing bagging, random forests, and boosting
  • Summary
  • Chapter 6: Boosting Refinements
  • Technical requirements
  • Why does boosting work?
  • The gbm package
  • Boosting for count data
  • Boosting for survival data
  • The xgboost package
  • The h2o package
  • Summary
  • Chapter 7: The General Ensemble Technique
  • Technical requirements
  • Why does ensembling work?
  • Ensembling by voting
  • Majority voting
  • Weighted voting
  • Ensembling by averaging
  • Simple averaging
  • Weight averaging
  • Stack ensembling
  • Summary
  • Chapter 8: Ensemble Diagnostics
  • Technical requirements
  • What is ensemble diagnostics?
  • Ensemble diversity
  • Numeric prediction
  • Class prediction
  • Pairwise measure
  • Disagreement measure
  • Yule's or Q-statistic
  • Correlation coefficient measure
  • Cohen's statistic
  • Double-fault measure
  • Interrating agreement
  • Entropy measure
  • Kohavi-Wolpert measure
  • Disagreement measure for ensemble
  • Measurement of interrater agreement
  • Summary
  • Chapter 9: Ensembling Regression Models
  • Technical requirements
  • Pre-processing the housing data
  • Visualization and variable reduction
  • Variable clustering
  • Regression models
  • Linear regression model
  • Neural networks
  • Regression tree
  • Prediction for regression models
  • Bagging and Random Forests
  • Boosting regression models
  • Stacking methods for regression models
  • Summary
  • Chapter 10: Ensembling Survival Models
  • Core concepts of survival analysis
  • Nonparametric inference
  • Regression models - parametric and Cox proportional hazards models
  • Survival tree
  • Ensemble survival models
  • Summary
  • Chapter 11: Ensembling Time Series Models
  • Technical requirements
  • Time series datasets.
  • AirPassengers
  • co2
  • uspop
  • gas
  • Car Sales
  • austres
  • WWWusage
  • Time series visualization
  • Core concepts and metrics
  • Essential time series models
  • Naïve forecasting
  • Seasonal, trend, and loess fitting
  • Exponential smoothing state space model
  • Auto-regressive Integrated Moving Average (ARIMA) models
  • Auto-regressive neural networks
  • Messing it all up
  • Bagging and time series
  • Ensemble time series models
  • Summary
  • Chapter 12: What's Next?
  • Bibliography
  • References
  • R package references
  • Other Books You May Enjoy
  • Index.