The Statistics and Machine Learning with R Workshop Unlock the Power of Efficient Data Science Modeling with This Hands-On Guide
Learn the fundamentals of statistics and machine learning using R libraries for data processing, visualization, model training, and statistical inference Key Features Advance your ML career with the help of detailed explanations, intuitive illustrations, and code examples Gain practical insights int...
Otros Autores: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing Ltd
[2023]
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009781239706719 |
Tabla de Contenidos:
- Cover
- Title Page
- Copyright
- Dedication
- Contributors
- Table of Contents
- Preface
- Part 1: Statistics Essentials
- Chapter 1: Getting Started with R
- Technical requirements
- Introducing R
- Covering the R and RStudio basics
- Common data types in R
- Common data structures in R
- Vector
- Matrix
- Data frame
- List
- Control logic in R
- Relational operators
- Logical operators
- Conditional statements
- Loops
- Exploring functions in R
- Summary
- Chapter 2: Data Processing with dplyr
- Technical requirements
- Introducing tidyverse and dplyr
- Data transformation with dplyr
- Slicing the dataset using the filter() function
- Sorting the dataset using the arrange() function
- Adding or changing a column using the mutate() function
- Selecting columns using the select() function
- Selecting the top rows using the top_n() function
- Combining the five verbs
- Introducing other verbs
- Data aggregation with dplyr
- Counting observations using the count() function
- Aggregating data via group_by() and summarize()
- Data merging with dplyr
- Case study - working with the Stack Overflow dataset
- Summary
- Chapter 3: Intermediate Data Processing
- Technical requirements
- Transforming categorical and numeric variables
- Recoding categorical variables
- Creating variables using case_when()
- Binning numeric variables using cut()
- Reshaping the DataFrame
- Converting from long format into wide format using spread()
- Converting from wide format into long format using gather()
- Manipulating string data
- Creating strings
- Converting numbers into strings
- Connecting strings
- Working with stringr
- Basics of stringr
- Pattern matching in a string
- Splitting a string
- Replacing a string
- Putting it together
- Introducing regular expressions
- Working with tidy text mining.
- Converting text into tidy data using unnest_tokens()
- Working with a document-term matrix
- Summary
- Chapter 4: Data Visualization with ggplot2
- Technical requirements
- Introducing ggplot2
- Building a scatter plot
- Understanding the grammar of graphics
- Geometries in graphics
- Understanding geometry in scatter plots
- Introducing bar charts
- Introducing line plots
- Controlling themes in graphics
- Adjusting themes
- Exploring ggthemes
- Summary
- Chapter 5: Exploratory Data Analysis
- Technical requirements
- EDA fundamentals
- Analyzing categorical data
- Summarizing categorical variables using counts
- Converting counts into proportions
- Marginal distribution and faceted bar charts
- Analyzing numerical data
- Visualization in higher dimensions
- Measuring the central concentration
- Measuring variability
- Working with skewed distributions
- EDA in practice
- Obtaining the stock price data
- Univariate analysis of individual stock prices
- Correlation analysis
- Summary
- Chapter 6: Effective Reporting with R Markdown
- Technical requirements
- Fundamentals of R Markdown
- Getting started with R Markdown
- Getting to know the YAML header
- Formatting textual information
- Writing R code
- Generating a financial analysis report
- Getting and displaying the data
- Performing data analysis
- Adding plots to the report
- Adding tables to the report
- Configuring code chunks
- Customizing R Markdown reports
- Adding a table of contents
- Creating a report with parameters
- Customizing the report style
- Summary
- Part 2: Fundamentals of Linear Algebra and Calculus in R
- Chapter 7: Linear Algebra in R
- Technical requirements
- Introducing linear algebra
- Working with vectors
- Working with matrices
- Matrix vector multiplication
- Matrix multiplication
- The identity matrix.
- Transposing a matrix
- Inverting a matrix
- Solving a system of linear equations
- System of linear equations
- The solution to matrix-vector equations
- Geometric interpretation of solving a system of linear equations
- Obtaining a unique solution to a system of linear equations
- Overdetermined and underdetermined systems of linear equations
- Summary
- Chapter 8: Intermediate Linear Algebra in R
- Technical requirements
- Introducing the matrix determinant
- Interpreting the determinant
- Connection to the matrix rank
- Introducing the matrix trace
- Special properties of the matrix trace
- Understanding the matrix norm
- Understanding the vector norm
- Calculating the L 1-norm of a vector
- Calculating the L 2-norm of a vector
- Calculating the L ∞-norm of a vector
- Understanding the matrix norm
- Calculating the L 1-norm of a matrix
- Calculating the Frobenius norm of a matrix
- Calculating the infinity norm of a matrix
- Getting to know eigenvalues and eigenvectors
- Understanding scalar-vector multiplication
- Defining eigenvalues and eigenvectors
- Computing eigenvalues and eigenvectors
- Introducing principal component analysis
- Understanding the variance-covariance matrix
- Connecting to PCA
- Performing PCA
- Summary
- Chapter 9: Calculus in R
- Technical requirements
- Introducing calculus
- Differential and integral calculus
- More on functions
- Vertical line test
- Functional symmetry
- Increasing and decreasing functions
- Slope of a function
- Function composition
- Common functions
- Understanding limits
- Infinite limit
- Limit at infinity
- Introducing derivatives
- Common derivatives
- Common properties and rules of derivatives
- Introducing integral calculus
- Indefinite integrals
- Indefinite integrals of basic functions.
- Properties of indefinite integrals
- Integration by parts
- Definite integrals
- Working with calculus in R
- Plotting basic functions
- Working with derivatives
- Using symbolic parameters
- Working with the second derivative
- Working with partial derivatives
- Working with integration in R
- More on antiderivatives
- Evaluating the definite integral
- Summary
- Part 3: Fundamentals of Mathematical Statistics in R
- Chapter 10: Probability Basics
- Technical requirements
- Introducing probability distribution
- Exploring common discrete probability distributions
- The Bernoulli distribution
- The binomial distribution
- The Poisson distribution
- Poisson approximation to binomial distribution
- The geometric distribution
- Comparing different discrete probability distributions
- Discovering common continuous probability distributions
- The normal distribution
- The exponential distribution
- Uniform distribution
- Generating normally distributed random samples
- Understanding common sampling distributions
- Common sampling distributions
- Understanding order statistics
- Extracting order statistics
- Calculating the value at risk
- Summary
- Chapter 11: Statistical Estimation
- Statistical inference for categorical data
- Statistical inference for a single parameter
- Introducing the General Social Survey dataset
- Calculating the sample proportion
- Calculating the confidence interval
- Interpreting the confidence interval of the sample proportion
- Hypothesis testing for the sample proportion
- Inference for the difference in sample proportions
- Type I and Type II errors
- Testing the independence of two categorical variables
- Introducing the contingency table
- Applying the chi-square test for independence between two categorical variables
- Statistical inference for numerical data.
- Generating a bootstrap distribution for the median
- Constructing the bootstrapped confidence interval
- Re-centering a bootstrap distribution
- Introducing the central limit theorem used in t-distribution
- Constructing the confidence interval for the population mean using the t-distribution
- Performing hypothesis testing for two means
- Introducing ANOVA
- Summary
- Chapter 12: Linear Regression in R
- Introducing linear regression
- Understanding simple linear regression
- Introducing multiple linear regression
- Seeking a higher coefficient of determination
- More on adjusted R 2
- Developing an MLR model
- Introducing Simpson's Paradox
- Working with categorical variables
- Introducing the interaction term
- Handling nonlinear terms
- More on the logarithmic transformation
- Working with the closed-form solution
- Dealing with multicollinearity
- Dealing with heteroskedasticity
- Introducing penalized linear regression
- Working with ridge regression
- Working with lasso regression
- Summary
- Chapter 13: Logistic Regression in R
- Technical requirements
- Introducing logistic regression
- Understanding the sigmoid function
- Grokking the logistic regression model
- Comparing logistic regression with linear regression
- Making predictions using the logistic regression model
- More on log odds and odds ratio
- Introducing the cross-entropy loss
- Evaluating a logistic regression model
- Dealing with an imbalanced dataset
- Penalized logistic regression
- Extending to multi-class classification
- Summary
- Chapter 14: Bayesian Statistics
- Technical requirements
- Introducing Bayesian statistics
- A first look into the Bayesian theorem
- Understanding the generative model
- Understanding prior distributions
- Introducing the likelihood function
- Introducing the posterior model.
- Diving deeper into Bayesian inference.