Statistical application development with R and python power of statistics using R and python
Software Implementation Illustrated with R and Python About This Book Learn the nature of data through software which takes the preliminary concepts right away using R and Python. Understand data modeling and visualization to perform efficient statistical analysis with this guide. Get well versed wi...
Other Authors: | |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Birmingham, England ; Mumbai, [India] :
Packt
2017.
|
Edition: | Second edition |
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630737506719 |
Table of Contents:
- Cover
- Copyright
- Credits
- About the Author
- Acknowledgment
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Data Characteristics
- Questionnaire and its components
- Understanding the data characteristics in an R environment
- Experiments with uncertainty in computer science
- Installing and setting up R
- Using R packages
- RSADBE - the books R package
- Python installation and setup
- Using pip for packages
- IDEs for R and Python
- The companion code bundle
- Discrete distributions
- Discrete uniform distribution
- Binomial distribution
- Hypergeometric distribution
- Negative binomial distribution
- Poisson distribution
- Continuous distributions
- Uniform distribution
- Exponential distribution
- Normal distribution
- Summary
- Chapter 2: Import/Export Data
- Packages and settings - R and Python
- Understanding data.frame and other formats
- Constants, vectors, and matrices
- Time for action - understanding constants, vectors, and basic arithmetic
- What just happened?
- Doing it in Python
- Time for action - matrix computations
- What just happened?
- Doing it in Python
- The list object
- Time for action - creating a list object
- What just happened?
- The data.frame object
- Time for action - creating a data.frame object
- What just happened?
- Have a go hero
- The table object
- Time for action - creating the Titanic dataset as a table object
- What just happened?
- Have a go hero
- Using utils and the foreign packages
- Time for action - importing data from external files
- What just happened?
- Doing it in Python
- Importing data from MySQL
- Doing it in Python
- Exporting data/graphs
- Exporting R objects
- Exporting graphs
- Time for action - exporting a graph
- What just happened?
- Managing R sessions.
- Time for action - session management
- What just happened?
- Doing it in Python
- Pop quiz
- Summary
- Chapter 3: Data Visualization
- Packages and settings - R and Python
- Visualization techniques for categorical data
- Bar chart
- Going through the built-in examples of R
- Time for action - bar charts in R
- What just happened?
- Doing it in Python
- Have a go hero
- Dot chart
- Time for action - dot charts in R
- What just happened?
- Doing it in Python
- Spine and mosaic plots
- Time for action - spine plot for the shift and operator data
- What just happened?
- Time for action - mosaic plot for the Titanic dataset
- What just happened?
- Pie chart and the fourfold plot
- Visualization techniques for continuous variable data
- Boxplot
- Time for action - using the boxplot
- What just happened?
- Doing it in Python
- Histogram
- Time for action - understanding the effectiveness of histograms
- What just happened?
- Doing it in Python
- Have a go hero
- Scatter plot
- Time for action - plot and pairs R functions
- What just happened?
- Doing it in Python
- Have a go hero
- Pareto chart
- A brief peek at ggplot2
- Time for action - qplot
- What just happened?
- Time for action - ggplot
- What just happened?
- Pop quiz
- Summary
- Chapter 4: Exploratory Analysis
- Packages and settings - R and Python
- Essential summary statistics
- Percentiles, quantiles, and median
- Hinges
- Interquartile range
- Time for action - the essential summary statistics for The Wall dataset
- What just happened?
- Techniques for exploratory analysis
- The stem-and-leaf plot
- Time for action - the stem function in play
- What just happened?
- Letter values
- Data re-expression
- Have a go hero
- Bagplot - a bivariate boxplot
- Time for action - the bagplot display for multivariate datasets.
- What just happened?
- Resistant line
- Time for action - resistant line as a first regression model
- What just happened?
- Smoothing data
- Time for action - smoothening the cow temperature data
- What just happened?
- Median polish
- Time for action - the median polish algorithm
- What just happened?
- Have a go hero
- Summary
- Chapter 5: Statistical Inference
- Packages and settings - R and Python
- Maximum likelihood estimator
- Visualizing the likelihood function
- Time for action - visualizing the likelihood function
- What just happened?
- Doing it in Python
- Finding the maximum likelihood estimator
- Using the fitdistr function
- Time for action - finding the MLE using mle and fitdistr functions
- What just happened?
- Confidence intervals
- Time for action - confidence intervals
- What just happened?
- Doing it in Python
- Hypothesis testing
- Binomial test
- Time for action - testing probability of success
- What just happened?
- Tests of proportions and the chi-square test
- Time for action - testing proportions
- What just happened?
- Tests based on normal distribution - one sample
- Time for action - testing one-sample hypotheses
- What just happened?
- Have a go hero
- Tests based on normal distribution - two sample
- Time for action - testing two-sample hypotheses
- What just happened?
- Have a go hero
- Doing it in Python
- Summary
- Chapter 6: Linear Regression Analysis
- Packages and settings - R and Python
- The essence of regression
- The simple linear regression model
- What happens to the arbitrary choice of parameters?
- Time for action - the arbitrary choice of parameters
- What just happened?
- Building a simple linear regression model
- Time for action - building a simple linear regression model
- What just happened?
- Have a go hero.
- ANOVA and the confidence intervals
- Time for action - ANOVA and the confidence intervals
- What just happened?
- Model validation
- Time for action - residual plots for model validation
- What just happened?
- Doing it in Python
- Have a go hero
- Multiple linear regression model
- Averaging k simple linear regression models or a multiple linear regression model
- Time for action - averaging k simple linear regression models
- What just happened?
- Building a multiple linear regression model
- Time for action - building a multiple linear regression model
- What just happened?
- The ANOVA and confidence intervals for the multiple linear regression model
- Time for action - the ANOVA and confidence intervals for the multiple linear regression model
- What just happened?
- Have a go hero
- Useful residual plots
- Time for action - residual plots for the multiple linear regression model
- What just happened?
- Regression diagnostics
- Leverage points
- Influential points
- DFFITS and DFBETAS
- The multicollinearity problem
- Time for action - addressing the multicollinearity problem for the gasoline data
- What just happened?
- Doing it in Python
- Model selection
- Stepwise procedures
- The backward elimination
- The forward selection
- The stepwise regression
- Criterion-based procedures
- Time for action - model selection using the backward, forward, and AIC criteria
- What just happened?
- Have a go hero
- Summary
- Chapter 7: Logistic Regression Model
- Packages and settings - R and Python
- The binary regression problem
- Time for action - limitation of linear regression model
- What just happened?
- Probit regression model
- Time for action - understanding the constants
- What just happened?
- Doing it in Python
- Logistic regression model
- Time for action - fitting the logistic regression model.
- What just happened?
- Doing it in Python
- Hosmer-Lemeshow goodness-of-fit test statistic
- Time for action - Hosmer-Lemeshow goodness-of-fit statistic
- What just happened?
- Model validation and diagnostics
- Residual plots for the GLM
- Time for action - residual plots for logistic regression model
- What just happened?
- Doing it in Python
- Have a go hero
- Influence and leverage for the GLM
- Time for action - diagnostics for the logistic regression
- What just happened?
- Have a go hero
- Receiving operator curves
- Time for action - ROC construction
- What just happened?
- Doing it in Python
- Logistic regression for the German credit screening dataset
- Time for action - logistic regression for the German credit dataset
- What just happened?
- Doing it in Python
- Have a go hero
- Summary
- Chapter 8: Regression Models with Regularization
- Packages and settings - R and Python
- The overfitting problem
- Time for action - understanding overfitting
- What just happened?
- Doing it in Python
- Have a go hero
- Regression spline
- Basis functions
- Piecewise linear regression model
- Time for action - fitting piecewise linear regression models
- What just happened?
- Natural cubic splines and the general B-splines
- Time for action - fitting the spline regression models
- What just happened?
- Ridge regression for linear models
- Protecting against overfitting
- Time for action - ridge regression for the linear regression model
- What just happened?
- Doing it in Python
- Ridge regression for logistic regression models
- Time for action - ridge regression for the logistic regression model
- What just happened?
- Another look at model assessment
- Time for action - selecting iteratively and other topics
- What just happened?
- Pop quiz
- Summary.
- Chapter 9: Classification and Regression Trees.