R programming by example practical, hands-on projects to help you get started with R

This step-by-step guide demonstrates how to build simple-to-advanced applications through examples in R using modern tools. About This Book Get a firm hold on the fundamentals of R through practical hands-on examples Get started with good R programming fundamentals for data science Exploit the diffe...

Descripción completa

Detalles Bibliográficos
Otros Autores: Navarro, Omar Trejo, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt 2017.
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630715506719
Tabla de Contenidos:
  • Cover
  • Title Page
  • Copyright
  • Credits
  • About the Author
  • About the Reviewer
  • www.PacktPub.com
  • Customer Feedback
  • Table of Contents
  • Preface
  • Chapter 1: Introduction to R
  • What R is and what it isn't
  • The inspiration for R - the S language
  • R is a high quality statistical computing system
  • R is a flexible programming language
  • R is free, as in freedom and as in free beer
  • What R is not good for
  • Comparing R with other software
  • The interpreter and the console
  • Tools to work efficiently with R
  • Pick an IDE or a powerful editor
  • The send to console functionality
  • The efficient write-execute loop
  • Executing R code in non-interactive sessions
  • How to use this book
  • Tracking state with symbols and variables
  • Working with data types and data structures
  • Numerics
  • Special values
  • Characters
  • Logicals
  • Vectors
  • Factors
  • Matrices
  • Lists
  • Data frames
  • Divide and conquer with functions
  • Optional arguments
  • Functions as arguments
  • Operators are functions
  • Coercion
  • Complex logic with control structures
  • If… else conditionals
  • For loops
  • While loops
  • The examples in this book
  • Summary
  • Chapter 2: Understanding Votes with Descriptive Statistics
  • This chapter's required packages
  • The Brexit votes example
  • Cleaning and setting up the data
  • Summarizing the data into a data frame
  • Getting intuition with graphs and correlations
  • Visualizing variable distributions
  • Using matrix scatter plots for a quick overview
  • Getting a better look with detailed scatter plots
  • Understanding interactions with correlations
  • Creating a new dataset with what we've learned
  • Building new variables with principal components
  • Putting it all together into high-quality code
  • Planning before programming
  • Understanding the fundamentals of high-quality code.
  • Programming by visualizing the big picture
  • Summary
  • Chapter 3: Predicting Votes with Linear Models
  • Required packages
  • Setting up the data
  • Training and testing datasets
  • Predicting votes with linear models
  • Checking model assumptions
  • Checking linearity with scatter plots
  • Checking normality with histograms and quantile-quantile plots
  • Checking homoscedasticity with residual plots
  • Checking no collinearity with correlations
  • Measuring accuracy with score functions
  • Programatically finding the best model
  • Generating model combinations
  • Predicting votes from wards with unknown data
  • Summary
  • Chapter 4: Simulating Sales Data and Working with Databases
  • Required packages
  • Designing our data tables
  • The basic variables
  • Simplifying assumptions
  • Potential pitfalls
  • The too-much-empty-space problem
  • The too-much-repeated-data problem
  • Simulating the sales data
  • Simulating numeric data according to distribution assumptions
  • Simulating categorical values using factors
  • Simulating dates within a range
  • Simulating numbers under shared restrictions
  • Simulating strings for complex identifiers
  • Putting everything together
  • Simulating the client data
  • Simulating the client messages data
  • Working with relational databases
  • Summary
  • Chapter 5: Communicating Sales with Visualizations
  • Required packages
  • Extending our data with profit metrics
  • Building blocks for reusable high-quality graphs
  • Starting with simple applications for bar graphs
  • Adding a third dimension with colors
  • Graphing top performers with bar graphs
  • Graphing disaggregated data with boxplots
  • Scatter plots with joint and marginal distributions
  • Pricing and profitability by protein source and continent
  • Client birth dates, gender, and ratings
  • Developing our own graph type - radar graphs.
  • Exploring with interactive 3D scatter plots
  • Looking at dynamic data with time-series
  • Looking at geographical data with static maps
  • Navigating geographical data with interactive maps
  • Maps you can navigate and zoom-in to
  • High-tech-looking interactive globe
  • Summary
  • Chapter 6: Understanding Reviews with Text Analysis
  • This chapter's required packages
  • What is text analysis and how does it work?
  • Preparing, training, and testing data
  • Building the corpus with tokenization and data cleaning
  • Document feature matrices
  • Training models with cross validation
  • Training our first predictive model
  • Improving speed with parallelization
  • Computing predictive accuracy and confusion matrices
  • Improving our results with TF-IDF
  • Adding flexibility with N-grams
  • Reducing dimensionality with SVD
  • Extending our analysis with cosine similarity
  • Digging deeper with sentiment analysis
  • Testing our predictive model with unseen data
  • Retrieving text data from Twitter
  • Summary
  • Chapter 7: Developing Automatic Presentations
  • Required packages
  • Why invest in automation?
  • Literate programming as a content creation methodology
  • Reproducibility as a benefit of literate programming
  • The basic tools for an automation pipeline
  • A gentle introduction to Markdown
  • Text
  • Headers
  • Header Level  1
  • Header Level  2
  • Header Level  3
  • Header Level  4
  • Lists
  • Tables
  • Links
  • Images
  • Quotes
  • Code
  • Mathematics
  • Extending Markdown with R Markdown
  • Code chunks
  • Tables
  • Graphs
  • Chunk options
  • Global chunk options
  • Caching
  • Producing the final output with knitr
  • Developing graphs and analysis as we normally would
  • Building our presentation with R Markdown
  • Summary
  • Chapter 8: Object-Oriented System to Track Cryptocurrencies
  • This chapter's required packages.
  • The cryptocurrencies example
  • A brief introduction to object-oriented programming
  • The purpose of object-oriented programming
  • Important concepts behind object-oriented languages
  • Encapsulation
  • Polymorphism
  • Hierarchies
  • Classes and constructors
  • Public and private methods
  • Interfaces, factories, and patterns in general
  • Introducing three object models in R - S3, S4, and R6
  • The first source of confusion - various object models
  • The second source of confusion - generic functions
  • The S3 object model
  • Classes, constructors, and composition
  • Public methods and polymorphism
  • Encapsulation and mutability
  • Inheritance
  • The S4 object model
  • Classes, constructors, and composition
  • Public methods and polymorphism
  • Encapsulation and mutability
  • Inheritance
  • The R6 object model
  • Classes, constructors, and composition
  • Public methods and polymorphism
  • Encapsulation and mutability
  • Inheritance
  • Active bindings
  • Finalizers
  • The architecture behind our cryptocurrencies system
  • Starting simple with timestamps using S3 classes
  • Implementing cryptocurrency assets using S4 classes
  • Implementing our storage layer with R6 classes
  • Communicating available behavior with a database interface
  • Implementing a database-like storage system with CSV files
  • Easily allowing new database integration with a factory
  • Encapsulating multiple databases with a storage layer
  • Retrieving live data for markets and wallets with R6 classes
  • Creating a very simple requester to isolate API calls
  • Developing our exchanges infrastructure
  • Developing our wallets infrastructure
  • Implementing our wallet requesters
  • Finally introducing users with S3 classes
  • Helping ourselves with a centralized settings file
  • Saving our initial user data into the system
  • Activating our system with two simple functions.
  • Some advice when working with object-oriented systems
  • Summary
  • Chapter 9: Implementing an Efficient Simple Moving Average
  • Required packages
  • Starting by using good algorithms
  • Just how much impact can algorithm selection have?
  • How fast is fast enough?
  • Calculating simple moving averages inefficiently
  • Simulating the time-series
  • Our first (very inefficient) attempt at an SMA
  • Understanding why R can be slow
  • Object immutability
  • Interpreted dynamic typings
  • Memory-bound processes
  • Single-threaded processes
  • Measuring by profiling and benchmarking
  • Profiling fundamentals with Rprof()
  • Benchmarking manually with system.time()
  • Benchmarking automatically with microbenchmark()
  • Easily achieving high benefit - cost improvements
  • Using the simple data structure for the job
  • Vectorizing as much as possible
  • Removing unnecessary logic
  • Moving checks out of iterative processes
  • If you can, avoid iterating at all
  • Using R's way of iterating efficiently
  • Avoiding sending data structures with overheads
  • Using parallelization to divide and conquer
  • How deep does the parallelization rabbit hole go?
  • Practical parallelization with R
  • Using C++ and Fortran to accelerate calculations
  • Using an old-school approach with Fortran
  • Using a modern approach with C++
  • Looking back at what we have achieved
  • Other topics of interest to enhance performance
  • Preallocating memory to avoid duplication
  • Making R code a bit faster with byte code compilation
  • Just-in-time (JIT) compilation of R code
  • Using memoization or cache layers
  • Improving our data and memory management
  • Using specialized packages for performance
  • Flexibility and power with cloud computing
  • Specialized R distributions
  • Summary
  • Chapter 10: Adding Interactivity with Dashboards
  • Required packages.
  • Introducing the Shiny application architecture and reactivity.