R data analysis projects build end to end analytics systems to get deeper insights from your data

Get valuable insights from your data by building data analysis systems from scratch with R. About This Book A handy guide to take your understanding of data analysis with R to the next level Real-world projects that focus on problems in finance, network analysis, social media, and more From data man...

Descripción completa

Detalles Bibliográficos
Otros Autores: Subramanian, Gopi, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England ; Mumbai, [India] : Packt Publishing 2017.
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630405306719
Tabla de Contenidos:
  • Cover
  • Title Page
  • Copyright
  • Credits
  • About the Author
  • About the Reviewer
  • www.PacktPub.com
  • Customer Feedback
  • Table of Contents
  • Preface
  • Chapter 1: Association Rule Mining
  • Understanding the recommender systems
  • Transactions
  • Weighted transactions
  • Our web application
  • Retailer use case and data
  • Association rule mining
  • Support and confidence thresholds
  • The cross-selling campaign
  • Leverage
  • Conviction
  • Weighted association rule mining
  • Hyperlink-induced topic search (HITS)
  • Negative association rules
  • Rules visualization
  • Wrapping up
  • Summary
  • Chapter 2: Fuzzy Logic Induced Content-Based Recommendation
  • Introducing content-based recommendation
  • News aggregator use case and data
  • Designing the content-based recommendation engine
  • Building a similarity index
  • Bag-of-words
  • Term frequency
  • Document frequency
  • Inverse document frequency (IDF)
  • TFIDF
  • Why cosine similarity?
  • Searching
  • Polarity scores
  • Jaccard's distance
  • Jaccards distance/index
  • Ranking search results
  • Fuzzy logic
  • Fuzzification
  • Defining the rules
  • Evaluating the rules
  • Defuzzification
  • Complete R Code
  • Summary
  • Chapter 3: Collaborative Filtering
  • Collaborative filtering
  • Memory-based approach
  • Model-based approach
  • Latent factor approach
  • Recommenderlab package
  • Popular approach
  • Use case and data
  • Designing and implementing collaborative filtering
  • Ratings matrix
  • Normalization
  • Train test split
  • Train model
  • User-based models
  • Item-based models
  • Factor-based models
  • Complete R Code
  • Summary
  • Chapter 4: Taming Time Series Data Using Deep Neural Networks
  • Time series data
  • Non-seasonal time series
  • Seasonal time series
  • Time series as a regression problem
  • Deep neural networks
  • Forward cycle
  • Backward cycle.
  • Introduction to the MXNet R package
  • Symbolic programming in MXNet
  • Softmax activation
  • Use case and data
  • Deep networks for time series prediction
  • Training test split
  • Complete R code
  • Summary
  • Chapter 5: Twitter Text Sentiment Classification Using Kernel Density Estimates
  • Kernel density estimation
  • Twitter text
  • Sentiment classification
  • Dictionary methods
  • Machine learning methods
  • Our approach
  • Dictionary based scoring
  • Text pre-processing
  • Term-frequeny inverse document frequency (TFIDF)
  • Delta TFIDF
  • Building a sentiment classifier
  • Assembling an RShiny application
  • Complete R code
  • Summary
  • Chapter 6: Record Linkage - Stochastic and Machine Learning Approaches
  • Introducing our use case
  • Demonstrating the use of RecordLinkage package
  • Feature generation
  • String features
  • Phonetic features
  • Stochastic record linkage
  • Expectation maximization method
  • Weights-based method
  • Machine learning-based record linkage
  • Unsupervised learning
  • Supervised learning
  • Building an RShiny application
  • Complete R code
  • Feature generation
  • Expectation maximization method
  • Weights-based method
  • Machine learning method
  • RShiny application
  • Summary
  • Chapter 7: Streaming Data Clustering Analysis in R
  • Streaming data and its challenges
  • Bounded problems
  • Drift
  • Single pass
  • Real time
  • Introducing stream clustering
  • Macro-cluster
  • Introducing the stream package
  • Data stream data
  • DSD as a static simulator
  • DSD as a simulator with drift
  • DSD connecting to memory, file, or database
  • Inflight operation
  • Can we connect this DSD to an actual data stream?
  • Data stream task
  • Use case and data
  • Speed layer
  • Batch layer
  • Reservoir sampling
  • Complete R code
  • Summary
  • Chapter 8: Analyze and Understand Networks Using R
  • Graphs in R.
  • Degree of a vertex
  • Strength of a vertex
  • Adjacency Matrix
  • More networks in R
  • Centrality of a vertex
  • Farness and Closeness of a node
  • Finding the shortest path between nodes
  • Random walk on a graph
  • Use case and data
  • Data preparation
  • Product network analysis
  • Building a RShiny application
  • The complete R script
  • Summary
  • Index.