Machine learning techniques for text apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Machine learning and Python offer unique opportunities to process text data. This book will equip you with the skills you need to undertake a role in the field. The content keeps the right balance between need-to-know theory and hands-on practice, grounding the discussion around different case studi...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Tsourakis, Nikos, author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing Ltd [2022]
Edición:	1st ed
Materias:	Text data mining. Machine learning. Machine learning > Computer programs. Python (Computer program language)
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009701333306719

Tabla de Contenidos:

Cover
Title Page
Copyright and Credits
Acknowledgments
Contributors
Table of Contents
Preface
Chapter 1: Introducing Machine Learning for Text
The language phenomenon
The data explosion
The era of AI
Relevant research fields
The machine learning paradigm
Taxonomy of machine learning techniques
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Visualization of the data
Evaluation of the results
Summary
Chapter 2: Detecting Spam Emails
Technical requirements
Understanding spam detection
Explaining feature engineering
Extracting word representations
Using label encoding
Using one-hot encoding
Using token count encoding
Using tf-idf encoding
Executing data preprocessing
Tokenizing the input
Removing stop words
Stemming the words
Lemmatizing the words
Performing classification
Getting the data
Creating the train and test sets
Preprocessing the data
Extracting the features
Introducing the Support Vector Machines algorithm
Understanding Bayes' theorem
Measuring classification performance
Calculating accuracy
Calculating precision and recall
Calculating the F-score
Creating ROC and AUC
Creating precision-recall curves
Summary
Chapter 3: Classifying Topics of Newsgroup Posts
Technical requirements
Understanding topic classification
Performing exploratory data analysis
Executing dimensionality reduction
Understanding principal component analysis
Understanding linear discriminant analysis
Putting PCA and LDA into action
Introducing the k-nearest neighbors algorithm
Performing feature extraction
Performing cross-validation
Performing classification
Comparison to the baseline model
Introducing the random forest algorithm.
Contracting a decision tree
Performing classification
Extracting word embedding representation
Understanding word embedding
Performing vector arithmetic
Performing classification
Using the fastText tool
Summary
Chapter 4: Extracting Sentiments from Product Reviews
Technical requirements
Understanding sentiment analysis
Performing exploratory data analysis
Using the Software dataset
Exploiting the ratings of products
Extracting the word count of reviews
Exploiting the helpfulness score
Introducing linear regression
Putting linear regression into action
Introducing logistic regression
Understanding gradient descent
Using logistic regression
Creating training and test sets
Performing classification
Applying regularization
Introducing deep neural networks
Understanding logic gates
Understanding perceptrons
Understanding artificial neurons
Creating artificial neural networks
Training artificial neural networks
Performing classification
Summary
Chapter 5: Recommending Music Titles
Technical requirements
Understanding recommender systems
Performing exploratory data analysis
Cleaning the data
Extracting information from the data
Understanding the Pearson correlation
Introducing content-based filtering
Extracting music recommendations
Introducing collaborative filtering
Using memory-based collaborative recommenders
Applying SVD
Clustering handwritten text
Applying t-SNE
Using model-based collaborative systems
Introducing autoencoders
Summary
Chapter 6: Teaching Machines to Translate
Technical requirements
Understanding machine translation
Introducing rule-based machine translation
Using direct machine translation
Using transfer-based machine translation
Using interlingual machine translation.
Introducing example-based machine translation
Introducing statistical machine translation
Modeling the translation problem
Creating the models
Introducing sequence-to-sequence learning
Deciphering the encoder/decoder architecture
Understanding long short-term memory units
Putting seq2seq in action
Measuring translation performance
Summary
Chapter 7: Summarizing Wikipedia Articles
Technical requirements
Understanding text summarization
Introducing web scraping
Scraping popular quotes
Scraping book reviews
Scraping Wikipedia articles
Performing extractive summarization
Performing abstractive summarization
Introducing the attention mechanism
Introducing transformers
Putting the transformer into action
Measuring summarization performance
Summary
Chapter 8: Detecting Hateful and Offensive Language
Technical requirements
Introducing social networks
Understanding BERT
Pre-training phase
Fine-tuning phase
Putting BERT into action
Introducing boosting algorithms
Understanding AdaBoost
Understanding gradient boosting
Understanding XGBoost
Creating validation sets
Learning the myth of Icarus
Extracting the datasets
Treating imbalanced datasets
Classifying with BERT
Training the classifier
Applying early stopping
Understanding CNN
Adding pooling layers
Including CNN layers
Summary
Chapter 9: Generating Text in Chatbots
Technical requirements
Understanding text generation
Creating a retrieval-based chatbot
Understanding language modeling
Understanding perplexity
Building a language model
Creating a generative chatbot
Using a pre-trained model
Creating the GUI
Creating the web chatbot
Fine-tuning a pre-trained model
Summary
Chapter 10: Clustering Speech-to-Text Transcriptions.
Technical requirements
Understanding text clustering
Preprocessing the data
Using speech-to-text
Introducing the K-means algorithm
Putting K-means into action
Introducing DBSCAN
Putting DBSCAN into action
Assessing DBSCAN
Introducing the hierarchical clustering algorithm
Putting hierarchical clustering into action
Introducing the LDA algorithm
Putting LDA into action
Summary
Index
Other Books You May Enjoy.

Machine learning techniques for text apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Ejemplares similares