Machine learning techniques for text apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation
Machine learning and Python offer unique opportunities to process text data. This book will equip you with the skills you need to undertake a role in the field. The content keeps the right balance between need-to-know theory and hands-on practice, grounding the discussion around different case studi...
Otros Autores: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing Ltd
[2022]
|
Edición: | 1st ed |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009701333306719 |
Tabla de Contenidos:
- Cover
- Title Page
- Copyright and Credits
- Acknowledgments
- Contributors
- Table of Contents
- Preface
- Chapter 1: Introducing Machine Learning for Text
- The language phenomenon
- The data explosion
- The era of AI
- Relevant research fields
- The machine learning paradigm
- Taxonomy of machine learning techniques
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Reinforcement learning
- Visualization of the data
- Evaluation of the results
- Summary
- Chapter 2: Detecting Spam Emails
- Technical requirements
- Understanding spam detection
- Explaining feature engineering
- Extracting word representations
- Using label encoding
- Using one-hot encoding
- Using token count encoding
- Using tf-idf encoding
- Executing data preprocessing
- Tokenizing the input
- Removing stop words
- Stemming the words
- Lemmatizing the words
- Performing classification
- Getting the data
- Creating the train and test sets
- Preprocessing the data
- Extracting the features
- Introducing the Support Vector Machines algorithm
- Understanding Bayes' theorem
- Measuring classification performance
- Calculating accuracy
- Calculating precision and recall
- Calculating the F-score
- Creating ROC and AUC
- Creating precision-recall curves
- Summary
- Chapter 3: Classifying Topics of Newsgroup Posts
- Technical requirements
- Understanding topic classification
- Performing exploratory data analysis
- Executing dimensionality reduction
- Understanding principal component analysis
- Understanding linear discriminant analysis
- Putting PCA and LDA into action
- Introducing the k-nearest neighbors algorithm
- Performing feature extraction
- Performing cross-validation
- Performing classification
- Comparison to the baseline model
- Introducing the random forest algorithm.
- Contracting a decision tree
- Performing classification
- Extracting word embedding representation
- Understanding word embedding
- Performing vector arithmetic
- Performing classification
- Using the fastText tool
- Summary
- Chapter 4: Extracting Sentiments from Product Reviews
- Technical requirements
- Understanding sentiment analysis
- Performing exploratory data analysis
- Using the Software dataset
- Exploiting the ratings of products
- Extracting the word count of reviews
- Exploiting the helpfulness score
- Introducing linear regression
- Putting linear regression into action
- Introducing logistic regression
- Understanding gradient descent
- Using logistic regression
- Creating training and test sets
- Performing classification
- Applying regularization
- Introducing deep neural networks
- Understanding logic gates
- Understanding perceptrons
- Understanding artificial neurons
- Creating artificial neural networks
- Training artificial neural networks
- Performing classification
- Summary
- Chapter 5: Recommending Music Titles
- Technical requirements
- Understanding recommender systems
- Performing exploratory data analysis
- Cleaning the data
- Extracting information from the data
- Understanding the Pearson correlation
- Introducing content-based filtering
- Extracting music recommendations
- Introducing collaborative filtering
- Using memory-based collaborative recommenders
- Applying SVD
- Clustering handwritten text
- Applying t-SNE
- Using model-based collaborative systems
- Introducing autoencoders
- Summary
- Chapter 6: Teaching Machines to Translate
- Technical requirements
- Understanding machine translation
- Introducing rule-based machine translation
- Using direct machine translation
- Using transfer-based machine translation
- Using interlingual machine translation.
- Introducing example-based machine translation
- Introducing statistical machine translation
- Modeling the translation problem
- Creating the models
- Introducing sequence-to-sequence learning
- Deciphering the encoder/decoder architecture
- Understanding long short-term memory units
- Putting seq2seq in action
- Measuring translation performance
- Summary
- Chapter 7: Summarizing Wikipedia Articles
- Technical requirements
- Understanding text summarization
- Introducing web scraping
- Scraping popular quotes
- Scraping book reviews
- Scraping Wikipedia articles
- Performing extractive summarization
- Performing abstractive summarization
- Introducing the attention mechanism
- Introducing transformers
- Putting the transformer into action
- Measuring summarization performance
- Summary
- Chapter 8: Detecting Hateful and Offensive Language
- Technical requirements
- Introducing social networks
- Understanding BERT
- Pre-training phase
- Fine-tuning phase
- Putting BERT into action
- Introducing boosting algorithms
- Understanding AdaBoost
- Understanding gradient boosting
- Understanding XGBoost
- Creating validation sets
- Learning the myth of Icarus
- Extracting the datasets
- Treating imbalanced datasets
- Classifying with BERT
- Training the classifier
- Applying early stopping
- Understanding CNN
- Adding pooling layers
- Including CNN layers
- Summary
- Chapter 9: Generating Text in Chatbots
- Technical requirements
- Understanding text generation
- Creating a retrieval-based chatbot
- Understanding language modeling
- Understanding perplexity
- Building a language model
- Creating a generative chatbot
- Using a pre-trained model
- Creating the GUI
- Creating the web chatbot
- Fine-tuning a pre-trained model
- Summary
- Chapter 10: Clustering Speech-to-Text Transcriptions.
- Technical requirements
- Understanding text clustering
- Preprocessing the data
- Using speech-to-text
- Introducing the K-means algorithm
- Putting K-means into action
- Introducing DBSCAN
- Putting DBSCAN into action
- Assessing DBSCAN
- Introducing the hierarchical clustering algorithm
- Putting hierarchical clustering into action
- Introducing the LDA algorithm
- Putting LDA into action
- Summary
- Index
- Other Books You May Enjoy.