Python 3 text processing with NLTK 3 cookbook over 80 practical recipes on natural language processing techniques using Python's NLTK 3.0
Over 80 practical recipes on natural language processing techniques using Python's NLTK 3.0 In Detail This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet dictionary, you'll progress to part-of-speech taggin...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing Ltd
2014.
|
Edición: | Second edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009627973506719 |
Tabla de Contenidos:
- Intro
- Python 3 Text Processing with NLTK 3 Cookbook
- Table of Contents
- Python 3 Text Processing with NLTK 3 Cookbook
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Support files, eBooks, discount offers, and more
- Why Subscribe?
- Free Access for Packt account holders
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- Downloading the example code
- Errata
- Piracy
- Questions
- 1. Tokenizing Text and WordNet Basics
- Introduction
- Tokenizing text into sentences
- Getting ready
- How to do it...
- How it works...
- There's more...
- Tokenizing sentences in other languages
- See also
- Tokenizing sentences into words
- How to do it...
- How it works...
- There's more...
- Separating contractions
- PunktWordTokenizer
- WordPunctTokenizer
- See also
- Tokenizing sentences using regular expressions
- Getting ready
- How to do it...
- How it works...
- There's more...
- Simple whitespace tokenizer
- See also
- Training a sentence tokenizer
- Getting ready
- How to do it...
- How it works...
- There's more...
- See also
- Filtering stopwords in a tokenized sentence
- Getting ready
- How to do it...
- How it works...
- There's more...
- See also
- Looking up Synsets for a word in WordNet
- Getting ready
- How to do it...
- How it works...
- There's more...
- Working with hypernyms
- Part of speech (POS)
- See also
- Looking up lemmas and synonyms in WordNet
- How to do it...
- How it works...
- There's more...
- All possible synonyms
- Antonyms
- See also
- Calculating WordNet Synset similarity
- How to do it...
- How it works...
- There's more...
- Comparing verbs
- Path and Leacock Chordorow (LCH) similarity
- See also
- Discovering word collocations.
- Getting ready
- How to do it...
- How it works...
- There's more...
- Scoring functions
- Scoring ngrams
- See also
- 2. Replacing and Correcting Words
- Introduction
- Stemming words
- How to do it...
- How it works...
- There's more...
- The LancasterStemmer class
- The RegexpStemmer class
- The SnowballStemmer class
- See also
- Lemmatizing words with WordNet
- Getting ready
- How to do it...
- How it works...
- There's more...
- Combining stemming with lemmatization
- See also
- Replacing words matching regular expressions
- Getting ready
- How to do it...
- How it works...
- There's more...
- Replacement before tokenization
- See also
- Removing repeating characters
- Getting ready
- How to do it...
- How it works...
- There's more...
- See also
- Spelling correction with Enchant
- Getting ready
- How to do it...
- How it works...
- There's more...
- The en_GB dictionary
- Personal word lists
- See also
- Replacing synonyms
- Getting ready
- How to do it...
- How it works...
- There's more...
- CSV synonym replacement
- YAML synonym replacement
- See also
- Replacing negations with antonyms
- How to do it...
- How it works...
- There's more...
- See also
- 3. Creating Custom Corpora
- Introduction
- Setting up a custom corpus
- Getting ready
- How to do it...
- How it works...
- There's more...
- Loading a YAML file
- See also
- Creating a wordlist corpus
- Getting ready
- How to do it...
- How it works...
- There's more...
- Names wordlist corpus
- English words corpus
- See also
- Creating a part-of-speech tagged word corpus
- Getting ready
- How to do it...
- How it works...
- There's more...
- Customizing the word tokenizer
- Customizing the sentence tokenizer
- Customizing the paragraph block reader
- Customizing the tag separator.
- Converting tags to a universal tagset
- See also
- Creating a chunked phrase corpus
- Getting ready
- How to do it...
- How it works...
- There's more...
- Tree leaves
- Treebank chunk corpus
- CoNLL2000 corpus
- See also
- Creating a categorized text corpus
- Getting ready
- How to do it...
- How it works...
- There's more...
- Category file
- Categorized tagged corpus reader
- Categorized corpora
- See also
- Creating a categorized chunk corpus reader
- Getting ready
- How to do it...
- How it works...
- There's more...
- Categorized CoNLL chunk corpus reader
- See also
- Lazy corpus loading
- How to do it...
- How it works...
- There's more...
- Creating a custom corpus view
- How to do it...
- How it works...
- There's more...
- Block reader functions
- Pickle corpus view
- Concatenated corpus view
- See also
- Creating a MongoDB-backed corpus reader
- Getting ready
- How to do it...
- How it works...
- There's more...
- See also
- Corpus editing with file locking
- Getting ready
- How to do it...
- How it works...
- 4. Part-of-speech Tagging
- Introduction
- Default tagging
- Getting ready
- How to do it...
- How it works...
- There's more...
- Evaluating accuracy
- Tagging sentences
- Untagging a tagged sentence
- See also
- Training a unigram part-of-speech tagger
- How to do it...
- How it works...
- There's more...
- Overriding the context model
- Minimum frequency cutoff
- See also
- Combining taggers with backoff tagging
- How to do it...
- How it works...
- There's more...
- Saving and loading a trained tagger with pickle
- See also
- Training and combining ngram taggers
- Getting ready
- How to do it...
- How it works...
- There's more...
- Quadgram tagger
- See also
- Creating a model of likely word tags
- How to do it...
- How it works.
- There's more...
- See also
- Tagging with regular expressions
- Getting ready
- How to do it...
- How it works...
- There's more...
- See also
- Affix tagging
- How to do it...
- How it works...
- There's more...
- Working with min_stem_length
- See also
- Training a Brill tagger
- How to do it...
- How it works...
- There's more...
- Tracing
- See also
- Training the TnT tagger
- How to do it...
- How it works...
- There's more...
- Controlling the beam search
- Significance of capitalization
- See also
- Using WordNet for tagging
- Getting ready
- How to do it...
- How it works...
- See also
- Tagging proper names
- How to do it...
- How it works...
- See also
- Classifier-based tagging
- How to do it...
- How it works...
- There's more...
- Detecting features with a custom feature detector
- Setting a cutoff probability
- Using a pre-trained classifier
- See also
- Training a tagger with NLTK-Trainer
- How to do it...
- How it works...
- There's more...
- Saving a pickled tagger
- Training on a custom corpus
- Training with universal tags
- Analyzing a tagger against a tagged corpus
- Analyzing a tagged corpus
- See also
- 5. Extracting Chunks
- Introduction
- Chunking and chinking with regular expressions
- Getting ready
- How to do it...
- How it works...
- There's more...
- Parsing different chunk types
- Parsing alternative patterns
- Chunk rule with context
- See also
- Merging and splitting chunks with regular expressions
- How to do it...
- How it works...
- There's more...
- Specifying rule descriptions
- See also
- Expanding and removing chunks with regular expressions
- How to do it...
- How it works...
- There's more...
- See also
- Partial parsing with regular expressions
- How to do it...
- How it works...
- There's more...
- The ChunkScore metrics.
- Looping and tracing chunk rules
- See also
- Training a tagger-based chunker
- How to do it...
- How it works...
- There's more...
- Using different taggers
- See also
- Classification-based chunking
- How to do it...
- How it works...
- There's more...
- Using a different classifier builder
- See also
- Extracting named entities
- How to do it...
- How it works...
- There's more...
- Binary named entity extraction
- See also
- Extracting proper noun chunks
- How to do it...
- How it works...
- There's more...
- See also
- Extracting location chunks
- How to do it...
- How it works...
- There's more...
- See also
- Training a named entity chunker
- How to do it...
- How it works...
- There's more...
- See also
- Training a chunker with NLTK-Trainer
- How to do it...
- How it works...
- There's more...
- Saving a pickled chunker
- Training a named entity chunker
- Training on a custom corpus
- Training on parse trees
- Analyzing a chunker against a chunked corpus
- Analyzing a chunked corpus
- See also
- 6. Transforming Chunks and Trees
- Introduction
- Filtering insignificant words from a sentence
- Getting ready
- How to do it...
- How it works...
- There's more...
- See also
- Correcting verb forms
- Getting ready
- How to do it...
- How it works...
- See also
- Swapping verb phrases
- How to do it...
- How it works...
- There's more...
- See also
- Swapping noun cardinals
- How to do it...
- How it works...
- See also
- Swapping infinitive phrases
- How to do it...
- How it works...
- There's more...
- See also
- Singularizing plural nouns
- How to do it...
- How it works...
- See also
- Chaining chunk transformations
- How to do it...
- How it works...
- There's more...
- See also
- Converting a chunk tree to text
- How to do it...
- How it works.
- There's more.