Elasticsearch the definitive guide

Whether you need full-text search or real-time analytics of structured data—or both—the Elasticsearch distributed search engine is an ideal way to put your data to work. This practical guide not only shows you how to search, analyze, and explore data with Elasticsearch, but also helps you deal with...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Gormley, Clinton, author (author), Tong, Zachary, author (illustrator), Demarest, Rebecca, illustrator
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Sebastopol, California : O'Reilly Media 2010.
Edición:	1st edition
Materias:	Application software > Development. Application software > Development > Management.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629585406719

Tabla de Contenidos:

Intro
Table of Contents
Foreword
Preface
Who Should Read This Book
Why We Wrote This Book
Elasticsearch Version
How to Read This Book
Navigating This Book
Online Resources
Conventions Used in This Book
Using Code Examples
Safari® Books Online
How to Contact Us
Acknowledgments
Part I. Getting Started
Chapter 1. You Know, for Search...
Installing Elasticsearch
Installing Marvel
Running Elasticsearch
Viewing Marvel and Sense
Talking to Elasticsearch
Java API
RESTful API with JSON over HTTP
Document Oriented
JSON
Finding Your Feet
Let's Build an Employee Directory
Indexing Employee Documents
Retrieving a Document
Search Lite
Search with Query DSL
More-Complicated Searches
Full-Text Search
Phrase Search
Highlighting Our Searches
Analytics
Tutorial Conclusion
Distributed Nature
Next Steps
Chapter 2. Life Inside a Cluster
An Empty Cluster
Cluster Health
Add an Index
Add Failover
Scale Horizontally
Then Scale Some More
Coping with Failure
Chapter 3. Data In, Data Out
What Is a Document?
Document Metadata
_index
_type
_id
Other Metadata
Indexing a Document
Using Our Own ID
Autogenerating IDs
Retrieving a Document
Retrieving Part of a Document
Checking Whether a Document Exists
Updating a Whole Document
Creating a New Document
Deleting a Document
Dealing with Conflicts
Optimistic Concurrency Control
Using Versions from an External System
Partial Updates to Documents
Using Scripts to Make Partial Updates
Updating a Document That May Not Yet Exist
Updates and Conflicts
Retrieving Multiple Documents
Cheaper in Bulk
Don't Repeat Yourself
How Big Is Too Big?
Chapter 4. Distributed Document Store
Routing a Document to a Shard.
How Primary and Replica Shards Interact
Creating, Indexing, and Deleting a Document
Retrieving a Document
Partial Updates to a Document
Multidocument Patterns
Why the Funny Format?
Chapter 5. Searching-The Basic Tools
The Empty Search
hits
took
shards
timeout
Multi-index, Multitype
Pagination
Search Lite
The _all Field
More Complicated Queries
Chapter 6. Mapping and Analysis
Exact Values Versus Full Text
Inverted Index
Analysis and Analyzers
Built-in Analyzers
When Analyzers Are Used
Testing Analyzers
Specifying Analyzers
Mapping
Core Simple Field Types
Viewing the Mapping
Customizing Field Mappings
Updating a Mapping
Testing the Mapping
Complex Core Field Types
Multivalue Fields
Empty Fields
Multilevel Objects
Mapping for Inner Objects
How Inner Objects are Indexed
Arrays of Inner Objects
Chapter 7. Full-Body Search
Empty Search
Query DSL
Structure of a Query Clause
Combining Multiple Clauses
Queries and Filters
Performance Differences
When to Use Which
Most Important Queries and Filters
term Filter
terms Filter
range Filter
exists and missing Filters
bool Filter
match_all Query
match Query
multi_match Query
bool Query
Combining Queries with Filters
Filtering a Query
Just a Filter
A Query as a Filter
Validating Queries
Understanding Errors
Understanding Queries
Chapter 8. Sorting and Relevance
Sorting
Sorting by Field Values
Multilevel Sorting
Sorting on Multivalue Fields
String Sorting and Multifields
What Is Relevance?
Understanding the Score
Understanding Why a Document Matched
Fielddata
Chapter 9. Distributed Search Execution
Query Phase
Fetch Phase
Search Options
preference
timeout
routing
search_type.
scan and scroll
Chapter 10. Index Management
Creating an Index
Deleting an Index
Index Settings
Configuring Analyzers
Custom Analyzers
Creating a Custom Analyzer
Types and Mappings
How Lucene Sees Documents
How Types Are Implemented
Avoiding Type Gotchas
The Root Object
Properties
Metadata: _source Field
Metadata: _all Field
Metadata: Document Identity
Dynamic Mapping
Customizing Dynamic Mapping
date_detection
dynamic_templates
Default Mapping
Reindexing Your Data
Index Aliases and Zero Downtime
Chapter 11. Inside a Shard
Making Text Searchable
Immutability
Dynamically Updatable Indices
Deletes and Updates
Near Real-Time Search
refresh API
Making Changes Persistent
flush API
Segment Merging
optimize API
Part II. Search in Depth
Chapter 12. Structured Search
Finding Exact Values
term Filter with Numbers
term Filter with Text
Internal Filter Operation
Combining Filters
Bool Filter
Nesting Boolean Filters
Finding Multiple Exact Values
Contains, but Does Not Equal
Equals Exactly
Ranges
Ranges on Dates
Ranges on Strings
Dealing with Null Values
exists Filter
missing Filter
exists/missing on Objects
All About Caching
Independent Filter Caching
Controlling Caching
Filter Order
Chapter 13. Full-Text Search
Term-Based Versus Full-Text
The match Query
Index Some Data
A Single-Word Query
Multiword Queries
Improving Precision
Controlling Precision
Combining Queries
Score Calculation
Controlling Precision
How match Uses bool
Boosting Query Clauses
Controlling Analysis
Default Analyzers
Configuring Analyzers in Practice
Relevance Is Broken!
Chapter 14. Multifield Search
Multiple Query Strings
Prioritizing Clauses
Single Query String.
Know Your Data
Best Fields
dis_max Query
Tuning Best Fields Queries
tie_breaker
multi_match Query
Using Wildcards in Field Names
Boosting Individual Fields
Most Fields
Multifield Mapping
Cross-fields Entity Search
A Naive Approach
Problems with the most_fields Approach
Field-Centric Queries
Problem 1: Matching the Same Word in Multiple Fields
Problem 2: Trimming the Long Tail
Problem 3: Term Frequencies
Solution
Custom _all Fields
cross-fields Queries
Per-Field Boosting
Exact-Value Fields
Chapter 15. Proximity Matching
Phrase Matching
Term Positions
What Is a Phrase
Mixing It Up
Multivalue Fields
Closer Is Better
Proximity for Relevance
Improving Performance
Rescoring Results
Finding Associated Words
Producing Shingles
Multifields
Searching for Shingles
Performance
Chapter 16. Partial Matching
Postcodes and Structured Data
prefix Query
wildcard and regexp Queries
Query-Time Search-as-You-Type
Index-Time Optimizations
Ngrams for Partial Matching
Index-Time Search-as-You-Type
Preparing the Index
Querying the Field
Edge n-grams and Postcodes
Ngrams for Compound Words
Chapter 17. Controlling Relevance
Theory Behind Relevance Scoring
Boolean Model
Term Frequency/Inverse Document Frequency (TF/IDF)
Vector Space Model
Lucene's Practical Scoring Function
Query Normalization Factor
Query Coordination
Index-Time Field-Level Boosting
Query-Time Boosting
Boosting an Index
t.getBoost()
Manipulating Relevance with Query Structure
Not Quite Not
boosting Query
Ignoring TF/IDF
constant_score Query
function_score Query
Boosting by Popularity
modifier
factor
boost_mode
max_boost
Boosting Filtered Subsets
filter Versus query
functions
score_mode.
Random Scoring
The Closer, The Better
Understanding the price Clause
Scoring with Scripts
Pluggable Similarity Algorithms
Okapi BM25
Changing Similarities
Configuring BM25
Relevance Tuning Is the Last 10%
Part III. Dealing with Human Language
Chapter 18. Getting Started with Languages
Using Language Analyzers
Configuring Language Analyzers
Pitfalls of Mixing Languages
At Index Time
At Query Time
Identifying Language
One Language per Document
Foreign Words
One Language per Field
Mixed-Language Fields
Split into Separate Fields
Analyze Multiple Times
Use n-grams
Chapter 19. Identifying Words
standard Analyzer
standard Tokenizer
Installing the ICU Plug-in
icu_tokenizer
Tidying Up Input Text
Tokenizing HTML
Tidying Up Punctuation
Chapter 20. Normalizing Tokens
In That Case
You Have an Accent
Retaining Meaning
Living in a Unicode World
Unicode Case Folding
Unicode Character Folding
Sorting and Collations
Case-Insensitive Sorting
Differences Between Languages
Unicode Collation Algorithm
Unicode Sorting
Specifying a Language
Customizing Collations
Chapter 21. Reducing Words to Their Root Form
Algorithmic Stemmers
Using an Algorithmic Stemmer
Dictionary Stemmers
Hunspell Stemmer
Installing a Dictionary
Per-Language Settings
Creating a Hunspell Token Filter
Hunspell Dictionary Format
Choosing a Stemmer
Stemmer Performance
Stemmer Quality
Stemmer Degree
Making a Choice
Controlling Stemming
Preventing Stemming
Customizing Stemming
Stemming in situ
Is Stemming in situ a Good Idea
Chapter 22. Stopwords: Performance Versus Precision
Pros and Cons of Stopwords
Using Stopwords
Stopwords and the Standard Analyzer
Maintaining Positions
Specifying Stopwords.
Using the stop Token Filter.

Elasticsearch the definitive guide

Ejemplares similares