Entity resolution and information quality

Customers and products are the heart of any business, and corporations collect more data about them every year. However, just because you have data doesn't mean you can use it effectively. If not properly integrated, data can actually encourage false conclusions that result in bad decisions an...

Descripción completa

Detalles Bibliográficos
Autor principal: Talburt, John R. (-)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Amsterdam ; Boston : Elsevier/Morgan Kaufmann c2011.
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628128006719
Tabla de Contenidos:
  • Front Cover; Entity Resolution and Information Quality; Copyright; Dedication; Contents; Foreword; Preface; Motivation for the Book; Audience; Organization of the Material; Acknowledgements; Chapter 1: Principles of Entity Resolution; Entity Resolution; Background; Entity versus Entity Reference; Entity Resolution Activities; Entity Reference Extraction: ERA1; Entity Reference Preparation; Summary; Review Questions; Chapter 2: Principles of Information Quality; Information Quality; IQ versus DQ; Shannon Information Theory; Fisher Information; Value of Information
  • IQ and the Quality of InformationTwo IP Examples; IQ Management; Fitness for Use; IQ and the Organization; Information versus Process; IQ and HPC; The Evolution of Information Quality; Problem Recognition: The Data-Cleaning Phase; Root Cause Detection: The Prevention Phase; Information as a Product Phase; Information as an Asset; IQ as an Academic Discipline; TDQM and the MIT IQ Program; The UALR IQ Graduate Program; IQ and ER; Summary; Review Questions; Chapter 3: Entity Resolution Models; Overview; The Fellegi-Sunter Model; Deterministic and Probabilistic Matching; School Enrollment Example
  • Pattern Weights and Linkage RulesCalculating Weight Ratios; Comparing Attribute Values; SERF Model; Match and Merge Functions; Generic ER Defined; Consistent ER; R-Swoosh Algorithm; Other Swoosh Algorithms; Other ER Algorithms; Algebraic Model; Equivalence Relations; Equivalence Classes and Partitions; ER as an Equivalence Relation; ER Scenarios; Partition Similarity; Talburt-Wang Index; Rand Index and Adjust Rand Index; Other Measures of ER Outcomes; ER Metrics; ER Consistency; ER Accuracy; Synthetic Data Experiment; ENRES Meta-Model; Summary; Review Questions
  • Chapter 4: Entity-Based Data IntegrationIntroduction; Formal Framework for Describing EBDI; Optimizing Selection Operator Accuracy; Nai ̈ve Selection Operators; Selection Operator Evaluation; Strategies for Optimizing Selection; More Complex Selection Rules; Selection Based on Multiple Attributes; Nonselection Integration Operators; Summary; Review Questions; Chapter 5: Entity Resolution Systems; Introduction; DataFlux dfPowerStudio; Establish Agreement Rules; Assess Quality; Data Cleansing; Convert to Common Layout; Cluster Equivalent Records; Evaluation of Results
  • Infoglide Identity Resolution EngineSearch Scenario; Discovery Scenario; Acxiom AbiliTec; Identity Management Process; Link Append Process; Summary; Review Questions; Chapter 6: The Oyster Project; Background; OYSTER Logic; Run Script Document Structure; Attributes Description Document Structure; In-Memory Structures; Identity List; Identity Index; Link Index; Transitive Equivalence Example; Describing the Identities; Describing Identity Rules; Running the Example; Asserted Equivalence Example; Febrl: Open-Source Project; Summary; Review Questions
  • Chapter 7: Trends in Entity Resolution Research and Applications