Real-time big data analytics design, process, and analyze large sets of complex data in real time
Design, process, and analyze large sets of complex data in real time About This Book Get acquainted with transformations and database-level interactions, and ensure the reliability of messages processed using Storm Implement strategies to solve the challenges of real-time data processing Load datase...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing
2016.
|
Edición: | 1st edition |
Colección: | Community experience distilled.
|
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630357406719 |
Tabla de Contenidos:
- Preface; Chapter 1: Introducing the Big Data Technology Landscape and Analytics Platform; Big Data - a phenomenon; The Big Data dimensional paradigm; The Big Data ecosystem; The Big Data infrastructure; Components of the Big Data ecosystem; The Big Data analytics architecture; Building business solutions; Dataset processing; Solution implementation; Presentation; Distributed batch processing; Batch processing in distributed mode; Push code to data; Distributed databases (NoSQL)
- Advantages of NoSQL databasesChoosing a NoSQL database; Real-time processing; The telecoms or cellular arena; Transportation and logistics; The connected vehicle; The financial sector; Summary; Chapter 2: Getting Acquainted with Storm; An overview of Storm; The journey of Storm; Storm abstractions; Streams; Topology; Spouts; Bolts; Storm architecture and its components; A Zookeeper cluster; A Storm cluster; How and when to use Storm; Storm internals; Storm parallelism; Storm internal message processing; Summary; Chapter 3: Processing Data with Storm; Storm input sources; Meet Kafka
- Getting to know more about KafkaOther sources for input to Storm; A file as an input source; A socket as an input source; Kafka as an input source; Reliability of data processing; The concept of anchoring and reliability; The Storm acking framework; Storm simple patterns; Joins; Batching; Storm persistence; Storm's JDBC persistence framework; Summary; Chapter 4: Introduction to Trident and Optimizing Storm Performance; Working with Trident; Transactions; Trident topology; Trident tuples; Trident spout; Trident operations; Merging and joining; Filter; Function; Aggregation; Grouping
- State maintenanceUnderstanding LMAX; Memory and cache; Ring buffer - the heart of the disruptor; Producers; Consumers; Storm internode communication; ZeroMQ; Storm ZeroMQ configurations; Netty; Understanding the Storm UI; Storm UI landing page; Topology home page; Optimizing Storm performance; Summary; Chapter 5: Getting Acquainted with Kinesis; Architectural overview of Kinesis; Benefits and use cases of Amazon Kinesis; High-level architecture; Components of Kinesis; Creating a Kinesis streaming service; Access to AWS Kinesis; Configuring the development environment; Creating Kinesis streams
- Creating Kinesis stream producersCreating Kinesis stream consumers; Generating and consuming crime alerts; Summary; Chapter 6: Getting Acquainted with Spark; An overview of Spark; Batch data processing; Real-time data processing; Apache Spark - a one-stop solution; When to use Spark - practical use cases; The architecture of Spark; High-level architecture; Spark extensions/libraries; Spark packaging structure and core APIs; The Spark execution model - master-worker view; Resilient distributed datasets (RDD); RDD - by definition; Fault tolerance; Storage; Persistence; Shuffling
- Writing and executing our first Spark program