Building Python real-time applications with Storm learn to process massive real-time data streams using Storm and Python-- no Java required
Learn to process massive real-time data streams using Storm and Python - no Java required! About This Book Learn to use Apache Storm and the Python Petrel library to build distributed applications that process large streams of data Explore sample applications in real-time and analyze them in the pop...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing
2015.
|
Edición: | 1st edition |
Colección: | Community experience distilled.
|
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629674606719 |
Tabla de Contenidos:
- Cover
- Copyright
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Table of Contents
- Preface
- Chapter 1: Getting Acquainted with Storm
- Overview of Storm
- Before the Storm era
- Key features of Storm
- Storm cluster modes
- Developer mode
- Single-machine Storm cluster
- Multimachine Storm cluster
- The Storm client
- Prerequisites for a Storm installation
- Zookeeper installation
- Storm installation
- Enabling native (Netty only) dependency
- Netty configuration
- Starting daemons
- Playing with optional configurations
- Summary
- Chapter 2: The Storm Anatomy
- Storm processes
- Supervisor
- Zookeeper
- The Storm UI
- Storm-topology-specific terminologies
- The worker process, executor, and task
- Worker processes
- Executors
- Tasks
- Interprocess communication
- A physical view of a Storm cluster
- Stream grouping
- Fault tolerance in Storm
- Guaranteed tuple processing in Storm
- XOR magic in acking
- Tuning parallelism in Storm - scaling a distributed computation
- Summary
- Chapter 3: Introducing Petrel
- What is Petrel?
- Building a topology
- Packaging a topology
- Logging events and errors
- Managing third-party dependencies
- Installing Petrel
- Creating your first topology
- Sentence spout
- Splitter bolt
- Word Counting Bolt
- Defining a topology
- Running the topology
- Troubleshooting
- Productivity tips with Petrel
- Improving startup performance
- Enabling and using logging
- Automatic logging of fatal errors
- Summary
- Chapter 4: Example Topology - Twitter
- Twitter analysis
- Twitter's Streaming API
- Creating a Twitter app to use the Streaming API
- The topology configuration file
- The Twitter stream spout
- Splitter bolt
- Rolling word count bolt
- The intermediate rankings bolt
- The total rankings bolt.
- Defining the topology
- Running the topology
- Summary
- Chapter 5: Persistence Using Redis and MongoDB
- Finding the top n ranked topics using Redis
- The topology configuration file - the Redis case
- Rolling word count bolt - the Redis case
- Total rankings bolt - the Redis case
- Defining the topology - the Redis case
- Running the topology - the Redis case
- Finding the hourly count of tweets by city name using MongoDB
- Defining the topology - the MongoDB case
- Running the topology - the MongoDB case
- Summary
- Chapter 6: Petrel in Practice
- Testing a bolt
- Example - testing SplitSentenceBolt
- Example - testing SplitSentenceBolt with WordCountBolt
- Debugging
- Installing Winpdb
- Add Winpdb breakpoint
- Launching and attaching the debugger
- Profiling your topology's performance
- Split sentence bolt log
- Word count bolt log
- Summary
- Appendix: Managing Storm Using Supervisord
- Storm administration over a cluster
- Introducing supervisord
- Supervisord components
- Supervisord installation
- Summary
- Index.