Modern Network Observability A Hands-On Approach Using Open Source Tools Such As Telegraf, Prometheus, and Grafana
As modern IT services and software architectures such as microservices rely increasingly on network performance, the relevance of networks has never been greater. Network observability has emerged as a critical evolution of traditional monitoring, providing the deep visibility needed to manage today...
Otros Autores: | , , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing
[2024]
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009853625606719 |
Tabla de Contenidos:
- Cover
- Title page
- Copyright and credits
- Dedication
- Foreword 1
- Foreword 2
- Contributors
- Table of Contents
- Preface
- Part 1: Understanding Monitoring and Observability
- Chapter 1: Introduction to Monitoring and Observability
- Defining network observability
- Network monitoring evolution
- What has worked so far
- Trends and requirements
- Network observability pillars
- Data quality
- Scalability and interoperability
- Actionable data
- Assisted analysis
- Benefits
- Summary
- Chapter 2: Role of Monitoring and Observability in Network Infrastructure
- Networking in the 2020s
- Technological changes
- Cultural changes
- Transforming data into information
- The importance of using business terms
- Defining KPIs
- From data to information
- Expectations for network observability
- Heterogeneous and enriched data
- Proactive role in network automation
- Full visibility of network state
- Faster, more accurate, and at scale
- Summary
- Chapter 3: Data's Role in Network Observability
- Network monitoring and telemetry
- Challenges of traditional network monitoring
- Network telemetry
- Network observability framework
- Collecting data, in practice
- Agent-based versus Agentless approach
- Network data collection methods
- Setting up the lab environment
- Summary
- Part 2: Building an Effective Observability Stack
- Chapter 4: Observability Stack Architecture
- The components of an observability platform
- The importance of a well-designed observability stack
- Why does an observability stack need to be well designed?
- What does it mean to be a well-designed platform?
- Understanding data pipelines for observability
- The versatility of data pipelines
- Unpacking ETL in data pipelines
- Challenges and best practices
- Scalability
- Reliability.
- Flexibility, extensibility, and customization
- Cost management
- Other tips and best practices
- Setting up a lab environment
- Lab scenarios
- Summary
- Chapter 5: Data Collectors
- A deep dive into data collectors
- Key characteristics
- A look into Telegraf
- Telegraf architecture
- Telegraf configuration
- Telegraf SNMP input plugin
- Telegraf synthetic monitoring input plugins
- Telegraf gNMI input plugin
- Telegraf exec input plugins
- A look into Logstash
- Logstash architecture
- Logstash syslog input
- Summary
- Chapter 6: Data Distribution and Processing
- Understanding data normalization
- Observability data models
- Breaking down metrics and the data model
- Enhancing insights with data enrichment
- Data enrichment injection
- Data enrichment at query time
- The scale of the observability data pipeline
- Why message brokers/buses matter in observability
- Summary
- Chapter 7: Data Storage Solutions for Network Observability
- Databases for observability
- Time series databases
- Matching databases with observability needs
- A look into Prometheus TSDB
- Prometheus architecture
- Writing to Prometheus TSDB
- Reading from Prometheus TSDB (PromQL)
- Prometheus rules
- A look at Grafana Loki
- Grafana Loki architecture
- Writing to Loki
- Reading from Loki (LogQL)
- Loki rules
- Persistence tips and best practices
- Performance and scale
- Automation is your best friend
- Summary
- Chapter 8: Visualization - Bringing Network Observability to Life
- Data visualization principles
- A look into Grafana
- Architecture
- Setting up the lab environment
- Creating your first Grafana dashboard
- Visualization tips and best practices
- Summary
- Chapter 9: Alerting - Network Monitoring and Incident Management
- Incident management and alerts
- Challenges and considerations on alerting.
- Alert aggregation and correlation
- Alert engine architecture
- A look into rulers and Alertmanager
- Architecture
- Creating your first alerts
- Grafana for alerts
- External integrations
- Alerting tips and best practices
- Addressing common alert challenges
- Build on top of communication and transparency
- Healthy incident management process
- The role of AI in alerting
- Summary
- Chapter 10: Real-World Observability Architectures
- Observability stack options
- All-in-one open source tools
- Commercial off-the-shelf tools
- Controller-based systems
- Time series versus snapshot observability
- Comparing build versus buy decision points
- Defining requirements
- Evaluating in-house capabilities and resources
- Cost analysis
- Assessing risks
- Comparing features and flexibility
- Making a decision
- Orchestrating an observability platform
- Deployment methodologies and orchestration
- Summary
- Part 3: Using Your Network Observability Data
- Chapter 11: Applications of Your Observability Data - Driving Business Success
- The business value of observability data
- Capacity planning
- Percentiles
- Forecasting
- Defining health status
- Treating your network as a service
- Monitoring SLIs, SLOs, and SLAs for optimal network performance
- How to treat a network as a service
- Architecting dashboards
- Network-related personas
- Dashboard types
- Summary
- Chapter 12: Automation Powered by Observability Data - Streamlining Network Operations
- Setting up the lab environment
- Advanced automation techniques with event-driven automation
- Event-driven automation
- Closed-loop automation
- Event-driven automation with Prefect
- Summary
- Chapter 13: Leveraging Artificial Intelligence for Enhanced Network Observability
- AI and ML fundamentals
- ML algorithms
- Neural networks and language models.
- Real-world AIOps
- Lab requirements
- Validating operational changes
- Assisted root cause analysis
- Summary
- Appendix A
- A lab environment
- Hardware requirements
- Software requirements
- Step 0 - Git repository setup
- Step 1 - VM provisioning
- Step 2 - interacting with the lab scenarios
- Step 3 - removing the lab environment
- Step 4 - managing lab scenarios
- Summary
- Index
- Other Books You May Enjoy.