Mastering Prometheus Gain Expert Tips to Monitoring Your Infrastructure, Applications, and Services

Learn how to effectively implement, manage, and optimize Prometheus for monitoring your systems Key Features Achieve high availability with Prometheus by using Thanos Integrate Prometheus into your broader observability stack with OpenTelemetry Tweak, tune, and debug Prometheus to reliably scale wit...

Full description

Bibliographic Details
Other Authors: Hegedus, William, author (author)
Format: eBook
Language:Inglés
Published: Birmingham, England : Packt Publishing [2024]
Edition:First edition
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009815726406719
Table of Contents:
  • Cover
  • Copyright
  • Contributors
  • Table of Contents
  • Preface
  • Part 1: Fundamentals of Prometheus
  • Chapter 1: Observability, Monitoring, and Prometheus
  • A brief history of monitoring
  • Nagios
  • A word on SNMP
  • Enter the cloud
  • Introduction to observability concepts
  • Metrics
  • Logs
  • Traces
  • Other signals
  • Tying signals together
  • Getting data out of systems
  • Prometheus's role in observability
  • Alerting
  • Dashboarding
  • What Prometheus is not
  • Summary
  • Further reading
  • Chapter 2: Deploying Prometheus
  • Technical requirements
  • Components of a Prometheus stack
  • Prometheus
  • Alertmanager
  • Node Exporter
  • Grafana
  • Provisioning Kubernetes
  • Configuring the linode-cli
  • Creating a Kubernetes cluster
  • Deploying Prometheus
  • Prometheus Operator overview
  • Deploying kube-prometheus
  • Summary
  • Further reading
  • Chapter 3: The Prometheus Data Model and PromQL
  • Technical requirements
  • Prometheus's data model
  • Prometheus' TSDB
  • Head block
  • WAL
  • Blocks and chunks
  • Index
  • Compaction
  • PromQL basics
  • Syntax overview
  • Query operators
  • Query functions
  • Summary
  • Further reading
  • Chapter 4: Using Service Discovery
  • Technical requirements
  • Service discovery overview
  • Using service discovery
  • Relabeling
  • Using service discovery in a cloud provider
  • Linode service discovery
  • Custom service discovery endpoints with HTTP SD
  • Summary
  • Further reading
  • Chapter 5: Effective Alerting with Prometheus
  • Technical requirements
  • Alertmanager configuration and routing
  • Routing
  • Receivers
  • Inhibitions
  • Validating
  • Alertmanager templating
  • Configuring templates
  • Defining your own templates
  • Highly available (HA) alerting
  • Cluster sizing
  • Making robust alerts
  • Use logical/set binary operators
  • Use appropriate "for" durations.
  • Use _over_time functions
  • Anomaly detection
  • Unit-testing alerting rules
  • Summary
  • Further reading
  • Part 2: Scaling Prometheus
  • Chapter 6: Advancing Prometheus - Sharding, Federation, and High Availability
  • Technical requirements
  • Prometheus' limitations
  • Cardinality
  • Long-term storage
  • Sharding Prometheus
  • Sharding by service
  • Sharding with relabeling
  • Federating Prometheus
  • Achieving high availability (HA) in Prometheus
  • HA via the Prometheus Operator
  • Cleanup
  • Summary
  • Further reading
  • Chapter 7: Optimizing and Debugging Prometheus
  • Technical requirements
  • Controlling cardinality
  • Identifying cardinality issues
  • Remediating cardinality issues
  • Using limits
  • Recording rules
  • Recording rule conventions
  • Scrape jitter
  • Using pprof
  • Using promtool for pprof data
  • Query logging and limits
  • Query logging
  • Query limits
  • Tuning garbage collection
  • Using GOMEMLIMIT
  • Summary
  • Further reading
  • Chapter 8: Enabling Systems Monitoring with the Node Exporter
  • Technical requirements
  • Node Exporter overview
  • What is in an exporter?
  • Default collectors
  • conntrack
  • cpu
  • diskstats
  • filesystem
  • loadavg
  • meminfo
  • netdev
  • pressure
  • Others
  • The textfile collector
  • Troubleshooting the Node Exporter
  • Summary
  • Further reading
  • Part 3: Extending Prometheus
  • Chapter 9: Utilizing Remote Storage Systems with Prometheus
  • Technical requirements
  • Understanding remote write and remote read
  • Remote read
  • Remote write
  • Using VictoriaMetrics
  • Deployment methods
  • Deploying to Kubernetes
  • Using Grafana Mimir
  • Comparing to VictoriaMetrics
  • Deploying to Kubernetes
  • Summary
  • Further reading
  • Chapter 10: Extending Prometheus Globally with Thanos
  • Technical requirements
  • Overview of Thanos
  • Why use Thanos?
  • Thanos Sidecar.
  • Deploying Thanos Sidecar
  • Thanos Compactor
  • Vertical compaction
  • Downsampling
  • Deploying Thanos Compactor
  • Thanos Query
  • Deploying Thanos Query
  • Scaling Thanos Query
  • Thanos Query Frontend
  • Query sharding and splitting
  • Caching
  • Deploying Thanos Query Frontend
  • Thanos Store
  • Deploying Thanos Store
  • Scaling Thanos Store
  • Thanos Ruler
  • Stateless mode
  • Deploying Thanos Ruler
  • Thanos Receiver
  • Deploying Thanos Receiver
  • Thanos tools
  • Cleanup
  • Summary
  • Further reading
  • Chapter 11: Jsonnet and Monitoring Mixins
  • Technical requirements
  • Overview of Jsonnet
  • Syntax
  • Using Jsonnet
  • Generating files
  • Formatting and linting
  • Monitoring Mixins
  • Mixin structure
  • Using and extending mixins
  • Summary
  • Further reading
  • Chapter 12: Utilizing Continuous Integration Pipelines with Prometheus
  • Technical requirements
  • GitHub Actions
  • Validation in CI
  • Using promtool
  • Using amtool
  • Linting Prometheus rules with Pint
  • Configuring Pint
  • Integrating Pint with CI
  • Summary
  • Further reading
  • Chapter 13: Defining and Alerting on SLOs
  • Technical requirements
  • Understanding SLIs, SLOs, and SLAs
  • Why SLOs matter
  • Types of SLOs
  • Defining SLOs with Prometheus data
  • Window-based SLOs
  • Alerting on SLOs
  • Using Sloth and Pyrra for SLOs
  • Sloth
  • Pyrra
  • Summary
  • Further reading
  • Chapter 14: Integrating Prometheus with OpenTelemetry
  • Technical requirements
  • Introducing OpenTelemetry
  • OTel specification
  • OpenTelemetry line protocol
  • OpenTelemetry collector
  • Collecting Prometheus metrics with the OpenTelemetry collector
  • Sending metrics to Prometheus with the OpenTelemetry collector
  • Configuring Prometheus
  • Configuring OpenTelemetry collector
  • Summary
  • Further reading
  • Chapter 15: Beyond Prometheus
  • Technical requirements.
  • Extending observability past Prometheus
  • Logs
  • Traces
  • Connecting the dots across observability systems
  • Logging with Loki
  • Tracing with Tempo
  • Summary
  • Further reading
  • Index
  • Other Books You May Enjoy.