97 things every SRE should know collective wisdom from the experts
"Site reliability engineering (SRE) is more relevant than ever. Knowing how to keep systems reliable has become a critical skill. With this practical book, newcomers and old hats alike will explore a broad range of conversations happening in SRE. You'll get actionable advice on several top...
Other Authors: | , |
---|---|
Format: | eBook |
Language: | Inglés |
Published: |
Sebastopol, California :
O'Reilly Media, Incorporated
[2020]
|
Subjects: | |
See on Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009631409906719 |
Table of Contents:
- New to SRE. Site reliability engineering in six words / Alex Hidalgo
- Do we know why we really want reliability? / Niall Murphy
- Building self-regulating processes / Denise Yu
- Four engineers of an SRE seder / Jacob Scott
- The reliability stack / Alex Hidalgo
- Infrastructure: it's where the power is / Charity Majors
- Thinking about resilience / Justin Li
- Observability in the development cycle / Charity Majors and Liz Fong-Jones
- There is no magic / Bouke van der Bijl
- How Wikipedia is served to you / Effie Mouzeli
- Why you should understand ( a little) about TCP / Julia Evans
- The importance of a management interface / Salim Virji
- When it comes to storage, think distributed / Salim Virji
- The role of cardinality / Charity Majors and Liz Fong-Jones
- Security is like an onion / Lucas Fontes
- Use your words / Tanya Reilly
- Where to SRE / Fatema Boxwala
- Dear future team / Frances Rees
- Sustainability and burnout / Denise Yu
- Don't take advice from Graybeards / John Looney
- Facing that first page / Andrew Louis
- Zero to one. SRE, at any size, is cultural / Matthew Huxtable
- Everyone is an SRE in a small organization / Matthew Huxtable
- Auditing your environment for improvements / Joan O'Callaghan
- With incident response, start small / Thai Wood
- Solo SRE: effecting large-scale change as a single individual / Ashley Poole
- Design goals for SLO measurement / Ben Sigelman
- I have an error budget- now what? / Alex Hidalgo
- How to change things / Joan O'Callaghan
- Methodological debugging / Avishai Ish-Shalom and Nati Cohen
- How startups can build an SRE mindset / Tamara Miner
- Bootstrapping SRE in Enterprises / Vanessa Yiu
- It's okay not to know, and it's okay to be wrong / Todd Palino
- Storytelling is a superpower / Anita Clarke
- Get your work recognized: write a brag document / Julie Evans and Karla Burnett
- One to ten. Making work visible / Lorin Hochstein
- An overlooked engineering skill / Murali Suriar
- Unpacking the on-call divide / Jason Hand
- The maestros of incident response / Andrew Louis
- Effortless incident management / Suhail Patel, Miles Bryant, and Chris Evans
- If you're doing runbooks, do them well / Spike Lindsey
- Why I hate our playbooks / Frances Rees
- What machines do well / Michelle Brush
- Integrating empathy into SRE tools / Daniella Niyonkuru
- Using ChatOps to implement empathy / Daniella Niyonkuru
- Move fast to unbreak things / Michelle Brush
- You don't know for sure until it runs in production / Ingrid Epure
- Sometimes the fix is the problem / Jake Pittis
- Legendary / Elise Gale
- Metrics are not SLIs (the measure everything trap) / Brian Murphy
- When SLOs attack: pathological SLOs and how to fix them / Narayan Desai
- Holistic approach to product reliability / Kristine Chen and Bart Ponurkiewicz
- In search of the lost time / Ingrid Epure
- Unexpected lessons from office hours / Tamara Miner
- Building tools for internal customers that they actually want to use / Vinessa Wan
- It's about the individuals and interactions / Vinessa Wan
- The human baseline in SRE / Effie Mouzeli
- Remotely productive or productively remote / Avleen Vig
- Of margins and individuals / Kurt Andersen
- The importance of margins in systems / Kurt Andersen
- Fewer spreadsheets, more napkins / Jacob Bednarz
- Sneaking in your DevOps deliciously / Vinessa Wan
- Effecting SRE cultural changes in enterprise / Vanessa Yiu
- To all the SREs I've loved / Felix Glaser
- Complex: the most overloaded word in technology / Laura Nolan
- Ten to hundred. The best advice I can give to teams / Nicole Forsgren
- Create your supporting artifacts / Daria Barteneva and Eva Parish
- The order of operations for getting SLO buy-in / David K. Rensin
- Heroes are necessary, but hero culture is not / Lei Lopez
- On-call rotations that people want to join / Miles Bryant, Chris Evans, and Suhail Patel
- Study of human factors and team culture to improve paper fatigue / Daria Barteneva
- Optimize for MTTBTB (mean time to back to bed) / Spike Lindsey
- Mitigating and preventing cascading failures / Rita Lu
- On-call health: the metric you could be measuring / Caitie McCaffrey
- The SRE as a diplomat / Johnny Boursiquot
- Test your disaster plan / Tanya Reilly
- Why training matters to an SRE practice and SRE matters to your training program / Jennifer Petoff
- The power of uniformity / Chris Evans, Suhail Patel, and Miles Bryant
- Bytes per user value / Arshia Mufti
- Make your engineering blog a priority / Anita Clarke
- Don't let anyone run code in your context / John Looney
- Trading places: SRE and product / Shubheksha Jalan
- You see teams, I see product / Avleen Vig
- The performance emergency fund / Dawn Parzych
- Important but not urgent: roadmaps for SREs / Laura Nolan
- The future of SRE. That 50% thing / Tanya Reilly
- Following the path of safety-critical systems / Heidy Khlaaf
- The importance of formal specification / Hillel Wayne
- Risk and rot in sociotechnical systems / Laura Nolan
- SRE in crisis / Niall Murphy
- Expected risk limitations / Blake Bisset
- Beyond local risk: accounting for Angry Birds / Blake Bisset
- A word from software safety nerds / J. Paul Reed
- Incidents: a window into Gaps / Lorin Hochstein
- The third age of SRE / Björn "Beorn" Rabenstein.