Site Reliability Engineering (SRE) is a lot. But the most important component is reliability.
It’s the set of principles, practices, and organizational constructs that keep shit happening while allowing innovation to take place. It’s about delivering what you promised while incrementally improving your service with new features.
This WTFinar tackles the beginning of understanding SRE. It focuses on service level indicators (SLIs) and service level objectives (SLOs) - components of error budgets. (Wow, so many acronyms!). We figure out the happy(ish) medium around acceptable downtime and how to weave that into KPI-based allowances.
Nathen Harvey, gets the people side of tech. He’s built a career on this deep understanding. He helps teams realise their potential while aligning tech to business outcomes. Cloud Developer Advocate at Google, he bridges a damn big gap as he helps the industry understand and apply DevOps and SRE practices in the cloud—a Cloud whisperer, if you will.
Jamie Dobson is co-founder and CEO of Container Solutions, a professional services company that specialises in Cloud Native transformation. With clients like Shell, Adidas, and other large enterprises, CS helps organisations navigate not only technologysolutions but also adapt their internal culture and set business strategy. Jamie is the co-author of the new book Cloud Native Transformation: Practical Patterns for Innovation, (O'Reilly Media, 2020). A veteran software engineer, he specialises in leadership and organisational strategy, and is a frequent presenter at conferences.