This book discusses how to train Site Reliability Engineers, or SREs. Before we go any further, we'd like to clarify the term "SRE."
"SRE" means a variety of things:
- Site Reliability Engineer or a Site Reliability Engineering team, based on the context (singular, SRE, or plural, SREs)
- Site Reliability Engineering concepts, discipline, or way of thinking (SRE)
- Belonging to an SRE individual, team, or way of thinking (SRE's or SREs')
In this books, we share our experience ramping up new SREs, but we also look at other use cases. For example, we have talked with several smaller organizations that are successful in ramping people up to do SRE (or SRE-like) functions.
While much of this book focuses on the specific experience of Google SRE, we aim to present best practices and lessons learned over the past several years, which can be applied to organizations that are at varying points along the spectrum in terms of size and maturity.
This open access book is Complimented by Google Cloud. You can download Training Site Reliability Engineers ebook for free in PDF format (2.3 MB).
Table of Contents
Identifying Your SRE Training Needs
Instructional Design Principles
How to "SRE" an SRE Training Program
Summary and Conclusions
Example Training Design Document