In this episode I talk with Erin Dees, Principal Engineer at Stitch Fix, about Site Reliability Engineering. Topics discussed include being on-call, incident response, SLAs and SLOs, incident severity levels, recovering from incidents, and more.
- Effective Testing with RSpec 3
- Google Site Reliability Engineering book
- The Phoenix Project