What is SLI, SLO, SLA & Error budgets in Site Reliability Engineering?

SRE is a discipline which is used to automate IT operations tasks - e.g. production system management, change management, incident response, even emergency response - that would otherwise be performed manually by systems administrators (sysadmins).  It can help organizations to improve the reliability of their software systems.  SRE uses SLIs, SLOs, error budgets, and SLAs … Continue reading What is SLI, SLO, SLA & Error budgets in Site Reliability Engineering?