SRE is a discipline which is used to automate IT operations tasks - e.g. production system management, change management, incident response, even emergency response - that would otherwise be performed manually by systems administrators (sysadmins). It can help organizations to improve the reliability of their software systems. SRE uses SLIs, SLOs, error budgets, and SLAs … Continue reading What is SLI, SLO, SLA & Error budgets in Site Reliability Engineering?