With the amount of data traversing through an organization, what metrics should staff be focused on? Determining your your must haves should start with your industry’s regulations, business agreements and partner contracts (ie https://en.wikipedia.org/wiki/Service-level_agreement). In this post I’ll review four metrics that can help us meet common SLA components: availability and data integrity.

Availability

  • Mean Time to Failure (MTTF): refers to the average amount of time a system or component operates before experiencing a failure. This metric is useful for understanding the reliability of a system over time.
  • Mean Time to Repair (MTTR): refers to the average amount of time it takes to repair a failed component or system. This metric is useful for understanding how quickly a system can be restored to normal operation after a failure.



Data Integrity

  • Recovery Point Objective (RPO): refers to the maximum acceptable data loss during a disaster. It is expressed in terms of time, such as “the maximum acceptable data loss is four hours.” This metric is useful for understanding how much data a system can afford to lose in the event of a disaster.
  • Recovery Time Objective (RTO): refers to the maximum acceptable time it takes to restore a system to normal operation after a disaster. This metric is useful for understanding how quickly a system needs to be restored in order to meet the needs of the business.



Closing

I’ve reviewed some of the most common metrics I’ve found in SLAs. Most importantly, you’ll first want to determine what your SLAs require. Testing backups can help ensure that an organization can meet its RPO and RTO objectives. By regularly testing and verifying the integrity of backups, organizations can be confident that they have the necessary data to recover from a disaster.

In summary, MTTR, MTTF, RPO, and RTO are important metrics for understanding the performance and reliability of a system. Testing backups and defining SLOs and SLAs can help organizations measure and improve these metrics so you can continue to deliver business.