Engineering:Mean time to recovery

From HandWiki


Mean time to recovery (MTTR)[1][2][3] is the average time that a device will take to recover from any failure. Examples of such devices range from self-resetting fuses (where the MTTR would be very short, probably seconds), to whole systems which have to be repaired or replaced.

The MTTR would usually be part of a maintenance contract, where the user would pay more for a system MTTR of which was 24 hours, than for one of, say, 7 days. This does not mean the supplier is guaranteeing to have the system up and running again within 24 hours (or 7 days) of being notified of the failure. It does mean the average repair time will tend towards 24 hours (or 7 days). A more useful maintenance contract measure is the maximum time to recovery which can be easily measured and the supplier held accountably.

Note that some suppliers will interpret MTTR to mean 'mean time to respond' and others will take it to mean 'mean time to replace/repair/recover/resolve'. The former indicates that the supplier will acknowledge a problem and initiate mitigation within a certain timeframe. Some systems may have an MTTR of zero, which means that they have redundant components which can take over the instant the primary one fails, see RAID for example. However, the failed device involved in this redundant configuration still needs to be returned to service and hence the device itself has a non-zero MTTR even if the system as a whole (through redundancy) has an MTTR of zero. But, as long as service is maintained, this is a minor issue.

See also

References