Software:Oracle Zero Data Loss Recovery Appliance

From HandWiki
Oracle Recovery Appliance
Original author(s)Oracle Corporation
Initial releaseSeptember, 2014
Operating systemOracle Linux
PlatformZero Data Loss Recovery Appliance
LicenseCommercial
Website'www.oracle.com/zdlra'


The Oracle Zero Data Loss Recovery Appliance[1]) is a computing platform that includes Oracle Corporation (Oracle) hardware and software optimized for Oracle Database backup and recovery. According to Oracle, the Recovery Appliance was designed to virtually guarantee recoverability of the most vital Oracle databases, for which the loss of data would be catastrophic[2][3][4].

Oracle's unique advantage in owning both the Oracle Database software and the Recovery Appliance enables integrated capabilities for recoverability that are unavailable to non-Oracle backup and recovery products[5][6][7][8]. After evaluating a Recovery Appliance, industry analyst firm ESG[9] (via ESG Lab) coined the term “Fiduciary Class Data Recovery[10][11][12] to denote the highest level of trust required by Financial institutions, and that which is fulfilled by the Recovery Appliance.

The Recovery Appliance debuted in 2014 as a member of Oracle Corporation's family of Engineered Systems[13] and uses the same hardware and software as the popular Oracle Exadata Database Machine, plus additional software specific to the Recovery Appliance for backup, recovery, replication, monitoring and management.

The Recovery Appliance is available in a "Base Rack" configuration which can be incrementally increased to a "Full Rack" or larger "multi-rack" configurations. A Base Rack is capable of managing over 100 terabytes of data, a Full Rack over 700 terabytes, and multi-rack configurations up to more than 13 petabytes of data[1]. Since Recovery Appliance only needs to store data that has changed, the actual size of databases that are protected can be many times larger than the storage capacity of a Recovery Appliance[14].

Backup and Recovery of Databases

Backing up databases is an accepted practice of IT professionals. A backup is a separate copy of data that can be restored and used in place of a damaged or unavailable database. Recovery is thus the main goal, and backups are a mechanism to enable recovery. If a backup copy is itself corrupt, it cannot be used for recovery. If the backup data isn’t current, some or all of the changes made to the database since the backup could be lost[15].

The value of being able to recover data varies greatly, depending on the nature of the database. Financial transactions, medical records, and national security information are examples of databases that require the greatest level of data protection. The average cost of downtime for such databases is estimated at millions of U.S. dollars per hour[2].

Due to the high cost of database downtime, critical databases are usually protected with a standby database that is closely synchronized with the primary database using a "data replication[16]" mechanism. If the primary database becomes unavailable, the applications can switch to the standby (now the primary) and continue to operate. However, until the original primary is restored and synchronized, the organization is exposed if the new primary also becomes unavailable. Backups are, therefore, the ultimate failsafe for data protection. The more important the database the more important the backup and recoverability of the backup.

Backup and Recovery Steps

The general process of database backup and recovery includes the following steps:

Backup

  1. On a regular schedule, run a backup utility that makes a copy of the full database or just the incremental changes since the last backup. Typically a full backup of the database is made once a week followed by incremental backups daily.
  2. In between backups, periodically save (archive) changes in the form of transaction logs.

Recovery

  1. Restore the last full database backup, then apply all subsequent incremental backups in order, plus any subsequently archived transaction logs.
  2. If possible, recover completed transactions that occurred after the last archived log. This usually requires re-entry based on a separate journal maintained by the applications. If not possible, data loss has occurred.

Backup and Recovery Challenges

Oracle designed the Recovery Appliance to overcome problems that commonly plague backup and recovery of critical databases[1][17]

  • Long-running backups - Critical databases are often the largest and can take a long time for a weekly full backup to complete. The longer the backup, the more computing and backup resources are required, and the higher the likelihood of a failure. Some vendors encourage full backups in order to highlight the ability of their products to deduplicate data. Full backups are also faster to recover than a series of incremental backups, creating an incentive for more frequent full backups.
    • Recovery Appliance - After an initial full backup, the Recovery Appliance uses an "incremental forever" strategy, and only performs a backup of the changes, minimizing the time required. "Virtual full" backups are automatically compiled in the background on the Recovery Appliance to provide the fast recovery of a full backup.
  • Application slow-down during backups - Most backup processes consume significant computing resources on the system where the database resides, particularly if data compression and deduplication are performed there on full backups. This impacts the performance of running applications and may force other applications to be turned off until backups complete.
    • Recovery Appliance - With the incremental forever approach, backups run for the shortest time possible and have minimal impact on other workloads. All post processing of backups is then performed on the Recovery Appliance and not on the production systems.
  • Excessive backup storage - The cost of storing backups can grow quickly, depending on how many backup copies are maintained and whether or not they are full or incremental backups, and how much deduplication of data is applied. Most databases have a small percentage of daily changes, but may be fully backed up to simplify recovery.
    • Recovery Appliance - Only changed data is stored for backups, which is the same as deduplicating all databases at their source without any of the processing overhead associated with deduplication on the production system. Not having to perform full backups can easily reduce the backup storage requirements by 75% or more, depending upon database size and the rate of change.[14]
  • Corrupt backups - If a backup has corrupt or missing data, it cannot be used for recovery. Data can be corrupted in many ways as it moves from the primary database through the computing infrastructure and across the network to the backup destination. There are many elements of hardware and software that could cause corruptions of the data. Data corruptions are difficult to detect ahead of time, often until it is too late and a recovery is in process. It is also possible to have operational mistakes that result in an incomplete backup. Incomplete backups can result in unrecoverable databases.
    • Recovery Appliance - The Recovery Appliance understands internal Oracle database block formats, which enables deep levels of data validation. All backup data is validated at various times in the Recovery Appliance to ensure that recovery operations will always restore valid data. If a corruption or missing data is detected, the Recovery Appliance is able to repair it automatically or alert the administrator.
  • Lost transactions - Ensuring no data loss is one of the biggest challenges with database recovery, since backups occur periodically on a set schedule, whereas the database is constantly changing. If backups are daily it is possible to lose an entire day of changes during recovery, which is unacceptable for critical databases. A common practice is to periodically capture changes in the form of transaction logs maintained by the database. This practice reduces, but doesn't eliminate, the likelihood of lost transactions. Continuous data protection is the process of capturing changes as they occur in between backups, and adding them to the recovery process to eliminate most or all lost transactions.
    • Recovery Appliance - Redo logging is the fundamental means of implementing transactional changes within the Oracle database. In between backups, Oracle databases can continuously send redo directly from inside the database to the Recovery Appliance. This provides real-time continuous data protection that allows databases to be protected as transactions are happening until the last sub-second.
  • Error-prone recovery processes - Data protection is a multi-step process, with recovery occurring under stress. If any data is corrupt or missing or a recovery step is skipped or out of sequence, the recovery will fail. Seldom does an organization practice recovery or place high value on the certainty of recovery, as it is rarely required. This large gap between the cost of downtime and the perceived need for recovery often leads to underinvestment, until it is too late.
    • Recovery Appliance - The Recovery Appliance introduces the concept of protection policies, which define recovery goals that are enforced on a per-database basis. Using protection policies, databases can be grouped by recovery service tier, with a new database inheriting the policies of the tier to which it is added. When a failure occurs and a database must be restored, Recovery Appliance automates the steps to recover the database either to the point of failure or to a specified point in time.

The Evolution of Recovery Appliance

The Recovery Appliance debuted in 2014, with new generations released yearly.

Model Release

Date

Base Rack

Capacity

Full Rack

Capacity

Full Rack

Backup & Restore

ZDLRA X4[18] 2014 37 TB* 224 TB 12 TB/hour
ZDLRA X5[19] 2015 50 TB 340 TB 12 TB/hour
ZDLRA X6 2016 94 TB 580 TB 12 TB/hour
ZDLRA X7 2017 119 TB 729 TB 24 TB/hour

*TB = terabyte

References

  1. 1.0 1.1 1.2 Various authors (2017). "Oracle Zero Data Loss Recovery Appliance X7 Technical Data Sheet". http://www.oracle.com/technetwork/database/availability/recovery-appliance-ds-2297776.pdf. 
  2. 2.0 2.1 Moore, Fred (June 1, 2015). "White Paper: Implementing a Modern Backup Architecture: Oracle's Tiered Data Protection Strategy". https://horison.com/publications/implementing-a-modern-backup-architecture. 
  3. "CIO Magazine White Paper: Extreme Protection That Eliminates Data Loss for All of Your Oracle Databases". http://www.oracle.com/us/products/oracle-zdlra-final1-2301371.pdf?source=%3Aow%3Alp%3Acpo%3A%3A. 
  4. "Video: Resume Business Faster With Engineered Database Recovery". http://medianetwork.oracle.com/video/player/5070016766001. 
  5. Vellante, David (October 22, 2015). "Oracle Backup and Recovery Strategies: Moving to Data-Protection-as-a-Service". https://wikibon.com/oracle-backup-and-recovery-strategies-moving-to-data-protection-as-a-service/. 
  6. Floyer, David (September 2, 2016). "Real-time Recovery Architecture as a Service". https://wikibon.com/real-time-recovery-architecture-as-a-service/. 
  7. Goodwin, Phil (November 1, 2016). "Oracle's Zero Data Loss Recovery Appliance: A Transaction DVR for the Enterprise". http://www.oracle.com/us/corporate/analystreports/idc-zero-data-loss-rec-app-3350796.pdf. 
  8. "Taneja Group Whitepaper: FULL DATABASE PROTECTION WITHOUT THE FULL BACKUP PAIN". http://www.oracle.com/us/products/engineered-systems/tg-recovery-app-lights-year-ahead-2842666.pdf?source=%3Aow%3Alp%3Acpo%3A%3A. 
  9. "Enterprise Strategy Group". https://www.esg-global.com/. 
  10. Choinski, Sr, Vinny (October 1, 2016). "ESG Lab Validation: Zero Data Loss Recovery Appliance from Oracle". https://research.esg-global.com/reportaction/oraclezdlra/Toc?SearchTerms=oracle. 
  11. Peters, Mark (November 3, 2016). "Better Business Protection – Fiduciary Class Data Recovery". https://www.esg-global.com/blog/better-business-protection-fiduciary-class-data-recovery. 
  12. Peters, Mark (2016). "Understanding Oracle Engineered Storage: Oracle Zero Data Loss Recovery Appliance". http://www.enhancedreports.com/oracle/zero-data-loss-recovery-appliance.html. 
  13. Hollis, Chuck (September 9, 2015). "Chuck's Blog: Grown-up IT for Grown-Up Applications". http://chucksblog.typepad.com/chucks_blog/2015/09/grown-up-it-for-grown-up-applications.html. 
  14. 14.0 14.1 Craft, Chris (February 21, 2018). "De-Duplication in ZDLRA". https://chriscraftoracle.wordpress.com/2018/02/21/de-duplication-in-zdlra/. 
  15. Miller, Lawrence (2017). Database Protection for Dummies. https://go.oracle.com/LP=48477?elqCampaignId=49435&src1=ad:pas:go:dg:stor&src2=wwmk160606p00067c0001&SC=sckw=WWMK160606P00067C0001&mkwid=sGGgtXNTW%7cpcrid%7c238844513031%7cpkw%7cdatabase%20backup%20and%20recovery%7cpmt%7ce%7cpdv%7cc%7csckw=srch:database%20backup%20and%20recovery&gclid=CjwKCAjwkYDbBRB6EiwAR0T_-via_PHAabILjjVjJPczumLzdLKeDLCvmV-iLULWRskzHbQ6ahfoLhoChNwQAvD_BwE&gclsrc=aw.ds: John Wiley & Sons, Inc.. pp. 1–35. ISBN 978-1-119-37957-7. 
  16. Rouse, Margaret (June 1, 2018). "What is Database Replication". https://searchdatamanagement.techtarget.com/definition/database-replication. 
  17. "Database Recovery Without the Drama". http://www.oracle.com/webfolder/assets/infographics/recovery-appliance/index.html. 
  18. Bradley, Marcie (September 29, 2014). "Oracle Reinvents Database Protection with Zero Data Loss Recovery Appliance". Oracle Corporation Press Release. https://www.oracle.com/corporate/pressrelease/zero-data-loss-092914.html. 
  19. Whitaker, Teri (January 21, 2015). "Oracle Tackles Data Center Cost and Complexity with Next-Generation Engineered Systems". Oracle Corporation Press Release. https://www.oracle.com/sn/corporate/pressrelease/x5-launch-20150121.html. 

External links