US7685189B2 - Optimizing backup and recovery utilizing change tracking - Google Patents

Optimizing backup and recovery utilizing change tracking Download PDF

Info

Publication number
US7685189B2
US7685189B2 US11/616,686 US61668606A US7685189B2 US 7685189 B2 US7685189 B2 US 7685189B2 US 61668606 A US61668606 A US 61668606A US 7685189 B2 US7685189 B2 US 7685189B2
Authority
US
United States
Prior art keywords
data
file
backup
application
application data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US11/616,686
Other versions
US20080162599A1 (en
Inventor
Harshwardhan Mittal
Kushal Suresh Narkhede
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US11/616,686 priority Critical patent/US7685189B2/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MITTAL, HARSHWARDHAN, NARKHEDE, KUSHAL SURESH
Publication of US20080162599A1 publication Critical patent/US20080162599A1/en
Application granted granted Critical
Publication of US7685189B2 publication Critical patent/US7685189B2/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • Data protection and restoration is a critical component of any business enterprise. Typically, the time a recovery program needs to recover data from a backup is proportional to the size of the data source. Recovery time is important to a business, because the application is not functional during recovery.
  • the recovery process starts when a user detects an anomaly in an application behavior due to a system failure. For example, the application may be unavailable or some important data (e.g. a critical SQL table) is missing.
  • a recovery process is initiated.
  • the recovery process can be broken down in three phases: (1) preserving the state of the application data after the system failure and before restoring the data; (2) recovering the application data from a backup; and (3) setting up backup protection post recovery.
  • the whole process involves 3 ⁇ amount of data transfer (1 ⁇ of data is transferred to preserving the state of the application data after the system failure and before restoring the data, 1 ⁇ of data is transferred to recover the application data from a backup, and 1 ⁇ of data is transferred to create the first backup point). Even with modern hardware and a moderate size of data, the whole process takes significant time (hours).
  • administrators will skip the first step to reduce recovery time and overwrite the existing application data without performing a backup.
  • the administrator may be able to salvage additional data from the preserved data which could not be recovered from a previous backup of the application data.
  • Embodiments of the invention overcome one or more deficiencies in known recovery programs by transferring only the data that has changed since the last backup from the backup server to the application server.
  • the application server tracks changes to a storage device associated with the application data to allow the backup server to transfer only the changes from the last backup to the application server.
  • the application server generates changes to the application data utilizing the tracked changes to the storage device.
  • the application server transfers the changed data to the backup server allowing the backup server to preserve the current state of the application data.
  • This data may be used to create a new recovery point (replica) or to preserve the current state of the application data after a system failure.
  • embodiments of the invention may comprise various other methods and apparatuses.
  • FIG. 1 is an exemplary block diagram illustrating a system for recovering application data according to an embodiment of the invention.
  • FIG. 2 is an exemplary flow chart embodying aspects of the invention for preserving and recovering application data according to an embodiment of the invention.
  • FIG. 3 is a flow diagram for creating an express backup according to an embodiment of the invention.
  • FIG. 4 is a flow diagram for generating file difference records for the recovery of application data according to an aspect of the invention.
  • FIG. 5 is a flow diagram for generating file difference records for the recovery of application data according to another aspect of the invention.
  • an embodiment of the invention includes an application server 102 and a backup server 104 .
  • the application server 102 includes a application data 106 modified by an application program 108 and protected by the recovery program 110 .
  • the application data 106 is associated with one or more application programs such as an email application or database application and is contained on one or more storage devices 114 .
  • FIG. 1 shows one example of a general purpose computing device in the form of application server 102 and backup server 104 .
  • the computer e.g. application server 102 and backup server 104
  • the computer has one or more processors or processing units and a system memory.
  • the computer e.g. application server 102 and backup server 104
  • Computer readable media which include both volatile and nonvolatile media, removable and non-removable media, may be any available medium that may be accessed by the computer (e.g. application server 102 and backup server 104 ).
  • computer readable media comprise computer storage media and communication media.
  • Computer storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store the desired information and that may be accessed by the computer (e.g. application server 102 and backup server 104 ).
  • Communication media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and include any information delivery media.
  • modulated data signal such as a carrier wave or other transport mechanism
  • Wired media such as a wired network or direct-wired connection
  • wireless media such as acoustic, RF, infrared, and other wireless media
  • Combinations of any of the above are also included within the scope of computer readable media.
  • the recovery program 110 tracks the changes to the storage device 114 throughout the duration of protection of the application data 106 .
  • the recovery program 110 monitors writes to the volume of the storage device 114 and maintains the change tracking data 112 .
  • the recovery program 110 monitors writes to the file system of the storage device 114 and maintains the change tracking data 112 .
  • the change tracking data 112 includes metadata about the writes to each protected volume or file system of the storage device 114 since the last backup (backup data N 116 D).
  • the change tracking data 112 does not include the actual data written to the volume or file system, only an indication of the blocks of the volume or file system that have been written since the last backup (backup data N 116 D).
  • the application server 102 and backup server 104 may also include other removable/non-removable, volatile/nonvolatile computer storage media such as the storage device 114 .
  • Removable/non-removable, volatile/nonvolatile computer storage media that may be used in the exemplary operating environment include, but are not limited to, magnetic disk drive, optical disk drive, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like.
  • the drives or other mass storage devices and their associated computer storage media discussed above and illustrated in FIG. 1 provide storage of computer readable instructions, data structures, program modules and other data for the application server 102 and backup server 104 .
  • storage device 114 is illustrated as application data 106 , application program 108 , recovery program 110 and change tracking data 112 .
  • the backup server 104 includes backup data 116 A, 116 B, 116 C, 116 D (replicas of the application data) and file change data 118 A, 118 B, 118 C, 118 D.
  • the backup data 116 A, 116 B, 116 C, 116 D is a copy of the application data 106 for a particular point in time and corresponds to a recovery point.
  • each copy of the backup data 116 A, 116 B, 116 C, 116 D is associated with file change data 118 A, 118 B, 118 C, 118 D file.
  • backup data N 116 D and file change data N 118 D correspond to an nth recovery point while backup data 2 116 B and file change data 2 118 B correspond to a second recovery point.
  • the file change data 118 A, 118 B, 118 C, 118 D indicates the changes to the application data 106 between any two recovery points.
  • file change data 118 C indicates the changes to the application data 106 that occurred between backup data 116 B and backup data 116 C.
  • the file change data 118 A, 118 B, 118 C, 118 D does not include the actual data written to the application data 106 , but instead includes an indication of the blocks that have been written between two consecutive recovery points.
  • the application server 102 communicates with the backup server 104 through a logical communication connection 120 .
  • the application server 102 and backup server 104 may operate in a networked environment using the logical communication connection 120 .
  • the logical communication connection 120 depicted in FIG. 1 include a local area network (LAN) and a wide area network (WAN), but may also includes other networks.
  • LAN and/or WAN may be a wired network, a wireless network, a combination thereof, and so on.
  • Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and global computer networks (e.g., the Internet).
  • program modules depicted relative to the application server 102 and backup server 104 may be stored in a remote memory storage device (not shown).
  • the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
  • FIG. 2 is a flow diagram for restoring the application data 106 of the application program 108 according to the system illustrated in FIG. 1 .
  • the recovery program 110 initiates change tracking generating change tracking data 112 .
  • the change tracking data 112 is written to a data structure of the memory of the application server 102 .
  • the change tracking data 112 does not include the actual data written to the storage device, only an indication of the blocks of the storage device that have been written since the last backup (backup data N 116 D).
  • backup data N 116 D an indication of the blocks of the storage device that have been written since the last backup
  • a system failure related to the application program 108 is detected.
  • the application program 108 may be completely or partially unavailable or some important data (e.g. a SQL table) is missing.
  • the recovery process is initiated by determining the type of system failure at steps 206 and 214 .
  • system failures may be classified into the following three categories: corruption, crash, and disaster.
  • Corruption is a logical corruption of the application data 106 and includes transient hardware errors, bit flips, and the accidental deletion of data by users.
  • the storage device 114 of the application server 102 (including the change tracking data 112 ) is intact and the operating system is executing normally.
  • a crash results when the application server 102 has crashed or unexpectedly shutdown which results in unmanaged reboot of the operating system.
  • the application data 106 and/or the change tracking data 112 may be corrupted, but the storage device 114 is in a healthy state.
  • the last category, disaster occurs when the storage device 114 of the application server 102 has failed and has been replaced.
  • a check is made to determine if the storage device 114 has been replaced. If so, the type of system failure is a disaster and a request for the backup data (e.g. backup data 116 A, 116 B, 116 C, 116 D) associated with one recovery point is made to the backup server 104 at 208 .
  • the volume identifier of the storage device 114 is checked to see if it has changed since the last backup (e.g. backup data N 116 D). If so, the storage device 114 has been replaced.
  • the backup server 104 transfers the backup data (e.g. backup data 116 A, 116 B, 116 C, 116 D) corresponding to the requested recovery point to the application server 102 .
  • backup data 116 A is associated with a first recovery point
  • backup data 116 D is associated with an nth recovery point
  • backup data 116 C is associated with a n-1 recovery point.
  • the backup data 116 A, 116 B, 116 C, 116 D associated with the requested recovery point is applied to the application data 106 to complete the recovery of the application data 106 .
  • the recovery program marks the storage device 114 as dirty to indicate that a clean shutdown has not occurred when change tracking is initiated at 202 .
  • the storage device 114 is marked as clean to indicate that the change tracking data 112 is available and that a clean shut down has occurred. If the recovery program detects that the storage device 114 has a dirty status, a crash type system failure has occurred and the change tracking data 112 should not be used.
  • the change tracking data 112 is generated by comparing a checksum for the application data 106 to a checksum for the last backup, backup data N 116 D. The checksum is compared on a block by block basis. For example, if the checksum for the application data 106 is different that the checksum for a block of backup data N 116 D, the block has been changed and indication is written to the change tracking data 112
  • file change data (FCD) 122 is generated for each file of the application data 106 as a function of the file system and the application data 106 .
  • the file change data 122 indicates the blocks of application data 106 that have changed since the last backup, backup data N 116 D.
  • the application indicates the files belonging to the application, while file system of storage device indicates which blocks on the storage device belong to each file, thus it is possible to determine which the blocks belonging to the application data.
  • the intersection of the application data blocks with the change tracking data determines changed blocks for application data 106 since last backup.
  • file systems include FAT (File Allocation Table), NTFS (New Technology File System), HFS (Hierarchical File System), HFS+ (Hierarchical File System Plus), ext2 (second extended file system) and ext3 (third extended file system). Therefore, it is possible to determine which blocks belong to the application data 106 as a function of the application data 106 and the file system of the storage device 114 . Furthermore, the change tracking data 112 indicates the blocks of the storage that have changed since the last backup. Therefore, it is possible to generate file change data 122 (the blocks of the files of the application data that have changed since the last backup) as a function of the change tracking data 112 and the determined blocks belonging to the application data 106 .
  • FAT File Allocation Table
  • NTFS New Technology File System
  • HFS Hierarchical File System
  • HFS+ Hierarchical File System Plus
  • ext2 second extended file system
  • ext3 third extended file system
  • a first file difference record 124 is generated as a function of the application data 106 and the file change data.
  • the first file difference record 124 is used to by the backup server 104 to construct a copy of the current state of the application data 106 (PD1).
  • the first file difference record 124 data structure contains data changed between PD1 and last backup (e.g. backup data N 116 D). It is possible to construct the second version of PD1 by applying the first file difference record 124 to the last backup (e.g. backup data N 116 D).
  • the first file difference record 124 and the file change data 122 for each file of the application data 106 are sent to the backup server 104 .
  • the backup server 104 applies the first file difference record 124 to the last backup (e.g. backup data N 116 D) to create a new version of the application data 106 to preserve the current state of the application data 106 (PD1) before attempting a recovery.
  • the first file difference record 124 can be kept on the application server 102 for duration of recovery and discarded on the successful completion of the recovery procedure.
  • the size of the first file difference record 124 is only a fraction of the size of whole PD1.
  • this optimization reduces time taken for backup and consumes a fraction of disk space consumed by PD1.
  • the PD1 data may be preferred to a previous recovery point in case the backup application fails to recover the desired recovery point during the recovery procedure.
  • the PD1 data may also be used for doing analysis of what went wrong or to salvage additional data which could not be recovered by the recovery program 110 . Some administrators skip this step to reduce recovery time and overwrite the application data 106 , but this can be disastrous if recovery from the backup fails.
  • step 224 can be executed post recovery (after step 228 ) to save application downtime.
  • the backup server 104 generates a second file difference record 124 as a function of the last backup data (e.g. backup data N 116 D) and the file change data.
  • the second file difference record 124 is the data applied to the application data 106 to restore that data so it is consistent with the backup data.
  • the second file difference record 124 is transferred to the application server 102 .
  • the size of the second file difference record 124 is only a fraction of the size of whole last backup.
  • this optimization reduces time taken for backup and consumes a fraction of disk space consumed by last backup (e.g. backup data N 116 D).
  • the application server 102 applies the second file difference record 124 to the application data 106 to recover the data and make the application data 106 consistent with the last backup. Because only the changed data represented by the second file difference record 124 is transferred from the application server 102 to the backup server 104 , the time, space and bandwidth requirements for recovery may be reduced by a large factor (see Appendix A). Advantageously, reduction in recovery time reduces the downtime for the application and improves business efficiency.
  • the next express backup can be achieved by synchronizing the first recovery point post system failure to the backup data recovered.
  • the change tracking data is used by the application server to generate the file change data and file difference records.
  • the file change data and file difference records are transferred to the backup server.
  • the backup server then applies the file difference records the backup data to generate the next recovery point after the system failure.
  • FIG. 3 is a flow diagram for creating an express backup according to an embodiment of the invention.
  • the protection of the application data is initiated when the backup server 104 receives a first replica or backup data (e.g. backup data 1 ) from the application server 102 .
  • the first replica is a copy of the application data 106 at some initial recovery point.
  • backup server 104 receives a first file change data (e.g. file change data 122 ) from the application server 102 .
  • the first file change data indicates the blocks of application data 106 that have changed since the last backup.
  • the backup server 104 receives a first file difference record (e.g. file difference record 124 ) from the application server 102 .
  • the first file difference record is generated as a function of the application data 106 and the file change data.
  • the first file difference record data structure contains data changed since the time of the last backup (e.g. backup data N 116 D).
  • the backup server 104 applies the first file difference record to the first replica to generate a second replica of the application data 106 for a second recovery point.
  • bandwidth and disk space to create the express backup is minimized because only the file difference record and file change data is transferred from the application server 102 to the backup server 104 to create the second replica. Additionally, after the file difference records have been applied to the first replica, the file difference records may be deleted. However, the file change data is kept on the application server for recovery purposes as explained above with respect to FIG. 2 . Additionally, the time needed to recover the application data 106 will be proportional to the time difference between backup and recovery, which can be reduced to desired extent by increasing the frequency of the express backups.
  • FIG. 4 is a flow diagram for generating file difference records for the recovery of application data 106 from the last backup according to an aspect of the invention.
  • the backup server 104 receives a request for file difference records and file change data (e.g. file change data 122 ) from the application server 102 .
  • the file difference record represents data modifications between backup data corresponding to a backup recovery point and the application data 106 .
  • the file change data indicates the blocks of the application data 106 that have changed since the last backup.
  • file difference records are generated as a function of the received file change data and the last backup data (e.g. backup data N ).
  • the generated file difference records include modified data from the last backup data as indicated by the received file change data.
  • the generated file difference records are transferred to the application server 102 .
  • the application server 102 applies the transferred file difference records to the application data 106 , restoring the application data 106 to the state of the backup data corresponding to the backup recovery point.
  • the backup server 104 receives a request for file difference records associated with a target recovery point and file change data (e.g. file change data 122 ) from the application server 102 .
  • a recovery point corresponds to a version (or state) of the backup data (e.g. backup data N , backup data N-1 , backup data 1 ).
  • the received file change data indicates the blocks of the application data 106 that have changed since the last backup.
  • the current recovery point is initialed to the last recovery point and, at 506 , the total file change data is initialized to the received file change data (e.g. file change data 122 ).
  • the current recovery point is equal to the target recovery point. If the current recovery point is not equal to the target recovery point, at 510 , the current recovery point set to the next previous recovery point. At 512 , the total file change data is calculated as the total file change data unioned with the file change data (e.g. file change data N ) of the current recovery point. Steps 508 - 512 are repeated until the current recovery point is equals to the target recovery point.
  • the file difference records are generated as a function of the total file change data and the backup data associated with the target recovery point.
  • the total file change data indicates the blocks of the application data 106 that have changed since target recovery point.
  • the generated file difference records include data for each file from the backup data associated with the target recovery point that is different from the current state of the application data 106 as indicated by the total file change data.
  • the generated file difference records are transferred to the application server 102 .
  • the application server 102 applies the file difference records to the application data 106 to recover the application to the target recovery point.
  • the state of the application data will be in the same state as the backup data at the target recovery point.
  • the amount of data transferred from the backup server to the application server is minimized.
  • the bandwidth needed to transfer the data and the time needed to recover the data to the target recovery point are also minimized.
  • the backup server 104 receives a request from the application server 102 indicating that the target recovery point is the third recovery point and that blocks 1 , 2 , 4 , 10 of the application data 106 have been modified since the last backup (step 502 ).
  • the current recovery point is initialed to the fifth (last) recovery point (step 504 ) and the total file change data is initialized to the received file change data (step 506 ).
  • the current recovery point is set to the next previous recovery point ( 4 ) (step 510 ).
  • the file change data corresponding to the fourth recovery point ( 6 , 10 , 12 ) is unioned to the total file change data ( 1 , 2 , 4 , 10 ) resulting in a total file change data of ( 1 , 2 , 4 , 6 , 10 , 12 ).
  • the file change data corresponding to the fourth recovery point indicates the changes that occurred to the application data between the fifth recovery and the fourth recovery point.
  • the steps 508 - 512 are repeated for the fourth recovery point.
  • the current recovery point is set to the next previous recovery point ( 3 ) (step 510 ).
  • the file change data corresponding to the third recovery point ( 5 , 7 ) is unioned to the total file change data ( 1 , 2 , 4 , 6 , 10 , 12 ) resulting in a total file change data of ( 1 , 2 , 4 , 5 , 6 , 7 , 10 , 12 ).
  • the file change data corresponding to the third recovery point indicates the changes that occurred to the application data between the fourth recovery and the third recovery point.
  • the total file change data of ( 1 , 2 , 4 , 5 , 6 , 7 , 10 , 12 ) contains a list of blocks that have been changed between the current state of the application data 106 and the third recovery point. Table 1 summarizes the data of this example.
  • the file difference records are generated as a function of the total file change data ( 1 , 2 , 4 , 5 , 6 , 7 , 10 , 12 ) such that the generated file difference records include data for each file from the backup data associated with the third recovery point that is different from the current state of the application data 106 as indicated by the total file change data.
  • the generated file difference records are transferred to the application server 102 and applied to the application data to recover the application to the third recovery point. After the file difference records have been applied to the application data 106 , the state of the application data will be in the same state as the backup data at the third recovery point.
  • programs and other executable program components such as the application program 108 and the recovery program 110 are illustrated herein as discrete blocks. It is recognized, however, that such programs and components reside at various times in different storage components of the portable application server 102 , and are executed by the data processor(s) of the devices.
  • Embodiments of the invention may be implemented with computer-executable instructions.
  • the computer-executable instructions may be organized into one or more computer-executable components or modules.
  • Aspects of the invention may be implemented with any number and organization of such components or modules. For example, aspects of the invention are not limited to the specific computer-executable instructions or the specific components or modules illustrated in the figures and described herein.
  • Other embodiments of the invention may include different computer-executable instructions or components having more or less functionality than illustrated and described herein.

Abstract

Application data associated with an application program located on a storage device is restored utilizing tracked changed blocks of the storage device. The changed blocks of the storage device are tracked as the application program modifies the application data. When a system failure has occurred, file change data is generated for each file of the application data as a function of tracked changed clocks of the storage device and file system metadata of the storage device. Additionally, a file difference record for each changed file is generated to indicate changes between the application data and the last backup. The file difference record is applied to the application data such that application data corresponds to the state of last backup.

Description

BACKGROUND
Data protection and restoration is a critical component of any business enterprise. Typically, the time a recovery program needs to recover data from a backup is proportional to the size of the data source. Recovery time is important to a business, because the application is not functional during recovery.
The recovery process starts when a user detects an anomaly in an application behavior due to a system failure. For example, the application may be unavailable or some important data (e.g. a critical SQL table) is missing. Once an administrator is notified, a recovery process is initiated. In general, the recovery process can be broken down in three phases: (1) preserving the state of the application data after the system failure and before restoring the data; (2) recovering the application data from a backup; and (3) setting up backup protection post recovery. For example, if the total size of protected application data is X, the whole process involves 3×amount of data transfer (1× of data is transferred to preserving the state of the application data after the system failure and before restoring the data, 1× of data is transferred to recover the application data from a backup, and 1× of data is transferred to create the first backup point). Even with modern hardware and a moderate size of data, the whole process takes significant time (hours).
In some cases, administrators will skip the first step to reduce recovery time and overwrite the existing application data without performing a backup. However, it desirable to backup the current state of the application data to an alternate location. This allows the administrator to recover data in case the recovery process fails or the preserved data may be analyzed to determine the cause of the system failure. In some cases, the administrator may be able to salvage additional data from the preserved data which could not be recovered from a previous backup of the application data.
Additionally, there are three distinct states of the application data during the recovery process: (1) the state of the data on the protected server prior to the recovery (PD1), (2) the state of the data on backup server corresponding to the recovery point chosen (BD), and (3) the state of the data on protected server after recovery (PD2). In prior art recovery systems, no application agnostic process is used to transition from one state to another.
SUMMARY
Embodiments of the invention overcome one or more deficiencies in known recovery programs by transferring only the data that has changed since the last backup from the backup server to the application server. The application server tracks changes to a storage device associated with the application data to allow the backup server to transfer only the changes from the last backup to the application server.
In another aspect, the application server generates changes to the application data utilizing the tracked changes to the storage device. The application server transfers the changed data to the backup server allowing the backup server to preserve the current state of the application data. This data may be used to create a new recovery point (replica) or to preserve the current state of the application data after a system failure. Alternatively, embodiments of the invention may comprise various other methods and apparatuses.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Other features will be in part apparent and in part pointed out hereinafter.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is an exemplary block diagram illustrating a system for recovering application data according to an embodiment of the invention.
FIG. 2 is an exemplary flow chart embodying aspects of the invention for preserving and recovering application data according to an embodiment of the invention.
FIG. 3 is a flow diagram for creating an express backup according to an embodiment of the invention.
FIG. 4 is a flow diagram for generating file difference records for the recovery of application data according to an aspect of the invention.
FIG. 5 is a flow diagram for generating file difference records for the recovery of application data according to another aspect of the invention.
Corresponding reference characters indicate corresponding parts throughout the drawings.
DETAILED DESCRIPTION
Referring now to the drawings, an embodiment of the invention includes an application server 102 and a backup server 104. The application server 102 includes a application data 106 modified by an application program 108 and protected by the recovery program 110. The application data 106 is associated with one or more application programs such as an email application or database application and is contained on one or more storage devices 114.
FIG. 1 shows one example of a general purpose computing device in the form of application server 102 and backup server 104. In one embodiment of the invention, the computer (e.g. application server 102 and backup server 104) is suitable for use in the other figures illustrated and described herein. The computer (e.g. application server 102 and backup server 104) has one or more processors or processing units and a system memory. The computer (e.g. application server 102 and backup server 104) typically has at least some form of computer readable media. Computer readable media, which include both volatile and nonvolatile media, removable and non-removable media, may be any available medium that may be accessed by the computer (e.g. application server 102 and backup server 104). By way of example and not limitation, computer readable media comprise computer storage media and communication media.
Computer storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. For example, computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store the desired information and that may be accessed by the computer (e.g. application server 102 and backup server 104).
Communication media typically embody computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and include any information delivery media. Those skilled in the art are familiar with the modulated data signal, which has one or more of its characteristics set or changed in such a manner as to encode information in the signal. Wired media, such as a wired network or direct-wired connection, and wireless media, such as acoustic, RF, infrared, and other wireless media, are examples of communication media. Combinations of any of the above are also included within the scope of computer readable media.
The recovery program 110 tracks the changes to the storage device 114 throughout the duration of protection of the application data 106. In one embodiment, the recovery program 110 monitors writes to the volume of the storage device 114 and maintains the change tracking data 112. Alternatively, the recovery program 110 monitors writes to the file system of the storage device 114 and maintains the change tracking data 112. The change tracking data 112 includes metadata about the writes to each protected volume or file system of the storage device 114 since the last backup (backup data N 116D). The change tracking data 112 does not include the actual data written to the volume or file system, only an indication of the blocks of the volume or file system that have been written since the last backup (backup data N 116D).
The application server 102 and backup server 104 may also include other removable/non-removable, volatile/nonvolatile computer storage media such as the storage device 114. Removable/non-removable, volatile/nonvolatile computer storage media that may be used in the exemplary operating environment include, but are not limited to, magnetic disk drive, optical disk drive, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The drives or other mass storage devices and their associated computer storage media discussed above and illustrated in FIG. 1, provide storage of computer readable instructions, data structures, program modules and other data for the application server 102 and backup server 104. In FIG. 1, for example, storage device 114 is illustrated as application data 106, application program 108, recovery program 110 and change tracking data 112.
The backup server 104 includes backup data 116A, 116B, 116C, 116D (replicas of the application data) and file change data 118A, 118B, 118C, 118D. The backup data 116A, 116B, 116C, 116D is a copy of the application data 106 for a particular point in time and corresponds to a recovery point. Also, each copy of the backup data 116A, 116B, 116C, 116D is associated with file change data 118A, 118B, 118C, 118D file. For example, backup data N 116D and file change data N 118D correspond to an nth recovery point while backup data 2 116B and file change data 2 118B correspond to a second recovery point.
The file change data 118A, 118B, 118C, 118D indicates the changes to the application data 106 between any two recovery points. For example, referring to FIG. 1, file change data 118C indicates the changes to the application data 106 that occurred between backup data 116B and backup data 116C. In one embodiment, the file change data 118A, 118B, 118C, 118D does not include the actual data written to the application data 106, but instead includes an indication of the blocks that have been written between two consecutive recovery points.
In an embodiment, the application server 102 communicates with the backup server 104 through a logical communication connection 120. The application server 102 and backup server 104 may operate in a networked environment using the logical communication connection 120. The logical communication connection 120 depicted in FIG. 1 include a local area network (LAN) and a wide area network (WAN), but may also includes other networks. LAN and/or WAN may be a wired network, a wireless network, a combination thereof, and so on. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and global computer networks (e.g., the Internet).
In a networked environment, program modules depicted relative to the application server 102 and backup server 104, or portions thereof, may be stored in a remote memory storage device (not shown). The network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
FIG. 2 is a flow diagram for restoring the application data 106 of the application program 108 according to the system illustrated in FIG. 1. At 202, the recovery program 110 initiates change tracking generating change tracking data 112. In one embodiment, the change tracking data 112 is written to a data structure of the memory of the application server 102. In this embodiment, the change tracking data 112 does not include the actual data written to the storage device, only an indication of the blocks of the storage device that have been written since the last backup (backup data N 116D). Advantageously, because the actual changed data is not written to the change tracking data 112, the recovery program 112 minimizes the use of system resources.
At 204, a system failure related to the application program 108 is detected. For example, the application program 108 may be completely or partially unavailable or some important data (e.g. a SQL table) is missing. After the system failure is detected, the recovery process is initiated by determining the type of system failure at steps 206 and 214.
Generally, system failures may be classified into the following three categories: corruption, crash, and disaster. Corruption is a logical corruption of the application data 106 and includes transient hardware errors, bit flips, and the accidental deletion of data by users. In this case, the storage device 114 of the application server 102 (including the change tracking data 112) is intact and the operating system is executing normally. A crash results when the application server 102 has crashed or unexpectedly shutdown which results in unmanaged reboot of the operating system. In this case, the application data 106 and/or the change tracking data 112 may be corrupted, but the storage device 114 is in a healthy state. The last category, disaster, occurs when the storage device 114 of the application server 102 has failed and has been replaced.
At 206, a check is made to determine if the storage device 114 has been replaced. If so, the type of system failure is a disaster and a request for the backup data (e.g. backup data 116A, 116B, 116C, 116D) associated with one recovery point is made to the backup server 104 at 208. In an embodiment, the volume identifier of the storage device 114 is checked to see if it has changed since the last backup (e.g. backup data N 116D). If so, the storage device 114 has been replaced. Alternatively, it may be determined if the storage device has been replaced by checking a unique signature of the storage device, administrator input to the recovery program, or by checking a RFID tag associated with the storage device. At 210, the backup server 104 transfers the backup data (e.g. backup data 116A, 116B, 116C, 116D) corresponding to the requested recovery point to the application server 102. For example, backup data 116A is associated with a first recovery point, backup data 116D is associated with an nth recovery point, and backup data 116C is associated with a n-1 recovery point. At 212, the backup data 116A, 116B, 116C, 116D associated with the requested recovery point is applied to the application data 106 to complete the recovery of the application data 106.
If it is determined that the storage device has not been replaced at 206, a check is made to see if the change tracking data (CTD) 112 is available on the application server 102 at 214. In an embodiment, the recovery program marks the storage device 114 as dirty to indicate that a clean shutdown has not occurred when change tracking is initiated at 202. At the time of a clean shutdown, the storage device 114 is marked as clean to indicate that the change tracking data 112 is available and that a clean shut down has occurred. If the recovery program detects that the storage device 114 has a dirty status, a crash type system failure has occurred and the change tracking data 112 should not be used.
If the change tracking data 112 is available, a corruption type system failure has occurred and the process continues at 218. If the change tracking data is not available, a crash type system failure has occurred and the change tracking data 112 needs to be generated at 216. In an embodiment, the change tracking data 112 is generated by comparing a checksum for the application data 106 to a checksum for the last backup, backup data N 116D. The checksum is compared on a block by block basis. For example, if the checksum for the application data 106 is different that the checksum for a block of backup data N 116D, the block has been changed and indication is written to the change tracking data 112
At 218, file change data (FCD) 122 is generated for each file of the application data 106 as a function of the file system and the application data 106. The file change data 122 indicates the blocks of application data 106 that have changed since the last backup, backup data N 116D. In one embodiment, the application indicates the files belonging to the application, while file system of storage device indicates which blocks on the storage device belong to each file, thus it is possible to determine which the blocks belonging to the application data. The intersection of the application data blocks with the change tracking data determines changed blocks for application data 106 since last backup.
Examples of file systems include FAT (File Allocation Table), NTFS (New Technology File System), HFS (Hierarchical File System), HFS+ (Hierarchical File System Plus), ext2 (second extended file system) and ext3 (third extended file system). Therefore, it is possible to determine which blocks belong to the application data 106 as a function of the application data 106 and the file system of the storage device 114. Furthermore, the change tracking data 112 indicates the blocks of the storage that have changed since the last backup. Therefore, it is possible to generate file change data 122 (the blocks of the files of the application data that have changed since the last backup) as a function of the change tracking data 112 and the determined blocks belonging to the application data 106.
At 220, a first file difference record 124 (FDR) is generated as a function of the application data 106 and the file change data. The first file difference record 124 is used to by the backup server 104 to construct a copy of the current state of the application data 106 (PD1). In an embodiment, the first file difference record 124 data structure contains data changed between PD1 and last backup (e.g. backup data N 116D). It is possible to construct the second version of PD1 by applying the first file difference record 124 to the last backup (e.g. backup data N 116D).
At 222, the first file difference record 124 and the file change data 122 for each file of the application data 106 are sent to the backup server 104. At 224, the backup server 104 applies the first file difference record 124 to the last backup (e.g. backup data N 116D) to create a new version of the application data 106 to preserve the current state of the application data 106 (PD1) before attempting a recovery. The first file difference record 124 can be kept on the application server 102 for duration of recovery and discarded on the successful completion of the recovery procedure. Furthermore, the size of the first file difference record 124 is only a fraction of the size of whole PD1. Advantageously, this optimization reduces time taken for backup and consumes a fraction of disk space consumed by PD1.
The PD1 data may be preferred to a previous recovery point in case the backup application fails to recover the desired recovery point during the recovery procedure. The PD1 data may also be used for doing analysis of what went wrong or to salvage additional data which could not be recovered by the recovery program 110. Some administrators skip this step to reduce recovery time and overwrite the application data 106, but this can be disastrous if recovery from the backup fails. Advantageously, because only the changed data represented by the first file difference record 124 is transferred from the application server 102 to the backup server 104, the time to perform this preservation step is minimized. Alternatively, step 224 can be executed post recovery (after step 228) to save application downtime.
At 226, the backup server 104 generates a second file difference record 124 as a function of the last backup data (e.g. backup data N 116D) and the file change data. The second file difference record 124 is the data applied to the application data 106 to restore that data so it is consistent with the backup data. At 228, the second file difference record 124 is transferred to the application server 102. The size of the second file difference record 124 is only a fraction of the size of whole last backup. Advantageously, this optimization reduces time taken for backup and consumes a fraction of disk space consumed by last backup (e.g. backup data N 116D).
At 230, the application server 102 applies the second file difference record 124 to the application data 106 to recover the data and make the application data 106 consistent with the last backup. Because only the changed data represented by the second file difference record 124 is transferred from the application server 102 to the backup server 104, the time, space and bandwidth requirements for recovery may be reduced by a large factor (see Appendix A). Advantageously, reduction in recovery time reduces the downtime for the application and improves business efficiency.
After recovery, protection needs to be setup again to protect against future system failures. This step is done after the application is online, so it is not part of the recovery downtime, but it is the part of overall recovery process. Because the backup server already has a copy of the backup data associated with the recovery point and all changes to the application data are tracked after recovery, it is not necessary to perform a full backup to restart the protection process. Furthermore, this optimization is independent of type of failure. The next express backup can be achieved by synchronizing the first recovery point post system failure to the backup data recovered. Then, the change tracking data is used by the application server to generate the file change data and file difference records. The file change data and file difference records are transferred to the backup server. Next, the backup server then applies the file difference records the backup data to generate the next recovery point after the system failure. Advantageously, the need of doing complete full backup after performing system recovery is eliminated and the time and network bandwidth requirements for recovery is reduced.
FIG. 3 is a flow diagram for creating an express backup according to an embodiment of the invention. At 302, the protection of the application data is initiated when the backup server 104 receives a first replica or backup data (e.g. backup data1) from the application server 102. The first replica is a copy of the application data 106 at some initial recovery point. At 304, backup server 104 receives a first file change data (e.g. file change data 122) from the application server 102. The first file change data indicates the blocks of application data 106 that have changed since the last backup.
At 306, the backup server 104 receives a first file difference record (e.g. file difference record 124) from the application server 102. The first file difference record is generated as a function of the application data 106 and the file change data. In an embodiment, the first file difference record data structure contains data changed since the time of the last backup (e.g. backup data N 116D). At 308, the backup server 104 applies the first file difference record to the first replica to generate a second replica of the application data 106 for a second recovery point.
Advantageously, bandwidth and disk space to create the express backup is minimized because only the file difference record and file change data is transferred from the application server 102 to the backup server 104 to create the second replica. Additionally, after the file difference records have been applied to the first replica, the file difference records may be deleted. However, the file change data is kept on the application server for recovery purposes as explained above with respect to FIG. 2. Additionally, the time needed to recover the application data 106 will be proportional to the time difference between backup and recovery, which can be reduced to desired extent by increasing the frequency of the express backups.
FIG. 4 is a flow diagram for generating file difference records for the recovery of application data 106 from the last backup according to an aspect of the invention. At 402, the backup server 104 receives a request for file difference records and file change data (e.g. file change data 122) from the application server 102. The file difference record represents data modifications between backup data corresponding to a backup recovery point and the application data 106. In an embodiment, the file change data indicates the blocks of the application data 106 that have changed since the last backup. At 404, file difference records are generated as a function of the received file change data and the last backup data (e.g. backup dataN). The generated file difference records include modified data from the last backup data as indicated by the received file change data. At 406, the generated file difference records are transferred to the application server 102. The application server 102 applies the transferred file difference records to the application data 106, restoring the application data 106 to the state of the backup data corresponding to the backup recovery point.
Referring FIG. 5, at 502, the backup server 104 receives a request for file difference records associated with a target recovery point and file change data (e.g. file change data 122) from the application server 102. A recovery point corresponds to a version (or state) of the backup data (e.g. backup dataN, backup dataN-1, backup data1). In an embodiment, the received file change data indicates the blocks of the application data 106 that have changed since the last backup. At 504, the current recovery point is initialed to the last recovery point and, at 506, the total file change data is initialized to the received file change data (e.g. file change data 122).
At 508, it is determined if the current recovery point is equal to the target recovery point. If the current recovery point is not equal to the target recovery point, at 510, the current recovery point set to the next previous recovery point. At 512, the total file change data is calculated as the total file change data unioned with the file change data (e.g. file change dataN) of the current recovery point. Steps 508-512 are repeated until the current recovery point is equals to the target recovery point.
If the current recovery point is equal to the target recovery point, at 514, the file difference records are generated as a function of the total file change data and the backup data associated with the target recovery point. In an embodiment, the total file change data indicates the blocks of the application data 106 that have changed since target recovery point. The generated file difference records include data for each file from the backup data associated with the target recovery point that is different from the current state of the application data 106 as indicated by the total file change data.
At 516, the generated file difference records are transferred to the application server 102. The application server 102 applies the file difference records to the application data 106 to recover the application to the target recovery point. After the file difference records have been applied to the application data 106, the state of the application data will be in the same state as the backup data at the target recovery point. Advantageously, because only the file difference records containing changes between the target recovery point and the current state of the application data, the amount of data transferred from the backup server to the application server is minimized. Thus, the bandwidth needed to transfer the data and the time needed to recover the data to the target recovery point are also minimized.
For example, suppose there are five total recovery points available on the backup server 104. The recovery points are numbered sequentially in descending order where the first recovery point corresponds to the oldest available backup and the fifth recovery point corresponds the last backup before the request. Additionally, the backup server 104 receives a request from the application server 102 indicating that the target recovery point is the third recovery point and that blocks 1, 2, 4, 10 of the application data 106 have been modified since the last backup (step 502). The current recovery point is initialed to the fifth (last) recovery point (step 504) and the total file change data is initialized to the received file change data (step 506).
At step 508, it is determined that the current recovery point (5) is not equal to the target recovery point (3). Therefore, the current recovery point is set to the next previous recovery point (4) (step 510). Also, at step 512, the file change data corresponding to the fourth recovery point (6, 10, 12) is unioned to the total file change data (1, 2, 4, 10) resulting in a total file change data of (1, 2, 4, 6, 10, 12). The file change data corresponding to the fourth recovery point indicates the changes that occurred to the application data between the fifth recovery and the fourth recovery point.
Next, the steps 508-512 are repeated for the fourth recovery point. At step 508, it is determined that the current recovery point (4) is not equal to the target recovery point (3). Therefore, the current recovery point is set to the next previous recovery point (3) (step 510). Also, at step 512 the file change data corresponding to the third recovery point (5, 7) is unioned to the total file change data (1, 2, 4, 6, 10, 12) resulting in a total file change data of (1, 2, 4, 5, 6, 7, 10, 12). The file change data corresponding to the third recovery point indicates the changes that occurred to the application data between the fourth recovery and the third recovery point.
At step 508, it is determined that the current recovery point (3) is equal to the target recovery point (3) and the loop terminates. The total file change data of (1, 2, 4, 5, 6, 7, 10, 12) contains a list of blocks that have been changed between the current state of the application data 106 and the third recovery point. Table 1 summarizes the data of this example.
TABLE 1
Recovery
Point File change data Total file change data
5 1, 2, 4, 10 (from request) 1, 2, 4, 10
4 6, 10, 12 1, 2, 4, 6, 10, 12
3 5, 7 1, 2, 4, 5, 6, 7, 10, 12
The file difference records are generated as a function of the total file change data (1, 2, 4, 5, 6, 7, 10, 12) such that the generated file difference records include data for each file from the backup data associated with the third recovery point that is different from the current state of the application data 106 as indicated by the total file change data. The generated file difference records are transferred to the application server 102 and applied to the application data to recover the application to the third recovery point. After the file difference records have been applied to the application data 106, the state of the application data will be in the same state as the backup data at the third recovery point.
For purposes of illustration, programs and other executable program components, such as the application program 108 and the recovery program 110 are illustrated herein as discrete blocks. It is recognized, however, that such programs and components reside at various times in different storage components of the portable application server 102, and are executed by the data processor(s) of the devices.
The order of execution or performance of the operations in embodiments of the invention illustrated and described herein is not essential, unless otherwise specified. That is, the operations may be performed in any order, unless otherwise specified, and embodiments of the invention may include additional or fewer operations than those disclosed herein. For example, it is contemplated that executing or performing a particular operation before, contemporaneously with, or after another operation is within the scope of aspects of the invention.
Embodiments of the invention may be implemented with computer-executable instructions. The computer-executable instructions may be organized into one or more computer-executable components or modules. Aspects of the invention may be implemented with any number and organization of such components or modules. For example, aspects of the invention are not limited to the specific computer-executable instructions or the specific components or modules illustrated in the figures and described herein. Other embodiments of the invention may include different computer-executable instructions or components having more or less functionality than illustrated and described herein.
When introducing elements of aspects of the invention or the embodiments thereof, the articles “a,” “an,”“the,” and “said” are intended to mean that there are one or more of the elements. The terms “comprising,” “including,” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements.
Having described aspects of the invention in detail, it will be apparent that modifications and variations are possible without departing from the scope of aspects of the invention as defined in the appended claims. As various changes could be made in the above constructions, products, and methods without departing from the scope of aspects of the invention, it is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative and not in a limiting sense.
APPENDIX A
  • Experimental results:
  • In order to measure the performance gains of this approach, some experiments were conducted on an email server. The experiment below assumes a corruption system failure. Below is the summary of the findings:
  • Size of protected data source: 200 GB
  • Churn rate (Logs generated): 5% per day (10 GB per day)
  • Frequency of full backups: once per day
  • Corresponding changes in database file: 30 GB
  • Size of full backup without optimizations: 200 GB
  • Size of backup: 30 GB
  • Percentage gain: 85%
  • Recovery is done after 12 hours of last full backup
  • Amount of logs: 5 GB
  • Changes to database: 15 GB
  • Step 1 (Backup of data source on application server)
  • Data transferred without proposed optimization: 200 GB
  • Data transferred with proposed optimizations: 15 GB
  • Step 2 (Recovery of data source from backup server)
  • Data transferred without proposed optimization: 200 GB
  • Data transferred with proposed optimizations: 15 GB
  • Step 3 (Re-protect data)
  • Data transferred without proposed optimization: 200 GB
  • Data transferred with proposed optimization: 15 GB
  • Overall percentage gain: 92.5%

Claims (15)

1. A method for preserving application data on a storage device of a system at the approximate time of a failure of the system by determining changed application data blocks of a storage device since the last backup where the application data is stored as backup data, said application data being associated with an application program and located on said storage device, said method comprising:
determining the changed application data blocks on the storage device as a function of the application data and the backup data, wherein said determining comprises:
receiving an indication that a system failure has occurred;
determining a checksum for each block of application data as it exists on the storage device of the system at the time of receiving the system failure indication;
receiving a checksum for each corresponding block of backup data from a backup server, said backup data comprising a copy of the application data stored on the backup server at the approximate time a time prior to the time the indication of a system failure was received; and
determining the changed application blocks of the application data function of corresponding to the determined checksums of blocks of the application data on the storage device of the system to the received checksums of the blocks of backup data stored on the backup server;
in response to the received indication that the system failure has occurred, generating file change data for each file of the application data on the storage device of the system as a function e-f corresponding to the determined changed application blocks on the storage device of the system and file system metadata, wherein said the file system metadata includes information about˜file system writes operations to the storage device since the last backup, said file change data indicating the blocks of each file˜that have been modified on the storage device of the system since the last backup of the application data was stored on the backup server;
generating a first file difference record for each file of the application data on the storage device of the system including as a function of the generated file change data, said first file difference record indicating the modifications to each file of the application data on the storage device since the last backup of the application data was stored on the backup server;
transferring the generated first file difference record to the backup server, said backup server applying the first file difference record to the last backup of the application data stored on the backup server to generate a replica of the application data on the backup server, said backup server retaining the first file difference record;
receiving a second file difference record for each changed file corresponding to the generated file change data, the second file difference record indicating changes between the transferred first file difference record corresponding to the application data on the storage device and the replica; and
applying the second file difference record to the application data such that application data on the storage device corresponds to the state of replica data.
2. The method of claim 1, wherein the determining the changed blocks of the storage device further comprises tracking blocks on the storage device including application data that have been changed since the last backup of the application data to the backup server.
3. The method of claim 2, wherein the indication that a system failure had occurred comprises one or more of the following: the application program is completely unavailable, the application program is partially unavailable and at least part of the application data is missing.
4. The method of claim 1, wherein the indication that a system failure has occurred comprises one or more of the following: the application program is completely unavailable, the application program is partially unavailable and at least part of the application data is missing.
5. The method of claim 1, wherein the application program comprises one or more of the following: an email application and a database application.
6. The method of claim 1, wherein the file system of the storage device comprises one or more of the following: FAT (File Allocation Table), NTFS (New Technology File System), HFS (Hierarchical File System), HFS+(Hierarchical File System Plus), ext2 (second extended file system) and ext3 (third extended file system).
7. The method of claim 1, further comprising transferring backup data to the backup server, said backup data associated with the application program and being used by the backup server to create a replica of the application data at a point in time prior to the system failure, said replica being a copy of the application data on the system and representing the state of application data at a point in time.
8. The method of claim 1 wherein the system failure comprises at least one of a corruption of the system, a crash of the system, and a disaster of the system.
9. The method of claim 8 wherein the corruption of the system comprises a corruption of application data of the system, wherein the crash of the system comprises a crash of the application server of the system, and the disaster of the system comprises a failure of the storage device of the system.
10. A system for preserving application data on a primary computing device coupled to a data communication network, said application data associated with an application program, said application data located on a storage device of a primary computing device to a state associated with replica data, said replica data representing a state of the application data at a point in time prior to a failure of the computing device, said system comprising:
a backup computing device coupled to the network;
the primary computing device configured for executing computer-readable instructions to perform steps comprising:
transferring application data on the storage device of the primary computing device as backup data to the backup computing device via the network, said backup data being used by the backup computing device to create a replica of the application data at a point in time prior to a failure, said replica representing the state of the application data at a point in time prior to a failure;
receiving an indication that a system failure has occurred;
determining a checksum for each block of application data as it exists on the storage device of the primary computing device at the time of receiving the system failure indication;
receiving a checksum for each corresponding block of backup data from the backup computing device via the network, said backup data comprising a copy of the application data stored on the backup computing device at the approximate time the indication of the system failure was received;
determining the changed application blocks of the application data corresponding to the determined checksums of blocks of the application data on the storage device of the primary computing device to the received checksums of the blocks of backup data stored on the backup computing device;
in response to receiving the indication that the failure occurred, generating file change data for each file of the application data on the storage device of the primary computing device corresponding to the determined changed application blocks on the storage device of the primary computing device and file system metadata, said generated file change data indicating the blocks of a file associated with the application data that have been modified on the storage device of the primary computing device since the time of the last replica of the application data;
transferring the generated file change data to the backup computing device as backup data via the network;
generating a first file difference record for each file of the application data on the storage device of the primary computing device corresponding to the transferred file change data, the first file difference record indicating changes between the application data on the storage device of the primary computing device and the replica on the backup computing device corresponding to the transferred file change data;
applying the first file difference record to the application data on the storage device of the primary computing device such that application data corresponds to the state of replica data;
receiving a second file difference record for each file of the application data on the storage device of the primary computing device corresponding to the file change data, said second file difference record indicating the modifications to each file since the last transfer of application data as backup data to the backup computing device; and
transferring the second file difference record to the primary computing device, said primary computing device applying the second file difference record to the application data stored on the storage device of the primary computing device.
11. The system of claim 10, wherein the indication that a failure occurs comprises one or more of the following: the application program is completely unavailable, the application program is partially unavailable and at least part of the application data is missing.
12. The system of claim 10, wherein the application program comprises one or more of the following: an email application and a database application.
13. The system of claim 10, wherein the file system of the storage device comprises one or more of the following: FAT (File Allocation Table), NTFS (New Technology File System), HFS (Hierarchical File System), HFS+(Hierarchical File System Plus), ext2 (second extended file system) and ext3 (third extended file system).
14. The system of claim 10, wherein the failure comprises at least one of a corruption of the primary computing device, a crash of the primary computing device, and a disaster of the primary computing device.
15. The method of claim 14, wherein the corruption of the primary computing device comprises a corruption of application data of the primary computing device, wherein the crash of the primary computing device comprises a crash of the application server of the primary computing device, and the disaster of the primary computing device comprises a failure of the storage device of the primary computing device.
US11/616,686 2006-12-27 2006-12-27 Optimizing backup and recovery utilizing change tracking Expired - Fee Related US7685189B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/616,686 US7685189B2 (en) 2006-12-27 2006-12-27 Optimizing backup and recovery utilizing change tracking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/616,686 US7685189B2 (en) 2006-12-27 2006-12-27 Optimizing backup and recovery utilizing change tracking

Publications (2)

Publication Number Publication Date
US20080162599A1 US20080162599A1 (en) 2008-07-03
US7685189B2 true US7685189B2 (en) 2010-03-23

Family

ID=39585510

Family Applications (1)

Application Number Title Priority Date Filing Date
US11/616,686 Expired - Fee Related US7685189B2 (en) 2006-12-27 2006-12-27 Optimizing backup and recovery utilizing change tracking

Country Status (1)

Country Link
US (1) US7685189B2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070300068A1 (en) * 2006-06-21 2007-12-27 Rudelic John C Method and apparatus for flash updates with secure flash
US20090249006A1 (en) * 2008-03-31 2009-10-01 Boldt Michael W System and Method for Setting an Activation State for a Device Used in a Backup Operation
US20120131072A1 (en) * 2010-11-18 2012-05-24 Fuentes Ii Hector System and Method for removing Master File Table ($MFT) File Record Segments (FRS)
US8762342B1 (en) * 2007-03-30 2014-06-24 Symantec Corporation Method of inserting a validated time-image on the primary CDP subsystem in a continuous data protection and replication (CDP/R) subsystem
US9773042B1 (en) * 2013-02-28 2017-09-26 EMC IP Holding Company LLC Method and system for accelerating data movement using change information concerning difference between current and previous data movements

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8326805B1 (en) * 2007-09-28 2012-12-04 Emc Corporation High-availability file archiving
US8341121B1 (en) * 2007-09-28 2012-12-25 Emc Corporation Imminent failure prioritized backup
US8060709B1 (en) 2007-09-28 2011-11-15 Emc Corporation Control of storage volumes in file archiving
US8918603B1 (en) 2007-09-28 2014-12-23 Emc Corporation Storage of file archiving metadata
US8261126B2 (en) 2009-04-03 2012-09-04 Microsoft Corporation Bare metal machine recovery from the cloud
US20100257403A1 (en) * 2009-04-03 2010-10-07 Microsoft Corporation Restoration of a system from a set of full and partial delta system snapshots across a distributed system
US8805953B2 (en) * 2009-04-03 2014-08-12 Microsoft Corporation Differential file and system restores from peers and the cloud
US20110016093A1 (en) * 2009-07-15 2011-01-20 Iron Mountain, Incorporated Operating system restoration using remote backup system and local system restore function
WO2012166102A1 (en) 2011-05-27 2012-12-06 Empire Technology Development Llc Seamless application backup and recovery using metadata
US20150242282A1 (en) * 2014-02-24 2015-08-27 Red Hat, Inc. Mechanism to update software packages
GB2533342A (en) * 2014-12-17 2016-06-22 Ibm Checkpointing module and method for storing checkpoints
US9996429B1 (en) 2015-04-14 2018-06-12 EMC IP Holding Company LLC Mountable container backups for files
US10078555B1 (en) 2015-04-14 2018-09-18 EMC IP Holding Company LLC Synthetic full backups for incremental file backups
US9946603B1 (en) * 2015-04-14 2018-04-17 EMC IP Holding Company LLC Mountable container for incremental file backups
JP7249735B2 (en) * 2018-03-05 2023-03-31 日本電産株式会社 ROBOT CONTROLLER, BACKUP FILE STORAGE METHOD AND PROGRAM

Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5933838A (en) 1997-03-10 1999-08-03 Microsoft Corporation Database computer system with application recovery and recovery log sequence numbers to optimize recovery
US6687897B2 (en) 2000-12-01 2004-02-03 Microsoft Corporation XML based script automation
US6742138B1 (en) 2001-06-12 2004-05-25 Emc Corporation Data recovery method and apparatus
US20040199552A1 (en) 2003-04-01 2004-10-07 Microsoft Corporation Transactionally consistent change tracking for databases
US20050033777A1 (en) 2003-08-04 2005-02-10 Moraes Mark A. Tracking, recording and organizing changes to data in computer systems
US20050149582A1 (en) 2003-12-29 2005-07-07 Wissmann Joseph T. Method and system for synchronization of copies of a database
US6920555B1 (en) 2001-03-10 2005-07-19 Powerquest Corporation Method for deploying an image into other partition on a computer system by using an imaging tool and coordinating migration of user profile to the imaged computer system
US20050204108A1 (en) * 1998-12-31 2005-09-15 Emc Corporation Apparatus and method for copying, backing up and restoring logical objects in a computer storage system by transferring blocks out of order or in parallel backing up and restoring
US20050262097A1 (en) 2004-05-07 2005-11-24 Sim-Tang Siew Y System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services
US6978279B1 (en) 1997-03-10 2005-12-20 Microsoft Corporation Database computer system using logical logging to extend recovery
US20060020634A1 (en) 2004-07-20 2006-01-26 International Business Machines Corporation Method, system and program for recording changes made to a database
US6993523B1 (en) 2000-12-05 2006-01-31 Silicon Graphics, Inc. System and method for maintaining and recovering data consistency in a data base page
US20060047715A1 (en) 2004-08-27 2006-03-02 Archer Analytics, Inc. System and method for managing and analyzing data from an operational database
US20060085485A1 (en) 2004-10-19 2006-04-20 Microsoft Corporation Protocol agnostic database change tracking
US20060122964A1 (en) 2004-12-03 2006-06-08 Tsae-Feng Yu Materialized view maintenance and change tracking
US20060129769A1 (en) 2004-12-09 2006-06-15 Shaofei Chen System and method for migration to manufactured information handling systems
US20060136471A1 (en) 2004-12-17 2006-06-22 International Business Machines Corporation Differential management of database schema changes
US7107589B1 (en) 2001-09-28 2006-09-12 Siebel Systems, Inc. Infrastructure for the automation of the assembly of schema maintenance scripts
US20060235899A1 (en) 2005-03-25 2006-10-19 Frontline Systems, Inc. Method of migrating legacy database systems
US20070079140A1 (en) 2005-09-26 2007-04-05 Brian Metzger Data migration
US20070079089A1 (en) * 2001-11-29 2007-04-05 Emc Corporation Tracking Incremental Changes in a Mass Storage System
US20070083794A1 (en) 2005-10-06 2007-04-12 Yu Seong R System and method for minimizing software downtime associated with software rejuvenation in a single computer system
US20070112895A1 (en) * 2005-11-04 2007-05-17 Sun Microsystems, Inc. Block-based incremental backup
US20070294577A1 (en) 2006-05-16 2007-12-20 Bea Systems, Inc. Automatic Migratable Services
US20070294703A1 (en) 2006-06-19 2007-12-20 Ozan Talu System and Method for Migration of Information From a Legacy to a Replacement Information Handling System

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5933838A (en) 1997-03-10 1999-08-03 Microsoft Corporation Database computer system with application recovery and recovery log sequence numbers to optimize recovery
US6978279B1 (en) 1997-03-10 2005-12-20 Microsoft Corporation Database computer system using logical logging to extend recovery
US20050204108A1 (en) * 1998-12-31 2005-09-15 Emc Corporation Apparatus and method for copying, backing up and restoring logical objects in a computer storage system by transferring blocks out of order or in parallel backing up and restoring
US6687897B2 (en) 2000-12-01 2004-02-03 Microsoft Corporation XML based script automation
US6993523B1 (en) 2000-12-05 2006-01-31 Silicon Graphics, Inc. System and method for maintaining and recovering data consistency in a data base page
US6920555B1 (en) 2001-03-10 2005-07-19 Powerquest Corporation Method for deploying an image into other partition on a computer system by using an imaging tool and coordinating migration of user profile to the imaged computer system
US6742138B1 (en) 2001-06-12 2004-05-25 Emc Corporation Data recovery method and apparatus
US7107589B1 (en) 2001-09-28 2006-09-12 Siebel Systems, Inc. Infrastructure for the automation of the assembly of schema maintenance scripts
US20070079089A1 (en) * 2001-11-29 2007-04-05 Emc Corporation Tracking Incremental Changes in a Mass Storage System
US20040199552A1 (en) 2003-04-01 2004-10-07 Microsoft Corporation Transactionally consistent change tracking for databases
US20050033777A1 (en) 2003-08-04 2005-02-10 Moraes Mark A. Tracking, recording and organizing changes to data in computer systems
US20050149582A1 (en) 2003-12-29 2005-07-07 Wissmann Joseph T. Method and system for synchronization of copies of a database
US20050262097A1 (en) 2004-05-07 2005-11-24 Sim-Tang Siew Y System for moving real-time data events across a plurality of devices in a network for simultaneous data protection, replication, and access services
US20060020634A1 (en) 2004-07-20 2006-01-26 International Business Machines Corporation Method, system and program for recording changes made to a database
US20060047715A1 (en) 2004-08-27 2006-03-02 Archer Analytics, Inc. System and method for managing and analyzing data from an operational database
US20060085485A1 (en) 2004-10-19 2006-04-20 Microsoft Corporation Protocol agnostic database change tracking
US20060122964A1 (en) 2004-12-03 2006-06-08 Tsae-Feng Yu Materialized view maintenance and change tracking
US20060129769A1 (en) 2004-12-09 2006-06-15 Shaofei Chen System and method for migration to manufactured information handling systems
US20060136471A1 (en) 2004-12-17 2006-06-22 International Business Machines Corporation Differential management of database schema changes
US20060235899A1 (en) 2005-03-25 2006-10-19 Frontline Systems, Inc. Method of migrating legacy database systems
US20070079140A1 (en) 2005-09-26 2007-04-05 Brian Metzger Data migration
US20070083794A1 (en) 2005-10-06 2007-04-12 Yu Seong R System and method for minimizing software downtime associated with software rejuvenation in a single computer system
US20070112895A1 (en) * 2005-11-04 2007-05-17 Sun Microsystems, Inc. Block-based incremental backup
US20070294577A1 (en) 2006-05-16 2007-12-20 Bea Systems, Inc. Automatic Migratable Services
US20070294703A1 (en) 2006-06-19 2007-12-20 Ozan Talu System and Method for Migration of Information From a Legacy to a Replacement Information Handling System

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Bednar, "Oracle Recovery Manager 10g," Nov. 2003, 14 pages, Oracle Corporation, USA.
Bertucci et al., "What's New in SQL Server 2000-Second Edition," Feb. 14, 2003, 4 pages, Sams Publishing, USA.
Cooksey et al., "TALX: Increased Performance with RMAN and Oracle Database 10g," Oct. 2005, 4 pages, Oracle Corporation, USA.
Unknown, "Database Recovery Process with Examples," 14 pages, Oct. 2006, IBM, USA.

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070300068A1 (en) * 2006-06-21 2007-12-27 Rudelic John C Method and apparatus for flash updates with secure flash
US8001385B2 (en) * 2006-06-21 2011-08-16 Intel Corporation Method and apparatus for flash updates with secure flash
US8762342B1 (en) * 2007-03-30 2014-06-24 Symantec Corporation Method of inserting a validated time-image on the primary CDP subsystem in a continuous data protection and replication (CDP/R) subsystem
US20090249006A1 (en) * 2008-03-31 2009-10-01 Boldt Michael W System and Method for Setting an Activation State for a Device Used in a Backup Operation
US8769223B2 (en) 2008-03-31 2014-07-01 Symantec Corporation System and method for setting an activation state for a device used in a backup operation
US20120131072A1 (en) * 2010-11-18 2012-05-24 Fuentes Ii Hector System and Method for removing Master File Table ($MFT) File Record Segments (FRS)
US8650229B2 (en) * 2010-11-18 2014-02-11 II Hector Fuentes System and method for removing master file table ($MFT) file record segments (FRS)
US9773042B1 (en) * 2013-02-28 2017-09-26 EMC IP Holding Company LLC Method and system for accelerating data movement using change information concerning difference between current and previous data movements

Also Published As

Publication number Publication date
US20080162599A1 (en) 2008-07-03

Similar Documents

Publication Publication Date Title
US7685189B2 (en) Optimizing backup and recovery utilizing change tracking
US7801867B2 (en) Optimizing backup and recovery utilizing change tracking
US8260747B2 (en) System, method, and computer program product for allowing access to backup data
US7003694B1 (en) Reliable standby database failover
US6691139B2 (en) Recreation of archives at a disaster recovery site
KR101044849B1 (en) Systems and methods for automatic database or file system maintenance and repair
US10346369B2 (en) Retrieving point-in-time copies of a source database for creating virtual databases
US8788770B2 (en) Multiple cascaded backup process
US9244775B2 (en) Reducing reading of database logs by persisting long-running transaction data
US20070185936A1 (en) Managing deletions in backup sets
US9165012B2 (en) Periodic file system checkpoint manager
US7979742B2 (en) Recoverability of a dataset associated with a multi-tier storage system
US20070208918A1 (en) Method and apparatus for providing virtual machine backup
CN108351821B (en) Data recovery method and storage device
US20070038682A1 (en) Online page restore from a database mirror
US20060224636A1 (en) Page recovery using volume snapshots and logs
US9454590B2 (en) Predicting validity of data replication prior to actual replication in a transaction processing system
WO2018098972A1 (en) Log recovery method, storage device and storage node
US7620785B1 (en) Using roll-forward and roll-backward logs to restore a data volume
US10613923B2 (en) Recovering log-structured filesystems from physical replicas
US10078558B2 (en) Database system control method and database system
US8782006B1 (en) Method and apparatus for file sharing between continuous and scheduled backups
US8595271B1 (en) Systems and methods for performing file system checks
US7472141B1 (en) System and method for controlling off-host policies
US20170300387A1 (en) Always Current backup and recovery method on large databases with minimum resource utilization.

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MITTAL, HARSHWARDHAN;NARKHEDE, KUSHAL SURESH;REEL/FRAME:018686/0763

Effective date: 20061221

Owner name: MICROSOFT CORPORATION,WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MITTAL, HARSHWARDHAN;NARKHEDE, KUSHAL SURESH;REEL/FRAME:018686/0763

Effective date: 20061221

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034542/0001

Effective date: 20141014

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552)

Year of fee payment: 8

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20220323