CN103415842A - Systems and methods for data management virtualization - Google Patents

Systems and methods for data management virtualization Download PDF

Info

Publication number
CN103415842A
CN103415842A CN2011800617167A CN201180061716A CN103415842A CN 103415842 A CN103415842 A CN 103415842A CN 2011800617167 A CN2011800617167 A CN 2011800617167A CN 201180061716 A CN201180061716 A CN 201180061716A CN 103415842 A CN103415842 A CN 103415842A
Authority
CN
China
Prior art keywords
data
time
copy
hash
time state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011800617167A
Other languages
Chinese (zh)
Other versions
CN103415842B (en
Inventor
A·阿述托什
C·A·普罗文扎诺
D·F·常
P·J·阿伯尔克罗姆比埃
M·穆塔里克
M·A·罗曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Actifio Inc
Original Assignee
Actifio Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/947,436 external-priority patent/US8904126B2/en
Priority claimed from US12/947,393 external-priority patent/US8788769B2/en
Priority claimed from US12/947,383 external-priority patent/US8396905B2/en
Priority claimed from US12/947,438 external-priority patent/US8299944B2/en
Priority claimed from US12/947,375 external-priority patent/US8843489B2/en
Priority claimed from US12/947,385 external-priority patent/US9858155B2/en
Priority claimed from US12/947,513 external-priority patent/US8417674B2/en
Priority claimed from US12/947,418 external-priority patent/US8402004B2/en
Application filed by Actifio Inc filed Critical Actifio Inc
Publication of CN103415842A publication Critical patent/CN103415842A/en
Application granted granted Critical
Publication of CN103415842B publication Critical patent/CN103415842B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1453Management of the data involved in backup or backup restore using de-duplication of the data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0685Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Abstract

Systems and methods for data management virtualization are disclosed. The systems have a data management engine for performing data management functions, including at least a snapshot function and a back-up function, and a service level policy engine that controls the scheduling of the data management functions using service level agreements in electronic form. Data objects are organized in a deduplicated content store using an organized arrangement of temporal structures to represent states of data over time. Hash signatures are used to track content segments for performing deduplicated copies from a first deduplicated store to a second deduplicated store. Garbage collection is performed on the deduplicated content store for only content segments that have changed relative to an immediately-prior state of a given data object.

Description

For the virtualized system and method for data management
The cross reference of related application
The application requires the right of priority of following patented claim: the title of submitting on November 16th, 2010 is the U.S. Patent Application No. 12/947 of " System and Method for Performing Backup or Restore Operations Utilizing Difference Information and Timeline State Information ", 393, the title that on November 16th, 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Managing Data with Service Level Agreements That May Specify Non-Uniform Copying of Data ", 385, the title that on November 16th, 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Performing a Plurality of Prescribed Data Management Functions in a Manner That Reduces Redundant Access Operations to Primary Storage ", 436, the title that on November 16th, 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Creating Deduplicated Copies of Data by Tracking Temporal Relationships Among Copies and by Ingesting Difference Data ", 418, the title that on November 16th, 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Managing Deduplicated Copies of Data Using Temporal Relationships Among Copies ", 375, the title that on November 16th, 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Improved Garbage Collection Operations in a Deduplicated Store by Tracking Temporal Relationships Among Copies ", 383, the title that on November 16th, 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Creating Deduplicated Copies of Data by Sending Difference Data Between Two Near-Neighbor Temporal States ", the title that on November 16th, 513 and 2010 submitted to is the U.S. Patent Application No. 12/947 of " System and Method for Creating Deduplicated Copies of Data Storing Non-Lossy Encodings of Data Directly in a Content Addressable Store ", 438.
Technical field
The present invention relates generally to data management, data protection, disaster recovery and business continuity.More specifically, the present invention relates to be used to utilizing different information and timeline status information to carry out the system and method for recovery.
Background technology
The business of the life cycle of managing application data is required traditionally by with the multiple spot solution, meeting, wherein the part of each solution processing life cycle.This causes complicated and expensive infrastructure, and wherein a plurality of copies of data are created and repeatedly move to independent thesaurus.The employing of server virtualization becomes the promoting factor to simple, flexible and low-cost computer based Infrastructure.The larger deployment that this causes fictitious host computer and storer, further increased the weight of emerging computer model and the current data management gap between realizing.
Provide the application of commerce services to depend on the storage in the data of the commerce services in various stages of the life cycle of data.Fig. 1 illustrates the general set of data management operations, and its data that will be applied to apply are for example at the commerce services database under payroll management for example.For commerce services is provided, application 102 need to have the main data memory 122 of certain reliabilty and availability that dwindles level.
Backup 104 be produced to prevent to worsen or main data memory by fault or the personal error of hardware or software.Generally, but backup every day or jede Woche are fabricated into local disk or tape 124, and (weekly or per month) moves to long-range physically home 125 more not continually.
In the time of based on the new application of same database, development& testing 106 needs development teams another copy with visit data 126.Such snapshot can be produced weekly, depends on development plan.
With the accordance 108 of legal or voluntary strategy, may require some data to be preserved for not visiting in the safety of some years; Usually, data by regularly (such as, per month) copy long term archival system 128 to.
If provide the system of main commerce services malfunctioning due to certain physics disaster, disaster recovery service 110 prevents the catastrophic loss of data.Provide other constraint (for example cost), master data is reasonably copied (130) continually to physically different positions.In the situation that disaster, home site can be rebuilt, and data move back to from safe copy.
If home site is damaged, business continuity service 112 is provided for guaranteeing the facility of continuous commerce services.Usually this need to the master data step heat copy 132 of consistent master data almost, and the dubbing system and application and the mechanism that for the request that will enter, are switched to the business continuous server.
Therefore, data management is current is the set of some application of the different piece of management life cycle.This is the artefact of the development of data management solution in recent two decades.
The accompanying drawing explanation
Fig. 1 is the reduced graph of current method of managing the data life period of commerce services.
Fig. 2 is by the skeleton diagram of individual data managing virtual system management data in the whole life cycle of data.
Fig. 3 is the simplified block diagram of data management virtualization system.
Fig. 4 is the view of data management virtualization engine.
Fig. 5 illustrates Object Management group and data mobile engine.
Fig. 6 illustrates the storage pool manager.
Fig. 7 illustrates the decomposition of service level agreement.
Fig. 8 illustrates the application particular module.
Fig. 9 illustrates service policy manager.
Figure 10 is the process flow diagram of service strategy scheduler.
Figure 11 is the block scheme that content addressable storage (CAS) provides device.
Figure 12 is illustrated in the definition of the object handle in cas system.
Figure 13 is depicted as data model and the operation of the time chart of the object storage in CAS.
Figure 14 means the diagram of the operation of the garbage collection algorithm in CAS.
Figure 15 copies object to the process flow diagram of the operation of CAS.
Figure 16 is the system diagram of the general deployment of data management virtualization system.
Figure 17 is the schematic diagram that is used in the feature physics server apparatus on the data management virtualization system.
Embodiment
For example above-described current data management structure and realization relate to a plurality of application of the different piece of deal with data life cycle management, some common function is all carried out in all application: the copy (frequency of this action is commonly called recovery point objectives (RPO)) that (a) produces application data, (b) generally with professional format, the copy of data is stored in special thesaurus, and (c) within certain duration measured as retention time, retains copy.At the Main Differences of each point in solution, be frequency, the retention time of RPO and the feature of the independent thesaurus that uses, comprise capacity, cost and geographic position.
It is virtual that the disclosure relates to data management.Data management is virtual for example to be backed up, copies and file is virtualized, because they needn't be configured to and separately and operation dividually.Alternatively, the user defines its requirement of business about the life cycle of data, and the data management virtualization system automatically performs these operations.Snapshot is brought to external memory from primary memory; So this snapshot is for the backup operation to other external memory.In essence, suppose by the level of the data protection of service level agreement regulation, can make these backups of any amount.
The disclosure also relates to for the different information between the service time state the system of data from the first storage pool backup to the second storage pool, the data management engine that it merges for the executing data management function---comprising that backup functionality is to create the backup copy of data at least---.Data management engine can operate to carry out a sequence snapshot operation on the first storage pool, to create the time point reflection of application data, each continuous time point reflection is corresponding to the specific continuous time-state of application data, and each snapshot operation creates the different information of the content of indicating the application data that has changed and changed for which application data of corresponding time state.Data management engine also can operate application data to carry out and be scheduled at least one backup functionality that m-state is carried out when discontinuous, and also be full of the historical information that maintains of m-status information while having, this time the m-status information indication last backup functionality carried out for the respective backup copy application data of data time m-state.Data management engine from the last backup functionality carried out in application data the time m-state and current scheduling that application data is carried out the time different information of each time state between m-state of backup functionality create compound different information, and compound different information is sent to the second storage pool so that be used in final time-state data backup copy and be compiled, with to the current time-state creates the backup copy of data.
According to data management Intel Virtualization Technology of the present disclosure, be basic structure and realization based on take following guiding principle.
At first, the business requirement of the application of the service level agreement (SLA) that is used for its whole data life period is used in definition.SLA is much more complicated than single RPO, reservation and target release time (RTO).The data protection feature in each stage of its data of description life cycle.Each application can have different SLA.
Secondly, provide the uniform data managing virtual engine of management data protection life cycle, Mobile data in the various thesauruss of the memory capacity with raising and the network bandwidth.Part by following the tracks of the data that change along with the time by contact copying and compression algorithm (amounts of the data that its minimizing need to be copied and move), the data management virtualization system is realized these improvement by the ability of the expansion of supplementary modern storage system.
The 3rd, the single primary copy of supplementary application data is usingd as the basis of a plurality of elements in life cycle.A lot of data management operations for example back up, file and copy the copy depended on the stable and consistent of protected data.The data management virtualization system is the single copy of supplementary data for a plurality of purposes.Single instance by the data of system held can be used as source, and each data management function can be made extra copy from this source as required.This forms contrast by a plurality of independently data-management application in classic method and the application data that needs to be copied repeatedly.
The 4th; in a series of data protection storage pools, the physical store resource is from different classes of storer by the physical store Resource Access, and---comprising local and remote disk, solid-state memory, tape and optional medium, special use, public and/or mixing storage cloud---is virtualized.Storage pool provides the access irrelevant with type, physical location or basic memory technology.The business of the life cycle of data requires may require to copy data to dissimilar storage medium in the different time.The data management virtualization system allows the user by different storage medium classification and gathers in storage pool, the fast quick-recovery storage pool for example formed by hyperdisk with can be the thesaurus that copies of the releasing on high-capacity disk or the effective longer-term storage of the cost pond of tape library.The data management virtualization system can be in the middle of these ponds Mobile data to utilize unique feature of each storage medium.The extraction of storage pool provides the access irrelevant with type, physical location or basic memory technology.
The 5th, utilize application data after basic equipment ability and contact copying to improve the movement of the data between storage pool and disaster position.The discovery of data management virtualization system comprises the ability of the storage system of storage pool, and utilizes these abilities to carry out Mobile data effectively.If storage system is the disk array that support to create the snapshot of data volume or clone's ability, the data management virtualization system will utilize this ability and of snapshot, make the copy of data, rather than from a local reading out data and it is write to another place.Similarly, follow the tracks of if the storage system support changes, the data management virtualization system will only upgrade older copy effectively to create new copy to change.When Mobile data in whole network, the data management virtualization system is used to be avoided sending the releasing of available data on the opposite side of network and copies and compression algorithm.
A critical aspects that improves data mobile is to identify application data slowly to change along with the past of time.The copy of the application that make today will have a lot of similarities of the copy of the same application made from yesterday usually.In fact, the copy of today of data can be represented as the copy of the yesterday with a series of increment transformations, and wherein the size of increment transformation itself is usually much smaller than all data in copy itself.These conversion with the form of bitmap or range list are caught and recorded to the data management virtualization system.In an embodiment of system, basic storage resources---disk array or server virtualization system---can be followed the tracks of the variation that capacity or file are carried out; In these environment, data management virtualization system inquiry storage resources changes list to obtain these, and it is preserved together with protected data.
In the preferred embodiment of data management virtualization system, there is the mechanism for the master data access path of eavesdropping application, its make that the data management virtualization system can observe application data which partly be modified, and produce its oneself the bitmap of modified data.If for example be applied in modified block 100,200 and 300 during specific period, the data management virtualization system will be eavesdropped these events, and create the bitmap that these specific pieces of indication are modified.When processing next copy of application data, the data management virtualization system is by a processing block 100,200 and 300, because it knows that these pieces are only the pieces be modified.
In an embodiment of system (wherein the primary memory of application is modern disk array or storage virtualization apparatus), the basic memory device of data management virtualization system utilization is made the time point snapshot of the initial copy of data.This virtual copies mechanism is quick, the effective and low impact technology that creates initial copy, and it does not guarantee that all positions all will be copied or be stored together.Alternatively, by maintaining, allow copy for example Copy on write volume bitmap or scope build virtual copies in rebuilt metadata of access time and data structure.Copy has to application with to the low weight impact of main storage device.In another embodiment (wherein apply based on server virtualization system for example VMware or Xen), the data management virtualization system is used the intrasystem similar virtual machine snapshot ability of server virtualization that is configured in.When the virtual copies ability while not being available, the data management virtualization system can comprise its oneself built-in snapshot mechanism.
May use the data primitive of snapshot as the basis that forms all data management functions of being supported by system.Because it is light, the operation of snapshot usable as internal, even the operation of asking itself is not snapshot; It is created to realize and is convenient to other operation.
When creating snapshot, some related preparatory function may be arranged, in order to create relevant snapshot or relevant reflection, make reflection can return to the spendable state of application.These preparatory functions only need to be performed once, even snapshot for example is added a plurality of data management functions in system in the backup copy according to the strategy scheduling.Preparatory function can comprise that application is static, and it comprises the state that clears data cache memory and freeze to apply; It also may be included in other operation as known in the art and for example, to keep other useful operation of complete reflection, the application collected metadata information from storing together with reflection.
Fig. 2 illustrates a kind of mode that virtual data management system can require according to the data life period that these principles processing are described in early time.
In order to serve the local backup requirement, at the effective snapshot of interior generation one sequence of local high availability memory 202.Some in these snapshots are be used to adapting to exploitation/test request, and do not make another copy.For the longer-term reservation of local backup, copy is fabricated in long-term local storage 204 effectively, and long-term local storage 204 copies to reduce the copy of repetition with releasing in this is realized.It is accessed or processed as archives that copy in long term memory can be used as backup, depends on the retention strategy by the SLA application.The copy of data is fabricated into remote memory 206, in order to meet remote backup and the successional requirement of business---and the single set of copy meets this two purposes again.As the alternatives to remote backup and disaster recovery, another copy of data can be fabricated into the thesaurus 208 by business or the trustship of private cloud storage provider effectively.
The data management virtualization system
Fig. 3 illustrates the quality part of the data management virtualization system of realizing above-mentioned principle.Preferably, this system comprises these basic function parts that hereinafter further describe.
Application 300 creates and has data.This is to be deployed as for example e-mail system, Database Systems or financial report system in order to meet the software systems that certain calculates needs by the user.Application generally moves and utilizes storer on server.For illustrative purposes, only have an application to be instructed to.In fact, hundreds of or even thousands of application by the system management of individual data managing virtual may be arranged.
Storage resources 302 is that application data is stored the place at place in its whole life cycle.Storage resources is the physical store assets, comprises that the user has obtained internal disk drive, disk array, optics and the tape storage storehouse of deal with data memory requirement and based on the storage system of cloud.Storage resources is comprised of primary memory 310 and external memory 312, the online activity of application data copy is stored in primary memory, the additional copy of application data is stored in external memory, for such as backup, disaster recovery, file, index, report and the purpose of other purposes.The external memory resource can be included in the shell identical from primary memory extra memory and based in same data center, another location or at the storer of the similar or different memory technology of crossing over internet.
One or more management work stations 308 allow the user to stipulate service level agreement (SLA) 304, the life cycle of its definition application data.Management work station is desktop or laptop computer or the mobile computing device for configuration, monitoring and control data management virtualization system.Service level agreement is to catch the closed specification that the detailed business about establishment, reservation and the deletion of time copy of application data requires.SLA than in traditional data management application be used to the simple RTO of the frequency of the copy that means single external memory classification and expection release time and RPO complexity many.SLA is captured in a plurality of stages in the data life period standard, and allows incomparable inconsistent frequency and reservation standard in each external memory classification.SLA has been described in more detail in Fig. 7.
The whole life cycle that data management virtualization engine 306 is managed as the application data of stipulating in SLA.It may manage a large amount of SLA for widely applying.The data management virtualization engine obtains input by management work station from the user, and with discovery, applies the primary storage resource alternately with application.What data the data management virtualization engine is made about needs protected and what time storage resources decision-making of satisfied protection needs best.For example, if enterprise is appointed as its accounting data and needs copy to be produced with the very short time interval for business continuity purpose and backup purpose, engine can determine to create with the short time interval according to one group of suitable SLA first storage pool that copies to of accounting data, and arrives time storage pool with the backup copy of long time interval establishment accounting data.This business by the storage application requires to determine.
Engine is then made the copy of application data by the senior ability of available storage resources.In the above example, engine can copy by the business continuity that built-in virtual copies or the snapshot of memory storage are dispatched short time interval.The data management virtualization engine is mobile application data in the middle of storage resources, the business requirement of catching in SLA in order to meet.The data management virtualization engine has been described in more detail in Fig. 4.
The data management virtualization system can be deployed in individual host computer system or device as a whole, or it can be a logic entity, but physically is distributed in the network of general and special-purpose system.Some parts of system also can be deployed in and calculate or store in cloud.
In an embodiment of data management virtualization system, the data management virtualization engine moves on a pair of Tolerate and redundance computing machine mainly as a plurality of processes.Some parts of data management virtualization engine can move near the application in application server.Some other parts can be near operation in storage organization or in storage system itself primary memory and external memory.Management station is connected to desktop and laptop computer and the mobile device of engine by secure network.
The data management virtualization engine
Fig. 4 illustrates the architectural overview according to the data management virtualization engine 306 of some embodiment of the present invention.Engine 3 06 comprises following modules:
Application particular module 402.This module is responsible for controlling and collecting the metadata of self-application 300.Apply metadata comprises the information about application, for example type, the details about its configuration, the position of its data repository, its current operation status of application.The operation of controlling application comprises various action, for example by the data dump of high-speed cache to disk, freeze and thaw to apply I/O, rotation or truncated log file, and close and restart application.The application particular module carry out these the operation, and in response to the order of the service level policy engine 406 from the following describes the sending and receiving metadata.With reference to Fig. 8, the application particular module is described in more detail.
Service level policy engine 406 operates to make the decision-making about establishment, movement and the deletion of the copy of application data according to customer-furnished SLA304.Each SLA describes the business requirement relevant with the protection of an application.The service level policy engine is analyzed each SLA and is drawn a series of action, and wherein each action relates to the copy of application data from a memory location to another memory location.The service level policy engine is then examined these and is taken action to determine priority and correlativity, and scheduling and the operation of initiation data mobile.With reference to Fig. 9, the service level policy engine is described in more detail.
Object Manager and data mobile engine 410 creates it and moves through according to the instruction from policy engine the composite object that application data, apply metadata and the SLA of different storage pools form.Object Manager for example 415 creates the copy of application data with the real-time master data 413 based on belonging to application 300 from the instruction of the form that service strategy engine 406 receives to order another pond specific pond or from existing copy.The copy of the composite object created by Object Manager and data mobile engine is self complete and self-description, because it not only comprises application data, and comprises apply metadata and for the SLA of application.With reference to Fig. 5, Object Manager and data mobile engine have been described in more detail.
Storage pool manager 412 is adjust and extract basic physical store resource 302 and it is rendered as to the parts of storage pool 418.The physical store resource is the actual storage assets, for example disk array and the tape library disposed for the purpose of the life cycle of the data of the application of supporting the user of user.These storage resources can be based on different memory technologies, for example disk, tape, flash memory or optical memory.Storage resources also can have different geographic position, cost and speed attribute, and can support different agreements.The effect of storage pool manager is by the storage resources combination and assembles, and is sequestered in the difference between its DLL (dynamic link library).The storage pool manager is presented to Object Manager 410 as one group of storage pool using the physical store resource, and this group storage pool has the feature of the moment in the life cycle that makes these ponds be suitable for application data.With reference to Fig. 6, the storage pool manager has been described in more detail.
Object Manager and data mobile engine
Fig. 5 illustrates Object Manager and data mobile engine 410.Object Manager and data mobile engine are found and are used and by pool manager 504, presented to its virtual store resource 510.Its is accepted request from service level policy engine 406 with the asset creation from storage pool and maintains the data storage object example, and copies application data between its example according to the storage object of instruction in storage pool from the service level policy engine.The object pool that is selected for copy is impliedly specified selecteed commercial operation, for example backs up, copies or recover.The service level policy engine is resided Object Manager (on same system) or is remotely resided and communicate by agreement by the standard connected network communication in this locality.Can use in a preferred embodiment TCP/IP because it understood well, widely available, and allow the service level policy engine in the situation that this locality navigates to Object Manager or almost do not revising remotely location.
In one embodiment, system can be for easily being deployed in service level policy engine and Object Manager on same computer system of realizing.In another embodiment, system can be used a plurality of systems, if useful or convenient to applying, the subset of each system trustship parts, and do not change design.
Object Manager 501 and storage pool manager 504 are the software parts that can reside on computer system platform, this computer system platform is by storage resources and use the computer system of those storage resources to interconnect, and wherein user's application resides on described computer system.The placement of these software parts on interconnection platform is designated as preferred embodiment, and via the communication protocol that is widely used in such application (for example can provide, fiber channel, iSCSI etc.) client is connected to the ability of storer, and deployment convenient of various software parts can be provided.
Object Manager 501 is communicated by letter with this platform via the application programming interface provided by basic Storage Virtualization platform with storage pool manager 504.How the behavior that these interfaces allow software part to inquire about and control computer system interconnects storage resources and this computer system with it, and wherein user's application resides on described computer system.As common in practice, parts application module technology, to allow the replacement to the code that specifically intercoms to fixed platform.
Object Manager and storage pool manager communicate via agreement.These by generally standard networking protocol available on computer system for example between TCP/IP or standard process communication (IPC) mechanism transmit.Depend on specific computer platform, if parts reside on same computer platform or on a plurality of computer platforms by the network connection, this allows the comparable communication between parts.For easy deployment, current configuration has all local software parts that reside on same computer system.As mentioned above, this is not the strict demand of design, and can reconfigure on demand in future.
Object Manager
Object Manager 501 is be used to maintaining the software part of data storage object, and provides one group of protocol operation to control it.Operation be included in the data between object establishment, destroy, copy and copy, maintain the access to object, and particularly be allowed for creating the standard of the storage pool of copy.The common function subset that not all ponds are all supported; Yet in a preferred embodiment, main pond can be performance optimization, that is, lower time delay, and backup or to copy pond can be that capacity is optimized, support the data of larger amt and be content addressable.Pond can be long-range or local.Storage pool is classified according to various standards, comprises that the user can be used to carry out the means of business decision, for example the cost of every GB storage.
At first, specific memory device (storage is from its taking-up) can be a Consideration because for different commercial objects together with relevant cost and other actual Consideration and distributing equipment.Some equipment can not be even actual hardwares, but the capacity provided as service, and the selection of such resource can complete for the practical commercial purpose.
Secondly, network topology " degree of approach " is considered, because near storer is generally connected by low time delay, not expensive Internet resources, and storer at a distance can be connected by the expensive Internet resources of high time delay, limit bandwidth; On the contrary, when geographical diversity was immune the physics disaster of local resource, storage pool may be useful with respect to the distance in source.
The 3rd, consider the storage optimization feature, the some of them storage is effectively stored space and is optimized, but need computing time and resource to analyze or translation data before data are stored, and other storer is " performance optimization " by contrast, adopt by contrast more storage resources, but few computer time or the resource of usage comparison carried out translation data (if any).
The 4th, consider " access speed " feature, wherein store some intrinsic resources of computer platform user's application examples such as virtual SCSI block device easily and quickly become to available, and some resources may only be used indirectly.These that recover easily and speed usually by the Type Control of the storage of using, and this permission it suitably classified.
The 5th, consider the quantity of the storer used and in given pond available quantity because may have, concentrate or the benefit of the memory capacity that expansion is used.
How and when SLA and criteria for classification that service level policy engine combination user as described below provides maintain application data to determine, and storage pool extracts resource requirement to meet service level agreement (SLA) from it.
Object Manager 501 creates, maintains and by history mechanism, follows the tracks of the operation series that the data object in the performance pond is carried out, and makes those operations relevant to other operation that operates particularly capacity optimization that object is moved to other storage pool.This series record of each data object is maintained at the Object Manager place of all data objects in main pond, all data objects are at first by the master data object association, then by the sequence of operation association: the list of the timeline of each object and all such timelines.Each performed operation shows that substantially virtual primitive is to put the state of capture-data object in preset time.
In addition, basic storage virtualization apparatus can be modified to expose and allow for example fetching of bitmap of internal data structure, the modification of the part of the data of bitmap indication in data object.These data structures are utilized to the state at data point capture-data object: for example, and the snapshot of data object, and be provided at the difference between the snapshot that official hour obtains, thus and realize best backup and recover.Although specific realization and data structure can change in the middle of the different device from different suppliers, data structure is used for following the tracks of the variation to the data object, and storer is used for keeping the virgin state of those parts that changed of object: the indication in data structure is corresponding to the data that retain in storer.When the access snapshot, data structure is consulted, and for the part changed, data rather than the current data of preservation are accessed, because data object is modified in the location of indication like this.The general data structure of using is bitmap, and wherein each is corresponding to the section of data object.Position indication section is set to be modified after the time point of snapshot operation.Basic snapshot primitive mechanism maintains this, as long as the snapshot object exists.
The given original data object of above-described timeline contrast maintains the list of snapshot operation, comprise time that operation starts, time (if any) that it stops, for example, to quoting of snapshot object and (quoting internal data structure, bitmap or range list), make it to obtain from ultimate system.Quoting of the result that the state also maintained the data object of naming a person for a particular job at any given time copies in another pond---as an example, the content addressed result of use in object handle copies the state of data object in capacity optimization pond 407 to---.This object handle is corresponding to given snapshot, and the use snapshot operation is stored in timeline.This association is for identifying suitable starting point.
Best backup and the list of recovery consulting from the expectation starting point to the operation of terminal.The list of the time-sequencing of operation and corresponding data structure (bitmap) thereof are constructed such that series continuous time from start to end is implemented: there is no interval between the zero-time of the operation in series.This guarantees all changes of data object are meaned by corresponding bitmap data structure.The all operations of fetching from start to end is dispensable; Simultaneous data object and basic snapshot are overlapping in time; It is only necessary in the time, there is no interval, and wherein not tracked change may occur.When bitmap, indicate certain storage block to change but be not that what time changes be, bitmap can be added or form together to realize the one group of all changes occurred in the time interval.Not to use this data structure to visit the state at the time point place, the fact of the data of revising when system alternatively utilizes data structure to pass forward between meaning at that time.More properly, the final state of data object is accessed in indicated location, thereby this group is changed and turns back to the data-oriented object from the given start time to the termination time.
Backup operation utilize this timeline, relevant quote and to the access of internal data structure to realize our backup operation.Similarly, it realizes our recovery operation in the mode of supplementing by system.Below in the chapters and sections of " best backup/restoration ", specific step is described.
The storage pool type
Fig. 5 illustrates several representative store pond type.Although described in the accompanying drawings a primary storage pool and two storage pools, configurable more storage pools in certain embodiments.
Primary storage pool 507---comprise be used to creating the storage resources of data object, wherein its data of user's application memory.This is opposite with other storage pool, and there is the operation that mainly realizes the data management virtualization engine in other storage pool.
Performance optimization pond 508---can apply and provide high-performance backup (that is the point in time copy, the following describes) and to the storage pool of the fast access of Backup Images by the user.
Capacity is optimized pond 509---and the reproduction technology mainly the following describes by use in the effective mode of height space provides the storage pool of the storage of data object.Storage pool provides the access to the copy of data object, but as its main target, does not so do with high-performance, opposite with top performance optimization pond.
Initial deployment comprises storage pool as above as the minimum operation set.Design fully expection mean standard described above various combinations various types of a plurality of ponds and be illustrated in easily a plurality of pool managers of all storeies in future deployment.Compromise shown in the above is the feature of computational data storage system.
From practical point of view, these three ponds mean to process in very simple mode the preferred embodiment of most of customer requirements.Most of user will find; if they are useful on a storage pool (it provides fast quick-recovery) that emergent restoring needs and another pond cheaply; make a large amount of reflections in long-time section, to be retained, can be in the situation that seldom damage and be satisfied to nearly all business requirement of data protection.
The form of the data in each pond is by the target of using in pond and technology indication.For example, fast quick-recovery pond is maintained to minimize required conversion and improves the speed of recovering with the form that is very similar to raw data.On the other hand, the longer-term storage pond is with removing the big or small and cost that therefore reduce to store that copies and compress to reduce data.
Object Management group operation 505
Object Manager 501 is according to by service level agreement engine 406, being sent to its instruction to create and maintained the instruction from the data storage object 503 of storage pool 418.Object Manager provides the data object operation in five main region: point in time copy or copy (being commonly called " snapshot "), answer print, object maintenance, mapping and access are safeguarded and are collected.
The Object Management group operation also comprises be used to safeguarding storage pool itself and fetching a series of resource discovering operations about their information.Pool manager 504 finally is provided for the functional of these.
Time point copy (" snapshot ") operation
Snapshot operation creates the data object example of the initial object example that is illustrated in specific time point place.More specifically, snapshot operation creates the member's of set complete virtual copies by the resource of the storage pool of regulation.This is called as data storage object.A plurality of states of data storage object were maintained along with the past of time, and it is available making the state of the data storage object existed at the data point place.As mentioned above, virtual copies is the copy of realizing with basic storing virtual API, and basic storing virtual API allows when writing in backrest or other band technology rather than by all positions copy of copy data with store disk into and be created in light mode.In certain embodiments, this can realize obtaining by the software module be written into the ability of the used basic storing virtual system provided by for example EMC, vmware or IBM.So substantially virtual be not available occasion, described system can provide its oneself the virtual level for being connected by interface with intelligent hardware.
Snapshot operation require application by the state freezing of data to specific point, make Image Data be concerned with, and make snapshot can be after a while for recovering the state of application when the time of snapshot.Also may need other preliminary step.These are processed by the application particular module 302 of describing in chapters and sections subsequently.Therefore for real-time application, need the lightest operation.
Snapshot operation is used as the data primitive of all high level operations in system.In fact, they provide the access to the state of the data at specific time point.Because snapshot also generally realizes with the Copy on write technology that resides in the content on disk with distinguishing the content that changes, these snapshots provide and also can be formed or be added on the difference of copies data effectively of coming together in whole system.The form of snapshot can be that this is described below by the form of the data of data mover 502 copies.
The answer print operation
When copy function was not snapshot, it can be considered to the answer print operation.The answer print operation copies the whole or subset of the source data object in a storage pool in another storage pool data object.Result is two different objects.The answer print operation of one type that can be used is initial " baseline " copy.This generally copies to when another storage pool is for example optimized storage pool from the performance optimization pond to capacity and completes from a storage pool at first in data.Can use the standard of another type to consider operation, the data that wherein only change or difference are copied into target storage pool to upgrade destination object.This will occur after before the initial baseline copy, being performed.
The complete detailed version of object need to not be kept in system when copy is produced, even when at first the data virtual system is initialised, need the baseline copy.This is because each virtual copies provides the access to complete copy.Any increment or difference can be represented as relevant with virtual copies rather than relevant with baseline.This has the positive spinoff of the common step of eliminating in fact a series of change lists of Walkthrough.
The a series of instructions or the request that receive with data mover that answer print operation is provided by pool manager are initiated, and to cause the movement of data in the middle of data storage object, and maintain data storage object itself.Copy function allows the copy of the data storage object of regulation to create with the storage pool of regulation.Result is the copy of the source data object in the target data object in storage pool.
Each constructs snapshot and copy function with preparatory function and activation manipulation.Preparation and two steps that activate allow long-time running resource allocation operations (it is the feature of preproduction phase) from activating decoupling.This can be that only to suspend the application of the short time point feature that a little while realizes simultaneously snapshot operation needed, and in fact snapshot spends the time limited but non-zero quantity and realize.For copy and snapshot operation similarly, the preparation of this two step and activation structure only have and when all set members' resource can be assigned with, just allow the policy engine continuation to operate.
Object maintenance
The object maintenance operation is be used to maintaining the sequence of operations of data object, comprises establishment, destroys and copies.Object Manager and data mover are used the function of pond request intermediary (following more) to realize these operations.Data object can be maintained at global level place, each storage pool place or preferably both.
Set
Collecting operation is subsidiary function.Collection is abstract Concept of Software, the list maintained by Object Manager in storer.They allow all members in policy engine 206 pair sets to ask sequence of operations, allow the consistent application to all members' request.When the use of set allows the time point snapshot, activate, make a plurality of data storage objects all accurately be hunted down at one time, because this is generally needed to the application of correct in logic recovery.The use of set allows the convenient request of copy function in all members of set, and wherein application will be used a plurality of storage objects as logic integral body.
The resource discovering operation
Object Manager is found storage pool by pool manager 504, sending Object Management group operation 505, and use the information obtained about each pond to select to meet the pond of the required standard of given request, or in the situation that do not mate, the fault pond is selected, and Object Manager can then use the resource from selected storage pool to create data storage object.
Mapping and access
Object Manager also provides the set of Object Manager operation to allow and maintain the availability of these objects to applications.The first set is the operation for the resident computing machine of registration and cancel register user's application.Typical identity (for example, fiber channel WWPN, iSCSI identity etc.) is carried out registration computer by the storage networking concerning in use.The second set is " mapping " operation, and when being stored pond (object creates from storage pool) permission, data storage object can be " mapped ", and that is to say, the user is applied to resident computing machine and become available.
This availability is taked storing suitable form, upper as the block device of the iscsi device on fiber channel disk or network, file system etc. on file sharing network such as being present in SAN, and can be used by the operating system on appliance computer.Similarly, " not mapping " operation is inverted to user's application by the availability of the virtual memory facilities on network.By this way, to the data of an application memory, that is, backup can be that the Another Application on another computing machine of time after a while can be used, that is, recover.
502 data movers
Data mover 502 is the software parts in Object Manager and according to what from Object Manager, receive, snapshot (time point) is copied to the data mover of the instruction of request and answer print request in various data storage objects 503 central read and write data.In the middle of the example of the data object of data mover in whole system, be provided for the operation of read and write data.Data mover also provides operation, and it allows inquire about and maintain the state that its long-time running of carrying out of Object Manager request operates.
Data mover is used from the pond function provides the function of device (see figure 6) to realize its operation.Snapshot functions provides the establishment of the data object example of the initial object example that device 608 allows to be illustrated in specific time point.The difference engine function provides the description of device 614 for the difference of request between two data objects relevant on time chain.Data object for being stored on the content addressable pond, provide specific function, and it can provide the difference between any two arbitrary data objects.In some cases also by basic memory virtualization system and in other cases by the commodity storage, realizing that this module provides this function for the performance optimization pond.Data mover 502 use select about the information of difference it copies between the example of data object 503 data set.
For given pond, difference engine provides device that the specific expression along with the difference of past between two states of data storage object of time is provided.For snapshot, provide device, the change between two time points is registered as the writing to certain portions of data storage object.In one embodiment, difference is represented as bitmap, and wherein each is corresponding at the first ordered list that starts and rise to the data object zone of last, the zone that the position indication of wherein setting is revised.This bitmap obtains from the Copy on write bitmap that basic memory virtualization system is used.In another embodiment, difference can be represented as the list corresponding to the scope in the zone of the change of data.For content addressable storage, provide device 610, this is described below being illustrated in, and for effectively determining the part of two different content addressable data objects.
Data mover only copies those different parts by this information, makes the redaction of data object can be by first copying it, obtains the list of difference and then only mobilely corresponding to the data of those differences in list, come to create from existing version.The list of data mover 502 traversal differences, move to the target data object by indicated zone from source data object.(seeing the best mode of data backup and resume).
506 copy functions---request conversion and instruction
Object Manager 501 designation data shifters 502 pass sequence of operations with the data in the middle of copies data object in storage pool 418.This process is included in the following step started while receiving instruction:
At first, create the set request.The title of set is returned.
Secondly, add object to set.From above name set also as the title of the source data object that will be copied and the title of two precedents: the data object and the respective data object in the target memory resource pool that in the memory resource pool of source, are contrasted to obtain difference.This step will be to being repeated by each source data object of operation in this set.
The 3rd, prepare the copy request.Name set also is provided, and memory resource pool is served as target.Warning order denoted object manager contact storage pool manager is to create the target data object corresponding to necessity of each resource in set.Warning order also provides the respective data object in the target memory resource pool to be copied, so the reproducible object provided of device is provided and uses it as destination object.The reference name of copy request is returned.
The 4th, activate the copy request.The reference name of copy request returned above is provided.Data mover is instructed to copy given source object to its corresponding destination object.Each request comprises reference name and describes the sequence number of overall work (the whole set that the source target is right) and describe the sequence number that each independent source-target is right.Except source-target to, the name of corresponding precedent is referred to as the part of copy instruction and is provided.
The 5th, the copy engine obtains precedent by the title of the data object in storage pool and carrys out the difference between difference engine that comfortable source locates.Indicated difference then is transferred to target from source.In one embodiment, these differences are transmitted as bitmap and data.In another embodiment, these differences are transmitted as range list and data.
503 data storage objects
Data storage object is the storage that allows application data of the idiom be familiar with computer data treatment facility and software and method and the software configuration of fetching.In fact, the form of the SCSI block device of these current employings on storage networking, for example SCSI LUN or content addressable container, wherein the indicator of content is by wherein data configuration and identify uniquely these data.By to pool manager, sending instruction, create and maintain data storage object.Lasting actual storage is extracted from storage pool for making application data, and data storage object creates from storage pool.
The structure of data storage object changes according to storage pool, and data storage object creates from storage pool.The object of the form of---data structure of given block device---for the block device of taking on storage networking, data object is realized LBA (Logical Block Addressing) (LBA) and the device identifier of actual storage locations and the mapping between LBA of each piece in data object.The identifier of data object is for identifying mapping set to be used.The service that current embodiment depends on basic physical computing machine platform to be provided realizes this mapping, and depends on interior data structure example such as bitmap or range list.
For the object of the form of taking the content addressable container, the content recognition feature is as identifier, and data object is stored, and describes in the chapters and sections that copy about releasing if following.
504 pool managers
Pool manager 504 is the software parts for managing virtual storage resources and relative functional features as described below.Object Manager 501 communicates by letter to maintain data storage object 503 with data mobile engine 502 with one or more pool managers 504.
510 virtual store resources
Virtual store resource 510 is be used to the adoptable various storeies of the pool manager that realizes the storage pool function, as described below.In this embodiment, the storing virtual device is for offering pool manager 504 using various external fiber channels or iSCSI storage LUN as virtualized memory.
The storage pool manager
Fig. 6 further illustrates storage pool manager 504.The purpose of storage pool manager is to provide basic virtual store resource to the Object Manager/data mover as memory resource pool, and memory resource pool is have the storage of the common interface that other parts by system utilize and data management function abstract.These common interfaces generally comprise for identifying and process the data object relevant to specific time state and for generation of the mechanism of the difference between the data object of the form with bitmap or scope.In the present embodiment, facilities manager provides primary storage pool, performance optimization pond and capacity to optimize pond.Common interface allows Object Manager to create and deletes the data storage object in these ponds, as the copy of other data storage object or as new object, and data mover can be between data storage object mobile object, but and the result of usage data object discrimination operation.
The storage pool management appliance is useful on the general structure of the common interface of the difference realization that realizes similar functions, and the some of them function is provided by " intelligence " basic resources, and other function must realize on less function basic resources.
It is in the process identical with Object Manager/data mover or the software module of carrying out in via this locality or procotol another process that for example TCP communicates by letter that pond request intermediary 602 provides device 604 with the pond function.In the present embodiment, provide device to comprise that primary storage provides device 606, snapshot to provide device 608, content addressable to provide device 610 and difference engine that device 614 is provided, and these provide device to be further described hereinafter.In another embodiment, device is provided can be the superset that device is provided shown in here to this group.
Virtual store resource 510 is be used to the adoptable dissimilar storer of the pool manager that realizes the storage pool function.In the present embodiment, the virtual store resource comprises the group of the SCSI logical block of memory virtualization system: memory virtualization system is in operation on the hardware identical with pool manager and pass through DLL (dynamic link library) addressable (for data and bookkeeping): except the calibrated bolck memory function, other ability is available, comprise and create and delete snapshot, and the part of following the tracks of the change of volume.In another embodiment, virtual resource can be from the external storage system that exposes similar capabilities, and can be upper at interface (for example, by file system or by for example CIFS, iSCSI or CDMI access of network interface), in ability (for example, whether resource supports to produce the operation of copy-on-write snapshot) upper or aspect NOT-function (for example, at a high speed/limited capacity for example solid magnetic disc with respect to low speed/high power capacity SATA disk for example) upper different.Available ability and interface determine which provides device can consume the virtual store resource, and which pond function need to be realized by one or more devices that provide in pool manager: for example, content addressable storage provides this realization of device only to need " mute " storage, and realizes fully in content addressable provides device 610; That substance addressable virtual store resource alternately is used in is better simply " by " provide on device.On the contrary, snapshot device is provided this realize usually " by ", and need to expose the storage of quick time point copy function.
Pond request intermediary 602 is simple software parts, and its virtual store resource 510 configured by contrast is carried out one group of suitable pond function provides device that the request to the specific function of storage pool is provided.The request that can be provided includes but not limited to create object in pond; From pond, delete object; Data are write to object; From the object read data; In pond, copy object; Between pond, copy object; The general introduction of the difference between two objects of request in pond.
The management interface (for example, creating and delete the part of the change of snapshot and trace file) that primary storage provides device 606 to realize the virtual store resource, also for example fiber channel, iSCSI, NFS or CIFS directly are exposed to application to the virtual store resource via interface.
Snapshot provides device 608 to realize the function of making from the time point copy of the data of primary storage pool.This establishment is filled with another resource pool abstract of snapshot.As realize, the time point copy is the Copy on write snapshot from the object of primary storage pool, consume the second virtual store resource to adapt to the copy of Copy on write, because this management function is by for primary memory with for snapshot, providing the virtual store resource of device to expose.
Difference engine provides device 614 can meet the request to two objects to be compared in pond, and these two objects are connected in time chain.Difference between these two objects partly is identified and to provide the specific mode of device for example use bitmap or scope and summarized.For example, the difference part can be represented as bitmap, wherein the bit representation fixed size zone of each setting, wherein these two object differences; Or difference can be represented as a series of funcalls or readjustment on program.
According to pond based on the virtual store resource or realize pond other device is provided, difference engine can bear results in various manners effectively.As realize, the difference engine worked on the pond that provides device to realize via snapshot provides the Copy on write feature of device to follow the tracks of the variation to object that snapshot is made with snapshot.Therefore the continuous snapshot of the primary object of single variation has with them is provided the record of the difference of device storage together by snapshot, and the difference engine in snapshot pond is only fetched this record of variation.If also realized, the effective tree construction that the difference engine worked on the pond that provides device to realize via content addressable is used content addressable to realize (below seeing, Figure 12) completes the quick comparison between object when requiring.
Content addressable provides device 610 to realize the write once read many content addressable interface of the virtual store resource that it consumes.It meets reading and writing, copies and deletion action.Each object that writes or copy is by the unique handle identification obtained from its content.Below further describe content addressable device (Figure 11) is provided.
The pool manager operation
In operation, pond request intermediary 502 accepts for example request of copy, snapshot or deletion pond or object of data operating operation.Which of pond 504 be the title or quote of request intermediary by checking pond or object determine from provides the device code to be performed.The form that can provide device to process by specific pond function is provided to the services request that intermediary then will enter, and call suitable sequence the device operation is provided.
For example, the request entered can ask the snapshot of the volume in autonomic storage pond in future to be fabricated in the snapshot pond.The request entered is according to the object (volume) in title identification primary storage pool, and the combination of title and operation (snapshot) determines that snapshot provides device to be called, and this snapshot provides device can use basic snapshot from main pond generation time point snapshot.For example bitmap or scope of the required definite form of copy function when this snapshot provides device to convert to request by the performed local write of basic storage virtualization apparatus, and its results conversion of copy function becomes can turn back to Object Manager and uses in request in the future to pool manager during by local write storage volume handle.
Use the best mode of the data backup of Object Manager and data mover
The best mode of data backup is along with past of time, to produce the sequence of operations of amount of the continuous version simultaneous minimization data of application data object, and these data must be by being copied with bitmap, scope and the different information At All Other Times that is stored in the object shifter.It in data storage object, and makes it relevant to metadata application data store, and metadata, along with the time makes various variations relevant with application data in the past, makes and changes can easily be identified in the past along with the time.
In a preferred embodiment, this process comprises the following steps:
1. mechanism is provided at the state of initially quoting of the application data in data storage object, for example T0;
2. along with past of time, create the example subsequently (version) of the data storage object in the storage pool that has difference engine and provide device when requiring;
Each continuous version for example T4, T5 by the difference engine of storage pool, provide device to obtain the difference between it and the example that created before it, make T5 be stored as quoting of T4 and one group of difference between T5 and T4;
4. the copy engine receives the request of another data center (destination) that data are copied to from a data object (source);
5. if storage pool (wherein the destination object will be created) does not comprise other object created from version before source data object, new object is created in the storage pool of destination, and the full content of source data object is copied into the destination object; This process finishes.Otherwise next step and then;
6. if storage pool (wherein object is created) comprises the object created from version before source data object, the nearest establishment in the storage pool of destination before version selected, version before this is existed to version before corresponding in the storage pool of source data object.For example, if the copy of T5 from the snapshot pond, initiate, and time T 3 create to as if at the available recent release in target place, T3 is selected as former version;
7. the list of the time-sequencing of the version of structure source data object, it starts with the initial version of identifying in former step and finishes with the source data object that will be copied.In the above example, at snapshot Chi Chu, all states of object are available, but only comprise T3 and meaningful at the state of T3 back: T3, T4, T5;
8. the respective list of the difference between each the continuous version in the structure list, make all differences from the initial version of list to final version be expressed.Difference all which of identification data partly is changed, and comprises the new data of corresponding time.This produces one group of difference from target version to the source version, for example difference between T3 and T5.
Object by being replicated in the identification in step 6 in the storage pool of destination before version create the destination object, the object T3 in the target thesaurus for example;
10. this group difference of identifying in the list that will create in step 8 copies the destination object to from source data object, and this process finishes.
Each data object in the storage pool of destination is complete; That is to say, it represents whole data object, and allows the access of all application datas at the time point place is not needed in the state at some place or the external reference of expression At All Other Times.Addressable to liking, and all increments are not delivered to current state from a baseline state minute journey.In addition, initial and the copying of version subsequently of the data object in the storage pool of destination do not need wherein the detailed of application data content to copy.Finally, arrive second and state subsequently only need the transmission of the change following the tracks of and maintain, as mentioned above, and do not have the content of data storage object traversal, transmit or copy.
Use the best mode of the data resource of Object Manager and data mover
Intuitively, the best mode of data resource is the conversion of the best mode of data backup.The expectation state of the data object in some preset time place creates the destination storage pool again comprises the following steps:
1. identification has difference engine and provides the version of the data object in another storage pool of device, and it is corresponding to the expectation state that will be created.This is the source data object in the storage pool of source;
2. be identified in the former version of the data object again created in the storage pool of destination;
3., if do not have data object identified in step 2, be created in the new destination object in the empty storage pool in destination and from source data object, copy data to the destination data object.This process is complete.Otherwise, continue the following step;
4., if the version of data object is identified in step 2, identify corresponding to the data object in the source storage pool of the data object of identification in step 2;
5., if do not have data object identified in step 4, be created in the new destination object in the empty storage pool in destination and from source data object, copy data to the destination data object.This process is complete.Otherwise, continue the following step;
6. by the data object that is replicated in identification in step 2, create the new destination data object in the storage pool of destination;
7. by the difference engine of source storage pool, provide the difference set between the data object that device obtains the data object of identification in step 1 and identification in step 4;
8. will from source data object, copy the destination data object to by the data that the list that step 7 creates is identified.This process finishes.
Access to expectation state is complete: it does not need the external reference to other container or external status.The state of quoting of setting up given expectation state neither needs detailed traversal also not need detailed transmission, the change of fetching of only being indicated by the expression provided in the storage pool of source.
Service level agreement
Fig. 7 illustrates service level agreement.Service level agreement is caught the detailed business requirement about time copy of application data.In the most simply describing, when business requirements definition copy is created and how long is created once, and how long their retain and these copies exist with storage pools of what type.Several aspects that business requires are not caught in the description of this oversimplification.The frequency that the copy in the pond of given type creates may not be consistent in one day all hours or in all skies in a week.Some day in some hour of one day or a week or January can mean the critical period of more (still less) in application data, and therefore can require the copy of more (or still less) frequency.Similarly, all copies of the application data in particular pool may not need to retain the time of equal length.The copy of the application data for example, created when processing finishes per month may need to retain the longer time period than the copy in the same storage pool created in the mid-January.
The service level agreement 304 of some embodiment is designed to mean to be present in all these complicacy in the business requirement.Service level agreement has four major parts: the set of title, destination, internal affairs processing attribute and service level strategy.As mentioned above, each application has a SLA.
Name attribute 701 allows each service level agreement to have unique title.
But describing attribute 702 is helpful descriptions of user's specified services level agreements.
Service level agreement also has a lot of internal affairs processing attributes 703, and it can be maintained and revise it.These attributes include but not limited to possessory identity, establishment date and time, modification and access, priority, enable/disable flag.
Service level agreement also comprises a plurality of service level strategies 705.Some service level agreement can only have single service level strategy.More generally, single SLA can comprise the dozens of strategy.
In certain embodiments, each service level strategy at least by under list and form: storage pool position 706, source and type 708; Target storage pool position 710 and type 712; Be represented as the frequency 714 of the establishment copy of time period; Be represented as the length 716 of retention period of the copy of time period; The running time 718 of this specific service level strategy in one day; And the those days 720 of this service level strategy in applicable week, moon or year.
The frequency of the copy of each service level strategy regulation source and target storage pool and the application data that needs between those storage pools.In addition, the service level strategy stipulate its operation hour and its is applicable day.Each service level strategy is the expression of the single statement during the business of the protection of application data requires.For example, if specific, apply and have the business requirement copied at the archives that create per month afterwards in 3 years with reservation after closing per month, this may be transformed into need to be at the service level strategy of the copy of the midnight of last day in January from the local backup storage pool to long-term archives storage pond, and retention period is 3 years.
Have source and destination pond and position for example source primary storage pool and destination local express according to all service level strategies of the particular combinations in pond when when considering in the lump, stipulating copy being created to the business requirement in this pond, specific destination.Business requires to indicate snapshot copy for example to be created in per hour but only every four hours outside these times during the regular working time.Two service level strategies with identical sources and target storage pool will be caught these requirements with the form that can be implemented by the service strategy engine effectively.
The service level agreement of this form allows every day, jede Woche and the expression of the plan of business activity per month, and therefore catches the traditional scheme based on RPO and RPO of contrast and protect exactly the business requirement with managing application data more.By permission operate hour and day of 1 year, week and month, scheduling can occur " calendar basis " is upper.
A particular combinations with source and destination for example all service level strategies in " source: local main and destination: local performance optimization " is caught the inconsistent data protection requirement to the storage of a type altogether.Single force on the other hand for No. RPO one day the single consistent frequency of data protection in free and all skies.For example, the combination of service level strategy may require a large amount of snapshots to be saved one period short time, and for example 10 minutes, the snapshot of lesser amt was saved one long period, for example 8 hours; This allows a small amount of information of unexpectedly being deleted to be returned to and is not more than 10 minutes states before, still at the long period horizontal line, provides substantial data protection simultaneously, and does not need to store the storage overhead of all snapshots of taking in every ten minutes.As another example, the Backup Data defencive function can be between the work period with a frequency and during weekend with the given strategy of another frequencies operations.
All data that when the service level strategy of all inhomogeneous source and destination storages was included, service level agreement was caught fully to whole application---comprising local express photograph, local long duration thesaurus, device external memory, archives etc.---comprise requirement.The set of the strategy in SLA can mean when given function should be performed, and can mean a plurality of data management functions that should carry out on given data source.
Service level agreement is created and revises by the user interface in management work station by the user.These agreements are by the structure SQL database of its management or the e-file that the service strategy engine in other thesaurus is stored.Strategy is retrieved, analyzes electronically and worked by its normal consistency plan as above by the service strategy engine.
Fig. 8 illustrates application particular module 402.The application particular module is at application 300(as mentioned above) near operation, and with application and its operating environment alternately with collected metadata inquiry and the required application of control data management operations.
The various parts of application particular module and application and its operating environment mutual, operating environment comprises application service process and demons 801, apply for example VSS and the VDS on window of configuration data 802, operating system stores service 803(), logical volume management and file system service 804 and operating environment driver and module 805.
The application particular module is carried out these operations in response to the control command from service strategy engine 406.Exist for these two the mutual purposes with application: metadata collecting and application consistance.
Metadata collecting is a process, and the application particular module is collected the metadata about application by this process.In certain embodiments, metadata comprises information, for example: the configuration parameter of application; State and the situation of application; The control documents of application and startup/close script; The position of data file, daily record and the transaction journal of application; And Symbolic Links, file system point of fixity, logical volume name and can affect other such entity of the access of application data.
Metadata is collected and preserves together with application data and SLA information.This guarantees that each copy in intrasystem application data is complete, and comprises all details that the reconstruction application data are required.
The application consistance is that when the copy of application data is created, to guarantee to copy be one group of action that effectively also can return to effective example of application.When business require application be movable, when the indication application is protected in its on-line operation state the time, this is crucial.Application can have complementary data relationship in its data repository, and if these in consistent state, be not copied, will not provide effective recoverable reflection.
Realize that the conforming definite process of application is applied to Another Application from one and changes.Some application have the simple clear command that the data that force buffer memory arrive disk.Hot-standby mode is supported in some application, and wherein application guarantees that its operation is recorded to guarantee conforming mode, even when application data changes.Some application need to the operating system memory device for example VSS and VDS alternately to guarantee consistance.The application particular module is built for specific purpose, together with specific application, to work and to guarantee the consistance of this application.The application particular module with substantially memory virtual equipment and Object Manager alternately so that the consistent snapshot of application data to be provided.
For efficiency, the preferred embodiment of application particular module 402 is to move on the server identical with application 300.This guarantee with application mutual in minimal time delay, and provide the stores service on the application main frame and the access of file system.Applied host machine is the general primary memory of considering, it then is snapshotted the performance optimization thesaurus.
Interruption for the application that runs minimized, comprise and minimize preliminary step, the application particular module only is triggered when the access corresponding to data when the specific time is required and other local time in the snapshot to this time is not present in system produces snapshot, follows the tracks of as Object Manager.The time be produced by following the tracks of snapshot, Object Manager can be realized the request of data subsequently from the performance optimization data repository, comprises that it can be optimized pond from sub-volumes and send for meeting a plurality of requests to backing up and copying.Object Manager may provide object handle to the snapshot in the performance optimization thesaurus, and the performance optimization thesaurus of the peculiar native format of bootable form with snapshot, and this depends on basic memory storage.In certain embodiments, this form can be the application data combined with one or more LUN bitmaps of indicating which piece change; In other embodiments, it can be specific scope.Therefore form for data transmission can only transmit two increment or differences between snapshot by bitmap or scope.
The version number that also can for example apply each application memory metadata is together with snapshot.When the SLA strategy was performed, apply metadata was read and for strategy.This metadata is stored together with data object.For each SLA, application element is according to only during light snapshot operation, being read once, and the preparatory function of for example removing the cache memory appearance in that time will only be performed once during light snapshot operation, even this copy of application data can be used for a plurality of data management functions together with its metadata.
The service strategy engine
Fig. 9 illustrates service strategy engine 406.The service strategy engine comprises service strategy scheduler 902, and it checks by user configured all service level agreement and make scheduling decision and meets service level agreement.It depends on several data repositories carrys out capturing information and makes it lasting along with the past of time, and data repository comprises in certain embodiments: SLA thesaurus 904, and the service level agreement wherein configured continues and is updated; Resource distribution file thesaurus 906, its storage provide the resource distribution file of the mapping between logical storage Pool name and actual storage instrument; Protection directory stores storehouse 908, the successful copy before wherein overdue about also not having of creating in various tool and information is cataloged; And centralized historical data base 910.
Historical storage storehouse 910 is to preserve the historical information about the action in past in order to use all data-management application, comprises that each is applied to the timestamp of the former copy of various storage tools, order and level.For example, initiate and be arranged in the afternoon 9 overdue snapshots copies from the main memory data storehouse to capacity optimization data thesaurus and will be recorded in the historical storage storehouse 910 the ephemeral data thesaurus for 1 in the afternoon, the ephemeral data thesaurus also is included in the object data of link of the snapshot of the same source and target that at 11 in the morning and 12 noon occur.
These thesauruss are by the service strategy engine management.For example, when as the user, creating service level agreement by management work station or revising one of strategy in it, it is to make the service strategy engine that new SLA in its thesaurus is lasting and this modification is reacted by scheduling copy as indicated as SLA.Similarly, when the service strategy engine successfully completed the data mobile work of the new copy that causes the application in storage pool, the storage policy engine upgraded the historical storage storehouse, makes this copy will resolve into the factor of following decision-making.
The preferred embodiment of the various thesauruss that the service strategy engine uses is the form with form in the redundant data base management system near the service strategy engine extremely.This guarantees consistent business semantics when inquiring about with the renewal thesaurus, and allows dirigibility in fetching complementary data.
The dispatching algorithm of service strategy scheduler 902 is shown in Figure 10.When the service strategy scheduler determined that it need to make the copy of application data from a storage pool to another storage pool, its initiated data mobile requester and watch-dog task 912.These tasks are not the tasks of reappearing, and stop after they complete.According to the mode that the service level strategy is prescribed, a plurality of in these requesters operate simultaneously.
When determining that while bearing which extra task, the service strategy scheduler is considered the priority of service level agreement.For example; if a service level agreement has high priority; because the protection of its regulation to the application of task key; and another SLA has lower priority; because the protection of its regulation to test database; the service strategy engine can be selected the only protection of operation to the application of task key, and can postpone or skip even fully the protection to the lower priority application.This is by the service strategy engine implementation of the higher priority SLA of scheduling before lower priority SLA.In a preferred embodiment, under these circumstances, in order to check purpose, the service strategy engine also will be to management work station trigger notice event.
The strategy dispatching algorithm
Figure 10 illustrates the process flow diagram of tactful scheduling engine.The strategy scheduling engine is circulated throughout defined all SLA continuously.When it arrives the end of all SLA, its dormancy a moment, for example 10 seconds, and restart again to browse SLA.The data protection business requirement of each SLA encapsulation to an application, therefore all SLA represent all application.
For each SLA, scheduling engine 1000 is collected together at all service level strategies of the process status with identical pond, Chi He destination, source 1004, and 1002, next SLA in this group SLA is repeated.This subclass of service level strategy gets up to represent all requirements to the copy from this source storage pool to this specific destination storage pool.
In the middle of this subset of service level strategy, the service strategy scheduler abandons and not can be applicable to today or the strategy outside its running time.In the middle of the strategy stayed, find and have the strategy (1006) of short frequency, and based on historical data with in historical storage storehouse 910, find the strategy with the longest maintenance that needs then operation (1008).
Then, a series of inspection 1010-1014 are arranged, it stops the new copy of making application data at this moment, because new copy does not also expire, because copy is underway or because do not have new data to copy.If any in these conditions is applicable, the service strategy scheduler moves to the new combination in source and destination pond 1004.If in these conditions, neither one is applicable, new copy is initiated.That in copy as the respective service level strategy in this SLA1016, stipulates is performed.
Then, scheduler moves to next source and destination pond combination of same service level agreement 1018.If no longer include different combinations, scheduler continues to move to next service level agreement 1020.
After the service strategy scheduler passed all source/destinations ground pond combination of all service level agreement, it suspended one period short time and then restarts circulation.
Has the simple exemplary system (only having 2 strategies to be defined) in snapshot thesaurus and back-up storage storehouse by following mutual with the service strategy scheduler.Provide two strategies, a statement " is per hour backed up; backup will keep 4 hours " and another statement " backup in every 2 hours; backup will keep 8 hours ", result will be the single snapshot of per hour taking, each snapshot is copied into the back-up storage storehouse, but in snapshot thesaurus and back-up storage storehouse, retains the time of varying number." every 2 hours backup " strategy be scheduled with at noon 12 by the system manager, implemented.
4 points in the afternoon, when the service strategy scheduler starts when step 1000 operates, it finds two strategies in step 1002.(these two strategies all are suitable for, because the multiple of having passed since 12 noon two hours).In step 1004, only has a source and destination pond combination.In step 1006, two frequencies are arranged, and 1 hour frequency of system selection, because it is shorter than 2 hours frequencies.Two operations with different retention periods are arranged in step 1008, and system selects the operation with 8 hour retention period, because it has long retention.Not to make a copy to require and make another to copy to meet requirement in 8 hours to meet 4 hours, during these two requirements all merge to and required in long 8 hours, and met by single snapshot copy function.System determines that in step 1010 copy expires, and the object that 910 inspections are relevant in the historical storage storehouse is to determine whether (in step 914) is produced copy in target (in step 912) with in source.If these inspections are passed through, system is initiated copy in step 916, and triggers during the course snapshot and be produced and be kept at the snapshot thesaurus.Snapshot then copies the back-up storage storehouse to from the snapshot thesaurus.System is followed dormancy (1022) and is for example again waken up after 10 seconds in short time period.Result is at the copy at back-up storage storehouse place with at the copy at snapshot thesaurus place, and wherein snapshot continues 8 hours each even a few hours, and each strange a few hours snapshot continue 4 hours.Even a few hours of back-up storage storehouse and snapshot thesaurus snapshot all enclosed the label of 8 hour retention period, and will be at this moment by another process from system by automatic detection.
Note, not 2 reasons of taking two snapshots or making two backup copies, even these two strategies all are suitable for, because these two strategies are all met by single copy.Combination and merge the minimizing that these snapshots cause unwanted operation, keep the dirigibility of a plurality of independent strategies simultaneously.It also can help that the same target with a plurality of retention periods is had to two movable simultaneously strategies.In given example, exist than two hours many per hour copies of copy, cause larger granularity for close to the current time, recovering.For example, in system in front, if 7:30 reflection is from the discovery earlier in afternoon in the afternoon, backup will be to past four hours: each hour of 4,5,6,7 of afternoon is available.Many two backups will be retained from 2 pm and 12 noon again.
The content addressable storage storehouse
Figure 11 realizes that content addressable provides the block scheme of module in the content addressable storage storehouse of device 510.
The realization in content addressable storage storehouse 510 provides to capacity rather than to copying into or copy out the memory resource pool of speed-optimization, if will be the situation in the performance optimization pond of realizing described in early time in snapshot, and therefore be generally used for offline backup, copy and remote backup.Content Addressable Memory provides the common subset mode only once of the different objects of storage, and wherein those common subset can the vicissitudinous size of tool, but general little of 4 kilobyte.With the snapshot thesaurus, compare, the storage overhead in content addressable storage storehouse is low, although the access time is usually higher.Usually, the object in the content addressable storage storehouse not with each other internal relation, even they can share its most contents, although in this is realized, historical relation also is maintained, it is the starter of the various optimizations that will be described.This is opposite with the snapshot thesaurus that snapshot forms chain inherently, and each thesaurus is only stored from the increment of former snapshot or baseline copy.Particularly, the content addressable storage storehouse will only be stored in a copy of multiple data subset in single object, and based on the thesaurus of snapshot, will store at least one copy of any object.
Content addressable storage storehouse 510 be via local transmission for example TCP in same process or the software module of carrying out on the system identical with pool manager in the process of separating.In this embodiment, the content addressable storage module is moved in order to minimize the impact from the software fault of different parts in the process of separating.
The purpose of this module is to allow to store data storage object 403 in the effective mode of height space by reproducting content (that is the content of, guaranteeing the repetition in single or multiple data objects is only stored once).
The content addressable storage module provides service via programmable A PI to pool manager.These services comprise following:
Process the object of mapping 1102: can be by via API, data being write in thesaurus and created object; In case data are write fully, API is just by the determined object handle of the content of returning an object value.On the contrary, data can be used as byte stream by providing handle to read from the skew in object.With reference to the description of Figure 12, explain the details of how to construct handle.
Father/subrelation that time tree management 1104 is followed the tracks of between the data object of storing.When data object was written in thesaurus 510, API allowed it as child, to be linked to the parent object in thesaurus.This is the modification of parent object to content addressable storage storehouse indication subobject.Single parent can have a plurality of children with different modifications, if if may be that the data of for example application are saved in thesaurus situation a little while regularly; Then early stage copy is resumed and is used as new starting point for subsequently modification.Be described in more detail below time tree bookkeeping and data model.
Difference engine 1106 can be created in the general introduction of the zones of different between two any objects in thesaurus.Distinguish the API operated via the handle of stipulating two objects to be compared and be called, and the form of difference general introduction is to have a series of readjustments of side-play amount and continuous difference size partly.Two Hash by relatively more parallel object mean to come calculated difference.
Garbage collector 1108 is that the analyzing stored storehouse is to find the data of being preserved the withdrawal of by any object handle, not quoted to allot the service to the storage space of these data.The character in content addressable storage storehouse is, a lot of data are quoted by a plurality of object handles, that is, data are shared between data object; Some data will be quoted by single object handle; But the data of not quoted by object handle (if be if object handle from the situation of content addressable system-kill) can be safely by the new data overwrite.
Object tools device 1110 is the services that are replicated in two data objects between different content addressable storage storehouses.A plurality of content addressable storage storehouse can be used for meeting extra business requirement, for example offline backup or remote backup.
Implement of Function Module shown in Figure 11 is used in these services.Data Hash module 1112 is up to the data block of fixed size restriction, to produce the key of regular length.For example, in the present embodiment, the largest amount of piece (hash generator will produce key for this piece) is 64KiB.The regular length key is Hash, and it is enclosed label to indicate Hash scheme or the lossless compression used to encode.The Hash scheme of using in this embodiment is SHA-1, and its generation has uniform distribution and do not have facility need to merge in this system to survey and to process enough safety encipher Hash of the probability of approaching zero hash-collision of conflict.
Data handle cache memory 1114 is software modules of database in diode-capacitor storage, and database provides the of short duration storage to data-mapping of data and handle.
Persistence handle management index 1104 is CAH reliable persistent databases to data-mapping.In the present embodiment, it is implemented as B tree, and Hash is mapped to the page the persistent data thesaurus 1118 of the data that comprise this Hash from hash generator.Because full B tree can not once be kept in storer, for efficiency, the present embodiment is also used in storer and develops filtrator to avoid the B-tree search to the costliness that is not known the Hash existed.
Persistent data storage module 1118 stores data and handle into long-term permanent memory, returns to designation data and is stored in token where.Handle/token is to subsequently be used to fetching data.When data were written to permanent memory, it is passed in the present embodiment mode used one deck lossless data compression 1120 that zlib realizes and the optional reversible encryption 1122 of one deck of not enabled in the present embodiment.
For example, it is the operation by object/service of handle map device provides that data object is copied in the content addressable storage storehouse, because the object entered will be stored, and handle will be returned to requester.Object/handle map device reads the object entered, and the Hash that request is produced by the data hash generator stores data the persistent data storage device into and stores handle into permanent handle management index.For the following fast finding of the data of handle, data are processed cache memory and are kept being updated.The data compressed also (alternatively) that store the persistent data storage device into are encrypted before writing disk.Generally, the request of the copy in the data object is also set allocating time to management service to produce the historical record of object, and this keeps lasting via permanent data storage.
As another example, from the content addressable storage storehouse copies data of having been given handle to as if by another operation that object/service of handle map device is provided.Handle is searched to locate corresponding data in data handle cache memory; If data lose in cache memory, permanent index is used; In case data are positioned on disk, it just is retrieved and follows redistribution to turn back to requester via persistent data storage module (its encryption decompression data in magnetic disk).
Content addressable storage storehouse handle
How the handle that Figure 12 illustrates content addressed object produces.The data object management device is quoted all the elements addressable objects with content addressable handle.This handle consists of three parts.First 1201 is directly sizes of the master data object of sensing of handle.Second portion 1202 is degree of depth of the object of its sensing.The 3 1203 is Hash of the object of its sensing.Field 1203 comprises that alternatively the indication Hash is the label of the lossless coding of master data.The encoding scheme that label indication is used, for example, as the form of the run-length encoding (RLE) of the data of algorithm coding, if data block can fully be expressed as short length RLE.If the master data object is too large, can not be represented as lossless coding, from the pointer that is hashing onto data or the mapping of quoting, be stored in individually permanent handle management index 1104.
The content addressable object is divided into piece 1204.The size of each piece must be by content addressable handle 1205 addressables.Data are by data Hash module 1102 Hash, and the Hash of piece is for generation of handle.If the data of object are engaged in a piece, the handle created is the final handle of object.If no, handle itself is grouped into piece 1206 together, and Hash produces every group of handle.This grouping of handle continues (1207), until only have a handle produced 1208, so it is the handle of object.
When object will be rebuild from the content handle (memory resource pool copy out operation), top content handle by dereference to obtain the list of next stage content handle.These again by dereference to obtain the other list of content handle, until the degree of depth 0 handle is obtained.These expand to data by the handle of searching in handle management index or cache memory, or (at the algorithm Hash for example in the situation of length of stroke programming) expands to full content definitely.
Time tree management
Figure 13 is depicted as the time tree relation of the data object establishment be stored in the content addressable storage storehouse.This specific data structure only is utilized in the content addressable storage storehouse.Time tree administration module maintains data structure 1302 to make in permanent storage storehouse that each content addressed data object and parent (it may be zero, to cause first in revising sequence) be correlated with.The separate nodes of tree comprises single cryptographic hash.If the piece of this cryptographic hash reference data---Hash is the degree of depth 0 Hash, if or the list of other Hash---Hash is the degree of depth 1 or higher Hash.Being mapped to quoting of cryptographic hash is included in permanent handle management index 1104.In certain embodiments, the edge of tree can have weight or length, and it can be used to finding neighbours in algorithm.
This is the standard tree structure, and the operation of module support standard maneuver, and particularly: 1310 add: add the leaf under parent, this causes in original state 1302 and adds the variation of the tree between rear state 1304; And 1312 remove: remove node (and reset its father to its child parent), this causes at state after interpolation 1304 and the variation of the tree between state 1306 after removing.
When object copied into CAS from outside pond, " interpolation " operation was used.If copy into being via the best mode for data backup, if or object originate from different CAS ponds, older generation's object needs designatedly, and adds operation and is called to record this older generation/descendants's relation.
When policy manager determined that the retention period of object expires, " removing " operation was called by Object Manager.These data that can cause being stored in CAS do not have object in quoting its time tree, and therefore subsequently refuse collection discharges storage space be used to re-using by can be those available data.
Note, the single older generation may have a plurality of descendants or child node.For example, be modified if object is created, and at time T 2 in time T 1 at first, this may occur, revise and re-execute via recovery operation, and the time T 3 that is modified in is subsequently made.In this example, state T1 has two children, state T2 and state T3.
Different CAS ponds can be used for realizing different business goals, for example at remote location, provides disaster recovery.When from a CAS, copying another CAS to, copy can be used as Hash and side-play amount is sent out, and removes replication capacity with this locality that utilizes target CAS.The master data of being pointed to by any new Hash also is being sent out on basis as required.
The time tree construction is as the part of the realization of various services and be read or manage to pass:
● refuse collection manages to pass tree in order to reduce the cost in " mark " stage, as described below.
● copy to different CAS ponds and find one group of neighbour in also by the known time that has been transferred to other CSA pond, being set, make and only have a small group difference to be transmitted extraly.
● for the best mode of data recovery, set to find the older generation on the basis that can be used as recovery operation service time.In CAS time data tree structure, child is version subsequently, for example, as indicated as the archives strategy.A plurality of children are supported on same father node; May change at father node in this case, occur while then as the basis of recovering, also again changing subsequently.
The CAS difference engine
Two objects that 1106 comparisons of CAS difference engine as the cryptographic hash in Figure 11 and 12 or handle identify, and in object, produce sequence offset amount and a scope, wherein known object data difference.By in the Hash data structure at Figure 12, traveling through concurrently two object trees, realize this sequence.The tree traversal is standard depth or breadth first traversal.During traveling through, compare the Hash at current depth.In the Hash of node identical occasion between both sides, do not need from setting fartherly, so traversal can be shortened.If the Hash of node is not identical, traversal continues to drop to next floor level of tree.If traversal arrives the degree of depth 0 Hash not identical with its homologue, the absolute offset values in data object be compared (wherein not identical data occur together with data length) is issued in output sequence.If an object is less than another object in size, its traversal will complete earlier, and all side-play amounts subsequently that run in other traversal of tree are issued as difference.
Via the refuse collection of distinguishing
If described under at Figure 11, garbage collector is the service of analyzing specific CAS thesaurus, to find the data of the preservation of being quoted by any object handle in CAS thesaurus time data structure, and regains the storage space of being alloted these data.Refuse collection Application standard " mark and scanning " method.Because " mark " stage is may be quite expensive, for the algorithm of marking phase, attempt to minimize data that mark is identical repeatedly, its quotability immediately is many times; Yet marking phase must be complete, guarantee that the data that are not cited are held unmarked, because this will cause the data degradation from thesaurus, because unlabelled data will be later by the new data overwrite after sweep phase.
The algorithm of the data of quoting for mark is used following truth: use during the data structure arrangement of describing in Figure 13 has the curve of time relationship at the object of CAS.The object of sharing the edge in these curves may be only different on the little subset of its data, and any new data block occurred when object creates from the older generation should occur it being also rare again between any two other objects.Therefore, the component of each connection of the marking phase processing time curve of refuse collection.
Figure 14 be in certain embodiments service time relation the example of refuse collection.The depth-first search that comprises the data structure of time relationship is carried out, and is meaned by arrow 1402.Adopt start node 1404, the tree traversal is from this start node 1404.Node 1404 is tree roots and there is no reference object.Node 1406 comprises object H 1And H 2Quote, the cryptographic hash of the cryptographic hash of indicated object 1 and object 2.By node 1406(, be here H 1And H 2) data object of all degree of depth 0, the degree of depth 1 and the Geng Gao that quote is enumerated and is labeled as has quoted.
Then, processing node 1408.When the node 1406 of it and institute's mark is shared edge, difference engine is applied to by 1406 objects of quoting with by the difference between 1408 objects of quoting, and obtains being present in unlabelled object but one group of degree of depth 0, the degree of depth 1 and the Geng Gao Hash in the object of mark not.In the accompanying drawings, be present in node 1408 but not the Hash in node 1406 be H 3So, H 3Be marked as and quoted.This process continues, until all edges are depleted.
By the result of prior art algorithm 1418 generations and comparison shows that of the present embodiment 1420, while by the prior art algorithm, carrying out processing node 1408, the Hash H seen in the past 1And H 2Together with new Hash H 3Be issued in output stream together.The Hash that the present embodiment 1420 will not seen in the past is issued in output stream, thereby causes only having new Hash H 3, H 4, H 5, H 6, H 7Be issued in output stream, and corresponding raising is arranged on performance.Note, this method does not guarantee that data will not be labeled more than once.For example,, if cryptographic hash H 4In node 1416, occur independently, it will be by mark independently for the second time.
Object is copied in CAS
Object is copied to from another pond CAS and produce by the software module described in Figure 11 the data structure of being quoted by the object handle as in Figure 12.Input to process is (a) in a series of data blocks at the side-play amount place of regulation, and it suitably is set up size in order to generate the degree of depth 0 handle, and the last version of (b) same target alternatively.Impliedly, new object will be identical with last version, except input wherein that data are provided and itself different from last version.Copy algorithm into operation shown in the process flow diagram of Figure 15.
If last version (b) is provided, sequence (a) can be one group of sparse change from (b).Known object to be copied only in the situation that several somes place from before object different, this can greatly reduce need to be copied into the amount of data, and therefore reduce required calculating and i/o activity.This be for example when object via before the best mode of data backup of description be copied into the time situation.
Even sequence (a) comprises most of unaltered part from the older generation, whether the identification older generation (b) also allows to copy into process and really changes fast and check for data, and therefore avoids being compared to the data Replica that the meticulousr granular level of granularity that difference engine in certain other storage pool of input is possible is provided to CAS.
So impliedly, new object will be identical with last version, except input wherein that data are provided and itself different from last version.Copy algorithm into operation shown in the process flow diagram of Figure 15.
When the data object of the arbitrary size in interim thesaurus is provided, this process starts in step 1500, and proceeding to 1502, it enumerates any and all Hash (degree of depth 0 is to highest level) of being quoted by the cryptographic hash in older generation's object, if such Hash is provided.This will be as checking to be avoided storing the data that have been included in the older generation fast.
In step 1504, if the older generation is transfused to, create quoting of its clone in content addressable data repository time data structure.This clone will be updated to become new object.Therefore, new object will become the copy that copies the older generation that the difference CAS revises from copy pond, source to.
In step 1506,1508, data mover 502 is shifted data in CAS onto.Data are subsidiary object reference and side-play amount, and it is the target location of data.Data can be sparse, need to move in new object because only have with the older generation's difference.Now, the data that enter are divided into 0 of the big or small enough little degree of depth, and each piece can be meaned by the single degree of depth 0 Hash.
In step 1510, data Hash module is that 0 of each degree of depth produces Hash (hash).
In step 1512, read in older generation's Hash at same side-play amount place.If the Hash the Hash of the same side-play amount place data coupling older generation, do not have data to be stored, and the degree of depth 1 and Geng Gao object do not need 0 renewal of this degree of depth.In this case, return to accept 0 of next degree of depth of data.This has realized that interim releasing copies, and needn't carry out expensive global search.Even origin system only sends ideally and was stored in the past the difference of the data in CAS, this inspection may be also necessary, if origin system different granular level place is carried out and is distinguished, if or data be marked as change but change the value of getting back to storage before it.Differentiation can be carried out at different granular level places, if for example origin system is on the 32KiB border, to create the snapshot pond of increment, and the CAS thesaurus creates Hash on the 4KiB piece.
If coupling does not find, data can be by Hash storage.In case new data is depleted, data just are written into, and former bias starts and finishes.In case data are stored, in step 1516, if side-play amount still is included in the same degree of depth 1 object, the degree of depth 1, the degree of depth 2 and all higher objects 1518 are updated, at each level place, produce new Hash, and the degree of depth 0, the degree of depth 1 and all higher objects store local cache memory in step 1514.
Yet in step 1520, if the amount of data to be stored surpasses the degree of depth 1 block size and side-play amount will be included in the new degree of depth 1 object, the current degree of depth 1 must be scavenged into thesaurus, is stored in there unless it is determined.At first in global index 1116, check it.If find it there, from local cache memory, remove the degree of depth 1 and all relevant degree of depth 0 objects, and proceed new piece 1522.
In step 1524, as the quick inspection of avoiding browsing global index, for each degree of depth 0, the degree of depth 1 and the Geng Gao object in local cache memory, search its Hash in the local thesaurus of setting up in 1502.Abandon anything of coupling.
In step 1526.For each degree of depth 0, the degree of depth 1 and the Geng Gao object in local cache memory, in global index 1116, search its Hash.Abandon anything of coupling.This guarantees that data are removed and copied globally.
In step 1528: in will storing persistent repository into from all remaining contents of local cache memory, then continue to process new piece.
From the CAS reading object, be better simply process, and be common in a lot of realizations of CAS.The handle of object is mapped to the persistent data object via global index, and required side-play amount is read in this persistent data.Several degree of depth recurrence of passing in the object handle tree in some cases, may be necessary.
The CAS object network copies
Under at Figure 11, describe, reproducer 1110 is services of copy data object between two different content addressable storage storehouses.The process copied can realize by from a thesaurus, reading and write back to another thesaurus, for example, carries out more effective copying but this architecture allows to connect (LAN (Local Area Network) or wide area network) by limited bandwidth.
Dubbing system to the operation of each CAS thesaurus is used above-described difference engine service together with time relationship structure as shown in Figure 13, and on the basis of this external each object in the time data structure that the CAS thesaurus is used storage object be copied to the record in which remote storage storehouse.The clear and definite knowledge that this object that is provided at certain data repository place exists.
Service time data structure, system may determine which data repository which object is present on.This information determines that by data mover and difference engine utilization the smallest subset of the data that will send by network during copy function is so that target data store is up-to-date.For example,, if data object O, at the remote server of time T 3 from Bostonian server copy to Seattle, protects directory stores storehouse 908 will be stored in the object O that time T 3 is present in Boston and two places, Seattle.In time T 5, during the copy subsequently from Boston to the Seattle, the time data structure by consulted should be for the previous state of the O of the object in Seattle of the differentiation on Bostonian source server to determine.Boston server will then obtain the difference of T5 and T3, and this difference is sent to the Seattle server.
So it is as follows to copy the process of object A: identification is registered as the object A0 that is copied to the target thesaurus and the neighbour of the A in local thesaurus.If there is no such object A0, A is sent to the remote storage storehouse, and in this locality, it is recorded as and sends.For native object being sent to the remote storage storehouse, as the conventional method here embodied be: all Hash and the side-play amount that send the data block in object; Inquiry remote storage storehouse represents remotely non-existent data about which Hash; Desired data is sent to remote storage storehouse (sending data and Hash realizes by they being encapsulated in tcp data stream in the present embodiment).
On the contrary, if A0 is identified, move difference engine to be identified in A but the data block in A0 not.This should be the superset that need to be sent to the data in remote storage storehouse.Be sent in A but the not Hash of the piece in A0 and side-play amount.Inquiry remote storage storehouse represents remotely non-existent data about which Hash; Desired data is sent to the remote storage storehouse.
Sample is disposed architecture
Figure 16 illustrates the software and hardware parts of an embodiment who comprises data management virtual (DMV) system.The software that comprises this system is carried out as three distributed elements:
Master agent software 1602a, 1602b, 1602c realize more above-described application particular modules.It is applied in the upper execution of identical server 1610a, 1610b, 1610c, wherein the data of this application are managed.
DMV server software 1604a, 1604b realize the remainder of system as described herein.It is also providing operation on one group of Linux server 1612,1614 of highly available virtual stores service.
This system is controlled by the administrative client software 1606 of operation on type or laptop computer 1620 on the table.
These software parts connect and communicate with one another via network by IP network 1628.The data management virtualization system for example, communicates with one another between home site 1622 and data Replica (DR) website 1624 by IP network (backbone network of public internet).
DMV system at home site and DR website is accessed one or more SAN storage systems 1616,1618 via fibre channel networks 1626.Moving the server of main application accesses by the storer of DMV system virtualization via the fiber channel on fibre channel networks or the iSCSI on IP network.DMV system at long-range DR website place is moved the parallel example of DMV server software 1604c on Linux server 1628.Linux server 1628 can be also Amazon Web service EC2 example or other similar cloud computing resources.
Figure 17 is the figure described according to the various parts of the computerized system of some embodiment of the present invention, and wherein some element can be realized on this computerized system.Described logic module can for example, realize on the host computer 1701 that comprises volatile memory 1702, permanent storage device (hard disk drive 1708), processor 1703 and network interface 1704.Use network interface, system computer can be mutual by SAN or fibre channel device and other embodiment and storage pool 1705,1706.Although Figure 17 illustrates the system that system computer wherein separates with various storage pools, some or all storage pools can be placed in host computer, thereby eliminate the needs to network interface.Programming process can be carried out on individual host as shown in figure 17, or they can be distributed in the middle of a plurality of main frames.
Host computer shown in Figure 17 can be used as management work station, maybe can realize application and application particular agent 402, maybe can realize any and all logic modules of describing in this instructions, comprise data virtual system itself, or can be used as the memory controller that is exposed to system for the storage pool by physical medium.Workstation can be connected to graphic display device 1707 and input equipment (for example mouse 1709 and keyboard 1710).Perhaps, the workstation of active user can comprise handheld device.
In whole this instructions, we mention software part, but software part mentioned to the expection software that is applicable to move on hardware.The object of mentioning in this manual and data structure expection are applicable in fact be stored in the data structure in storer---volatibility or non-volatile---.Equally, server expects is applicable to software, and engine expection is applicable to software, and software all for example moves on the described computer system of Figure 17 at hardware.
Some more relevant features of aforementioned Description of content theme.It is illustrative that these features should be construed as merely.A lot of other useful results can be by applying disclosed theme or obtaining by the theme of revising as will be described in a different manner.For example, disclosed garbage collection algorithm can merge for example tricolor marker of other refuse collection optimisation technique.

Claims (49)

1. one kind for to reduce the system of the mode of the redundant access operation of primary memory being carried out to the data management function of a plurality of regulations, and described system comprises:
Data management engine, it is for the executing data management function, comprise the video snapshot functions of external memory of the time point that can operate at least to create main memory data, and at least one backup functionality that can operate to create at least one backup copy of data, described data management engine is in response to the electronic service level agreements (SLA) of the plan that is given for the executing data management function
Wherein the time point of data reflection comprises the quoting of the variance data of the change to described data of the baseline full reflection of the data at particular point in time and indication particular point in time afterwards, and
Wherein, the described plan be performed simultaneously in response at least some data management functions of needs, described data management engine creates the time point reflection of described main memory data, and the different information of this time point reflection is delivered to described external memory with at least one in the backup copy that upgrades described master data, make for the described primary memory of all corresponding renewal set to described external memory only once accessed.
2. the system as claimed in claim 1, wherein the described time point reflection at the main memory data at external memory place is stored on the external memory of performance optimization.
3. the system as claimed in claim 1, wherein the described backup copy of the described time point reflection of main memory data is stored on remote memory.
4. the system as claimed in claim 1, wherein the described backup copy of the described time point reflection of main memory data is stored in the storer place that capacity is optimized.
5. system as claimed in claim 4, on the storer that wherein the reflection capacity that is stored in that copies as releasing of the described backup copy of the described time point reflection of main memory data is optimized.
6. the system as claimed in claim 1, wherein variance data comprises message bit pattern, each of described bitmap part corresponding to main memory data, and comprise the new data of those parts that are provided to the described bitmap that designation data changed.
7. the system as claimed in claim 1, wherein variance data comprises range information.
8. the system as claimed in claim 1, wherein said data management engine comprise calling described primary memory with the logic of time point reflection that data are provided and comprise from the logic of the described time point reflection of described primary memory retrieval.
9. one kind for the data management function coming the system of management data, described service level agreement to be given on the calendar basis to put rules into practice according to service level agreement (SLA) and be used to reducing the plan of redundancy between function, and described system comprises:
Data management engine, it is for the executing data management function, comprise at least one snapshot functions and at least one backup functionality, described data management engine comprises the service level policy engine, described service level policy engine receives the SLA with electronic form, and control the scheduling of described data management function according to it
Wherein each electronics SLA is relevant to the respective application of usage data, and wherein each SLA stipulates at least one service level strategy, the pond, source of each tactful specified data, should make therein the pond, destination of copy of the data in pond, described source, indicate the copy frequency of frequency of the operation of this strategy, indicate given copy before being allowed to expire, should be retained retention period how long and indication when described strategy is ready operation hour with the plan information of number of days, make the set of the strategy in SLA can mean for when carrying out the incomparable inconsistent plan of given function and can mean to tackle a plurality of data management functions that carry out in the given source of data, and
Wherein said data management engine can operate to use described application and use pond, described source to carry out preparatory function, the relevant reflection that makes the pond, described source of data have data to be copied, and wherein said preparatory function is performed once, even a plurality of data management functions that described SLA regulation will be carried out this pond, source in the current time.
10. system as claimed in claim 9, if wherein two or more copy functions are scheduled to occur in the synchronization between pond, Chi He destination, same source, in described two or more copy functions only one by described data management engine, carried out, and should copy relevant to the maximum retention time of copy function corresponding to described two or more scheduling.
11. system as claimed in claim 9, wherein preparatory function comprises that described data management engine collects metadata about described application in conjunction with application data, to store.
12. system as claimed in claim 9, wherein preparatory function comprises the static operation of application.
13. system as claimed in claim 12, the static operation of wherein said application comprise, freeze described application and further do not upgrade application data.
14. system as claimed in claim 12, the static operation of wherein said application comprises the I/O cache memory of the application server of removing application data.
15. one kind for using different information between time state by the system of data from the first storage pool backup to the second storage pool, described system comprises:
Data management engine, it is for the executing data management function, comprises creating at least one backup functionality of the backup copy of data,
Described data management engine can operate to carry out the snapshot operation of a sequence on the first storage pool, to create the time point reflection of application data, each continuous time point reflection is corresponding to specific continuous time of the state of described application data, and each snapshot operation creates, and which application data of indication has changed and the different information of the content of the application data changed of corresponding time state;
Described data management engine can operate carries out at least one backup functionality to described application data, and wherein said backup operation is scheduled at discrete time state and carries out,
Wherein said data management engine can operate to maintain the historical information with time state information, and described time state information is indicated the time state of the upper backup functionality that described application data is carried out for the respective backup copy of data; And
Wherein said data management engine can operate come for described application data is carried out described on each time state between the time state of time state and the backup functionality of the current scheduling that will carry out described application data of a backup functionality from the compound different information of described different information establishment, and wherein said data management engine can operate described compound different information is sent to the second storage pool compiles to create the data of described current time state together with the backup copy of the data with a time state on described backup copy.
16. system as claimed in claim 15, wherein different information comprises message bit pattern, each of described bitmap part corresponding to main memory data, and comprise the new data of those parts that are provided to the described bitmap that designation data changed.
17. system as claimed in claim 15, wherein different information comprises range information.
18. system as claimed in claim 15, wherein a plurality of backup functionalitys are scheduled to occur simultaneously, each backup functionality has the different gap of discontinuous time state, and each backup functionality has the different composite different information produced corresponding to described different gap.
19. one kind is recovered the system of the data of storage pool for the different information of use between time state from the backup copy of data, described system comprises:
Data management engine, wherein said data management engine can operate to maintain the historical information of the described time state of indication, and for described time state, storage pool has the time point reflection of application data; And
Wherein said data management engine comprises the logic of time point reflection that returns to the described data of official hour state for the application data by storage pool;
Described data management engine can operate to identify the existence of the time point reflection of stating data for the time state before described official hour state in described storage pool place, and will send to described storage pool with the different information of the described backup copy of data, which application data described different information indicates changed and in the content of described official hour state and the application data changed of the time between the time state before described official hour state.
20. system as claimed in claim 19, wherein different information comprises message bit pattern, each of described bitmap part corresponding to main memory data, and comprise the Backup Data of those parts that are provided to the described bitmap that designation data changed.
21. system as claimed in claim 19, wherein different information comprises range information.
22. system as claimed in claim 19, wherein said time state and described official hour state in the past is discrete time state.
23. the different information of a use between the time state of data object forms the method for the reflection that the releasing of the described data object changed along with the time copies, described method comprises:
By the Content Organizing of the described data object of very first time state, be a plurality of inclusive segments and described inclusive segment is stored in data repository;
Create the layout of organizing of Hash structure to be illustrated in the described data object in its very first time state, wherein for the subset of described Hash structure, each structure comprises the hash signature of corresponding inclusive segment and to relevant to quoting of corresponding inclusive segment, and the logical organization of wherein said layout means the tissue as the described inclusive segment of described inclusive segment being expressed in described data object;
Receive the different information of described data object, described different information indication is with respect to the content changed of the described data object of the second time state of described very first time state, and the position of described different information indication described content changed in described data object;
Form at least one hash signature of the described content changed;
The described content changed that will be unique in described data repository is stored as inclusive segment;
The layout of organizing of revising the Hash structure is with the new construction of described at least one hash signature of merging the described content changed, with as described in indicate in different information as described in the content that changed as described in merge in the layout of organizing of corresponding position, position in structure in data object as described in new construction, and make the hash signature of described new construction relevant to quoting of corresponding inclusive segment to the described content changed; And
Make described new construction relevant to described the second time state, thus, the reflection that the releasing of the described data object of the second time state copies is stored, and does not need to receive the full image of the described data object of described the second time state.
24. method as claimed in claim 23, wherein after forming described at least one hash signature of the described content changed, by least one the hash signature comparison in the layout of organizing of formed signature and Hash structure with in the layout of determining formed structure and whether Already in being organized.
25. method as claimed in claim 23, wherein said at first with as described in indicate in different information as described in occur together with Hash structure in the layout of organizing of corresponding position, position of the content that changed.
26. method as claimed in claim 23, wherein the layout of organizing of Hash structure is the tree construction of tissue.
27. method as claimed in claim 23, wherein the layout of organizing of time structure is maintained, and each time structure is relevant to time state, and each time structure comprises the information of indication corresponding to the Hash structure of relevant time state.
28. the method for the reflection that the releasing of the data object that a management changed along with the time copies, described method comprises:
The unique content of each data object is organized as to a plurality of inclusive segments and described inclusive segment is stored in data repository;
For each data object, create the organized layout of Hash structure, wherein for the subset of described Hash structure, each structure comprises the hash signature of corresponding inclusive segment and to relevant to quoting of corresponding inclusive segment, the logical organization of wherein said layout means the logical organization as the described inclusive segment of described inclusive segment being expressed in described data object, and another subset of wherein said Hash structure comprises the level of hash signature of the described hash signature of corresponding inclusive segment, make the layout of organizing can be traversed to determine that whether content is meaned by the layout of the described tissue of Hash structure, and
For each data object, the layout of organizing of the structure of holding time is to mean the corresponding data object changed along with the time, wherein each structure is relevant to the time state of described data object, and wherein the logic arrangement of structure is indicated the time state of the change of described data object, and wherein each time state is relevant to the Hash structure of the content that means described data object during this time state.
29. method as claimed in claim 28, wherein preset time state data object described time structure to respect to described data object before the Hash structure of the data object content that changed of time state relevant.
30. method as claimed in claim 28, the described Hash structure of the wherein said data content changed are organized as the figure separated with the layout of organizing of the Hash structure of former time state.
31. method as claimed in claim 28, the difference wherein from a time state to the content of the data object of another time state layout of organizing of the time structure of another time state and the every other time state between described another time state and a described state by reference make difference to be determined to determine at a plurality of time states.
32. store the method for removing the reflection copied for one kind, the part of wherein said reflection directly is stored in Hash table with coding form, described method comprises:
The unique content of each data object is organized as to a plurality of inclusive segments and described inclusive segment is stored in data repository;
For each data object, create the organized layout of Hash structure, wherein for the subset of described Hash structure, each structure comprise comprising corresponding inclusive segment hash signature field and to relevant to quoting of corresponding inclusive segment, the logical organization of wherein said layout means the logical organization as the described inclusive segment of described inclusive segment being expressed in described data object;
Reception will be included in the content in the reflection that the releasing of described data object copies;
Whether definite content received can encode with predetermined lossless coding technique, and wherein encoded radio will be engaged in be used in the described field that comprises hash signature;
If so, coding is placed in described field, and the described Hash structure of mark is to indicate described field to comprise the encoded content of the reflection that described releasing copies;
If not, produce the hash signature of the content received, and described hash signature is placed in described field, and by the corresponding inclusive segment of Content placement in described data repository received, its prerequisite is that the content received is unique.
33. method as claimed in claim 32, wherein said lossless coding is run length encoding.
34. method as claimed in claim 32, wherein the hash signature of each data object creates by the SHA-1 keyed Hash function.
35. method as claimed in claim 32, it also comprises subsequently rebuilds described content from encoded content.
36. one kind for removing and to copy thesaurus and to mean that information that how data object changes along with the time upgrades second and remove the method that copies thesaurus with first, described method comprises:
First, remove and copy the thesaurus place, the unique content of each data object is organized as to a plurality of inclusive segments and described inclusive segment is stored in data repository;
First, remove and copy the thesaurus place, for each data object, create the organized layout of Hash structure, wherein for the subset of described Hash structure, each structure comprises the hash signature of corresponding inclusive segment and to relevant to quoting of corresponding inclusive segment, the logical organization of wherein said layout means the logical organization as the described inclusive segment of described inclusive segment being expressed in described data object;
First, remove and copy the thesaurus place, for each data object, the layout of organizing of the structure of holding time is to mean the corresponding data object changed along with the time, wherein each structure is relevant to the time state of described data object, and wherein the logic arrangement of structure is indicated the time state of the change of described data object, and wherein each time state is relevant to the Hash structure of the content that means the described data object changed with respect to former time state;
Second, remove and copy the thesaurus place, the unique content of each data object is organized as to a plurality of inclusive segments and described inclusive segment is stored in data repository;
Second, remove and copy the thesaurus place, for each data object, maintain the layout of organizing of Hash structure, namely in described the first releasing, copy at least one subset of the described Hash structure at thesaurus place;
Second, remove and copy the thesaurus place, for each data object, the layout of organizing of the structure of holding time is to mean the corresponding data object changed along with the time, wherein the layout of the described tissue of time structure is to remove at least one subset of the described time structure that copies the thesaurus place described first, thereby means the subset of described time state;
In response to using, from described first, remove the information updating that copies thesaurus described second and remove the request that copies thesaurus, find described first remove copy thesaurus and second remove copy thesaurus common and with described second remove the current state time state around that copies thesaurus; And
The hash signature group of the content that the current time state that compiling copies thesaurus from described common state to described the first releasing has changed, and described hash signature group is sent to described second remove and to copy thesaurus, so its layout organized that can upgrade its Hash structure to mean until described first remove the content of the described data object of the current time state that copies thesaurus.
37. method as claimed in claim 36, it also comprises and maintains the history that each releasing copies the described hash signature that thesaurus comprises, and to copy thesaurus be the hash signature in new described hash signature group for to described second, removing, send the corresponding inclusive segment that copies thesaurus from described the first releasing, making described the second releasing copy thesaurus can upgrade its data repository with fresh content.
38. method as claimed in claim 36 is wherein the arest neighbors state of described current state near the described second described time state of removing the described current state that copies thesaurus.
39. method as claimed in claim 36 is wherein ancestors' state of described current state near the described second described time state of removing the described current state that copies thesaurus.
40. method as claimed in claim 36 is wherein the sub-state of described current state near the described second described time state of removing the described current state that copies thesaurus.
41. method as claimed in claim 38, wherein said arest neighbors state are the states connected by one group of edge, described edge and lower than any other group edge and.
42. method as claimed in claim 36, wherein the described logic arrangement of structure comprises branch.
43. method as claimed in claim 37, it also is included in described current state record and has been sent to what state corresponding to the described inclusive segment of described current time state.
44. a method of carrying out the inclusive segment that refuse collection no longer copies to be cited in storage system in releasing with identification, wherein the operation of the redundant marks in mark and scanning technique is avoided, and described method comprises:
The unique content of each data object is organized as to a plurality of inclusive segments in described releasing copies storage system;
For each data object, create the organized layout of Hash structure, wherein for the subset of described Hash structure, each structure comprises the hash signature of corresponding inclusive segment and to relevant to quoting of corresponding inclusive segment, the logical organization of wherein said layout means the logical organization as the described inclusive segment of described inclusive segment being expressed in described data object, and another subset of wherein said Hash structure comprises the level of hash signature of the described hash signature of corresponding inclusive segment, make the layout of organizing can be traversed to determine that whether content is meaned by the layout of the described tissue of Hash structure,
For each data object, the layout of organizing of the structure of holding time is to mean the corresponding data object changed along with the time, wherein each structure is relevant to the time state of described data object, and the time state of the change of the described data object of the logic arrangement of structure indication wherein, and wherein the Hash structure of the content of the described data object that changed with respect to the previous time state of described data object of each time state and expression is relevant;
For described releasing, copy each inclusive segment in storage system, remove its corresponding refuse collection state;
Iteration on described time structure, and for each time structure, the described refuse collection state of the inclusive segment that the described inclusive segment mark only changed for the previous time state with respect to described data object is relevant; And
Any inclusive segment is turned back to the free core pool of the refuse collection state with removing after described iterative step.
45. method as claimed in claim 44, it also comprises use depth-first search iteration on described time structure.
46. method as claimed in claim 44, it also comprises with periodic intervals and repeats described method.
47. method as claimed in claim 44, it carries out described method after also being included in the new time state that adds data object.
48. method as claimed in claim 44, it carries out described method after also being included in the time state that removes data object.
49. method as claimed in claim 44, it also comprises and maintains the global reference list that described releasing copies all the elements section of having distributed in storage system.
CN201180061716.7A 2010-11-16 2011-11-11 For the virtualized system and method for data management Active CN103415842B (en)

Applications Claiming Priority (17)

Application Number Priority Date Filing Date Title
US12/947,438 2010-11-16
US12/947,375 2010-11-16
US12/947,375 US8843489B2 (en) 2010-11-16 2010-11-16 System and method for managing deduplicated copies of data using temporal relationships among copies
US12/947,418 2010-11-16
US12/947,513 2010-11-16
US12/947,385 2010-11-16
US12/947,393 2010-11-16
US12/947,385 US9858155B2 (en) 2010-11-16 2010-11-16 System and method for managing data with service level agreements that may specify non-uniform copying of data
US12/947,393 US8788769B2 (en) 2010-11-16 2010-11-16 System and method for performing backup or restore operations utilizing difference information and timeline state information
US12/947,418 US8402004B2 (en) 2010-11-16 2010-11-16 System and method for creating deduplicated copies of data by tracking temporal relationships among copies and by ingesting difference data
US12/947,438 US8299944B2 (en) 2010-11-16 2010-11-16 System and method for creating deduplicated copies of data storing non-lossy encodings of data directly in a content addressable store
US12/947,436 US8904126B2 (en) 2010-11-16 2010-11-16 System and method for performing a plurality of prescribed data management functions in a manner that reduces redundant access operations to primary storage
US12/947,513 US8417674B2 (en) 2010-11-16 2010-11-16 System and method for creating deduplicated copies of data by sending difference data between near-neighbor temporal states
US12/947,383 US8396905B2 (en) 2010-11-16 2010-11-16 System and method for improved garbage collection operations in a deduplicated store by tracking temporal relationships among copies
US12/947,383 2010-11-16
US12/947,436 2010-11-16
PCT/US2011/060417 WO2012067964A1 (en) 2010-11-16 2011-11-11 Systems and methods for data management virtualization

Publications (2)

Publication Number Publication Date
CN103415842A true CN103415842A (en) 2013-11-27
CN103415842B CN103415842B (en) 2016-02-03

Family

ID=46084354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201180061716.7A Active CN103415842B (en) 2010-11-16 2011-11-11 For the virtualized system and method for data management

Country Status (8)

Country Link
EP (1) EP2643760A4 (en)
JP (1) JP2013543198A (en)
KR (1) KR20140051107A (en)
CN (1) CN103415842B (en)
AU (1) AU2011329232A1 (en)
BR (1) BR112013012134A2 (en)
CA (1) CA2817592A1 (en)
WO (1) WO2012067964A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361069A (en) * 2014-11-07 2015-02-18 广东电子工业研究院有限公司 Local file system integrated cloud storage service method
CN104679665A (en) * 2013-12-02 2015-06-03 中兴通讯股份有限公司 Method and system for achieving block storage of distributed file system
CN104750538A (en) * 2013-12-27 2015-07-01 伊姆西公司 Virtual storage pool providing method and system for target application
CN105068767A (en) * 2015-08-19 2015-11-18 山东超越数控电子有限公司 Full virtualization storage method based on consistency hash algorithm
CN105938457A (en) * 2016-03-31 2016-09-14 华为技术有限公司 Data filtering method, and device and data reading system
CN109792596A (en) * 2016-10-31 2019-05-21 华为技术有限公司 System and method for uniform data management in communication network
CN109801659A (en) * 2017-11-16 2019-05-24 国际商业机器公司 The activation management of DRAM memory bank
CN109906439A (en) * 2016-11-16 2019-06-18 国际商业机器公司 The time point backup of cloud is stored to object by storage control
CN110019097A (en) * 2017-12-29 2019-07-16 中国移动通信集团四川有限公司 Virtual logical copy management method, device, equipment and medium
CN110268379A (en) * 2017-01-06 2019-09-20 甲骨文国际公司 The cloud of file system data hierarchical structure migrates
CN110737542A (en) * 2018-07-19 2020-01-31 慧与发展有限责任合伙企业 Freezing and unfreezing upstream and downstream rolls
CN112699097A (en) * 2020-12-31 2021-04-23 北京浩瀚深度信息技术股份有限公司 Multi-policy mirror image implementation method and device and storage medium
CN113728303A (en) * 2019-04-26 2021-11-30 Emc Ip控股有限公司 Garbage collection for deduplication cloud layering
CN114218014A (en) * 2021-12-21 2022-03-22 中科豪联(杭州)技术有限公司 Virtual server backup and restoration method based on storage volume level
CN116088770A (en) * 2023-03-20 2023-05-09 苏州浪潮智能科技有限公司 Data management method, device, system, electronic equipment and storage medium

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8983961B2 (en) 2012-11-29 2015-03-17 International Business Machines Corporation High availability for cloud servers
KR101451807B1 (en) 2012-12-24 2014-10-22 주식회사 케이티 An apparatus and a method for backup and restoring of NoSQL meta data
JP6394070B2 (en) * 2014-05-28 2018-09-26 日本電気株式会社 Backup system and backup method
CN104133852B (en) * 2014-07-04 2018-03-16 小米科技有限责任公司 Web access method, device, server and terminal
WO2016043757A1 (en) * 2014-09-18 2016-03-24 Hewlett Packard Enterprise Development Lp Data to be backed up in a backup system
WO2016048263A1 (en) 2014-09-22 2016-03-31 Hewlett Packard Enterprise Development Lp Identification of content-defined chunk boundaries
BR112016028144A2 (en) * 2015-01-13 2017-08-22 Simplivity Corp SYSTEMS AND METHODS FOR OPTIMIZED SIGNATURE COMPARISONS AND DATA REPLICATION
US10846339B2 (en) * 2017-06-20 2020-11-24 Cisco Technology, Inc. Structured schema for temporal graph storage and retrieval
US11151161B2 (en) * 2018-07-06 2021-10-19 Snowflake Inc. Data replication and data failover in database systems
US11922047B2 (en) * 2021-09-16 2024-03-05 EMC IP Holding Company LLC Using RPO as an optimization target for DataDomain garbage collection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020049778A1 (en) * 2000-03-31 2002-04-25 Bell Peter W. System and method of information outsourcing
US20030131207A1 (en) * 2002-01-09 2003-07-10 Hiroshi Arakawa Virtualized volume snapshot formation method
US20080034016A1 (en) * 2006-08-04 2008-02-07 Pavel Cisler Consistent back up of electronic information
CN101710323A (en) * 2008-09-11 2010-05-19 威睿公司 Computer storage deduplication
US20100138827A1 (en) * 2008-11-30 2010-06-03 Shahar Frank Hashing storage images of a virtual machine

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7734603B1 (en) * 2006-01-26 2010-06-08 Netapp, Inc. Content addressable storage array element
JP5275692B2 (en) * 2007-10-24 2013-08-28 株式会社日立製作所 Storage system group
US7797279B1 (en) * 2007-12-31 2010-09-14 Emc Corporation Merging of incremental data streams with prior backed-up data
JP5313600B2 (en) * 2008-09-16 2013-10-09 株式会社日立製作所 Storage system and storage system operation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020049778A1 (en) * 2000-03-31 2002-04-25 Bell Peter W. System and method of information outsourcing
US20030131207A1 (en) * 2002-01-09 2003-07-10 Hiroshi Arakawa Virtualized volume snapshot formation method
US20080034016A1 (en) * 2006-08-04 2008-02-07 Pavel Cisler Consistent back up of electronic information
CN101710323A (en) * 2008-09-11 2010-05-19 威睿公司 Computer storage deduplication
US20100138827A1 (en) * 2008-11-30 2010-06-03 Shahar Frank Hashing storage images of a virtual machine

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679665A (en) * 2013-12-02 2015-06-03 中兴通讯股份有限公司 Method and system for achieving block storage of distributed file system
CN104750538A (en) * 2013-12-27 2015-07-01 伊姆西公司 Virtual storage pool providing method and system for target application
CN104361069A (en) * 2014-11-07 2015-02-18 广东电子工业研究院有限公司 Local file system integrated cloud storage service method
CN105068767A (en) * 2015-08-19 2015-11-18 山东超越数控电子有限公司 Full virtualization storage method based on consistency hash algorithm
CN105938457A (en) * 2016-03-31 2016-09-14 华为技术有限公司 Data filtering method, and device and data reading system
CN105938457B (en) * 2016-03-31 2018-10-02 华为技术有限公司 Filter method, device and the data reading system of data
US10785302B2 (en) 2016-10-31 2020-09-22 Huawei Technologies Co., Ltd. Systems and methods for unified data management in a communication network
CN109792596A (en) * 2016-10-31 2019-05-21 华为技术有限公司 System and method for uniform data management in communication network
CN109906439A (en) * 2016-11-16 2019-06-18 国际商业机器公司 The time point backup of cloud is stored to object by storage control
CN110268379A (en) * 2017-01-06 2019-09-20 甲骨文国际公司 The cloud of file system data hierarchical structure migrates
US11755535B2 (en) 2017-01-06 2023-09-12 Oracle International Corporation Consistent file system semantics with cloud object storage
CN110268379B (en) * 2017-01-06 2023-08-18 甲骨文国际公司 Cloud migration of file system data hierarchy
US11714784B2 (en) 2017-01-06 2023-08-01 Oracle International Corporation Low-latency direct cloud access with file system hierarchies and semantics
CN109801659B (en) * 2017-11-16 2023-04-14 国际商业机器公司 DRAM bank activation management
CN109801659A (en) * 2017-11-16 2019-05-24 国际商业机器公司 The activation management of DRAM memory bank
CN110019097A (en) * 2017-12-29 2019-07-16 中国移动通信集团四川有限公司 Virtual logical copy management method, device, equipment and medium
CN110019097B (en) * 2017-12-29 2021-09-28 中国移动通信集团四川有限公司 Virtual logic copy management method, device, equipment and medium
CN110737542A (en) * 2018-07-19 2020-01-31 慧与发展有限责任合伙企业 Freezing and unfreezing upstream and downstream rolls
CN113728303A (en) * 2019-04-26 2021-11-30 Emc Ip控股有限公司 Garbage collection for deduplication cloud layering
CN113728303B (en) * 2019-04-26 2024-03-15 Emc Ip控股有限公司 Garbage collection for deduplication cloud layering
CN112699097A (en) * 2020-12-31 2021-04-23 北京浩瀚深度信息技术股份有限公司 Multi-policy mirror image implementation method and device and storage medium
CN112699097B (en) * 2020-12-31 2024-03-08 北京浩瀚深度信息技术股份有限公司 Method, device and storage medium for realizing multi-element policy mirror image
CN114218014B (en) * 2021-12-21 2022-07-19 中科豪联(杭州)技术有限公司 Virtual server backup and restoration method based on storage volume level
CN114218014A (en) * 2021-12-21 2022-03-22 中科豪联(杭州)技术有限公司 Virtual server backup and restoration method based on storage volume level
CN116088770A (en) * 2023-03-20 2023-05-09 苏州浪潮智能科技有限公司 Data management method, device, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CA2817592A1 (en) 2012-05-24
KR20140051107A (en) 2014-04-30
AU2011329232A1 (en) 2013-06-06
EP2643760A1 (en) 2013-10-02
EP2643760A4 (en) 2015-09-30
JP2013543198A (en) 2013-11-28
BR112013012134A2 (en) 2016-09-27
WO2012067964A1 (en) 2012-05-24
CN103415842B (en) 2016-02-03

Similar Documents

Publication Publication Date Title
CN103415842B (en) For the virtualized system and method for data management
US11573859B2 (en) Content-independent and database management system-independent synthetic full backup of a database based on snapshot technology
CN104769555A (en) Enhanced data management virtualization system
US9372758B2 (en) System and method for performing a plurality of prescribed data management functions in a manner that reduces redundant access operations to primary storage
US10275474B2 (en) System and method for managing deduplicated copies of data using temporal relationships among copies
US9372866B2 (en) System and method for creating deduplicated copies of data by sending difference data between near-neighbor temporal states
US8299944B2 (en) System and method for creating deduplicated copies of data storing non-lossy encodings of data directly in a content addressable store
US9384207B2 (en) System and method for creating deduplicated copies of data by tracking temporal relationships among copies using higher-level hash structures
US8396905B2 (en) System and method for improved garbage collection operations in a deduplicated store by tracking temporal relationships among copies
US8788769B2 (en) System and method for performing backup or restore operations utilizing difference information and timeline state information
US9858155B2 (en) System and method for managing data with service level agreements that may specify non-uniform copying of data
US9244967B2 (en) Incremental copy performance between data stores
US20150227602A1 (en) Virtual data backup
JP2016524220A (en) Efficient data replication and garbage collection prediction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant