CN103365926A - Method and device for storing snapshot in file system - Google Patents

Method and device for storing snapshot in file system Download PDF

Info

Publication number
CN103365926A
CN103365926A CN2012101031281A CN201210103128A CN103365926A CN 103365926 A CN103365926 A CN 103365926A CN 2012101031281 A CN2012101031281 A CN 2012101031281A CN 201210103128 A CN201210103128 A CN 201210103128A CN 103365926 A CN103365926 A CN 103365926A
Authority
CN
China
Prior art keywords
data
batch
carry out
time
data block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101031281A
Other languages
Chinese (zh)
Other versions
CN103365926B (en
Inventor
赵军平
谢纲
杨加林
齐巍
胡风华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Original Assignee
EMC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by EMC Corp filed Critical EMC Corp
Priority to CN201210103128.1A priority Critical patent/CN103365926B/en
Publication of CN103365926A publication Critical patent/CN103365926A/en
Application granted granted Critical
Publication of CN103365926B publication Critical patent/CN103365926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The embodiment of the invention relates to a method and a device for storing a snapshot in a file system. The method can comprise the steps of: reading data in a plurality of to-be-upgraded data blocks in batch, distributing storage spaces for the data read in batch, and storing the data in the data blocks updated for the first time in the data read in batch into the distributed storage spaces in batch. According to the method, for example, data snapshots can be stored by once IO (Input/output) operation in batch, so that the efficiency of the file system is improved.

Description

In file system, be used for preserving the method and apparatus of snapshot
Technical field
Present invention relates in general to field of filesystems, more specifically, relate to the method and apparatus that in file system, is used for preserving snapshot.
Background technology
Snapshot (Snapshot) is often referred to production file system (Production File System, hereinafter referred to as PFS) about a complete usable copy of specific data set, this copy comprises that corresponding data is at the reflection of certain time point (time point of copy beginning), it can in the situation that memory device occurrence logic mistake or file corruption are carried out fast data recovery, for example return to data the state of certain available time point.The application of snapshot is very extensive, for example as the source of backup, as the source of data mining, as the checkpoint of preserving Application Status, even only as a kind of means of simple data Replica etc.The technology that creates snapshot also has a variety of, for example comprises mirror image separation, Pointer Remapping, journal file etc., wherein, comprises for the first time copy-on-write snapshot for the protection of the comparatively classical snapping technique of file system.
Copy-on-write (copy on first write is hereinafter referred to as COFW) refers to before data block is carried out the write operation first time original contents be copied and stores in the specific store for the first time, and upgrades trace table to safeguard corresponding mapping.This technology generally can be finished in the piece level.As hereinafter describing in detail, in existing COFW technology, each data block of upgrading for the first time (writing) all must experience a complete COFW cycle separately, comprising: read original contents, distribute new storage area, write original contents and final updated trace table (usually Long-term Preservation on hard disk).As hereinafter further specifying, this will produce a large amount of trifling hard disk I/O and consume a large amount of computational resources, and cause great performance burden and final effect of altitude to normal PFS operation.
Therefore in the art, the utmost point needs a kind of more effective, snapshot creation scheme that expense is less.Case.
Summary of the invention
The defects that exists in order to alleviate COFW snapshot in the prior art, embodiments of the present invention provide a kind of improvement, efficiently, in file system, be used for preserving the method and apparatus of snapshot.
According to an embodiment of the invention, a kind of method of preserving snapshot in file system is provided, the method can comprise: read in batches the data in a plurality of data blocks that will upgrade; The data allocations storage space that reads for described batch; And the batch data in the data block that will carry out in the data that described batch is read upgrading for the first time stores the storage space that distributes into.
In optional embodiment of the present invention, described method further can comprise: when the described number of data blocks that will carry out for the first time renewal that reaches the first value and/or appearance neighbour in the described number of data blocks that will carry out upgrading for the first time reaches the second value, carry out the method for described preservation snapshot, wherein, described the first value and described the second value are the predetermined value adjustment that maybe can be in operation.
In optional embodiment of the present invention, wherein said the first value is so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of first threshold; Described the second value is for so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of Second Threshold, and described first threshold and described Second Threshold are the predetermined value adjustment that maybe can be in operation.
In optional embodiment of the present invention, described method further can comprise: batch updating mapping table, described mapping table have recorded and describedly will carry out the data block upgraded for the first time and the corresponding relation between the position of data in described storage space in the described data block that will carry out upgrading for the first time.
In optional embodiment of the present invention, described renewal mapping table only comprises and upgrades described mapping table by a read-write operation.
In optional embodiment of the present invention, described method further comprises by search the bitmap of storing in depositing storer and obtains the described information that will carry out the data block upgraded for the first time, and wherein said information comprises one or more in described number, distribution and the head and the tail position that will carry out the data block upgraded for the first time.
In optional embodiment of the present invention, the described a plurality of data blocks that to upgrade are continuous distribution, and described batch reads data in the described a plurality of data blocks that will upgrade and comprises by read operation and read data in described a plurality of data blocks of continuous distribution.
In optional embodiment of the present invention, the reference position that described batch reads is for first will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade; And/or the end position that described batch reads is for last will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade.
In optional embodiment of the present invention, the described storage space of the data allocations that reads for described batch comprises the Coutinuous store space of disposable distribution, and the storage of described batch only comprises and storing by the write-once operation.
In optional embodiment of the present invention, the described storage size of the data allocations that reads for described batch is corresponding to the required storage size of data in the described data block that will carry out upgrading for the first time of storage, and the method for described preservation snapshot further comprises: only store the data outside the data in the data block that will carry out upgrading for the first time described in the data that described batch reads.
In optional embodiment of the present invention, the mode that described batch storage adopts complete band to write.
In optional embodiment of the present invention, described storage space comprises the dedicated memory space on the disk, and the data of described batch storage are snapshot.
According to another embodiment of the present invention, a kind of method of preserving snapshot in file system is provided, wherein said file system has one or more snippets continuous data block of upgrading for the first time carrying out, and described method can comprise: read in batches every section data in the described consecutive data block; The data allocations storage space that reads for described batch; And store the batch data that described batch reads into distribute storage space.
In optional embodiment of the present invention, upgrade one or more snippets new consecutive data block for the first time for described will carrying out concurrently, carry out described read step, described allocation step and described storing step.
According to another embodiment of the invention, a kind of device of preserving snapshot in file system is provided, comprising: reading device is used for the data that batch reads a plurality of data blocks that will upgrade; Distributor, the data allocations storage space that is used to described batch to read; And memory control device, the batch data in the data block that is used for the data that described batch reads will be carried out upgrading for the first time stores the storage space that distributes into.
In optional embodiment of the present invention, described device further comprises: flip flop equipment, be used for when the described number of data blocks that will carry out for the first time renewal that the described number of data blocks that will carry out upgrading for the first time reaches the first value and/or appearance neighbour reaches the second value, trigger the execution of the device of described preservation snapshot, wherein, described the first value and described the second value are the predetermined value adjustment that maybe can be in operation.
In optional embodiment of the present invention, described the first value is so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of first threshold; Described the second value is for so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of Second Threshold, and described first threshold and described Second Threshold are the predetermined value adjustment that maybe can be in operation.
In optional embodiment of the present invention, described device further comprises: updating device, be used for the batch updating mapping table, described mapping table has recorded and describedly will carry out the data block upgraded for the first time and the corresponding relation between the position of data in described storage space in the described data block that will carry out upgrading for the first time.
In optional embodiment of the present invention, described renewal mapping table only comprises and upgrades described mapping table by a read-write operation.
In optional embodiment of the present invention, described device further comprises information acquisition device, be used for obtaining the described information that will carry out the data block upgraded for the first time by searching the bitmap of storing at storer, wherein said information comprises one or more in described number, distribution and the head and the tail position that will carry out the data block upgraded for the first time.
In optional embodiment of the present invention, the described a plurality of data blocks that to upgrade are repeatedly continuous the distribution, and described batch reads data in the described a plurality of data blocks that will upgrade and comprises by read operation and read data in described a plurality of data blocks of continuous distribution.
In optional embodiment of the present invention, the reference position that described batch reads is for first will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade; And/or the end position that described batch reads is for last will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade.
In optional embodiment of the present invention, the described storage space of the data allocations that reads for described batch comprises the Coutinuous store space of disposable distribution, and the storage of described batch only comprises and storing by the write-once operation.
In optional embodiment of the present invention, the described storage size of the data allocations that reads for described batch is corresponding to the required storage size of data in the described data block that will carry out upgrading for the first time of storage, and described memory control device is further used for: only store the data in the data block that will carry out upgrading for the first time described in the data that described batch reads.
In optional embodiment of the present invention, the mode that described batch storage adopts complete band to write.
In optional embodiment of the present invention, described storage space comprises the dedicated memory space on the disk, and the data of described batch storage are snapshot.
According to another embodiment of the present invention, a kind of device of preserving snapshot in file system is provided, wherein said file system has one or more snippets continuous data block of upgrading for the first time carrying out, described device can comprise: reading device is used for the data that batch reads every section described consecutive data block; Distributor, the data allocations storage space that is used to described batch to read; And memory control device, the batch data that is used for described batch is read stores the storage space that distributes into.
In optional embodiment of the present invention, will carry out for the first time one or more snippets continuous data block of renewal for described concurrently, start described reading device, described distributor and described memory control device.
Description of drawings
By reading with reference to the accompanying drawings detailed description hereinafter, above-mentioned and other purposes of embodiment of the present invention, feature and advantage will become obvious.In the accompanying drawings, show some embodiments of the present invention in exemplary and nonrestrictive mode, wherein identical reference number represents same or analogous element.
Fig. 1 shows the graphical representation of exemplary that the present invention can be implemented on Virtual File System framework wherein;
Fig. 2 shows the indicative icon according to the COFW snapshot processes of prior art;
Fig. 3 A and 3B show the schematic defective diagram according to the COFW snapshot processes of prior art;
Fig. 4 shows the process flow diagram of the method for save data snapshot in file system according to one embodiment of the present invention.
Fig. 5 shows the process flow diagram of the method for save data snapshot in file system of according to the present invention another embodiment.
Fig. 6 A-6D shows the according to the preferred embodiment of the present invention concrete example of save data snapshot in file system.
Fig. 7 shows the block diagram of the device of save data snapshot in file system of according to the present invention another embodiment.
Fig. 8 shows the block diagram of the device of save data snapshot in file system of according to the present invention another embodiment.
Embodiment
For understanding better the present invention, at this term that the present invention may adopt is carried out brief description.Be noted that the explanation at this only illustrates for comprehend the present invention, not as the restriction to any aspect of the present invention.Alleged " the production file system " of the present invention (PFS) refers to for the production of environment and by the file system of snapshot protection, both readablely also can write.Although be noted that the present invention presented for purpose of illustration the property purpose PFS is shown, it will be understood by those skilled in the art that the method and apparatus of the various aspects according to the present invention also can be applied to the file system of other types.
Fig. 1 shows the graphical representation of exemplary 100 that the present invention can be implemented on Virtual File System framework wherein.As shown in the figure, Virtual File System 101 comprises PFS 102 parts and snapshot 106 parts in general.PFS 102 is usually made up by each the PFS volume (volume) 103 that is organized on the hard disk 104 and forms.And snapshot 106 is stored in the snapshot storage space 105, and this storage space 105 can be to be specifically designed to the specific store of holding snapshot such as SavVol.The COFW snapshot processes occurs between PFS volume 103 and snapshot storage space 105 these one-levels.
As is known, the illustrated file system of Fig. 1 is usually with the assembling of " high-speed cache (cached) " pattern, and it supports page cache and impact damper cache mechanism to obtain low delay and high bandwidth.In described " high-speed cache " pattern, the dirty data piece is washed away in for example asynchronous (asyc) mode by file system IO.The term here " file system IO " includes but not limited to the Non-Blocking I/O such as List IO, and List IO is sorted by FSBN, and it is to allow other to process a kind of I/O processing form that continued before being transmitted.During above-mentioned washing away, continuous dirty data piece merges the panel (extent) of (merge) precedent as being sorted by FSBN.It will be appreciated by those skilled in the art that, term " panel " refers to the dirty data piece of the successive range that is sorted by FSBN in the file system, wherein the number of data block is variable, and the number of the maximum data piece that generally can be accessed by IO of concrete file system is determined.Typically, the number of data block for example can be 32 in the panel.Those skilled in the art also will understand, here alleged " panel " only is to illustrate for describing better purpose of the present invention, not as any restriction to file system, be used for realizing that the file system of various aspects of the present invention can not comprise this " panel " fully.
For hereinafter understanding better various advantage of the present invention, typical data updating process in prior art this illustrate.Receiving renewal (writing) request of file; Afterwards, if satisfy predetermined condition (such as assembling 32 dirty data pieces, perhaps arriving the storer watermark), then production file system begins to process write operation.This moment, data block was merged, and by the FSBN ordering, these continuous dirty data pieces are washed away by for example List IO then, if necessary, then carry out log recording (journallog) during described washing away.During said process, the each time write operation on the snapshot monitoring PFS, for each data block among the List IO, carry out the COFW snapshot processes 200 according to prior art as shown in Figure 2:
At step S201, check at first whether data block is upgraded for the first time, and if, then carry out step S202, from the PFS volume, read original contents.Next, step S203 is the original contents memory allocated space of reading.As example, can be for example to distribute groove (slot) to store the piece that is snapshotted from the specific store that being used for such as SavVol held snapshot.Then, step S204 writes original contents in the groove of for example SavVol that distributes.Next step S205 upgrades mapping table to safeguard mapping, and described mapping table has for example recorded piece and its mapping relations between the position on the snapshot storage space such as SavVol that are snapshotted.As example, this mapping table typically is piece mapping table (BlockMap), and adopts the mode of B+ tree (B tree) to organize.Equally as example, this mapping table also can be stored in the snapshot storage space such as SavVol.After this, proceed the PFS write operation.At last, step S206 periodically is washed into hard disk with mapping table.
As mentioned above, existing data updating process is organized into variable vector in the panel with the dirty data piece, and provide starting block address (for example being assumed to N), and the snapshot screening washer will cut off these continuous pieces by every (for example 8KB), this can cause performance issue.For example, Fig. 3 A and 3B show in detail the schematic defective diagram according to the COFW snapshot processes of prior art.As shown in Figure 3A, a List IO who supposes the PFS file system relates to from piece and number is altogether 32 dirty data pieces of N to N+31, with reference to figure 3B, the COFW snapshot processes of prior art especially intensity write with the snapshot operating load under will suffer following huge performance issue, comprising:
1, exists great quantity of small (for example 8KB) piece to read with piece and write the IO operation, this is so that produce huge pressure to hard disk or SAN storage, and on the other hand, small-sized IO is difficult to optimised usually, for merging or levelization (staging), require more resource such as write operation, and because that storage is generally is shared, may can not effectively work thereby look ahead.
2, the space and thereby the continually generation interruption that distribute continually the storage snapshot such as groove.
3, mapping table (such as the piece mapping table) upgrades frequently, and well-known owing to being used for the internal lock mechanism of data consistency, the updating cost of B tree is very large.
4, have the lock competition between a plurality of snapshot streams (or thread), this is because there is the synchronous point that is used for data integrity and fast quick-recovery in system, so concurrency is deteriorated.
In view of this, the present invention propose a kind of improved in file system the method and apparatus of save data snapshot.Fig. 4 shows the process flow diagram 400 of the method for save data snapshot in file system according to one embodiment of the present invention.For more being expressly understood the present invention, describe the process flow diagram of method as described in Figure 4 in detail below in conjunction with Fig. 6 A-Fig. 6 D.Fig. 6 A-Fig. 6 D shows the according to the preferred embodiment of the present invention concrete example of save data snapshot in file system.In Fig. 6 A-Fig. 6 D, the length of supposing the panel is 10.It will be understood by those skilled in the art that described concrete example is only for being expressly understood that more the present invention illustrates, not as any limitation of the invention.
Shown such as method 400, after the present invention begins, carry out step S402, read in batches the data in a plurality of data blocks that will upgrade.Described read operation typically comprises operating from PFS by IO in batches, such as List IO and reads.
According to preferred implementation of the present invention, the described a plurality of data blocks that to upgrade are continuous distribution, and read in batches data in the described a plurality of data blocks that will upgrade and comprise by read operation and read data in described a plurality of data blocks (for example 32 consecutive data block) of continuous distribution.
According to preferred implementation of the present invention, the reference position that described batch reads also is that first is with the data block that is snapshotted for first will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade.In the example of Fig. 6 B, the reference position that reads in batches for example can be the piece #2 among the PFS.According to preferred implementation of the present invention, the end position that described batch reads also is that last is with the data block that is snapshotted for last will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade.For example being the piece #10 among the PFS in the example of Fig. 6 B, for example is the piece #9 among the PFS in the example shown in Fig. 6 C.In above-mentioned preferred implementation, the number of the data block that reads can be greater than the number of the data block that will carry out upgrading for the first time.For example, in the example of Fig. 6 B, the data block that reads can be preferably piece #2 from PFS to piece #10, and perhaps in the example of Fig. 6 C, the data block that reads can be preferably piece #2 from PFS to piece #9.
Next, process advances to step S404, the data block memory allocated space of reading for described batch.According to preferred implementation of the present invention, the described storage space of the data allocations that reads for described batch comprises the Coutinuous store space of disposable distribution.In addition, according to preferred implementation of the present invention, described storage space comprises the dedicated memory space on the disk, such as the groove among the SavVol.According to another preferred implementation of the present invention, the described storage size of the data allocations that reads for described batch is corresponding to the required storage size of data in the described data block that will carry out upgrading for the first time of (for example equaling) storage.For example, in the example of Fig. 6 A, need to distribute 8 grooves; And in the example of Fig. 6 B, need to distribute 7 grooves.The address of the storage space that distributes (or groove) can for example remain in the storage space table (for example groove table) for subsequent reference.Described storage space table (for example groove table) can for example adopt chained list to realize.
Then, process advances to step S406, and the batch data in the data block that will carry out in the data block that described batch is read upgrading for the first time stores the storage space that distributes into.According to preferred implementation of the present invention, the batch storage among this step S406 only comprises stores by the write-once operation.For example, said write operation can comprise by the IO operation of List IO and so on and being undertaken.Be noted that, as mentioned about as described in the step S402, sometimes the number of data blocks that reads can be greater than the number of the data block of upgrading for the first time, at this moment, according to preferred implementation of the present invention, the storage of batch among the step S406 will comprise the data of only storing in the data block that will carry out upgrading for the first time described in the data that described batch reads, and remaining data is not stored.For example, in the example of Fig. 6 A, not write-in block #2 and the corresponding data of piece #9.In addition, consider the configuration of usually carrying out in the RAID mode in the storage of rear end, according to preferred implementation of the present invention, described batch storage can adopt complete band to write the mode of (full stripe write).According to another preferred implementation of the present invention, the data of described batch storage are snapshot.So far, process finishes.
Be noted that various aspects according to the present invention are preserved the method for snapshot in file system and the method for preservation snapshot of the prior art is compatible, their switchings that also can coexist and be in operation.Therefore, if further contemplate the performance balance between snapshot performance and the memory spending, preferably, can before going out as shown in Figure 4 method 400, check the state that need to carry out the snapshot piece, judge to adopt accordingly of the present invention in file system the method for save data snapshot whether have desired performance, thereby determine it is to adopt according to the method for preservation snapshot of the present invention or the method for traditional preservation snapshot.How to check state and carry out described judgement below in conjunction with Fig. 6 A and 6B detailed description.Those skilled in the art person should know, following detailed description described checks that state and described judgement all are for further optimizing the optional example that performance of the present invention is made.It not necessarily.According to preferred implementation of the present invention, can obtain the described information that will carry out the data block upgraded for the first time by search the bitmap of storing in storer, described information comprises one or more in (for example in the panel) described number, distribution and the head and the tail position that will carry out the data block upgraded for the first time.Because bitmap is usually located in the storer, so inquiry velocity will be very fast.According to another preferred implementation of the present invention, described judgement is included in before the step S402, determine that further execution is such as the described process of the step S402-S406 of method among Fig. 4 400 when the described number of data blocks that will carry out upgrading for the first time that the described number of data blocks that will carry out upgrading for the first time reaches the first value and/or determines appearance neighbour reaches the second value.According to preferred implementation of the present invention, described the first value and described the second value are the predetermined value adjustment that maybe can be in operation.
As the another preferred implementation of further replenishing, described the first value is so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of first threshold; Described the second value is so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of Second Threshold.According to preferred implementation of the present invention, described first threshold and described Second Threshold are the predetermined value adjustment that maybe can be in operation.
As preferred implementation of the present invention, described first threshold can be 80%.As another preferred implementation of the present invention, described Second Threshold can be 50%.For example, in the example of Fig. 6 A, there are 8 data blocks of upgrading for the first time in the panel, its total data piece ratio that accounts for the panel reaches 80%, then according to the strategy of one aspect of the invention, can adopt according to of the present invention in file system the method for save data snapshot.Otherwise, be the data block of upgrading for the first time if 20% data block is only arranged, then as a kind of selection, can adopt the method for traditional save data snapshot.In the example of Fig. 6 B, there are 7 data blocks of upgrading for the first time in the panel, then according to the strategy of one aspect of the invention, namely consider to carry out for the first time this strategy of the shared panel of the number of data blocks ratio of renewal, because this ratio does not reach 80%, thereby need to adopt the method for traditional save data snapshot; Yet, strategy according to a further aspect in the invention, described this strategy that will carry out the shared panel of the number of data blocks ratio upgraded for the first time of namely considering appearance neighbour because this ratio reaches 60%, can adopt according to of the present invention in file system the method for save data snapshot.
In case determine to use according to the method for preserving File Snapshot of the present invention, the present invention also can preferably extract the snapshot table for subsequent reference, the information described snapshot table (for example in the panel) which data block that for example can adopt the form of chained list to record need to be snapshotted, these information for example can be PFS pieces number.The further preferred implementation according to the present invention, the PFS piece in the snapshot table number can be arranged (but can be non-conterminous) by ascending order according to the node of B tree.For example for the example shown in Fig. 6 A, in the snapshot table, can for example mark piece #1, piece #3-piece #8 and piece #10 as the piece that will be snapshotted.
The further preferred implementation according to the present invention, in Fig. 4 after the step S406 shown in the method 400, the step that can also comprise the batch updating mapping table, described mapping table have recorded describedly will carry out the data block upgraded for the first time and the corresponding relation between the position of data in described storage space in the described data block that will carry out upgrading for the first time.As the further preferred implementation of the present invention, can upgrade described mapping table based on described snapshot table and described storage space table (groove table).
According to preferred implementation of the present invention, described renewal mapping table only comprises and upgrades described mapping table by a read-write operation.According to another preferred implementation of the present invention, the renewal of described mapping table can be followed existing lock mechanism.Be noted that mapping or key word in described mapping table still keep in the piece level, so the method for preservation File Snapshot according to aspects of the present invention can be compatible mutually with current design.
Fig. 5 shows the process flow diagram 500 that keeps the method for data snapshot in file system of another embodiment according to the present invention.
Be with the difference of method 400 illustrated in fig. 4, method 500 shown in Figure 5 for be to carry out one or more snippets continuous data block of upgrading for the first time in the file system.For these data blocks, after method began, step S502 read the data in every section consecutive data block in batches.The concrete grammar that described batch reads for example such as preamble with reference in the method 400 of Fig. 4 shown in the step S402, read operation typically comprises by IO operation in batches, such as List IO (a for example IO operation) to read from PFS.Describe in conjunction with Fig. 6 D, in the example shown in Fig. 6 D, to carry out for the first time consecutive data block #2-piece #4 of renewal for (for example in the panel), and piece #7-piece #10, carry out respectively the method as shown in Figure 5 500 according to one embodiment of the present invention.
Next, step advances to step S504, the data allocations storage space that reads for described batch.Similar with the distribution of step S404 in the method 400 of Fig. 4, the storage space that distributes among the step S504 comprises the dedicated memory space on the disk, such as the groove among the SavVol.
Then, step advances to step S506, and the batch data that described batch is read stores the storage space that distributes into.S406 is similar with abovementioned steps, and the batch storage among the step S506 only also comprises stores by write-once operation (for example by a List IO).So far, process finishes.
It should be noted that according to preferred implementation of the present invention, method as shown in Figure 5 can be carried out concurrently for one or more snippets consecutive data block that will carry out upgrading for the first time.
It will be understood by those skilled in the art that each frame of illustrative flow described above and the combination of process flow diagram center can be carried out by computer program instructions.These programmed instruction can be provided to processor with the production machine, thereby so that described instruction creates the device that is used for realizing the specified operation of one or more flow chart box when processor is carried out.Described computer program instructions can be carried out so that described processor is carried out the sequence of operations step by processor produce computer-implemented processing, so that be provided for realizing the device of specified operation in one or more flow chart box in the instruction that processor is carried out.Described computer program instructions can also be so that at least some the operation steps executed in parallel shown in the flow chart box.In addition, can also stride a plurality of processors such as some step that may occur in multiprocessor computer system carries out.In addition, the one or more frames in the flowchart illustrations or the combination of frame can also in the situation that do not deviate from scope of the present invention or spirit is carried out simultaneously with the combination of other frame or frame, perhaps even to be different from illustrated order be carried out.
Fig. 7 shows the block diagram of the device of save data snapshot in file system according to one embodiment of the present invention.
As shown in the figure, device 700 comprises reading device 701, and configuration is used for reading in batches the data of a plurality of data blocks that will upgrade; Distributor 702 is configured to the data allocations storage space that described batch reads; And memory control device 703, the batch data in the data block that configuration is used for the data that described batch reads will be carried out upgrading for the first time stores the storage space that distributes into.
In optional embodiment of the present invention, device 700 further comprises: flip flop equipment 704, configuration is used for triggering the execution of the device of described preservation snapshot when the described number of data blocks that will carry out for the first time renewal that the described number of data blocks that will carry out upgrading for the first time reaches the first value and/or appearance neighbour reaches the second value.Wherein, described the first value and described the second value are the predetermined value adjustment that maybe can be in operation.
In optional embodiment of the present invention, described the first value is so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of first threshold; Described the second value is for so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of Second Threshold, and described first threshold and described Second Threshold are the predetermined value adjustment that maybe can be in operation.
In optional embodiment of the present invention, described device 700 further comprises updating device 705, configuration is used for the batch updating mapping table, and described mapping table has recorded and describedly will carry out the data block upgraded for the first time and the corresponding relation between the position of data in described storage space in the described data block that will carry out upgrading for the first time.
In optional embodiment of the present invention, described renewal mapping table only comprises and upgrades described mapping table by a read-write operation.
In optional embodiment of the present invention, described device 700 further comprises information acquisition device 706, configuration is used for obtaining the described information that will carry out the data block upgraded for the first time by searching the bitmap of storing at storer, and wherein said information comprises one or more in described number, distribution and the head and the tail position that will carry out the data block upgraded for the first time.
In optional embodiment of the present invention, the described a plurality of data blocks that to upgrade are continuous distribution, and described batch reads data in the described a plurality of data blocks that will upgrade and comprises by read operation and read data in described a plurality of data blocks of continuous distribution.
In optional embodiment of the present invention, the reference position that described batch reads is for first will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade; And/or or the end position that reads of described batch for last will carry out data block of upgrading for the first time in the described a plurality of data blocks that will upgrade.
In optional embodiment of the present invention, the described storage space of the data allocations that reads for described batch comprises the Coutinuous store space of disposable distribution, and the storage of described batch only comprises and storing by the write-once operation.
In optional embodiment of the present invention, the described storage size of the data allocations that reads for described batch is corresponding to the required storage size of data in the described data block that will carry out upgrading for the first time of storage, and described memory control device further configuration be used for: only store the data that will carry out the data block upgraded for the first time described in the data that described batch reads.
In optional embodiment of the present invention, the mode that described batch storage adopts complete band to write.
In optional embodiment of the present invention, described storage space comprises the dedicated memory space on the disk, and the data of described batch storage are snapshot.
Fig. 8 shows the block diagram of the device of save data snapshot in file system of according to the present invention another embodiment.
As shown in the figure, device 800 is configured to comprise reading device 801 for carrying out for the first time one or more snippets continuous data block of renewal in the file system, configures the data that are used for reading in batches every section described consecutive data block; Distributor 802 is configured to the data allocations storage space that described batch reads; And memory control device 803, the batch data that configuration is used for described batch is read stores the storage space that distributes into.
In optional embodiment of the present invention, will carry out for the first time one or more snippets consecutive data block of renewal for described concurrently, start described reading device 801, described distributor 802 and described memory control device 803.
Although should be noted that some devices or the sub-device of having mentioned equipment in above-detailed, this division only is not enforceable.In fact, according to the embodiment of the present invention, the feature of above-described two or more devices and function can be specialized in a device.Otherwise, the feature of an above-described device and function can Further Division for to be specialized by a plurality of devices.
Especially, except the hardware implementation mode, embodiments of the present invention can also realize by the form of computer program.For example, the method 400 and 500 with reference to figure 4 and Fig. 5 description can realize by computer program.This computer program for example can be stored in RAM, ROM, hard disk and/or any suitable storage medium, perhaps downloads on the computer system from suitable position by network.Computer program can comprise the computer code part, and it comprises can be by the programmed instruction of suitable treatment facility (for example, central processing unit CPU) execution.Described programmed instruction can comprise at least: be used for the instruction that batch reads the data of a plurality of data blocks that will upgrade, the instruction of the data allocations storage space that reads for described batch, and the batch data in the data block that will carry out upgrading for the first time in the data that described batch is read stores the instruction of the storage space that distributes into.
Above spirit of the present invention and principle have been explained in conjunction with some embodiments.Below with reference to the characteristics of COFW, the plurality of advantages of the method for preservation snapshot in file system according to various embodiments of the present invention is described.
Because COFW adopts cache mode and List IO usually, the PFS acquiescence is upgraded (default mode that the high-speed cache write operation is generally nas server) according to cache mode, and data block is continuous at hard disk often in cache mode, like this, utilize various embodiments according to the present invention will be easy in conjunction with batch I/O operation and the very good performance of acquisition.Equally, because the home town ruling of COFW, namely usually distribute locally and revise adjacent block, cache mode also will help to upgrade at random.
More particularly, according to the embodiment of the present invention, PFS read to write respectively with SavVol merge into (for example single) in batches IO operation, this has reduced significantly from hard disk and reads or (typically be 32:1 to the IO quantity that hard disk writes, specifically depend on the setting of file system, the number that for example depends on the data block that IO of file system can access), and by suitably support to read in advance and fully band write and improved the IO performance.Meanwhile, reduced significantly the function call (nearly 32:1) that snapshot memory allocation and mapping table upgrade.And the lock between a plurality of snapshot service threads competition and thereby support (namely to the different files) parallel runnings that become a mandarin of writing more.Especially, the present invention especially can obtain better many snapshot performances for sequential write and snapshot.Simultaneously, use and to detect more that the strategy of the method for new model and automatic switchover snapshot carries out balance more neatly between performance and memory consumption.And the present invention and current snapshot can be compatible, in fact, and the switching that can coexist and be in operation of snapshot mode of the present invention and snapshot mode of the prior art.In addition, owing to nearly all variation all occurs in memory construction/logic, so the present invention also is easy to realize.
Although described the present invention with reference to some embodiments, should be appreciated that, the present invention is not limited to disclosed embodiment.The present invention is intended to contain interior included various modifications and the equivalent arrangements of spirit and scope of claims.The scope of claims meets the most wide in range explanation, thereby comprises all such modifications and equivalent structure and function.

Claims (28)

1. method of preserving snapshot in file system comprises:
Read in batches the data in a plurality of data blocks that will upgrade;
The data allocations storage space that reads for described batch; And
Batch data in the data block that will carry out in the data that described batch is read upgrading for the first time stores the storage space that distributes into.
2. method according to claim 1 further comprises:
When the described number of data blocks that will carry out for the first time renewal that reaches the first value and/or appearance neighbour in the described number of data blocks that will carry out upgrading for the first time reaches the second value, carry out the method for described preservation snapshot,
Wherein, described the first value and described the second value are the predetermined value adjustment that maybe can be in operation.
3. method according to claim 2, wherein said the first value is for so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of first threshold; Described the second value is for so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of Second Threshold, and described first threshold and described Second Threshold are the predetermined value adjustment that maybe can be in operation.
4. arbitrary described method according to claim 1-3 further comprises:
Batch updating mapping table, described mapping table have recorded and describedly will carry out the data block upgraded for the first time and the corresponding relation between the position of data in described storage space in the described data block that will carry out upgrading for the first time.
5. method according to claim 4, wherein, described renewal mapping table only comprises and upgrades described mapping table by a read-write operation.
6. arbitrary described method according to claim 1-3, further comprise by search the bitmap of storing in storer obtaining the described information that will carry out the data block upgraded for the first time, wherein said information comprises one or more in described number, distribution and the head and the tail position that will carry out the data block upgraded for the first time.
7. arbitrary described method according to claim 1-3, wherein, the described a plurality of data blocks that to upgrade are continuous distribution, and described batch reads data in the described a plurality of data blocks that will upgrade and comprises by read operation and read data in described a plurality of data blocks of continuous distribution.
8. arbitrary described method according to claim 1-3, wherein, the reference position that described batch reads is for first will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade; And/or the end position that described batch reads is for last will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade.
9. arbitrary described method according to claim 1-3, wherein, the described storage space of the data allocations that reads for described batch comprises the Coutinuous store space of disposable distribution, and the storage of described batch only comprises and storing by the write-once operation.
10. arbitrary described method according to claim 1-3, wherein, the described storage size of the data allocations that reads for described batch is corresponding to the required storage size of data in the described data block that will carry out upgrading for the first time of storage, and the method for described preservation snapshot further comprises:
Only store the data in the data block that to carry out described in the data that described batch reads upgrading for the first time.
11. arbitrary described method according to claim 1-3, wherein, the mode that described batch storage adopts complete band to write.
12. arbitrary described method according to claim 1-3, wherein, described storage space comprises the dedicated memory space on the disk, and the data of described batch storage are snapshot.
13. a method of preserving snapshot in file system, wherein said file system has one or more snippets continuous data block of upgrading for the first time carrying out, and described method comprises:
Read in batches every section data in the described consecutive data block;
The data allocations storage space that reads for described batch; And
The batch data that described batch is read stores the storage space that distributes into.
14. method according to claim 13 wherein, will be carried out for the first time one or more snippets consecutive data block of renewal for described concurrently, carry out described read step, described allocation step and described storing step.
15. a device of preserving snapshot in file system comprises:
Reading device is used for the data that batch reads a plurality of data blocks that will upgrade;
Distributor, the data allocations storage space that is used to described batch to read; And
Memory control device, the batch data in the data block that is used for the data that described batch reads will be carried out upgrading for the first time stores the storage space that distributes into.
16. device according to claim 15 further comprises:
Flip flop equipment is used for triggering the execution of the device of described preservation snapshot when the described number of data blocks that will carry out for the first time renewal that the described number of data blocks that will carry out upgrading for the first time reaches the first value and/or appearance neighbour reaches the second value,
Wherein, described the first value and described the second value are the predetermined value adjustment that maybe can be in operation.
17. device according to claim 16, wherein said the first value are so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of first threshold; Described the second value is for so that the ratio that the described number of data blocks that will carry out upgrading for the first time accounts for the described a plurality of number of data blocks that will upgrade reaches the value of Second Threshold, and described first threshold and described Second Threshold are the predetermined value adjustment that maybe can be in operation.
18. arbitrary described device according to claim 15-17 further comprises:
Updating device is used for the batch updating mapping table, and described mapping table has recorded and describedly will carry out the data block upgraded for the first time and the corresponding relation between the position of data in described storage space in the described data block that will carry out upgrading for the first time.
19. device according to claim 18, wherein, described renewal mapping table only comprises and upgrades described mapping table by a read-write operation.
20. arbitrary described device according to claim 15-17, further comprise information acquisition device, be used for obtaining the described information that will carry out the data block upgraded for the first time by searching the bitmap of storing at storer, wherein said information comprises one or more in described number, distribution and the head and the tail position that will carry out the data block upgraded for the first time.
21. arbitrary described device according to claim 15-17, wherein, the described a plurality of data blocks that to upgrade are continuous distribution, and described batch reads data in the described a plurality of data blocks that will upgrade and comprises by read operation and read data in described a plurality of data blocks of continuous distribution.
22. arbitrary described device according to claim 15-17, wherein, the reference position that described batch reads is for first will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade; And/or the end position that described batch reads is for last will carry out for the first time data block of renewal in the described a plurality of data blocks that will upgrade.
23. arbitrary described device according to claim 15-17, wherein, the described storage space of the data allocations that reads for described batch comprises the Coutinuous store space of disposable distribution, and the storage of described batch only comprises and storing by the write-once operation.
24. arbitrary described device according to claim 15-17, wherein, the described storage size of the data allocations that reads for described batch is corresponding to the required storage size of data in the described data block that will carry out upgrading for the first time of storage, and described memory control device is further used for:
Only store the data in the data block that to carry out described in the data that described batch reads upgrading for the first time.
25. arbitrary described device according to claim 15-17, wherein, the mode that described batch storage adopts complete band to write.
26. arbitrary described device according to claim 15-17, wherein, described storage space comprises the dedicated memory space on the disk, and the data of described batch storage are snapshot.
27. a device of preserving snapshot in file system, wherein said file system has one or more snippets continuous data block of upgrading for the first time carrying out, and described device comprises:
Reading device is used for the data that batch reads every section described consecutive data block;
Distributor, the data allocations storage space that is used to described batch to read; And
Memory control device, the batch data that is used for described batch is read stores the storage space that distributes into.
28. device according to claim 27 wherein, will carry out for the first time one or more snippets continuous data block of renewal for described concurrently, start described reading device, described distributor and described memory control device.
CN201210103128.1A 2012-03-30 2012-03-30 It is used for the method and apparatus for preserving snapshot in file system Active CN103365926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210103128.1A CN103365926B (en) 2012-03-30 2012-03-30 It is used for the method and apparatus for preserving snapshot in file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210103128.1A CN103365926B (en) 2012-03-30 2012-03-30 It is used for the method and apparatus for preserving snapshot in file system

Publications (2)

Publication Number Publication Date
CN103365926A true CN103365926A (en) 2013-10-23
CN103365926B CN103365926B (en) 2017-10-24

Family

ID=49367288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210103128.1A Active CN103365926B (en) 2012-03-30 2012-03-30 It is used for the method and apparatus for preserving snapshot in file system

Country Status (1)

Country Link
CN (1) CN103365926B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202350A (en) * 2016-07-05 2016-12-07 浪潮(北京)电子信息产业有限公司 A kind of distributed file system simplifies the method and system of configuration automatically
CN106537375A (en) * 2014-06-26 2017-03-22 英特尔公司 Memcached systems having local caches
CN109324929A (en) * 2018-09-17 2019-02-12 郑州云海信息技术有限公司 A kind of snapshot creation method, device, equipment and readable storage medium storing program for executing
CN109462651A (en) * 2018-11-19 2019-03-12 郑州云海信息技术有限公司 Method, apparatus, system and the readable storage medium storing program for executing of cloud in a kind of mirrored volume data
CN109491961A (en) * 2018-10-22 2019-03-19 郑州云海信息技术有限公司 A kind of method and Snapshot Devices of file system snapshot
CN111563053A (en) * 2020-07-10 2020-08-21 阿里云计算有限公司 Method and device for processing Bitmap data
CN114461456A (en) * 2022-04-11 2022-05-10 成都云祺科技有限公司 CDP backup method, system, storage medium and recovery method based on continuous writing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040168034A1 (en) * 2003-02-26 2004-08-26 Hitachi, Ltd. Storage apparatus and its management method
US20060010227A1 (en) * 2004-06-01 2006-01-12 Rajeev Atluri Methods and apparatus for accessing data from a primary data storage system for secondary storage
CN101661415A (en) * 2009-09-21 2010-03-03 中兴通讯股份有限公司 Method for memorizing snapshot data and system for memorizing snapshot
CN101777016A (en) * 2010-02-08 2010-07-14 北京同有飞骥科技有限公司 Snapshot storage and data recovery method of continuous data protection system
US20130339301A1 (en) * 2012-06-04 2013-12-19 Google Inc. Efficient snapshot read of a database in a distributed storage system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040168034A1 (en) * 2003-02-26 2004-08-26 Hitachi, Ltd. Storage apparatus and its management method
US20060010227A1 (en) * 2004-06-01 2006-01-12 Rajeev Atluri Methods and apparatus for accessing data from a primary data storage system for secondary storage
CN101661415A (en) * 2009-09-21 2010-03-03 中兴通讯股份有限公司 Method for memorizing snapshot data and system for memorizing snapshot
CN101777016A (en) * 2010-02-08 2010-07-14 北京同有飞骥科技有限公司 Snapshot storage and data recovery method of continuous data protection system
US20130339301A1 (en) * 2012-06-04 2013-12-19 Google Inc. Efficient snapshot read of a database in a distributed storage system

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106537375A (en) * 2014-06-26 2017-03-22 英特尔公司 Memcached systems having local caches
CN106202350A (en) * 2016-07-05 2016-12-07 浪潮(北京)电子信息产业有限公司 A kind of distributed file system simplifies the method and system of configuration automatically
CN109324929A (en) * 2018-09-17 2019-02-12 郑州云海信息技术有限公司 A kind of snapshot creation method, device, equipment and readable storage medium storing program for executing
CN109324929B (en) * 2018-09-17 2021-11-09 郑州云海信息技术有限公司 Snapshot creating method, device and equipment and readable storage medium
CN109491961A (en) * 2018-10-22 2019-03-19 郑州云海信息技术有限公司 A kind of method and Snapshot Devices of file system snapshot
CN109491961B (en) * 2018-10-22 2022-02-18 郑州云海信息技术有限公司 File system snapshot method and snapshot device
CN109462651A (en) * 2018-11-19 2019-03-12 郑州云海信息技术有限公司 Method, apparatus, system and the readable storage medium storing program for executing of cloud in a kind of mirrored volume data
CN109462651B (en) * 2018-11-19 2021-11-19 郑州云海信息技术有限公司 Method, device and system for cloud-up of mirror volume data and readable storage medium
CN111563053A (en) * 2020-07-10 2020-08-21 阿里云计算有限公司 Method and device for processing Bitmap data
WO2022007937A1 (en) * 2020-07-10 2022-01-13 阿里云计算有限公司 Method and device for processing bitmap data
CN114461456A (en) * 2022-04-11 2022-05-10 成都云祺科技有限公司 CDP backup method, system, storage medium and recovery method based on continuous writing
CN114461456B (en) * 2022-04-11 2022-06-21 成都云祺科技有限公司 CDP backup method, system, storage medium and recovery method based on continuous writing

Also Published As

Publication number Publication date
CN103365926B (en) 2017-10-24

Similar Documents

Publication Publication Date Title
CN105718548B (en) Based on the system and method in de-duplication storage system for expansible reference management
CN103365926A (en) Method and device for storing snapshot in file system
CN107728937B (en) Key value pair persistent storage method and system using nonvolatile memory medium
Skourtis et al. Flash on rails: Consistent flash performance through redundancy
CN103019888B (en) Backup method and device
CN106687911B (en) Online data movement without compromising data integrity
US11249664B2 (en) File system metadata decoding for optimizing flash translation layer operations
KR101491626B1 (en) Memory storage apparatus, memory system and transaction function support method for database
CN105718217B (en) A kind of method and device of simplify configuration storage pool data sign processing
CN103034566B (en) Method and device for restoring virtual machine
CN104111897A (en) Data processing method, data processing device and computer system
CN109725840A (en) It is throttled using asynchronous wash away to write-in
CN108701048A (en) Data load method and device
CN102023809A (en) Storage system, method for reading data from storage system and method for writing data to storage system
CN108694231A (en) Using NVM and by multiple log recording buffers come ahead log recording
CN110188108A (en) Date storage method, device, system, computer equipment and storage medium
CN103500089A (en) Small file storage system suitable for Mapreduce calculation model
CN103761053A (en) Data and method for data processing
CN103198088A (en) Shadow paging based log segment directory
EP3494493B1 (en) Repartitioning data in a distributed computing system
CN102169460A (en) Method and device for managing variable length data
US9471366B2 (en) Virtual machine disk image backup using block allocation area
CN112181736A (en) Distributed storage system and configuration method thereof
CN103034591A (en) Memory sharing method and device for virtual machine
KR101529651B1 (en) Memory storage apparatus, memory system and transaction function support method for database

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200414

Address after: Massachusetts, USA

Patentee after: EMC IP Holding Company LLC

Address before: Massachusetts, USA

Patentee before: EMC Corp.