Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberCN100337233 C
Publication typeGrant
Application numberCN 01808063
PCT numberPCT/US2001/008486
Publication date12 Sep 2007
Filing date16 Mar 2001
Priority date30 Mar 2000
Also published asCN1449530A, CN1746892A, CN1746893A, CN1746893B, CN100445998C, EP1269353A2, US6856993, US7257595, US7418463, US7512636, US7613698, US8010559, US8510336, US20050120036, US20050120059, US20050138085, US20050149525, US20100042626, US20110276611, US20130325830, WO2001077908A2, WO2001077908A3
Publication number01808063.4, CN 01808063, CN 100337233 C, CN 100337233C, CN-C-100337233, CN01808063, CN01808063.4, CN100337233 C, CN100337233C, PCT/2001/8486, PCT/US/1/008486, PCT/US/1/08486, PCT/US/2001/008486, PCT/US/2001/08486, PCT/US1/008486, PCT/US1/08486, PCT/US1008486, PCT/US108486, PCT/US2001/008486, PCT/US2001/08486, PCT/US2001008486, PCT/US200108486
InventorsS维尔马, TJ米勒, RG阿特金森
Applicant微软公司
Export CitationBiBTeX, EndNote, RefMan
External Links: SIPO, Espacenet
Transactional file system
CN 100337233 C
Abstract  available in
Claims(42)  translated from Chinese
1.一种用于维持文件的方法,其特征在于,包括:在文件系统接收第一请求以在其上执行第一文件系统操作,第一请求与事务相关;执行第一文件系统操作;维持由该文件系统可访问的且指出第一文件系统操作与该事务相关的信息;和若该事务提交,提交第一文件系统操作;而若该事务未提交,则取消所述第一文件系统操作。 1. A method for maintaining file, characterized by comprising: receiving a first request to a file system on which the file system performs a first operation, a first request associated with the transaction; performing a first file system operation; maintain The file system can be accessed by the operating system, and pointed out that the first file with the transaction-related information; and if the transaction is submitted, the first file system operations; and if the transaction is not submitted, the cancellation of the first file system operation .
2.如权利要求1所述的方法,其特征在于,还包括在文件系统中接收指出该事务已提交的信息。 2. The method according to claim, characterized in that, further comprising receiving the file system information of the indicated committed transactions.
3.如权利要求1所述的方法,其特征在于,所述第一文件系统操作包括建立一个文件。 3. The method according to claim 1, characterized in that the first file system operations include creating a file.
4.如权利要求1所述的方法,其特征在于,所述第一文件系统操作包括删除一个文件。 4. The method according to claim 1, characterized in that the first file system operation includes deleting a file.
5.如权利要求1所述的方法,其特征在于,所述第一文件系统操作包括改名一个文件。 5. The method according to claim 1, characterized in that the first file system operation includes a renamed file.
6.如权利要求1所述的方法,其特征在于,还包括将对应第一文件系统操作的信息记录到至少一个日志中。 The method as claimed in one of the preceding claims, characterized in that, further comprising a first file system operations corresponding to the recording information to at least one log.
7.如权利要求6所述的方法,其特征在于,所述该事务未提交则取消所述第一文件系统操作的步骤包括访问至少一个日志,并根据在所述至少一个日志中的信息来撤消第一文件系统操作。 7. The method according to claim 6, characterized in that said the transaction is not committed to cancel the first step of the file system operations include access to at least one log, and according to at least one log information Undo the first file system operation.
8.如权利要求1所述的方法,其特征在于,还包括在该文件系统接收指出该事务已中断的信息,并确定该事务未提交以进行响应。 8. The method according to claim 1, characterized in that, further comprising receiving the file system indicating that the transaction information has been interrupted, and determines that the transaction is not committed to respond.
9.如权利要求1所述的方法,其特征在于,提交第一文件系统操作包括修改指示第一文件系统操作与该事务有关的信息。 9. The method according to claim 1, characterized in that the first file system operation to submit information related to modification indication includes a first file system operation with the transaction.
10.如权利要求1所述的方法,其特征在于,还包括在该文件系统接收第二请求以在其上执行第二文件系统操作,所述第二请求与该事务相关,执行第二文件系统操作,维持由该文件系统可访问的且指出第二文件系统操作与该事务有关的信息,且若该事务提交,提交第二文件系统操作。 10. The method according to claim 1, characterized in that, further comprising receiving a second request to the file system on which the implementation of the second file system operations, the second request associated with the transaction, a second file system operations, maintained by the file system that can be accessed and the second file system operations indicate information related to the transaction, and if the transaction is submitted, the second file system operations.
11.如权利要求1所述的方法,其特征在于,所述第一文件系统操作包括打开一文件。 11. The method according to claim 1, characterized in that the first file system operations including opening a file.
12.如权利要求11所述的方法,其特征在于,还包括维持与该事务有关的文件版本。 12. The method according to claim 11, characterized in that, further comprising maintaining the file version associated with the transaction.
13.如权利要求1所述的方法,其特征在于,所述由该文件系统可访问的且指出第一文件系统操作与该事务相关的信息包括位于请求数据包中的标志。 13. The method according to claim 1, characterized in that the file system can be accessed by the file system operations and points out a first information associated with the transaction request packet includes situated flag.
14.如权利要求1所述的方法,其特征在于,所述由该文件系统可访问的且指出第一文件系统操作与该事务相关的信息包括上下文关系,通过所述上下文关系中的数据,所述文件系统将所述第一请求与所述事务相关。 14. The method according to claim 1, characterized in that the file system can be accessed by the file system operations and points out a first associated with the transaction information includes a context, the context data in, said file system associated with the first request of the transaction.
15.如权利要求14所述的方法,其特征在于,所述上下文关系与一个线程相关。 15. The method of claim 14, wherein said context associated with a thread.
16.如权利要求14所述的方法,其特征在于,所述上下文关系与一个处理相关。 16. The method of claim 14, wherein said context associated with a handle.
17.如权利要求1所述的方法,其特征在于,所述由该文件系统可访问的且指出第一文件系统操作与该事务相关的信息包括与第一文件的文件句柄相关的上下文关系。 17. The method according to claim 1, characterized in that the file system can be accessed by the file system operations and points out a first information associated with the transaction includes the file handle of the file associated with a first context.
18.如权利要求1所述的方法,其特征在于,所述第一文件系统操作在第一网络计算设备上执行,而与所述事务相关的另一文件系统操作在另一网络计算设备上执行。 18. The method of any one of the transaction associated with said another network file system operations on another computing device in the claims, characterized in that the first file system operation on a first network computing device, carried out.
19.在一种计算环境中的方法,包括:在文件系统处接收第一请求,所述第一请求对应于第一文件以及被请求的相对于第一文件而执行的操作,所述第一请求与向文件系统指出所述第一请求与一事务相关的数据相关联;在所述文件系统处识别所述第一文件与所述事务相关;尝试执行对第一文件所请求的操作;在所述文件系统处接收第二请求,所述第二请求对应于第二文件以及被请求的相对于第二文件而执行的操作,所述第二请求与向文件系统指出所述第二请求与一事务相关的数据相关联;在所述文件系统处识别所述第二文件与所述事务相关;尝试执行对第二文件所请求的操作;以及如果在第一文件上尝试的所请求的操作失败或在第二文件上尝试的所请求的操作失败再或两种操作都失败,就在所述文件系统处取消与所述事务相关的已成功执行的任何文件系统操作。 19. In a computing environment, comprising: receiving a first request in the file system at the first request of the first document and the operation phase is requested for a first file corresponding to the executed, the first request to the file system and point out the first request data associated with a transaction-related; the file system at the identification of the first file in the transaction-related; the first attempt to perform the requested file operation; in The second request is received at a second file system request, the second request corresponds to a second operation with respect to the file and the second file requested to be executed, and pointed out the second request to the file system and the If you try to operate as well as on the first file requested; data related to a transaction associated with; attempts to perform the requested operation of the second file; the file system at identifying the second document related to the transaction in On the second attempt failed or the requested file operation failed before or two operations have failed to cancel any file system operations have been successfully performed with the affairs related to the file system in place.
20.如权利要求19所述的方法,其特征在于,所述对第一文件所请求的操作和对第二文件所请求的操作被成功执行,且进一步包括:在所述文件系统处接收指示所述事务已经被提交的信息,作为响应,提交所述对第一文件所请求的操作和对第二文件所请求的操作。 20. The method of claim 19, wherein said first file requested operation and a second operation requested file was successfully executed, and further comprising: receiving an indication in the file system at the information of the transaction has been submitted in response, submit the requested documents for the first operation and the second file the requested operation.
21.如权利要求19所述的方法,其特征在于,所述对第一文件所请求的操作和对第二文件所请求的操作被成功执行,且进一步包括:在所述文件系统处接收指示所述事务已经被异常中断的信息,作为响应,取消所述对第一文件所请求的操作和对第二文件所请求的操作。 21. The method of claim 19, wherein said first file requested operation and a second operation requested file was successfully executed, and further comprising: receiving an indication in the file system at the The transaction has been aborted information, as a response, canceling the first file the requested operation and the operation of the second document requested.
22.如权利要求19所述的方法,其特征在于,所述对第一文件所请求的操作包括创建一个文件,并且在所述文件系统处识别所述第一文件与所述事务相关的步骤包括评估标志。 22. The method of claim 19, wherein said first file requested operation comprises creating a file, and the file system in the first file identified at step associated with the transaction including the assessment mark.
23.如权利要求19所述的方法,其特征在于,在所述文件系统处识别所述第一文件与所述事务相关的步骤包括评估与对应于所述第一文件的文件句柄相关的上下文关系。 23. The method of claim 19, wherein, in the file system at the step of identifying the first file associated with the transaction includes evaluation and file handle corresponding to the first document associated with the context relationship.
24.如权利要求19所述的方法,其特征在于,进一步包括在日志中记录对应于所述对第一文件所请求的操作的信息。 24. The method according to claim 19, characterized in that it further comprises recording information corresponding to the request for the first file in the log of the operation.
25.如权利要求24所述的方法,其特征在于,所述对第一文件所请求的操作被成功执行,并且取消对第一文件所请求的操作的步骤包括从所述日志中读取所述信息。 25. The method according to claim 24, characterized in that the first file is successfully performed the requested operation, and cancels the first step of the requested file operation includes reading the log from the said information.
26.在一种计算环境中的方法,包括:在文件系统处接收第一请求,所述第一请求对应于第一文件以及被请求的相对于第一文件而执行的操作,所述第一请求具有与其相关的指出所述第一请求与一事务相关的数据;在所述文件系统处识别所述第一文件与所述事务相关;执行对第一文件所请求的操作;在所述文件系统处接收第二请求,所述第二请求对应于第二文件以及被请求的相对于第二文件而执行的操作,所述第二请求具有与其相关的指出所述第二请求与一事务相关的数据;在所述文件系统处识别所述第二文件与所述事务相关,其中所述事务也与所述第一文件相关;执行对第二文件所请求的操作;以及在所述文件系统处接收有关所述事务是否被成功提交的信息,并且如果所述事务被成功提交,就提交所述对第一文件所请求的操作和对第二文件所请求的操作,以及如果所述事务没有被成功提交,就取消所述对第一文件所请求的操作和对第二文件所请求的操作。 26. In a computing environment, comprising: receiving a first request in the file system at the first request of the first document and the operation phase is requested for a first file corresponding to the executed, the first request having associated therewith a pointed first request associated with the data of one transaction; in the file system identified at the first file associated with the transaction; performing a first file requested operation; in the file system to receive a second request, the second request and the operation phase corresponding to the second file requested to be performed in the second file, the second request having associated therewith noted that the second request associated with a transaction data; in the file system at identifying the second file associated with the transaction, wherein the transaction is also associated with the first file; performing a second file requested operation; and in the file system receive information on the whether the transaction has been successfully submitted information, and if the transaction is successfully submitted, you commit operations and operations on the second document requested first requested file, and if the transaction does not was successfully submitted, to cancel the first file the requested operation, and operations on the second document requested.
27.如权利要求26所述的方法,其特征在于,在公共目录中创建所述第一文件和第二文件,并且其中识别所述第一和第二文件与事务相关的步骤包括识别与所述公共目录相关的数据。 27. The method of claim 26, wherein creating the first file and a second file in the public directory, and wherein said first and second file identifying step includes identifying transaction associated with the said common directory-related data.
28.如权利要求26所述的方法,其特征在于,所述第一文件和第二文件由公共线程创建,并且其中识别所述第一和第二文件与事务相关的步骤包括识别与所述公共线程相关的数据。 28. The method of claim 26, wherein said first and second files created by the common thread, and wherein identifying the first and second documents associated with the transaction includes step of identifying the common thread-related data.
29.如权利要求26所述的方法,其特征在于,所述第一文件和第二文件由公共处理创建,并且其中识别所述第一和第二文件与事务相关的步骤包括识别与所述公共处理相关的数据。 29. The method of claim 26, wherein said first and second files created by the common processing, and wherein said first and second identifying files associated with the transaction includes a step of identifying the Public processing relevant data.
30.如权利要求26所述的方法,其特征在于,所述第一文件通过与所述第一请求相关的数据由向所述文件系统指示第一文件将要与所述事务相关的功能来创建。 30. The method of claim 26, wherein to create claims, characterized in that the first file by the first data related to the request to the file system from the first file to be associated with the transaction function indication .
31.如权利要求26所述的方法,其特征在于,在所述文件系统处识别所述第一文件与所述事务相关的步骤包括评估与对应于所述第一文件的文件句柄相关的上下文关系。 31. The method of claim 26, wherein, in the file system at the step of identifying the first file associated with the transaction includes evaluation and file handle corresponding to the first document associated with the context relationship.
32.在一种计算环境中的方法,包括:在文件系统处接收对应于所请求要执行的操作的第一请求,所述第一请求与向所述文件系统指示第一请求与事务相关的数据相关联;在所述文件系统处识别所述第一请求与所述事务相关;执行所述第一请求所对应的操作;在所述文件系统处接收对应于所请求要执行的操作的第二请求,所述第二请求与向所述文件系统指示第二请求与事务相关的数据相关联,所述事务与第一请求也相关;在所述文件系统处识别所述第二请求与所述事务相关;执行所述第二请求所对应的操作;以及如果所述事务被异常中断,取消文件系统处任何与所述事务相关的已执行的文件系统操作。 32. In a method of computing environments, including: in the file system corresponding to the first request is received at the operation to be performed by the request, the first request to the file system indicating a first request associated with the transaction The first file system at the reception to be executed corresponding to the requested operation; data associated; in the file system identified at the first request associated with the transaction; executing the operation corresponding to the first request Second request, said second request indicating a request to the file system associated with the second transaction data associated with, the transaction is also associated with the first request; identified at the file system with the request in the second referred to transaction-related; the second request to perform the corresponding operation; and if the transaction is aborted, the abolition of the file system at any file system associated with the transaction has to perform.
33.如权利要求32所述的方法,其特征在于,进一步包括在所述文件系统处接收指示事务已经被提交的信息,并且作为响应,提交所述第一请求所对应的操作和第二请求所对应的操作。 33. The method of claim 32, wherein the information further comprises receiving an indication in the file system of the transaction has been submitted, and in response to a request submitted corresponding to the first operation and the second request corresponding to the operation.
34.如权利要求32所述的方法,其特征在于,所述第一请求所对应的操作包括创建第一文件,并且其中在所述文件系统处识别所述第一请求与所述事务相关的步骤包括评估标志。 34. The method of claim 32, wherein said first request corresponding actions include creating a first file, and wherein the document identification system of said first transaction request associated with the steps include assessment mark.
35.如权利要求34所述的方法,其特征在于,在所述文件系统处识别所述第一请求与所述事务相关的步骤包括评估与对应于所述第一文件的文件句柄相关的上下文关系。 35. The method of claim 34, wherein, in the file system request is identified at the first step of the transaction associated with the file handle including an assessment of the first file corresponding to the relevant context relationship.
36.如权利要求32所述的方法,其特征在于,进一步包括在日志中记录对应于所述第一请求所对应的操作的信息。 32 36. The method according to claim, characterized in that it further comprises in the log corresponding to the first request corresponding to the operation information.
37.如权利要求36所述的方法,其特征在于,所述取消第一请求所对应的操作的步骤包括从所述日志中读取所述信息。 37. The method of claim 36, wherein said cancellation step operation corresponding to the first request comprises reading the information from the log.
38.如权利要求32所述的方法,其特征在于,所述第一请求对应于第一文件,而所述第二请求对应于第二文件,在公共目录中创建所述第一文件和第二文件,并且其中识别所述第一和第二请求与所述事务相关的步骤包括识别与所述公共目录相关的数据。 38. The method of claim 32, wherein the first request corresponds to the first file, the second request corresponds to a second file, the first file is created and in the public directory Second document, and wherein identifying the first and second steps of the transaction request associated with said public directory includes identifying relevant data.
39.如权利要求32所述的方法,其特征在于,所述第一请求对应于第一文件,而所述第二请求对应于第二文件,由公共线程创建所述第一文件和第二文件,并且其中识别所述第一和第二请求与所述事务相关的步骤包括识别与所述公共线程相关的数据。 39. The method of claim 32, wherein the first request corresponds to the first file, the second request corresponds to a second file, the first file is created by the common thread and a second document, and wherein identifying the first and second steps of the transaction request associated with the common thread comprises identifying associated data.
40.如权利要求32所述的方法,其特征在于,所述第一请求对应于第一文件,而所述第二请求对应于第二文件,由公共处理创建所述第一文件和第二文件,并且其中识别所述第一和第二请求与所述事务相关的步骤包括识别与所述公共处理相关的数据。 40. The method of 32, wherein the first file is created by the common processing and the second claim, wherein the first request corresponds to the first file, the second request corresponds to a second file, document, and wherein identifying the first and second request step associated with the transaction includes identifying the common processing associated with the data.
41.如权利要求32所述的方法,其特征在于,所述第一请求对应于第一文件,而所述第二请求对应于第二文件,其中所述第一文件通过与所述第一请求相关的数据由向所述文件系统指示第一文件将要与所述事务相关的功能来创建。 41. The method of claim 32, wherein the first request corresponds to the first file, the second request corresponds to a second document, wherein the first file with the first data related to the request from the file system to indicate a first file to be associated with the transaction function to create.
42.如权利要求32所述的方法,其特征在于,所述第一请求对应于第一文件,而所述第二请求对应于第二文件,其中在所述文件系统处识别所述第一请求与所述事务相关的步骤包括评估与对应于第一文件的文件句柄相关的上下文关系。 42. The method of claim 32 wherein identifying the first point in the file system of claim, wherein said first request corresponds to the first file, the second request corresponds to a second file, request step associated with the transaction includes evaluation and corresponding to the first file handle associated context.
Description  translated from Chinese
事务文件系统 Transactional file system

技术领域 FIELD

本发明一般针对计算机和文件系统。 The present invention is generally directed to a computer and the file system.

背景技术 BACKGROUND

通常的文件系统提供操纵文件层次结构的机制,包括建立新的文件或目录,删除或重命名文件或目录,和对文件内容的操作。 The usual file system provides a mechanism to manipulate the file hierarchy, including the establishment of a new file or directory, delete or rename a file or directory, and file content operations. 某些文件系统提供有关单个低级操作(即原语)的完整性的确定保证。 Some file systems provide integrity assurance concerning the determination of a single low-level operations (ie, the original language). 例如,建立新文件的原语或者是成功地完成,或者该建立文件操作的任意部分作用被系统撤消。 For example, create a new document primitives or the successful completion of, or any part of the role of the establishment of the file system operation is undone.

然而,在用户级的多稳健系统的操作却没有与该文件系统有联系。 However, more robust user-level operating systems did not have contact with the file system. 例如,对于一个文件系统现在没有方法建立四个文件,删除三个其他文件并改名另外的文件,但如果这些操作中的任一个失败,就撤消任何其他操作。 For example, a file system is now no way to create four files, delete and rename files in addition to three other files, but if any one of these operations fail, you undo any other action. 因此,要使用如应用程序这样的更高级(用户级)处理来管理这种多操作,即对该文件系统指定哪个动作可应用于哪个文件和/或目录。 Therefore, to use applications such as more advanced (user-level) to manage this multi-processing operation, the file system that can be used to specify which actions to which files and / or directories.

然而,此解决方法有其本身的缺陷。 However, this solution has its own flaws. 考虑一个例子,其中一个网站具有20个网页,以给予此网站一致的外观及感觉的方法彼此链接。 Consider an example where a site has 20 pages, in order to give this site a consistent look and feel of the method linked to each other. 在更新该网站时,系统失败导致不一致的状态。 When updating the site, system failure resulting in an inconsistent state. 例如,执行更新的应用程序可以已删除了某些文件但在失败时未删除来自其他文件指向这些文件的链接。 For example, an application can perform the update has been removed, but some files are not deleted when the failure links from other documents pointing to these files. 一个观看此网站的用户将看到某些网页,而当点击与已删除网页的链接时会收到的错误的消息。 Watch this site a user will see some of the pages, but will receive when you click the link to the page has been deleted when the error message.

为防止引起在不一致状态的可能性,在层次结构中的任何文件被改变之前整个网页的文件层次通常已经进行复制。 The possibility of preventing the cause in an inconsistent state, before any file in the hierarchy is changed file hierarchy is usually the entire page has been copied. 在发生失败时,被保存的层次结构复制回来。 When a failure occurs, the saved copy hierarchy back. 然而,文件的这种复制是慢的,而且相对笨拙,因为复制程序需要事先知道系统哪个部分将要更新,且易于出错,因为如果任何文件被疏忽而未复制,它就不能再恢复。 However, this is a slow copying files, and relatively awkward, because the replication program needs to know in advance which parts of the system to be updated, and error-prone, because if any files were copied without negligence, it can not be recovered.

如果文件在使用更高级的处理以更新文件时进行适当改变,则任何在进行中的改变为观看此网站的用户可见。 If the file is in use more advanced treatment to make appropriate changes when updating files, any ongoing changes to the user of this site can be seen viewing. 例如,对于上述的网站,在文件(和名字层次结构)由应用程序改变时,任何改变对该系统的现有用户是可见的。 For example, for the above-mentioned site, when the file (and the name of the hierarchy) to change by the application, the user of any change in the existing system is visible. 因为在所有改变完成以前系统状态通常是不一致的,因此,用户可以看到此不一致性。 Because all changes in the state of completion of the previous system is usually inconsistent, and therefore, the user can see this inconsistency. 例如,当出现应用程序已删除一个网页但尚未去除指向该页的链接的情况时,现有的用户可以在网页上看到一个链接(URL),点击它并在已被删除的网页上结束。 For example, when there is a web application has been removed but the situation has not been directed to remove the links page, existing users can see a link (URL) on the page, click on it and in the end of the page has been deleted.

除了网页更新以外,其他程序在它们一致地保存信息方面的能力中也受到类似的限制。 In addition to the page update, other programs in their ability to consistently hold information aspect also be similarly restrained. 例如,典型的文字处理应用程序或电子表格应用程序通过改名和删除操作执行所有保存,使用临时文件以防止引起后续系统失败的不一致状态。 For example, a typical word processing application or a spreadsheet application rename and delete operations performed by all save, use temporary files to prevent system failure caused by a subsequent inconsistent state. 这种应用程序也可以需要在不同数据源之间分配信息。 Such applications can also require allocation information between different data sources. 例如,应用程序可能希望在SQL Server中存储表格数据,在文件服务器和/或在因特网服务器中存储文件,例如,这种文件可以包括文字处理文档、演示图表和/或网页。 For example, an application might want to store form data in SQL Server, file servers and / or stored in a file server on the Internet, for example, this file may include word processing documents, presentations, charts and / or web pages. 但是,现在并不存在一种机制以协调一致、统一的方式支持这种信息的保存。 But now, there is no mechanism for a coordinated and unified way to support the preservation of this information. 例如,若在保存这种信息期间系统失败,将保存某些信息,而其他则不保存,再次导致不一致状态。 For example, if you save this information during the system fails, some of the information will be saved, but the other is not saved, resulting in an inconsistent state again.

发明内容 SUMMARY

简而言之,本发明提供一种系统和方法,通过本发明多文件系统操作能作为单个用户级事务的部分执行。 Briefly, the present invention provides a system and method of the present invention is performed by a multi-part file system operation can be as a single user-level transactions. 本发明的事务文件系统使用户能可选地控制在文件系统中事务的范围(context)和.持续时间。 Transactional file system of the invention allows the user to optionally control the file system transaction (context) and. Duration.

在文件打开或建立期间,应用程序规定文件打开情况的操作是否应作为事务的一部分处理。 In the open part of the processing document or file open operation of the provisions of the establishment period, whether the application should be used as a transaction. 此外,系统有能力永久性地标记只能被事务性地操作的文件,而应用程序在打开/建立的时间通过全局唯一的id(GUID)指定该事务。 In addition, the system has the ability to permanently mark transactional file can only be operated while the application time to open / establish the transaction specified by a globally unique id (GUID). 对于新文件建立,其父目录被标记为事务处理的,而应用程序能将一个事务与线程/处理相关联,从而通过这种线程/处理的文件操作在指定的事务的范围内进行事务处理。 For a new file is created, the parent directory is marked as transaction processing, and the application of a transaction can thread / associated processing to file operations in this thread / process a transaction within the scope of a transaction specified. 此外,应用程序可选择(如通过API)指令系统:子线程/处理继承事务的上下文关系,使应用程序能利用事务而不必对应用程序源代码作任何明显的修改。 In addition, the application can choose (eg through API) command system: sub-thread / transaction processing inherited context, so that applications can take advantage of the transaction rather than the application source code to make any significant changes.

一旦事务性地打开文件,系统自动地包括在文件处理上的如读、写、删除或改名操作,作为事务的一部分。 Once the transaction of open file, the system automatically included in the document processing such as read, write, delete, or rename operation, as part of the transaction. 因此,应用程序能调用现有的文件系统API,继续查看现有的每次操作的语义,还将操作作为事务的一部分包括起来。 Therefore, the application can call the existing file system API, continue to review the existing semantics of each operation, will operate as part of the transaction, including up. 一个应用程序能够自由地想用多少事务就用多少,自由地与其他应用程序共享一个事务,具有同样多的线程/处理共享一个事务等。 An application can freely think it matters how much how much to use, free to share a transaction with other applications, has the same number of threads / processing a transaction, such as sharing. 对驻留在不同机器上的文件所做的文件打开指定该事务。 Document files reside on different machines that did open the specified transaction.

本发明的其他方面针对记录,使得能从失败的事务恢复。 Other aspects of the present invention is directed to record, making recovery from failed transactions. 在事务下,如果事务由于包括系统失败和应用程序失败等任何原因而失败,由系统作出的改变撤消,而且如果系统成功地提交代表该应用程序的事务,则由系统对该事务作出的改变保证能经受得住系统失败(如电源断电)。 Under the transaction, if the transaction due to the inclusion of a system failure and application failure, etc. fails for any reason, to change the system to undo, and if the system successfully commit the transaction on behalf of the application, by the system to make changes to ensure that the transaction able to withstand the failure of the system (such as a power outage). 这是借助多级记录机制和另一个机制实现的,所述另一种机制对低级记录操作是否被成功地提交进行判定,从而确定较高级记录操作是否实际上发生。 This is the use of multi-level recording mechanism and other mechanisms, said another mechanism to lower the recording operation is successfully submitted for judgment to determine whether the higher recording operation actually took place.

通过将操作事件分隔成一个日志,并且将事务的实际数据写入细节分隔到另一个中,来记录数据的改变。 By operating as a separate event log, and the actual data is written details of the transaction are separated into another, to record changes in the data. 一个机制写入一个签名并随后将其与已记录的日志及数据一起比较以判定已记录的日志是否与其相应的数据页同步,消除了将该日志以相对于数据的特定次序写入盘中的要求。 A mechanism to write a signature and then compare it with the log and the data is recorded to determine whether the recorded log data and its corresponding page synchronization, eliminating the log data with respect to a particular order to write disk Claim.

本发明的另外方面包括在事务和其他文件系统操作之间提供名字空间和文件数据的隔离。 Another aspect of the invention includes providing isolation namespace and file data between transactions and other file system operations. 名字空间隔离借助使用隔离目录实现,以跟踪哪个名字属于哪个事务。 With the use of the name space isolation quarantine directory implementation, in order to keep track of which name belongs to which transaction. 因此,对给定事务由系统作出的变化对其他事务不可见,而且修改事务仍然有效,且只有在修盖事务成功地提交后成为可见。 Therefore, the change in a given transaction made by the system is not visible to other transactions, and to modify the transaction is still valid, and only after the repair cover the transaction successfully submitted become visible. 事务未觉察的文件句柄在变化发生时看到所述改变。 Transaction is not aware of the file handle to see the change in the time change. 因此,在第一事务过程中删除的文件将不再被该第一事务看到或没有事务会看到,但在第一事务完成之前对其他事务仍然可见。 Therefore, the file is deleted in the first transaction process will no longer be the first to see the transaction or no transaction will see, but before the first transaction completes still visible to other transactions.

为实现这种名字空间隔离,建立隔离目录,它们链接到原始NTFS目录,并将适当的文件名字加到该隔离目录以替代普通的NTFS父目录。 To achieve this namespace isolation, establishing quarantine directory, they are linked to the original NTFS directory and file the appropriate name added to the quarantine directory to replace ordinary NTFS parent directory. 例如,对于删除操作,被删除文件的名字在文件从NTFS父目录中除去的同时被加到该隔离目录。 For example, for the deletion, the deleted file's name removed from the NTFS file while the parent directory is added to the quarantine directory. 在提交以前,随后由不同的事务对此文件的访问使用隔离目录来应付,因此该文件被找到并认为未被删除。 In previous reports, followed by a visit to meet the different transaction isolation directory use this file, so the file is found and considered not deleted. 类似地,如果一个事务建立一个文件,名字被加到NTFS目录以及链接到父NTFS目录的隔离目录。 Similarly, if a transaction is to create a file name is added to the NTFS directory and links to the parent directory NTFS quarantine directory. 建立该文件的事务能看到它,但对于其他事务,该名字针对打开此文件和列出父NTFS目录的目的而被滤去。 The establishment of the file transaction can see it, but for other matters, the name for a parent to open this file and NTFS directory lists the purpose of being filtered. 当事务提交或中断时,隔离目录项从该隔离目录中被去除。 When the transaction commits or interrupted, the isolation directory entry is removed from the quarantine directory.

因此,本发明加入事务机制到文件系统中,使得应用程序能容易执行对一个或多个文件的多重事务操作,克服了与外事务机制相关的问题。 Accordingly, the present invention is added to the file system transaction mechanism, making the application can easily perform one or more files of multiple transaction operations, to overcome the problems associated with the outer transaction mechanism. 以这样的方法,在文件系统中多个文件系统的操作以事务方式互相联系,这样,这些操作要么一起提交,要么就撤消任何部分的动作。 In this way, in the file system to operate multiple file systems in a transactional way to contact each other, so that these operations be submitted either together or to withdraw any part of the action. 此外,一个事务的操作和数据改变与另一事务的操作和数据相隔离。 In addition, operations and data changes in a transaction with the operations and data isolated from other transactions.

从下面结合附图的详细描述中,其它的优点变得显而易见。 From the following detailed description of the drawings Other advantages become apparent.

附图简述图1是方框图,表示可以加入本发明的一个计算机系统;图2是方框图,表示按本发明的一个方面实现事务文件系统的通用结构;图3是方框图,表示按本发明的一个方面建立/打开一个事务文件的请求;图4是方框图,表示按本发明的一个方面在一个打开事务文件上执行一个文件系统操作的请求。 BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a block diagram showing a computer system may be added to the present invention; Figure 2 is a block diagram showing an aspect of the present invention according to the general structure to achieve transactional file system; FIG. 3 is a block diagram, according to the present invention showing a establishing / open file requests a transaction; FIG. 4 is a block diagram showing an aspect of the present invention according to a request to open a file system operation performed on the transaction file.

图5是方框图,表示按本发明的一个方面在一个时间周期上事务文件的隔离;图6是方框图,表示按本发明的一个方面跟踪文件版本的数据结构;图7是方框图,表示按本发明的一个方面随时间维持多个文件版本;图8是方框图,表示为写入而事务性打开的文件的数据页;图9-10是方框图,表示按本发明的一个方面用于支持为在事务中读和写而打开的文件的隔离的数据结构之间的关系;图11是方框图,表示按本发明的一个方面的两级记录机制和验证这些日志是否同步的机制;图12是方框图,表示按本发明的一个方面所记录的页数据和验证该页数据是否与日志同步的机制;图13是流程图,表示按本发明的一个方面根据页数据是否与所记录的日志同步而采取的动作;图14是方框图,表示按本发明的一个方面在另外的版本方案中随时间维持多个文件版本;图15是方框图,表示按本发明的一个方面在网络上的事务文件系统操作;图16-18是方框图,表示按本发明的一个方面的层次文件结构和使用隔离目录提供名字空间隔离;图19-22是流程图,表示按本发明的一个方面为提供名字空间隔离而使用隔离目录的一般规则;和图23是方块图,表示按本发明的一个方面存储器映射段的浮动。 Figure 5 is a block diagram, according to one aspect of the present invention, showing in a time period isolated transaction file; FIG. 6 is a block diagram showing the version according to one aspect of the trace file data structure of the present invention; FIG. 7 is a block diagram showing the present invention according to One aspect of maintaining over time a plurality of file versions; FIG. 8 is a block diagram showing transactional open for writing file data page; Figure 9-10 is a block diagram showing an aspect according to the present invention is used to support the transaction relationship of the read and write file opened isolation between the data structure; FIG. 11 is a block diagram, according to one aspect of the present invention showing the two-stage mechanism and record whether these log synchronization verification mechanism; Figure 12 is a block diagram, showing page data and the page data validation mechanism is synchronized with the log according to one aspect of the present invention is recorded; FIG. 13 is a flowchart showing an aspect of the present invention according to the page data based on whether synchronization with the recorded log actions taken ; FIG. 14 is a block diagram showing an aspect of the present invention according to another version of the program over time to maintain multiple versions of files; FIG. 15 is a block diagram showing an aspect according to the present invention on a network's transactional file system operation; FIG. 16 -18 is a block diagram illustrating an aspect of the present invention according to hierarchical file structure and use of the quarantine directory provides namespace isolation; Figure 19-22 is a flowchart illustrating an aspect according to the present invention is to provide namespace isolation and quarantine directory use The general rule; and Figure 23 is a block diagram showing an aspect of memory mapped by the floating section of the present invention.

具体实施方式 DETAILED DESCRIPTION

图1和下面讨论试图提供可应用本发明的适合的计算环境的简明描述。 Figure 1 and the following discussion attempts to provide a concise description of the application of a suitable computing environment of the present invention. 虽然并非必要,但本发明以计算机可执行指令的一般上下文范围中描述,如由个人计算机执行的程序模块。 Although not essential, but the present invention is directed to the general context of computer-executable instructions described range, such as program modules, executed by a personal computer. 通常,程序模块包括例行程序、程序、对象、组件、数据结构等,它们执行特定的任务或实现特定的抽象数据类型。 Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.

此外,本领域熟练人员知道,本发明可以应用其他计算机系统配置来实现,包括手持式装置、多处理器系统、基于微处理器或可编程消费者电子设备、网络PC、小型计算机、大型主机等。 In addition, skilled in the art that the present invention can be applied to other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PC, minicomputers, mainframes, etc. . 本发明也能在分布式计算环境中实现,其中任务由通过通信网落链接的远程处理设备执行。 The present invention can also be practiced in distributed computing environments where tasks are performed by falling linked through a communication network remote processing devices. 在分布式计算机环境中,程序模块可以放置在本地或远程的存储器储存设备中,参考图1,实现本发明的一个示例性系统包括以传统的个人计算机20或类似形式的通用计算设备,它包括处理单元21、系统存储器22和耦合包括系统存储器的各种系统部件到处理单元21的系统总线23。 In a distributed computer environment, program modules may be placed in local or remote memory storage device, with reference to Figure 1, an exemplary system to achieve the present invention includes a conventional personal computer 20 or the like in the form of a general purpose computing device, comprising processing unit 21, a system memory 22 and the coupling various system components including the system memory to the processing unit 21 of the system bus 23. 系统总线23可以是若干类型总线结构的任一种,包括存储器总线或存储器控制器、外围总线和使用各种总线结构的任一种的局部总线。 The system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and using any of a variety of bus architectures local bus. 系统存储器包括只读存储器(ROM)24和随机存取存储器(RAM)25。 The system memory includes read only memory (ROM) 24 and random access memory (RAM) 25. 在ROM 24中存储基本输入/输出系统26(BIOS),它包括基本的例行程序,以帮助如在起动过程中在个人计算机20中各单元之间传递信息。 In the ROM 24 stores a basic input / output system 26 (BIOS), which includes the basic routines to help such as during start-pass information between the personal computer 20 in each unit. 个人计算机20还可以包括硬盘驱动器27,用于向硬盘(未示出)读和写;磁盘驱动器28,用于可向可移动磁盘29的读和写;和光盘驱动器30用于向如CD-ROM或其他光媒体之类的可移动光盘31的读和写。 The personal computer 20 may further include a hard disk drive 27 for (not shown) to read and write to the hard disk; disk drive 28 for reading and writing to a removable magnetic disk 29 can be of; and optical disk drive 30 is used to as CD- ROM or other optical media like the removable optical disk 31, read and write. 硬盘驱动器27、磁盘驱动器28和光盘驱动器30分别通过硬盘驱动器接口32,磁盘驱动器接口33和光驱动器接口34连接到系统总线23。 Hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 by a hard disk drive interface 32, magnetic disk drive interface 33, and an optical drive interface 34 is connected to the system bus 23. 这些驱动器和与其相关的计算机可读媒体提供计算机可读指令、数据结构、程序模块和用于个人计算机20的其他数据的非易失性存储。 The drives and their associated computer-readable medium providing a computer readable instructions, data structures, program modules and other data for the personal computer 20 of the non-volatile memory. 虽然这里描述的示例性环境使用了硬盘、可移动磁盘29和可移动光盘31,本专业中熟悉人员应理解,能储存由计算机可访问的数据的其他类型计算机可读媒体在示例性操作环境中也能使用,如盒式磁带、闪存卡、数字视频盘、Bernoulli盒式盘带,随机存储器(RAM),只读存储器(ROM)等。 Although the exemplary environment described herein use a hard disk, a removable magnetic disk 29 and a removable optical disk 31, those skilled in the familiar person will appreciate that other types can store data that is accessible by a computer-readable medium of a computer in the exemplary operating environment can also be used, such as cassette tapes, flash memory cards, digital video disks, Bernoulli cartridges dribbling, random access memory (RAM), read-only memory (ROM) and so on.

包括操作系统35(最好是微软公司的Windows2000,以前的WindowsNT)在内的许多程序模块能存入硬盘、磁盘29、光盘31、ROM 24或RAM 25。 Including an operating system 35 (preferably a Microsoft Windows2000, previous WindowsNT), including a number of program modules can be stored in the hard disk, magnetic disk 29, optical disk 31, ROM 24 or RAM 25. 计算机20包括与操作系统35相关或包括在其中的文件系统36,如Windows NT文件系统(NTFS)、一个或多个应用程序37、其他程序模块38和程序数据39。 The computer 20 includes an operating system 35 associated with or included in the file system 36 therein, such as Windows NT File System (NTFS), one or more application programs 37, other program modules 38 and program data 39. 用户能通过如键盘40和指示设备42之类的输入设备将命令和信息输入到个人计算机20中。 A user can input device 42 keyboard 40 and a pointing device or the like will enter commands and information into the personal computer 20. 其他输入设备(未示出)可以包括:麦克风、游戏杆、游戏板、卫星碟(satellite disk)、扫描器等。 Other input devices (not shown) may include: a microphone, joystick, game pad, satellite dish (satellite disk), scanners, etc. 这些及其他输入设备通常通过耦合到系统总线的串口接口46连接到处理单元21,但也能通过如并行端口、游戏端口或通用串口总线(USB)之类的其他接口连接。 These and other input devices are often 21, but also through such as a parallel port, game port or a universal serial bus (USB) interface or the like other coupled to the system bus via the serial interface 46 is connected to the processing unit. 监视器47或其他类型的显示设备也能通过如视频适配器48这种接口连接到系统总线23。 A monitor 47 or other type of display device such as a video adapter 48 also through this interface is connected to the system bus 23. 除了监视器47外,个人计算机通常包括其他外围输出设备(未示出),如扬声器和打印机。 In addition to the monitor 47, the personal computers typically include other peripheral output devices (not shown), such as speakers and printers.

个人计算机20可使用与如远程计算机49之类的一个或多个远程计算机的逻辑连接在网络环境中运行。 The personal computer 20 may be used with a remote computer 49, such as logic or the like connected to one or more remote computers operate in a networked environment. 远程计算机49可以是另一个人计算机、服务器、路由器、网络PC、对等设备或其他常用的网络节点,而且虽然在图1中只示出存储器存储设备50,但通常包括许多或所有上述与个人计算机20相关的单元。 The remote computer 49 may be another personal computer, a server, a router, a network PC, a peer device or other common network node, and although only a memory storage device 50 in FIG. 1, but typically includes many or all of the above with a personal 20 computer-related units. 在图1画出的逻辑连接包括局域网(LAN)51和广域网(WAN)52。 In logical connections depicted in Figure 1 include a local area network (LAN) 51 and a wide area network (WAN) 52. 这种网络环境在办公室、企业范围计算机网络、企业网和因特网中是常见的。 Such networking environments are common in offices, enterprise-wide computer networks, enterprise networks and the Internet.

当在LAN网络环境中使用时,个人计算机20通过网络接口或适配器53连到局域网51。 When used in a LAN networking environment, the personal computer 20 through a network interface or adapter 53 is connected to the LAN 51. 当在WAN网络环境使用时,个人计算机20通常包括调制解调器54或其他用于在如因特网之类的广域网52上建立通信的装置。 When used in a WAN networking environment, the personal computer 20 typically includes a modem 54 or other means for establishing communications over the WAN, such as the Internet network 52. 调制解调器54(可以是内制或外制式的)通过串口接口46连接到系统总线23。 Modem 54 (which may be internal or external standard system) connected to the system bus 23 via serial port interface 46. 在网络环境中,相对于个人计算机20所描述的程序模块或其一部分可以储存在远程存储器存储设备中。 In a network environment, as opposed to 20 program modules described in the personal computer or a part thereof may be stored in the remote memory storage device. 应该明白,示出的网络连接是示例性的,可以使用在计算机之间建立通信链路的其他方法。 It should be understood, network connections shown are exemplary and other methods can be used to establish a communications link between the computers.

虽然本发明是相对于Windows2000操作系统和微软的Windows NT文件系统(NTFS)描述,本专业熟悉人员明白,也能使用其他操作系统和/或文件系统,并从本发明中得益。 Although the present invention relative to Windows2000 operating system and Microsoft's Windows NT File System (NTFS) description, one skilled in the profession to understand, but also to use other operating systems and / or file system, and benefit from the present invention.

事务文件系统的通用结构通常,这里使用的术语“事务”、“事务的”等被认为是具有某些共同属性的操作,在本发明中是应用于多重文件系统的操作。 The general structure of the transaction file system Generally, the term "transaction" as used herein, "transaction" and is considered to be operating some common attributes, in the present invention is applied to the operating multiple file systems. 事务属性常称之为“ACID”属性,代表着原子性、一致性、隔离性和持续性。 Transaction attributes often called "ACID" properties, representing the atomicity, consistency, isolation, and sustainability. 如在下面所理解,本发明实现了与文件系统相关的这些属性,为应用程序和一般计算提供许多益处。 As appreciated below, the present invention achieves these properties associated with the file system, and provides many benefits for applications and computing in general.

如在图2中通常所示,从应用程序60等所发出的给如微软Windows NT文件系统(NTFS)36(图1)这种事务启用文件系统62(如这里相对于本发明叙述)的文件系统请求58,经过分派机制66达到NTFS部件64。 As generally shown in FIG. 2, from a given file system such as Microsoft Windows NT application 60 or the like issued (NTFS) 36 (FIG. 1) to enable this transaction file system 62 (e.g., the present invention is described herein with respect to) the 58 file system requests, after dispatching mechanism 66 to reach 64 parts NTFS. 对于传统的文件系统已经知道,为了产生例如可以导致由I/O管理器发送到文件系统的I/O请求包(IRP)的这些请求,应用程序60可以作出应用程序接口(API)调用。 For traditional file system has been known, for example, may cause the transmission in order to generate the I / O manager to the file system I / O request packet (IRP) of the request, the application 60 may make application programming interface (API) calls. 按本发明并如下所述,文件系统请求58的某些与事务相关,而其他则不是。 According to the present invention is described below, some of the file system request 58 associated with a transaction, while the other is not.

在没有事务情况,文件系统请求58被分派并直接由NTFS部件64处理,实际上用本发明以前的方法。 In the absence of the transaction, the file system requests and 58 were assigned directly handled by the NTFS member 64, in fact, with the previous method of the present invention. 类似地,由事务发出的或指向如下述被打开事务修改过的文件或目录的请求58也继续正常分派到NTFS部件64,或从NTFS部件分派。 Similarly, issued by the transaction or transactions to be opened as follows pointing modified request file or directory 58 is also assigned to continue normal NTFS part 64, or distributions from NTFS components. 但是,这种事务请求导致在其他正常处理期间的策略点处调出(调回)到如在文件系统62内部实现的TxF部件70。 However, this leads to a transaction request at other strategic points during normal processing Recall (repatriation) as at 62 to the internal file system implementation TxF member 70.

如图2所示并如下所述,TxF部件70包括到外部事务服务72和记录服务74的接口,并与NTFS部件64一起工作处理事务请求。 Shown in Figure 2 and described below, TxF member 70 includes an external transaction service 72 and recorded 74 service interface, and work together with the NTFS member 64 processing transaction requests. 外部事务服务72可以包括微软公司的分布式事务协调程序(MS DTC,或简称为MTC或DTC),其中一个客户(如应用程序60)调用启用一个事务,随后的调用提交或中断该操作。 External transaction service 72 may include Microsoft Distributed Transaction Coordinator (MS DTC, or simply MTC or DTC), in which a client (such as application 60) call to enable a transaction, subsequent calls to commit or to interrupt the operation. DTC有文档已很好描述,在这里只作简单说明而不作详细描述,且说明范围只限于与TxF 70相关的部分。 DTC has been well described in the document, only briefly described here and not described in detail, and the description of the scope is limited to the relevant part of the TxF 70.

通常如图3所示,在MS DTC中,经过COM/DLE如应用程序60这样的应用程序借助调用事务协调程序76(图3)的方法(即BeginTransaction方法)启用事务。 Method (ie BeginTransaction method) is generally shown in Figure 3, the MS DTC, after COM / DLE 60 such applications by means of such an application calls the transaction coordinator 76 (Figure 3) to enable the transaction. 事务协调程序76可以是网络中的事务服务器或其本地的代理。 Coordination program 76 may be a network server or local agency transactions. 此调用建立了代表事务的事务对象/范围78。 This call to establish a representative transaction transaction object / Range 78. 然后应用程序60调用一个或多个资源管理器做事务工作。 The application then calls a 60 or more resource managers do transactional work. 在本发明中,TxF部件70作为了用于事务文件系统操作的资源管理器。 In the present invention, TxF member 70 as a resource manager for transactional file system operations. 在图3所示及下面描述中,对文件系统62的API调用(如CreateFileEx 80和其他文件系统操作)产生调出到TxF部件70。 Described below and shown in Figure 3, the API calls to the file system 62 (e.g. CreateFileEx 80 and other file system operations) generating call up to 70 TxF member.

应用程序对文件系统62的第一次调用是对文件、目录或具有与其相关的事务范围78的应用程序的当前的线程/过程的识别。 Application of Article 62 of the first call to the file system is to identify the current thread file, directory, or associated with a range of 78 transaction applications / processes. 如果事务范围是相关的,文件系统62调出到TxF 70。 If the transaction scope is related to the file system 62 paged out to TxF 70. 当TxF 70首次代表事务执行工作时,它通过调用事务协调程序76列入该事务,然后通知事务协调程序76:该TxF 70是在此事务中用到的资源管理器。 When TxF 70 for the first time to perform work on behalf of the transaction, the transaction by calling the 76 included in the transaction coordinator, and then notify the Office for the Coordination Program 76: The TxF 70 is used in this transaction resource manager. 注意,其他资源管理器84(如数据库部件的资源管理器)也能类似地列入此事务,因而数据库的操作和文件系统的操作能在同一事务中一起提交(或中断)。 Note that other resource managers 84 (such as a database component of the resource manager) can similarly be included in this transaction, and therefore the operation of the database and file system operations can be submitted together in the same transaction (or break).

为了判断什么时候TxF 70需要列入事务,如图3所示使用与ITransaction对象78一起进入的事务标识符(ID),TxF部件70的事务管理器82层将ID与事务ID的事务表86中保持的已知列入的事务进行校验。 To determine when TxF 70 to be included in the transaction, as used ITransaction 78 went into the transaction object identifier (ID), TxF parts transaction manager 70 82 layers transaction ID and the transaction ID 86 as shown in Table 3 Known inclusion of affairs kept for verification. 如果已经列入,事务ID和事务参照被记录在I/O请求包(IRP)中,而IRP继续。 If you have included in the transaction ID and the transaction is recorded in reference to the I / O request packet (IRP), whereas IRP continues. IRP在NTFS中的使用有文档已很好描述,为简单起见在后面不再描述。 IRP use in NTFS has been well described in the document, for simplicity will not be described later. 但是,如果该事务未被列入表86中,TxF通知事务协调程序76:TxF 70是需要与此事务关联的资源管理器,并将该事务标识符存入获得的事务表86中。 However, if the transaction is not included in Table 86, TxF notify the Office for the Coordination Program 76: TxF 70 needs associated with this transaction resource manager, and store the transaction identifier acquired in the transaction table 86.

更具体说来,当事务是未列入表86的新事务时,需要用事务协调程序76列入。 More specifically, the new transaction when the transaction is not included in Table 86, 76 need to be included with the transaction coordinator. 为此,TxF管理器82采用代理88使用OLE事务或其他协议与事务协调程序76通信。 To this end, TxF manager 82 88 Using OLE Transactions using proxy or other agreements with the transaction coordinator 76 communications. 适合本发明使用的替代协议等包括(X/OPEN的)XA,TIP(事务因特网协议)和/或在操作系统中的内在事务控制。 Alternative protocol for use in the invention include, such as (X / OPEN's) XA, TIP (Transaction Internet Protocol) and / or in the operating system to control the internal affairs. Create FileEX 80请求将ITransaction对象78(如通过DTC ItransactionTransmitter方法)安排到统一的字节集合。 Create FileEX 80 requests ITransaction 78 objects (eg through DTC ItransactionTransmitter method) arranged into a unified collection of bytes. 如果需要列入,这些字节发送到代理器88,它转而调用DTC ItransactionReveiver方法取回列入需要的ITransaction对象78。 If you need to include these 88 bytes are sent to the agent, which in turn calls the DTC ItransactionReveiver method to retrieve the object included in the required 78 ITransaction. 代理88保持DTC对象的ITransactionResourceAsync和ITransactionEnlistmentAsync。 Agent 88 and keep DTC object ITransactionResourceAsync ITransactionEnlistmentAsync. ITransactionResourceAsync应用TxF回调例行程序,使得事务协调程序76调用驱动二阶段提交,并具有列入调用。 ITransactionResourceAsync application TxF callback routine, making 76 calls the transaction coordinator drive two-phase commit, and has included calls. ItransactionEnlistmentAsyc通过IResourceManage::enlist()返回,并包含TxF 70调用确认二阶段提交控制的方法。 ItransactionEnlistmentAsyc by IResourceManage :: enlist () returns, and contains TxF 70 call to confirm two-phase commit control methods. 代理88作为在ItransactionResourceAsync和ItransactionEnlistmentAsync及基于文件系统控制(FSCTL)远程程序调用(RPC)的方法之间的中介,而RPC用于在TxF部件82和代理88之间的通信。 88 as a proxy between ItransactionResourceAsync and ItransactionEnlistmentAsync and based file system control (FSCTL) Remote Procedure Call (RPC) method intermediary, and RPC for communication between the agent TxF member 82 and 88.

注意,具有以与DTC协调程序过程同样的过程运行的TxF调协程序代理是可行的,且将事务管理器移到核心程序中以消除过程切换额外开销也是可行的。 Note that the program has to coordinate with the DTC process running the same program TxF Tune agent is feasible, and the transaction manager to move the core of the program to eliminate the process of switching overhead is also feasible. DTC代理的占位程序(Stub)也能移到核心程序中使得在建立用户模式代理中不需要TxF工作,同时也消除了从TxF代理到事务管理器的切换。 DTC proxy stub (Stub) can also be moved to the core of the program so that no TxF work in establishing user mode agent, but also eliminates the TxF agents from switching to the transaction manager. TxF代理能以与DTC协调程序同样的过程运行,后者需要由TxF代理工作,但是其具有与以前解决方案相同数目的过程切换。 TxF agents to the same process with the DTC to coordinate the program runs, the latter requires the TxF agency work, but it has the same number of solutions before switching process.

列入以后,当事务进行时,事务协调程序76保持对包括列入在事务中的TxF 70的每个资源管理器(和可能其他资源管理器84,如其他的TxF或数据库资源管理器)的跟踪。 After inclusion, when transaction, the transaction coordinator 76 is held in the transaction including the inclusion of each resource manager TxF 70's (and possibly 84 other resource managers, such as other TxF or Database Explorer) of tracking. 注意,这就使得其他信息(如数据库信息)作为还提交文件系统信息的事务的一部分被提交,而且使得多重事务启用文件系统的文件(如在远程机器上)作为同一事务的一部分被提交。 Note that this makes other information (such as database information) as also to commit the transaction part of the file system information is submitted, and make multiple transactions enable file system files (such as the remote machine) as part of the same transaction is committed.

通常,应用程序60借助调用(通过COM)事务协调程序76的提交事务方法来完成该事务以提交该事务。 Typically, an application calls the transaction coordinator With 60 commits the transaction method (via COM) 76 to complete the transaction in order to commit the transaction. 事务协调程序76随后通过二阶段提交协议使每个列入的资源管理器得以提交。 Transaction Coordinator subsequently submitted 76 agreements included in each resource manager to be submitted by the second stage. 二阶段提交协议保证所有资源提交该事务,或均中断该事务。 Two-phase commit protocol to ensure that all of the resources to commit the transaction, or have interrupted the transaction. 在第一阶段,事务协调程序76询问包括TxF部件70的每个资源管理器,是否准备提交。 In the first phase, the transaction coordinator asks includes 76 parts per TxF Explorer 70, is ready to commit. 如果这些资源管理器肯定地响应,则在第二阶段,事务协调程序76向它们广播一个提交消息。 If these resource managers respond affirmatively, then in the second phase, the transaction coordinator 76 broadcasts a commit message to them. 如果任何一个资源服务器否定地响应,或未能响应此准备请求,和/或事务任何部分失败,则事务协调程序76通知资源管理器,事务中断。 If any one resource server respond negatively, or fails to respond to this request ready, / or any part of the transaction and the failure, the transaction coordinator 76 notice Explorer affairs interrupted. 而且,如果应用程序不能完成,应用程序60调用中断事务方法。 Moreover, if the application is not complete, the application calls the interrupt transaction method 60. 如果应用程序失败,事务协调程序76代表应用程序中断该事务。 If the application fails, the transaction coordinator 76 interrupts the transaction on behalf of the application. 如下所述,包括TxF 70的各种资源管理器就撤消任何部分动作。 As described below, including TxF various Explorer 70 on any part of the action to undo.

因此,TxF部件70就作为在标准事务服务(如DTC)的范围中的资源管理器,因而真正的用户定义的事务支撑扩展到文件系统。 Therefore, TxF member 70 is used as the standard transaction services (such as DTC) range of the Explorer, so the real user-defined transaction support extended to the file system. 注意,如下所述,NTFS允许TxF将瞬态的每个文件和每个流的事务状态链接到正常的NTFS结构。 Note that, as described below, NTFS allows TxF each file and transient state of affairs of each stream is linked to a normal NTFS structure.

按照本发明的一个方面,应用程序60可选择包括在一个事务中的文件系统操作。 According to one aspect of the present invention, the application 60 can optionally include a file system operations in a transaction. 这可对每个文件实现,使得每个文件标记为经事务处理的,且这种的操作按事务方式完成,或者对每个线程/过程实现,其中线程/过程标记为经事务处理的,而由该线程/过程所作的操作按事务方式完成。 This can be realized for each file so that each file is marked as transaction processing, and this operation is accomplished by transaction or realize each thread / process, in which the thread / process marked by the transaction, and operated by the thread / process is done by way of the completion of the transaction.

为了在一个事务中包括在一个文件,定义一个事务处理模式标志(即:位),它能与CreateFileEx应用程序编程接口(API)调用(下面描述)、CreateFile WIN32 API变化的一起使用。 In order to be included in a transaction in a file that defines a transaction mode flag (ie: bit), it can and CreateFileEx application programming interface (API) calls (described below), CreateFile WIN32 API changes together. 当设置标志时,本发明的系统自动将此文件包括在事务的范围中。 When the flag is set, the file system is automatically included in the scope of the present invention transaction. 为此,如在图3中表示,当通过I/O请求包(IRP),一个建立请求80进入文件系统(NTFS)62时,现有的事务范围78可以通过将一指针转到该范围78而附加到该请求上,从而该文件能作为现有事务范围78的一部分被建立/打开。 To this end, as indicated in Figure 3, when the request packets through I / O (IRP), a request to establish a 80 to enter the file system (NTFS) 62, the existing transaction scope can be a pointer 78 to the range 78 And attached to the request, so that the file can be used as part of the 78 existing transaction scope is created / opened. 或者,如果在CreateFileEx API调用中指向Itransaction指针的指针是空,就如在MicrosoftTransaction Server(MTS)/Component Object Model(com)模型中那样,该范围被自动地挑选出到该线程之外。 Or, if you point Itransaction pointer pointers in CreateFileEx API call is empty, as in MicrosoftTransaction Server (MTS) / Component Object Model, as the range is automatically selected to (com) model in addition to the thread. 响应成功地建立/打开请求80而返回的文件句柄90包括到该事务范围78的指针。 Respond successfully established / 80 open request to return a file handle 90 includes a range of 78 to the transaction pointer. 然后,用那个句柄90作出的调用通过此指针识别为具有与其关联的事务范围,由此识别出相关的事务,而且使用该句柄的文件系统操作代表该事务被执行,直到该事务结束。 Then, with the handle 90 calls made through this pointer is identified as associated with a transaction scope, thereby recognizing the related transactions, and use the handle of a file system operation is executed on behalf of the transaction until the end of the transaction.

CreateFileEx API是现有的CreateFile Win32 API的适当扩展集,并加上″dwAdditionalFlags″DWORD参数以取得标志″FILE FLAG TRANSACTED″来设置该事务的模式。 CreateFileEx API is a set of extensions to the existing appropriate CreateFile Win32 API, and to add "dwAdditionalFlags" DWORD parameter to obtain symbol "FILE FLAG TRANSACTED" to set the transaction mode. 还定义的是指向事务范围的对象(LPUNKNOWNpunkTrasction)的参数,如上所述,如果参数为空,该对象就从当前的MTS/COM范围中挑出。 Also defines the scope of the transaction object point (LPUNKNOWNpunkTrasction) parameters, as described above, if the argument is null, the object is to pick out from the current MTS / COM range.

为了标记一个线程/过程为经事务处理的,提供SetTransactedFiles API,它有效地处理一组CreateFile/CreateFileEx调用,好象它们对经事务处理模式标志设定。 To mark a thread / process-through transaction processing, providing SetTransactedFiles API, it effectively handle a group CreateFile / CreateFileEx call, if they sign on through the transaction mode settings. 如果特定的CreateFileEx指定非空ITransaction对象指针,该对象用作事务范围78,否则该MTS事务对象被挑出该线程。 If a particular object ITransaction CreateFileEx designated non-null pointer, the object used as the ambit of 78, otherwise the MTS transaction object is to pick the thread.

使用SetTransactedFiles API将线程/过程标记为经事务处理的,从而通过该线程/过程的任何文件系统访问都是经事务处理的。 Use SetTransactedFiles API thread / process marked by the transaction, which access is through a transaction by any file system that thread / process. 能设置三个不同的标志,即一个在设置时导致从当前线程的任何文件系统访问都成为经事务处理的标志;一个在设置时导致从当前过程中每个线程的任何文件系统访问都成为经事务处理的标志;以及一个在设置时导致从当前过程衍生的子过程对这些标志的第二和第三个标志进行设定。 Can be set in three different signs, i.e., when setting up a file system access from any cause the current thread has become a symbol by the transaction; when setting causes a current process any file system access by each thread have become transaction flag; and a sub-process at the time of setting causes the current process derived from the second and third of these flags flag set. 因此,有可能以衍生过程继承这种模式的方式标记该线程/过程,这是一种非常有效的机制,因它允许现有的应用程序利用经事务处理的NTFS。 Therefore, it is possible to process inheritance derived mark this mode the thread / process, which is a very effective mechanism, because it allows the use of existing applications through transaction NTFS. 此外,它允许应用程序作出不具有事务处理模式位的文件系统操作,如删除文件和复制文件。 In addition, it allows applications to be made does not have transaction mode bit file system operations, such as deleting files, and copy files. 也能将此特征用于允许经事务处理命令行批处理脚本。 This feature can also be used to allow the transaction through a command line batch script. 下面描述SetTransactedFiles API:SetTransactedFiles([in]DWORD dwMode,//零或来自列举的TxFILEMODE更多值。此值//包含对标志的新设置,它们将按照dwMask//参数所指示进行设置。 The following description SetTransactedFiles API: SetTransactedFiles ([in] DWORD dwMode, // zero or more values from TxFILEMODE enumerated value // new set contains flag, which will be indicated in accordance with dwMask // parameter set.

[in]DWORD dwMask,//零或来自列举的TxFILEMODE的更多值。 [In] DWORD dwMask, // zero or more values from a list of TxFILEMODE. 只//有出现在此掩码中的那些标志值会受//SetTransactedFiles调用的影响。 // There is only appear in this mask will be affected by the value of those flags // SetTransactedFiles calls.

[out]DWORD*pdwPrevMode,//可选。 [Out] DWORD * pdwPrevMode, // optional. 如果提供,则通过这里,将以//前的模式返回给调用者。 If provided, through here, the pattern will be // returned to the caller before. );合法的标志值如下:Enum TxFILEMODE{TxFILEMODE_THISTHREAD=0x00000001,//对当前的线程TxFILEMODE_ALLTHREADS=0x00000002,//对该过程中所有线程TxFILEMODE_CHILDPROCESSES=0x00000004,//在设置此模式时//使从当前过程衍//生的所有子过程//自动将_ALLTHREADS//和_CHILDPROCESSES//设置。 ); Legal flag values are as follows: Enum TxFILEMODE {TxFILEMODE_THISTHREAD = 0x00000001, // the current thread TxFILEMODE_ALLTHREADS = 0x00000002, // the process all threads TxFILEMODE_CHILDPROCESSES = 0x00000004, // // When this mode is set to make the current process // derived all sub-processes raw _ALLTHREADS // // automatically and _CHILDPROCESSES // settings.

TxFILEMODE_ALL=0xFFFFFFFF}; TxFILEMODE_ALL = 0xFFFFFFFF};

如图4所示,对除建立/打开以外的文件操作,应用程序60为文件系统提供句柄90,如通过API调用92请求文件读操作,从而通过事务范围指针,文件系统能定位事务范围。 4 shows, in addition to the establishment of the file / open operations outside the application shown in Figure 60 provides a file system handle 90, 92 calls requesting such documents read by API, so that by the ambit of the pointer, the file system can locate the transaction scope. 注意,如上所述TxF 70必须列入该事务。 Note that, as described above TxF 70 must be included in the transaction. 由于在文件句柄90中指出事务范围,文件系统就知道此操作包括在事务之中,并知道特别相关事务的标识符。 Due to the range noted in the file handle 90 transactions, the file system will know this, being included in the transaction and related matters to hear about special identifier. 将文件包括在事务范围中意味着包括读、写、文件建立和删除的对该文件的操作将经事务处理。 The file includes the mean range in the transaction include reading, writing, document creation and operation of the file will be deleted by the transaction. 任意数量的文件系统请求可组合在一个事务内并原子地和持续地提交或中断。 Any number of file system requests can be combined and sustained and atomic commit or interruption within a transaction. 此外,在任何时刻可以进行任意数量的事务并互相隔离。 In addition, at any moment can be any number of transactions and isolated from each other.

事务访问一读和写隔离如上所述,为事务处理的访问可以打开或建立文件。 Read and write access to a quarantine matters described above, for the visit of the transaction can be open or create a file. 目前,为了提供直接的、安全的和可预测的性能,系统将在任意给定时间中在系统中更新(写)程序事务数目限定到1,即如果多个事务试图同时打开文件进行读/写(RW)访问,在文件打开时刻就返回一个错误。 Currently, in order to provide direct, secure and predictable performance, the system at any given time to update (write) the transaction limit to the number of programs in the system, one that, if multiple transactions at the same time trying to open the file for read / write (RW) access to the file open time returns an error. 因此,这些限制配置在文件级(相对于流级)。 Therefore, these restrictions configuration file level (relative to the flow level). 此限制伴随该文件直至以后提交或中断。 This restriction until after the submission of the document is accompanied or interrupted.

但是,另外可行的是用更精细的精度实现系统,如文件可以由多个写入程序打开,但是没有一个能改写文件中由另一个程序所写的(脏(dirty))页,即一旦某页被写过,该页就被锁定。 However, another possible with finer precision implementation system, such as the file can be opened by a plurality of writers, but no one can rewrite file written by another program (dirty (dirty)) pages, i.e., once a pages are written, the page is locked. 而且,“最后-写入程序-成功”类型的访问在这里也能实现。 Moreover, the "last - writer - success" type of access can be achieved here. 注意,在一个给定的系统中这些类型的文件“写”访问不是互相排斥的,因为下面情况是可行的,一个API打开一个文件进行写访问而锁定整个文件,另一个API打开一个文件(不是同时锁定的文件)进行写访问而锁定每页(或其他文件段),和/或又一个API是最后-写入程序-成功写访问。 Note that in a given system, these types of files to "write" access is not mutually exclusive, as the following conditions are feasible, an API to open a file for write access to the entire file is locked, another API to open a file (not Meanwhile locked file) write access to locked page (or other file segments), and / or has an API is the last - writer - a successful write access. 然而,这里为了简单起见,本发明是以整个文件在给定时刻只能由事务打开一次(即其他只能顺次(serialize)打开)以读/写访问来叙述。 Here, however, for simplicity, the present invention is based on the entire file at a given time can only be opened once the transaction (ie, the other can only sequentially (serialize) open) to read / write access to the narrative. 文件的非事务更新程序也可以与用于写入的事务打开顺序化。 Non-transactional update files can also be used to write the affairs of the opening sequence. 注意,这并没有阻止属于同一事务的多个线程同时打开文件以写入。 Note that this does not prevent multiple threads at the same time part of the same transaction to open the file for writing. 对于作只读访问打开文件的读程序的数目未加以严格的限制。 The number of open files for read-only access for reading program not be strictly limited.

继续考虑本发明,由事务为读访问打开的文件与由另外的事务对此文件作出的同时改变隔离,而不管写程序是在读程序之前还是在其后打开该文件。 Continue to consider the present invention is read by the transaction and simultaneously open files accessed by another transaction file to make this change in isolation, regardless of the program before writing or reading program then open the file. 而且,此隔离一直持续到只读事务访问的结束而不管改变该文件的事务是否提交该事务。 Moreover, this isolation lasted until the end of the read-only transaction regardless of whether the change to access the affairs of the file submitted to the transaction. 例如,如图5所示,考虑事务处理读程序X它打开页面的文件V0用于只读访问,在图5中在时间轴起点由X/R0表示。 For example, as shown in Figure 5, consider the transaction Reading X V0 it opens the file for read-only access pages, in Figure 5, the starting point in the timeline is represented by X / R0. 注意,文件中每页的大写字符“O”表示在打开时间的原始数据。 Note that uppercase characters per page file "O" represents the original data in the open time. 如果写入程序Y在以后时间在另一个事务中打开文件V1用于读/写访问(Y/RW),并随后作出改变(Y/Writes),事务处理的读程序X将继续看到在文件V0中的原始数据,而不是写入程序Y在V1中的改变。 If the writing program at a later time Y open files in another transaction V1 for read / write access (Y / RW), and then make a change (Y / Writes), transaction processing Reading X will continue to see the file V0 raw data, instead of writing a program Y changes in V1. 注意,当发生改变时,非事务将看到文件发生的改变。 Note that, when changed, non-transactional file will see the changes happening.

如下所述,为了实现事务的隔离,(至少)在读程序X使文件打开的时间内,为读程序保留文件的V。 As described below, in order to achieve isolation affairs (at least) in the reading program X so that the file is opened within time, to retain the file for reading programs V. “版本”。 "Version." 即使当事务的写程序Y提交时这仍然保持正确。 Even when affairs program Y submit written this remains correct. 注意,如下将详述,写程序Y对文件本身作出改变,而被读程序X看到的版本是在写入改变之前作出的原始数据的逐页复制;但相反情况也是可行的,即对读程序x保持原始数据完整而对写程序Y保持改变的页面的版本。 Note that, as will be described in detail, the writing program Y make changes to the document itself, and the program is read by X see page copy version is changed before writing to the original data; but opposite situation is also possible, i.e. to read x version of the original program to maintain data integrity and to maintain written procedures Y change pages. 而且注意,这是使用的术语“版本”,“已成版本(versioned)”,“正作成版本(versioning)”等是指时间瞬间的点(并且不应与源代码控制系统中永久性的版本混淆)。 Also note that this is the use of the term "Version", "has become the version (versioned)", "being made versions (versioning)" and so is the time instant of the point (and should not be a permanent source code version control system confusion). 此外注意,事务处理的读程序能与非事务处理的写程序依次排序以便于实现。 Also note that the transaction processing programs can read and write non-transaction processing procedures in order to facilitate the realization of the sort. 另外,非事务处理的写程序可仅为了隔离的目的包括在“系统拥有”的事务中。 In addition, non-transaction processing program to write only for the purpose of isolating included in the "system has a" transaction. 因此提供可预测的事务处理的读语义,因而事务处理的读程序在给定的时间点可以依赖文件的“静止”图像。 Thus providing predictable transaction semantics read, so read transaction processing program at a given point in time may depend on the file "static" images.

回到图5,一旦事务处理的写程序提交文件V1,事务处理的写程序Z可以打开文件V2(从V1没有改变)用于读一写访问(Z/RW),如图9所示。 Returning to Figure 5, once the write transaction program submissions V1, Z transaction processing program can open a file write V2 (no change from V1) for read-write access to (Z / RW), as shown in Figure 9. 写程序Z能看到写程序Y的提交的改变,并能作出进一步的改变(Z写入)。 Write a program to see the change Z Y submitted written procedures and can make further changes (Z writes). 但注意,此时读程序X仍然看到在文件首次被X打开时X所看到的原始文件页,而不是任何Y所提交的改变。 But note, this time reading program X still see when a file is first opened X X see the original document page, rather than any change in Y submitted. 仅当读程序X关闭文件并随后重新打开时,读程序X才可能看到写程序Y的改变。 Only when reading program X to close the file and then re-opened, it may be seen reading program X Y writing program changes. 如图5所示,读程序X也能看到写程序Z所提交的改变,只要读程序X关闭文件V2并在Z提交以后再打开它。 Shown in Figure 5, read the program can be seen to change X Z submitted written procedures, as long as the reading program X to close the file and then open it in V2 Z after submission. 换言之,如果读程序X关闭文件并在Z提交以前重新打开它,读程序X将看到版本V1,但是如果读程序关闭文件并在Z提交后重新打开它,读程序将看到文件版本V2。 In other words, if you read the program X to close the file and reopen it before submitting Z, X reading program will see the version V1, but if you read the program closes the file and reopen it after submission Z, read the program will see the file version V2. 注意,如下所述,在文件打开时刻保持并打开比最新提交的版本更老的版本也是可行的。 Note that, as described in the file open and keep open older than the latest version of the version submitted is feasible.

应该注意,这些语义是不能使用任何现有的文件共享模式表达的。 It should be noted that these semantics are not using any existing file-sharing mode of expression. 这里描述的事务隔离语义隔离了各事务互相的影响,这与互相隔离句柄的文件共享模式成对比。 Transaction isolation semantics described here the isolation of mutual influence each transaction, which is isolated from each other file-sharing mode to handle contrast. 现有的文件共享模式不变,并可用于附加的顺序化。 Existing file-sharing mode unchanged, and can be used an additional sequential. 例如,在为由规定“拒绝写”文件共享模式的同一事务的两个不同线程为事务处理更新而打开文件的情况中,第二次打开将以破坏共享为由而拒绝。 For example, in the grounds of the provisions of "refusing to write" same transaction file sharing mode two different threads for transaction processing to update and open the case file, the second to open will be the destruction of the grounds for refusing to share. 这就允许分布式应用程序分配事务的工作负荷到多个线程、过程或机器,而同时保护由该事务作出的改变不受其他事务或非事务处理的工作程序的破坏。 This allows distributed applications allocate the transaction workload to multiple threads, processes or machines, while protecting the change made from the transaction or other matters not undermine the work of the transaction program. 此外,这些语义保证可预测的已成版本的读,其中每个读程序能依赖文件的内容在保持打开时仍然稳定。 In addition, to ensure that these semantics become predictable version of reading, where each program can read the contents of the file dependent upon maintaining open remained stable.

在下面列出的兼容性矩阵中,“是”意味着相对于附加的事务处理的限制是兼容的: In the compatibility matrix listed below, "Yes" means with respect to the additional transaction limitations are compatible:

因此,更新事务查看包括其改变的文件的最新版本,而经事务处理的读得到文件的提交的版本。 Therefore, to see the latest version of the update transaction, including its changed files, and read by a transaction to get submissions version. 一个另选方法(上面一般描述的方法)是在打开时提供文件最近提交的版本,而同时它是为事务处理读打开,当作出更多改变并提交时,不允许版本改变。 An alternative method (method described above, in general) is to provide a document recently submitted version in the open, while at the same time it is open for the transaction to read, when to make more changes and submit is not allowed to change the version. 其好处在于读程序在打开期间看到事务处理一致的数据样式。 The benefit is seen reading program during the open transaction data consistent style.

在第二个另选方法中,由读程序看到的版本可以是第一次文件系统访问时的版本或某个更早时间(即在TxF日志中更早点)的版本。 In the second alternative method, by reading the version of the program can be seen or an earlier version of the time when the first file system access (ie more early in TxF log) version. ,这可提供在此读程序开始时最新提交的版本。 This can be provided at the beginning of this reading program presented the latest version. 此开始时间可以是该事务首次访问该系统中任何NTFS对象的时间,或者这时间可以使用在一个集成的情况中其他的API(如使用日志序列号,或LSN)来定义。 This start time may be the first transaction to access the system object of any NTFS time, or the time which can be used in the case of an integrated API defined in the other (e.g., using the log sequence number or LSN). 这种特性的优点在于事务在多个文件中得到时间点的瞬态图(point-in-timesnapshot),当存在多个文件依赖性和链接时(如HTML或XML文件)是有用的。 The advantage of this feature is that the transaction get a snapshot point in time (point-in-timesnapshot) in multiple files, when there are multiple file dependencies and links (such as HTML or XML file) is useful. 注意在此另选方法中,在同一事务中文件的多次打开可得到在该事务中第一次打开时所选的版本。 Note that in this alternative method, in the same transaction multiple open documents available in the transaction for the first time to open the selected version. 但是可以认识到,需要由系统保存的版本历史的数量在第二个另选方法中增加了。 It will be appreciated, the need by the system to save the version history of an increase in the number of the second alternative method.

术语“版本窗口(version window)”描述时间周期,在其中以前提交的版本的组被保持以支持选择的版本方案。 The term "release window (version window)" describes the time period in which the previously filed version of the group is held to support the version of the program selected. 对上述的第一另选方法,版本窗口随每个文件而变化,且是至今活动的文件的最早打开时间与当前时间之间的时间。 The first alternative to the above method, with each file version window and change, and the time since the active file is the earliest opening time between the current time. 对第二种方案,该窗含义为在系统中最早事务的开始LSN到当前时间的时间。 For the second program, the window in the system, meaning the earliest start LSN transaction to the current time. 可以支持这些方案的一种或两种方案,而且由TxF 70所做的维持版本的工作基本上相同。 You can support one or two programs of these programs, but also by maintaining TxF version 70 is substantially the same work done. 为简单起见,本发明在这里主要对于第一方案进行讨论,其中由读程序查看的版本是在该事务中第一次打开时刻的最新提交的文件版本。 For simplicity, the present invention is the first program where the main discussion, which is read by the program to view the version in the transaction for the first time to open the file submitted by the latest version of the moment. 因此,在此第一方案中,因为流版本在打开时刻确定,应用程序如果需要最新提交的数据,就必须关闭并重新打开句柄。 Therefore, in this first scenario, because the streaming version at the moment to determine the open, the latest data if required to submit an application, you must close and reopen the handle. 这类似于在万维网服务器中特别有关的情况(其中网站能事务处理地在线更新),因此读程序为了看到新提交的状态需要关闭并重新打开句柄。 This is similar to the Web server are particularly relevant to the case (which the site can be updated online transaction processing), so read the program in order to see the status of the new submission need to close and reopen the handle.

在一种实现中,写入一个文件是到实际文件,因为假定改变最终是由写程序提交的。 In one implementation, a file is written to the actual file, since it is assumed to change by the writing program is ultimately presented. 如下所述,如果未能提交,通过在日志中得到的撤消信息,取消任何改变。 As described below, if you fail to submit undo information in the log obtained by, cancel any changes. 因此,为提供版本隔离,每个针对一页的写首先导致为事务处理的读程序保存老的页。 Therefore, in order to provide a version of the isolation of the first page of each write for a cause for the transaction to save the old page reading program. 但注意,反过来做也是可行的,即保留原始文件完整直到改变被提交,从而写程序(而非读程序)将具有建立的新页面。 But note that, in turn, is also feasible to do, namely to keep the original file intact until the changes are committed, so write a program (rather than reading program) will have established a new page.

在使用微软Windows2000(或NT)操作系统的较佳实现中,从高速缓存管理器和虚拟存储管理器或VMM的观点呈现各自内部存储流,而不是在盘上为老版本建立各自文件。 Using Microsoft Windows2000 (or NT) operating system preferred implementation, the cache manager, and Virtual Storage Manager or VMM views showing respective internal storage flow, rather than build their own files for the old version on the disc . 高速缓存管理器、VMM及它们与非事务处理的NTFS的关系在下列参考文献中作详细描述:Helen Custer的“Inside WindowsNT”,Microsoft Press(1993);Helen Custer的”Inside the Windows NTFileSystem”,Microsoft Press(1994)和David A.Solomon的“Inside WindowsNT,Second Edition””Microsoft Press(1998),通过引入作为参考。 Relationship Cache Manager, VMM and their non-NTFS transaction described in detail in the following references: Helen Custer's "Inside WindowsNT", Microsoft Press (1993); Helen Custer's "Inside the Windows NTFileSystem ", Microsoft Press (1994) and David A.Solomon of" Inside WindowsNT, Second Edition "" Microsoft Press (1998), incorporated herein by reference.

关于为事务处理读程序保持版本,从虚拟存储管理器和高速缓存管理器的角度来看,读文件的较旧版本如同读不同文件进行管理。 Read on for transaction processing procedures to maintain version, from the virtual storage manager and cache manager's perspective, an older version of a file as read read different file management. 这允许应用程序简单地将较旧版本映射到它们的地址空间,并使采用存储描述符表(如重定向程序)访问数据的客户能透明地操作。 This allows the application to simply address space mapped to their older versions, and make use of the storage descriptor table (such as the redirector) customers can transparently access data operations. 注意,这之所以可能是因为在Windows2000操作系统中,VMM和高速缓存管理器参与文件系统的输入/输出(I/O)。 Note that this is possible because in Windows2000 operating system, VMM and the file cache manager to participate in the system input / output (I / O). 文件系统(为非高速缓存访问打开的文件除外)使用高速缓存管理器将数据映射到系统存储器,而高速缓存管理器转而使用VMM开始I/O。 File system (except for non-cache access open files) using the cache manager mapping data to system memory, and cache manager instead use VMM start I / O. 脏写页面的写通常以延迟模式在后台线程中发生。 Write dirty pages to write latency mode usually occurs in the background thread. 作为此结构的结果,直接映射到应用程序地址空间的文件与高速缓存管理器共享页面,这样,不管使用什么系统服务得到它,都能提供一致的数据样式。 As a result of this structure, directly mapped to the application's address space, file-sharing pages and cache manager, so that no matter what system you use the service to get it, can provide consistent data pattern. 注意,由此,网络重定向程序(下面描述)对数据进行本地高速缓存,并在服务器处得到与其他客户一致的数据。 Note As a result, the network redirector (described below) to cache data locally, and get the same data at the server with other customers.

为实现隔离,保持从仍然能读到的最早提交的版本开始到经更新的最新版本的多个版本。 To achieve isolation, keeping still able to read the version from the earliest filing begins to multiple versions of the latest updated version. 每个版本具有与追踪有关最新版本变化的版本相关的数据结构。 Each version has a track version of the latest version of change related data structures. 如果读出没有改变的页,该页从文件中读出(它可以在高速缓存中或写入盘中),而如果被改变的页读出,它从已改变的页数据中读出(它也可以在高速缓存中)。 If the read does not change the page, which is read out from the file (which can be written to the disc or in the cache), and if the read out page is changed, it is read from the data page has changed (it may also be in the cache). 注意,某些版本可以没有任何读它们的事务,但可以保持它们的存储器内部结构,因为它们在版本窗中,且将来可以得到打开请求。 Note that some versions can read them without any transaction, but the memory can retain their internal structure, as they are in the release window, and in the future can be open request. 那些从未打开的版本不占据任何存储器来储存数据页面。 Those who have never opened version does not occupy any memory to store data pages. 最新版本对应于基文件流,且可以更新。 The latest version of the file stream corresponding to the base, and can be updated.

如图6所示,每个版本用TxF“版本流控制块”(TxFVSCB)描述。 As shown in Figure 6, each version described by TxF "version Flow Control Block" (TxFVSCB). 对一个文件的版本流控制块以时间顺序链接到一个表中,且除最新版本外的其他版本被提交/中断,且是只读的。 Version of a file in chronological order flow control block is linked to a table, and in addition to the latest version of the other version was submitted / interruption, and is read-only. 最新版本可以提交或不提交。 The latest version can be submitted or not submitted.

每个TxFVSCB(如94)包括一个版本日志序列号96(版本LSN),它在记录到TxF日志中时,储存事务的提交LSN。 Each TxFVSCB (such as 94) includes a version of the log sequence number 96 (version LSN), which at the time the log records TxF to submit LSN store transaction. 在一个实现中,对(最新的)未提交的版本,此LSN是TxF定义的“MAX_LSN”,以便于寻找小于当前时间点的最高LSN。 In one implementation, on the (latest) uncommitted version of this LSN is TxF defined "MAX_LSN", in order to look for less than the current time point of the maximum LSN. 希望读早与此版本的提交的数据的读程序能借助使用在改变表中的项(如981)访问它,该表是存储器内部表98,它记录了由TxFVSCB指向的版本改变的页号。 I hope to read the data submitted earlier version of this program can be read by means of using the term in a changing table (such as 981) to access it, the table is an internal memory table 98, it recorded a version directed by TxFVSCB change the page number. 如TxFVSCB 94之类的每个TxFSCB还包括对应于此版本的段对象指针(SOP)结构100,它由高速缓存管理器和虚拟存储管理器使用,并表示存储器内部流。 As such each TxFSCB TxFVSCB 94 also includes a section that corresponds to this object pointer version (SOP) structure 100, which is used by the cache manager and virtual storage manager, and said internal memory stream. 还提供状态标志102,其中之一表示该版本是否被提交。 102 also provides status flags, one of which indicates whether the release was submitted. 注意,只有最新的版本可以是未提交的。 Note that only the latest version can be uncommitted. 还包括VersionLength104数据字段,以及Change Table Pointer field(改变表指针字段)106,它包括指向记录由版本改变的页号的改变表98的指针。 VersionLength104 also includes data fields, and Change Table Pointer field (changing table pointer field) 106, which includes links to the version recorded by changing the number of changes in the table on page 98 of the pointer.

如图6所示,在改变表中(如981),可以储存与页号相关的盘地址,以便于盘上找到该页的以前版本,只要该页在此版本中至少被写一次。 As shown in Figure 6, the changing table (such as 981), can be stored with the page number associated with the disk address in order to find a previous version of the page on the disk, as long as at least in this version of the page to be written once. 注意,如图6所示,主要为了节省存储器,页的范围能存在一个项中,该范围处页面连续存在盘上。 Note that, as shown in Figure 6, mainly in order to save memory, the range of pages can be present in an entry, the scope of the continuous presence of the page on the disk. 图7示出文件的多个版本的改变表940-943。 Figure 7 shows the multiple versions of changed files table 940-943. 可使用如树这种有效的搜索结构组织此改变表。 This tree can be used as an effective search this change the table structure of the organization.

如果在事务中文件被打开用于只读访问,挑选出适当的提交的版本,版本号由“readLSN”识别。 If the file is opened in a transaction for read-only access, pick out the version, the version number from "readLSN" identify the appropriate submission. 如上所述,readLSN或者是当前的LSN,或者是较早的LSN,取决于使用什么类型的版本。 As mentioned above, readLSN or current LSN, or earlier LSN, depending on what type of release. 选择的版本是readLSN以前的最近提交的版本。 Select version readLSN previous version recently submitted. 如果版本不存在,如此版本太老,则打开失败。 If the version does not exist, so the version is too old, the open failed. 如果该文件没有任何与其相关的TxFSCB,用空的改变表建立新的TxFVSCB,并标记为未提交。 If the file does not have any associated TxFSCB, changing table with an empty build new TxFVSCB, and marked as uncommitted. 使用默认的存储器内部流,使得现有的高速缓存的数据能用于读。 Use the default internal memory stream, making the existing cached data can be used for reading. 对写访问,如果最近的版本是未提交,它作为已提交被使用,否则如果没有标记为未提交,建立新的VSCB,并标记为未提交。 Write access, if the latest version is not submitted, as has been submitted to be used, or if not marked as uncommitted, establish new VSCB, and marked as uncommitted.

当写入一文件时为便于隔离,每当一页数据(如在用户级缓冲器108)被事务改变时,该页基本上在原位(即在高速缓存110中)编辑(见图8)。 When writing a file for ease of isolation, whenever one page of data (e.g., buffer 108 in user-level) the transaction is changed, the page substantially in situ (i.e., in the cache 110) edit (see Figure 8) . 然后在适当时间,高速缓存110被高速缓存管理器和/或VMM 112写到盘上(或其他合适的非易失性存储媒体)114。 Then at the appropriate time, the cache 110 is a cache manager and / or VMM 112 written to disk (or other suitable non-volatile storage medium) 114. 通常如上所述,数据能通过将文件映射到存储器或使用写API来改变。 Generally as described above, the data can be mapped file into memory or write API to change. 当使用写API时,通常使用高速缓存管理器112将改变复制到存储器驻留页116。 When using the write API, typically using cache manager 112 changes the page copied into the memory 116 resides. 注意,使用高速缓存管理器112将文件映射到系统存储器。 Note the use of the cache manager 112 to the system memory mapped file. 当使用存储器映射时,由应用应用程序直接对与高速缓存管理器112映射页相同的系统存储器页(如页116)作出改变。 When using memory mapping application directly from the application to the cache manager 112 pages mapping the same system memory pages (eg page 116) to make changes. 改变经“脏”位记录下来,它指出在存储器映射I/O的情况,改变驻留在过程专用页表项(PTE)118中。 Changed by "dirty" bit record, which indicates the memory mapped I / O case, the process of change resides in a dedicated page table entry (PTE) 118 in. 通常,当存储管理器对来自过程的工作集的页面进行修整时,这些位就传播到共享的(页面帧号)PFN结构120。 Typically, when the memory manager to page the working set from the trimming process, these bits are propagated to the shared (Page Frame Number) PFN structure 120. 它们也能由应用程序60使用系统API直接传播,以刷新映射段。 They can also be spread by direct application 60 using the system API, to refresh the map segments. 注意,脏页也能周期地写出。 Note that dirty pages can periodically write.

为了保证存储器映射的改变包括在事务中,在提交时刻系统将刷新每个应用映射段的虚拟地址范围。 In order to ensure the memory map changes included in the transaction, at the time the system will be submitted to refresh the virtual address range maps for each application segment. 从映射它们的应用程序的范围中起动此刷新。 Starting from this map refresh their applications range. 事务处理的语义可以这样定义:只有直接由应用程序刷新的页才能包括在该事务中(如刷新是事务性地作出,而不是对用户段中的字节的单独修改)。 Transaction semantics can be defined: Only direct refresh page to include the application (such as a transactional refresh to make, rather than individual changes to the user segment bytes) in the transaction. 另外,这可以通过附着(KeAttachProcess)到具有映射段并做此刷新的过程的系统线程来实现。 In addition, it can be attached (KeAttachProcess) to have the map section and do a system thread this refresh process to achieve. 段的列表保持在相应的事务表项中。 Maintain a list of segments in the corresponding entry in the transaction. 注意,由文件API作出的改变在提交时也需要刷新到盘中。 Note that the change made by the file API also need to be refreshed when you submit to the pan. 这是因为在页面写蚀刻,不可能在从以前事务留下的脏页面写和在当前的事务中由存储器映射作出的改变之间进行区分。 This is because the page write etching, from a previous transaction is not possible to leave between the dirty pages written in the current transaction and mapped by the memory changes made to distinguish.

因此TxF同时支持由事务作出的只读和读/写的文件打开。 Therefore TxF support read-only and read by the transaction made / write file open. 当事务以只读访问方式打开文件,而该文件目前未被任何其他事务打开时,该文件上的语义与以非事务方式打开的相同。 When the transaction is to open the file read-only access, and the file is not currently open to any other matters, the same way non-transactional semantics to open the file. 如果事务打开一文件用于读/写,则TxF需要的一种该文件的结构,每个流一个,以及对流版本的一种结构,以存储其每个事务的范围,如图9所示。 If the transaction open a file for read / write, then the TxF required structure for a document, for each flow a, and a structure convection version to store its scope each transaction, as shown in Fig. 用于此打开的数据结构如图9所示,其中“文件对象”是由用户的文件句柄映射的对象,“FCB”是NTFS文件控制块,“SCB”是用于打开的特定流的NTFS流控制块,“NP SCB”是主要用于保持对文件映射的段对象指针的非页面流控制块,而“CCB”是每个文件对象范围结构。 This data structure is used to open shown in Figure 9, where the "file object" by the subject user file handle mapping, "FCB" NTFS file control block, "SCB" NTFS is used to open the specific stream flows control block, "NP SCB" is mainly used to keep the section of the file mapping object pointer non-page flow control block, and "CCB" is the range of each file object structure. 注意在TxFFO中的标志指出该文件何时由事务打开用于读。 Note that in TxFFO sign that the file open for reading by the transaction when.

在图9中,TxFCB结构是用于由TxF保持的每个文件的改变的撤消数据的一个锚标,而且还包括对该事务的参照。 In Figure 9, TxFCB structure for changing TxF held by each file of an anchor standard undo data, but also with reference to the transaction. TxFSCB是用于流版本的锚标,TxFVSCB是用于流的特定版本的撤消数据的锚标。 TxFSCB anchor standard for streaming version, TxFVSCB anchor standard for stream data of a particular version of revocation. TxFO结构描述对流的版本的特定事务访问,且它捕捉指向该版本的有关共享TxF结构的指针。 TxFO structure describes convection access versions of a particular transaction, and it captures a pointer to the version of the relevant share TxF structure.

如图10所示,如果第二事务t3在以前的只读事务做完以前打开文件用于读/写,则该文件的老版本基本上移位(到图10中的右侧),为表示新版本的结构留出空间。 Shown in Figure 10, if the second transaction t3 in the previous read-only transaction done before opening the file for read / write, the old version of the file is basically a shift (right to Figure 10), representing The new version of the structure to make room. 因此,图10表示由修改文件当前版本的事务t3作出的读/写打开,由访问该文件最近提交的版本的事务t2作出的只读打开,以及由访问更早提交的版本的事务t1作出的另一个只读打开。 Thus, Figure 10 shows modified document is read by the current version of the transaction to t3 / write open, the file is accessed by the most recently committed version of the read-only transaction t2 to open, as well as by the access earlier versions of the transaction submitted to the t1 another read-opened. 注意,为简单起见,每个FileObject指向同一个SCB,而NTFS不知道文件的版本。 Note that for simplicity, each FileObject point to the same SCB, and NTFS do not know the version of the file. 而且,每个FileObject在唯一的非分页SCB中拥有其自己的段对象指针组。 Moreover, each has its own segment FileObject object pointer group only non-paged the SCB. 注意,通常并不使用用于只读事务的段对象指针,除非用户实际上映射该流。 Note that usually does not use segment object pointer for read-only transactions, unless the user actually maps the stream. 从对未加修改的页面的当前流以及从对已修改页的记录文件维护高速缓存访问。 From the pages of the unmodified and the maintenance of the current flow from the cache access pages that have been modified log files. 对每个文件对象的TxFO有效地捕捉该事务访问文件的哪个版本。 TxFO for each file object effectively capture the transaction which version to access the file.

通常,因为TxF事务具有无关于NTFS句柄的生命周期,因此,TxF结构具有无关于NTFS句柄的生命周期。 Typically, because TxF transaction with no handle on NTFS lifecycle, therefore, TxF on NTFS handle structure has no life cycle. 当两者都出现时,如图9-10所示,它们链接在一起,其中使用意义明确的接口在两边建立单向链结。 When both are present, as shown in Figure 9-10, they are linked together, including the use of well-defined interface to establish a one-way links on both sides. 例如,当发生对一个文件的事务处理访问时,校验到TxFCB的FCB链接。 For example, when a transaction access to a file, check to TxFCB the FCB link. 如果是空,它使用TxF例行程序建立。 If it is empty, it uses TxF routine established. 但是如果TxF已经存在,使用File-Id由TxF从TxF文件表中对其进行查找,否则分配一个新的。 But if TxF already exists, use the File-Id Find them by TxF TxF file table, otherwise it is assigned a new one. 类似地,当FCB被重新分配且TxFCB链接是非空,则调用TxF例行程序用于单向(NTFS到TxF)链接的删除。 Similarly, when the FCB is reassigned and TxFCB link is not empty, then the calling routine for unidirectional TxF (NTFS to TxF) link deletion.

当没有事务处理的读程序使文件打开或能在将来打开文件的这个版本时,对文件重新分配TxF结构。 When there is no transaction to make the file to open or read the program of this version of the file can be opened in the future when the file reallocation TxF structure. 即使NTFS目录由于目录本身的删除(在递归删除方式中发生)可以去掉,但只要在TxFSCB结构中存在名字空间隔离信息,就能维护目录。 Even NTFS directory because the directory itself is deleted (occurs in a recursive delete mode) can be removed, but as long as there namespace in isolation TxFSCB structure information, you can maintain the directory. TxF结构的生命周期通过参照计数来管理。 Lifecycle TxF structure by referring to the count management.

记录服务按本发明的另一方面并如下所述,对于永久性状态的记录与恢复,TxF 70使用记录服务74(图2),它允许多级记录而不是只依赖于普通的NTFS记录,以支持长期运行的事务。 Recording services by another aspect of the present invention and as described below, for a permanent record of the status and recovery, TxF 70 using a recording services 74 (FIG. 2), which allows multi-level recording rather than rely on the conventional recording NTFS to support for long-running transactions. 下面将明白,这提供了许多好处。 Below will be appreciated, this provides many benefits. 例如,典型的NTFS记录大小在约4兆字节,对于目前的短期元数据记录,这个大小很适合,但是典型的用户定义事务将很快超过这样的记录。 For example, a typical NTFS record size in about 4 megabytes, for the present short-term metadata records, this size is very fit, but the typical user-defined transaction will soon surpass this record. 而且,相对于记录的TxF事务操作数目,很可能有大量记录的NTFS操作。 Moreover, with respect to the number of records TxF transaction operations, is likely to have a large number of records NTFS operation. 此外,NTFS元数据提交操作锁定目录,而占用长时间的TxF事务将对文件系统的性能有不利的影响(在假定的单级记录方案中)。 In addition, NTFS metadata operation lock directory submission, and will take up the performance of the file system TxF affairs adversely affect long (on the assumption that the single-stage recording program).

传统的NTFS记录已有文档很好描述,因此在这里除简单概要以外不作详述,叙述的范围是结合本发明的事务处理文件系统对其的使用。 Traditional NTFS recording are well described in the document, so here other than simple summary will not elaborate, the scope of the present invention is described in conjunction transactional file system for its use. 在NFTS作出改变以前,通过写入对操作的撤消和/或重做记录到NTFS中,NTFS提供文件系统操作的中断/失败恢复功能。 Pending a change in the NFTS, by writing to the operation of undo and / or redo recorded in NTFS, NTFS file system operations provide break / failure recovery. NTFS记录是每卷的文件,用于记录影响该NTFS卷的操作,包括改变NTFS数据结构的操作,如建立文件命令、改名等。 NTFS volume record of every file, used to record the impact of the NTFS volume operations, including changes in operating NTFS data structures, such as the establishment of a command file, rename, etc. 注意,记录的是元数据,而不是用户文件数据,如被写入的字节。 Note that the record is metadata, not the user file data is written as bytes. 该日志作为文件保持,并被访问以便从系统失败中恢复,即如果系统崩溃,将使用已知的技术,可以撤消或重做部分完成的操作。 As the log file maintained and accessed in order to recover from system failure, that is, if the system crashes, the use of known technology, you can undo or redo operations partially completed. NTFS不提供持久性,即NTFS不强制其日志提交。 NTFS does not provide persistent, that NTFS is not mandatory to submit their logs.

按本发明的一个方面,在多级恢复机制中,TxF事务和恢复管理分层在NTFS的顶部。 According to one aspect of the present invention, in a multi-level recovery mechanisms, TxF affairs and recovery management layered on top of the NTFS. 如上所述,TxF将NTFS操作作为建立用户级事务的低级部件处理。 As mentioned above, TxF will operate as NTFS build user-level transaction processing low-level components. 为了恢复,TxF保持更高级日志,并且在检测到TxF强迫其自己的TxF日志在“数据”之前,将记录的NTFS操作处理作为关于更高级日志的“数据”进行处理。 In order to restore, TxF maintain higher log, and detects the operation processing TxF forced NTFS TxF its own log before "data", as will be recorded on a more advanced log "data" for processing. 在此情况下的“数据”是NTFS日志,一个可恢复的存储本身。 In this case, the "data" is NTFS log, store a recoverable itself.

如图11所示,为了完成多级记录,通过以利用已经可用的NTFS 64的可恢复性的方式协调每个日志的LSN(这里称为TxFLSN和NTFSLSN)强制高级TxF日志124在低级NTFS日志126之前完成。 As shown, in order to complete the multi-level record 11, by way of recoverability use already available NTFS 64 coordination of each log LSN (here called TxFLSN and NTFSLSN) forced senior TxF low NTFS journal log 124 126 done before. 如下所述,对于不由NTFS事务管理的数据(即流字节本身),TxF 70实质上完全地管理可恢复性。 As described below, to help NTFS transaction management data (ie, byte stream itself), TxF 70 substantially complete management of recoverability.

为保证较高级TxF日志124强制在其“数据”(即在NTFS日志126中的记录)之前(而没有无效地强制TxF日志在每次NTFS操作之前),提供TxF回拨,使得每当NTFS 64将要强制在其日志126中的数据时,NTFS 64就调用它。 To ensure higher TxF log 124 compulsory in its "data" (ie recorded in the NTFS log 126) before (but not invalid to enforce TxF log in NTFS before each operation), providing TxF back so whenever NTFS 64 will be forced to log 126 in its data, NTFS 64 would call it. 在此调用中,NTFS 64指出需要刷新的最高NTFS LSN。 In this call, NTFS 64 pointed out the need to refresh the highest NTFS LSN. 同时,TxF 70保持TxF使用的最近的NTFS事务的映射128,以便将NTFS提交LSN映射到对应的TxF LSN。 Meanwhile, TxF 70 to keep using the most recent map TxF 128 NTFS affairs in order to submit LSN NTFS mapped to corresponding TxF LSN. 注意,设计了名字空间修改操作,使得TxFN知道TFS提交LSN。 Note that the operation was designed to modify the name space, making TxFN know TFS submit LSN. NTFS日志并不持久,因为它相对不频繁地刷新到盘中。 NTFS logs did not last long, because it is relatively infrequently flushed to disk. 因此,在日志高速缓存中存在合理数量的TxF记录,它们在单个I/O操作中一起刷新到盘中。 Therefore, there is a reasonable amount of TxF recorded in the log cache refresh them together in a single I / O operations to the disk.

响应此回拨,TxF 70强制TxF日志124直到对应于在NTFS日志中被强制的最高NTFS Commit-LSN的TxF记录。 In response to this call-back, TxF 70 until forced TxF log 124 corresponds to the NTFS log is forced highest NTFS Commit-LSN of TxF records. 但是应该注意,刷新TxF日志124到最高记录仅是更加优化,因此保证较高级日志首先刷新的其他方法(如当NTFS要刷新其日志时刷新所有新的TxF记录)也满足需要。 However, it should be noted, refresh TxF logs to record only 124 more optimized, thus ensuring that other methods of higher log first flush (eg TxF refresh all the new record when you want to refresh their logs NTFS) also meet their needs. 在恢复期间,在TxF开始其恢复以前NTFS完成其的恢复。 During recovery, before beginning its recovery NTFS TxF complete its recovery.

虽然这保证TxF日志124在NTFS日志126之前刷新,但某些靠近TxF日志结束的日志记录可能已完成未被NTFS提交的NTFS操作,这种记录与已被提交的记录混合。 While this makes TxF logs 124 log 126 before the refresh NTFS, but some close to the end of the log TxF logging may not have been completed NTFS NTFS presented operations, such recording record has been mixed with submitted. 重要的是将其对应的NTFS操作已被提交的TxF日志记录与未被提交的那些加以区别,因为这决定了在恢复期间是否应用TxF日志记录。 It is important to distinguish between those corresponding TxF NTFS logging operation has been submitted and uncommitted, as it determines whether to apply during recovery TxF logging.

也将明白,这是重要的,因为在重做期间重复一个操作,或撤消从未发生过的操作是不正确的。 Will also understand that this is important because an operation is repeated during the redo or undo operation is never happened incorrect. 作为一个例子,考虑下面在TxF日志中可能记录的情况: As an example, consider the case of the following may be logged in TxF log in:

在上述情况中,不可能知道是否正确地反转(撤消)了改名操作。 In the above case, it is impossible to know whether rightly reverse (undo) the renaming. 每次简单地做此反转操作是不正确的。 Every time simply doing this inversion operation is not correct. 因为如果在NTFS中改名实际上从未发生过,Y将改名为X,取代它。 Because if in NTFS renamed never actually happened, Y will be renamed as X, replace it. 因此,在试图打开系统链接时可能失败,因为该链接由于未发生NTFS操作而不存在。 Therefore, when trying to open the system link may fail because the link does not occur due to NTFS not exist. 文件X将被丢失,而Y改名为X,但是,如果TxF 70能够查明改名是否发生,就能精确判定是否要应用撤消操作。 File X will be lost, and Y renamed X, however, if TxF 70 can ascertain whether there is renamed, will be able to accurately determine whether or not to apply the undo operation.

为在请求一个操作前判定操作是否实际发生,即是否被NTFS 64提交,TxF将相应记录写到其日志124。 Before a request for an operation to determine whether the operation actually occurred, that is, whether submitted by NTFS 64, TxF the corresponding record 124 writes its log. 然后TxF接收TxF LSN,后者为NTFS 66提供对给定文件的请求的操作。 Then TxF receive TxF LSN, which provides for NTFS 66 for a given file requested operation. 虽然在提交后让NTFS 66将TxF LSN放到其对应的NTFS日志记录(或多个记录)中是可行的,但这是低效的。 Although let NTFS 66 after submitting the TxF LSN into its corresponding NTFS logging (or multiple records) is feasible, but it is inefficient. 取代的是当NTFS提交操作时,作为提交的一部分,NTFS将TxF LSN写到在NTFS卷上保持用于该文件的记录中。 NTFS replaced when the commit operation, as part of the submission, NTFS will TxF LSN wrote on NTFS volumes for keeping record of the file. 在NTFS中,对卷上每个文件(及目录)已经以称为主文件表的结构保持了记录。 In NTFS, for each file (and directory) has been on a roll with a structure called the master file table to keep the records. 因此,如在图11所示,TxF LSN写到在对此文件(如文件3)的记录中的一个字段(如1323)中。 Thus, as shown in FIG. 11, TxF LSN written to a field (e.g., 1323) in this file (e.g., file 3) in the record. 注意,也可使用另外的数据结构,只要每文件的记录已经在每个NTFS卷上可以得到。 Note, also may be in another data structure, as long as the record of each file already available on each NTFS volume.

随后,系统崩溃之后,在恢复期间,在TxF使NTFS完全实现其恢复之后,TxF首先检查以确定在系统崩溃前在TxF日志中记录的操作是否在盘上进行(通过NtfsTxFGetTxFLSN(file-id,*TxFLsn)调用来调用NTFS)。 Then, after a system crash, during recovery, so NTFS fully realized in TxF after its restoration, TxF first checks to determine before the system crashes recorded in the log TxF whether the operation on the disk (by NtfsTxFGetTxFLSN (file-id, * TxFLsn) call to call NTFS). 如果对文件的NTFS操作提交并在系统崩溃前保存到盘中,TxF日志124中记录的TxF LSN就小于或等于在文件记录字段中的TxF LSN,因为NTFS恢复保证文件记录将被恢复。 If you submit to the NTFS file operations and saved to disk before the system crashes, TxF log 124 recorded TxF LSN is less than or equal to the file record field TxF LSN, because NTFS file recovery assurance records will be restored. 如果在文件记录的TxF LSN小于TxF日志记录的LSN(或不是在该文件的文件记录中),则可以知道NTFS操作未被提交,而且对应的TxF日志记录不能用于撤消。 If less than TxF logging in documented TxF LSN LSN (or is not in the file record), then you know that the operation has not been submitted to NTFS, and the corresponding TxF logging can not be used to undo.

但注意,为保证正确的恢复,如果一个对象在恢复窗期间被删除,TxF将推迟该文件记录的删除(因而保留文件标识符file-id),直到所删除日志记录在日志中被遗忘之后为止。 But note that, in order to ensure proper recovery, if an object is deleted during the recovery window, TxF will delay the file delete records (and thus retain the file identifier file-id), up until after the delete log records are forgotten in the log . 这是借助建立到该文件的系统链接完成的。 This is a means of establishing a link to the file system to complete. 此外,如果建立一个新文件,在NTFS确定将用于建立的文件标识符之前不写入TxF日志记录。 In addition, if a new file in NTFS is not written to be used to determine the log file before TxF build identifier. 这就实际上将文件标识符记录入TxF日志。 This will actually be recorded into TxF log file identifier. 注意,对非事务处理的建立也如此,NTFS将当前的TxF LSN写入文件记录,然后处理这种情况,其中在恢复窗期间重新使用文件标识符(包括序号),并在建立之前使TxF跳过日志记录。 Note that the establishment of a non-transaction is also the case, NTFS will be written to the file TxF LSN current record, and then deal with this situation, where the re-use file identifiers (including numbers) during the recovery window, and before the establishment of so TxF jump excessive logging.

因此,如果NtfsTxFGetTxFLSN调用发现,在恢复时刻文件标识符不存在,则或者在事务提交之后且在系统崩溃之前文件被非事务处理地删除,或者在建立操作之后立即发生系统崩溃。 Therefore, if NtfsTxFGetTxFLSN call found the time to restore a file identifier does not exist, or after the transaction commits and before a system crash file is non-transactional delete, or immediately after the establishment of the operating system crash occurs. 注意,在第一种情况,没有涉及TxF且在恢复窗期间文件记录被删除。 Note that, in the first case, and is not covered TxF window during recovery file record is deleted. 在第二种情况,TxF建立的日志记录送到TxF日志盘中,但NTFS对它的提交未持续。 In the second case, the logging TxF established to TxF log disk, but not NTFS submit it to continue. 只有当处理一个建立日志记录时第二种情才能检测到。 Only when dealing with an established logging second case can be detected.

因为撤消记录用于中断未完成的事务,如由NtfsTxFGetTxFLSN看到的文件标识符不存在的记录可以简单地忽略。 Because undo records are used to interrupt unfinished business, as seen by NtfsTxFGetTxFLSN file identifier record that does not exist can simply be ignored.

应该注意,在中断、崩溃恢复和向前恢复期间,由日志驱动的重做和撤消动作在NTFS过滤一驱动程序模型中的过滤驱动程序堆栈的顶部起动,允许任何中间的过滤驱动程序看到这些动作。 It should be noted, in the interruption, crash recovery and forward recovery period, driven by the log redo and undo actions filter on top of a driver model filter driver stack starting in NTFS, allows any intermediate filter driver to see these action. 对应于重做和撤消动作的IRP被专门标记,使得过滤驱动程序能选择忽略它们。 Corresponding to redo and undo actions IRP is specifically labeled, so that the filter driver can choose to ignore them. 这些IRP将包括通常的事务状态而文件对象一般将指向事务对象。 The IRP will include the usual state of affairs in general and the file object will point the transaction object. 但是,因为事务处于特定状态,TxF将知道它们需要专门处理。 However, because the transaction in a particular state, TxF will know that they need special treatment. 例如,TxF不试图将这些动作包括在一个事务中,或将它们作为非事务处理。 For example, TxF not attempt these actions are included in a transaction, or to use them as a non-transaction.

除了记录名字空间操作以外,TxF部件70与记录服务74协同工作以记录在其他操作中的页改变。 In addition to recording the name space operations, TxF parts 70 and 74 work together to record service record in other operations in the page change. 如上所述,在中断情况为维持版本也为了支持撤消操作,在通过API对存储器中的页面实际作出改变以前,相应的撤消记录写到(非强制性的)TxF日志126。 As mentioned above, in order to maintain versions interruption to support undo operation, before the actual pages of memory to make changes through the API, the corresponding undo records are written (non-mandatory) TxF log 126. 如图12所示,然后写整个页面(通常写到存储器内和如下所述称为TOPS流134的盘上的页流),它允许已成版本的读程序在单个I/O操作中读出该页面。 Shown in Figure 12, and then write the whole page (usually written in memory and follows the flow stream pages called TOPS 134 on the disc), which allows the reading has become a version of the program reads in a single I / O operation the page. 在日志写以后,对该文件的改变表98用日志序号(TxF LSN和在TOPS流134中的偏移量)标记,此改变随后应用到该页。 After the log writes to the file change log table 98 are marked with the serial number (TxF LSN and offset TOPS stream 134), this change is then applied to the page.

对由页I/O改变的页,如从已被用户映射段修改的页和/或由较早对正在写入的API的调用修改的页所得到的页,完成页面的写。 To the page I / O change pages, such as a user has been modified from the mapping section pages and / or by the earlier of the API calls being written to modify the resulting page page, complete the write page. 此分页的写能在后台的线程中,或可以是在提交时刻刷新部分。 This page can be written in a background thread, or may be submitted in time to refresh parts. 在任何情况,TxF 70首先检查改变表98(图6),以查看该撤消是否在TxF日志126中已被抓住。 In any case, TxF 70 first checks changing table 98 (FIG. 6), to see whether it has been caught in the undo log in TxF 126. 若是,系统强制TxF日志126直到表98中标记的TxF LSN,在大多数情况它将返回而没有I/O。 If so, the system forces TxF log 126 Table 98 until marked TxF LSN, in most cases, it will return with no I / O. 如果改变表98未被标记,得到该页的撤消版本并写到TOPS流134和TxF日志126。 If you change the table 98 is not marked, get undone version of the page and write TOPS TxF log stream 134 and 126. 多页I/O是常见的,因为后台线程试图以文件偏移量的次序将页面组合在一起。 Multi-page I / O is common because a background thread tries to file offset of the order of the page together. 在这些情况,多个撤消被写入单个、大的I/O。 In these cases, a plurality of revoked is written to a single, large I / O. 在这些情况的撤消也将在单个、大的I/O中读出。 Undo In these cases also be in a single, large I / O read out.

在准备的记录被强制到TxF日志126以后,撤消映象在TxF日志126和TOPS流134中的盘上,而修改的文件页在文件中它们的位置处。 In the recording preparation is forced to log 126 TxF after revocation image TxF log in TOPS stream 126 and 134 in the disc, while the modified file page position them in a file. 因此,提交是将提交记录写入日志126的简单操作。 Thus, the author is to submit written to the log record 126 simple operation. 中断的实现是通过以逆序执行撤消记录,并将它们应用到基文件,随后刷新文件,随后强制写中断记录。 Interruption is achieved by the implementation of undo records in reverse order, and apply them to the base file, then refresh the file, and then forced to write interrupt records. 如果中断记录存在于日志126之中;在恢复时刻就忽略这些撤消记录。 If the interrupt is present in the log record being 126; in the recovery time will ignore these undo records. 注意,通过在不频繁的操作(中断)期间刷新文件,大的(页面大小)补偿日志记录(CLR)不需要作为重做记录写入,这显著地保存了空间。 Note that, by infrequent during operation (interruption) refresh file, large (page size) compensation log record (CLR) is not required as a redo record is written, which significantly saving space.

获得一个撤消映象与得到以前提交的页版本是一样的,即文件的撤消映象首先在文件的以前版本中搜索。 Undo get a map and get page version previously submitted is the same, namely to undo the image file search first in the previous version of the file. 如果映象留在存储器中,撤消映象从存储器中取出。 If you stay in the memory map, undo image removed from memory. 否则,映象由非高速缓存的I/O操作从盘中读出,因为脏位被非公开地处理且并不需要知道,就无法确定当前留在存储器的映象是否是脏的。 Otherwise, the image / O operation is read by a non-cached I from the disk, because the dirty bit is handled in private and does not need to know, you can not determine whether the current stay in the memory map is dirty.

如上所述,每当一页面由使文件打开以写入的事务改变时,该页面就在原位(即高速缓存器中)编辑。 As mentioned above, whenever a page is opened by making changes in the transaction file to be written when the page is in place (ie, cache memory) editor. 随后,高速缓存在不同时刻被写入盘中(图8)。 Subsequently, the cache is written to the disc (Fig. 8) at different times. 但是在页面数据被改变时,老的页面数据需要保存,所以如果事务中断或系统失败,老的页面数据能够恢复。 However, when the page data is changed, the old page data needs to be saved, so if the transaction is interrupted or the system fails, the old page data can be restored. 为此,老的页面被复制到TOPS流134,而改变记录在TxF日志126中。 For this reason, the old page is copied to the TOPS stream 134 recorded in the change log 126 TxF. 如图12所示,日志记录(如X2)包括到此页面的偏移量,而日志126不需要保持数据,而只需对应其的记录。 As shown, the log 12 (such as X2) include an offset to this page, and 126 do not need to log data, and only its corresponding record. 因此,为使页面恢复,TxF使用随时间顺序记录改变的改变日志。 Therefore, in order to make the page resume, TxF use change over time, changes in the order of log records. 注意,对于正在做版本,为有效起见使用在改变表98中到TOPS流134的偏移,而不是访问TxF日志126。 Note that for the version being done for the sake of effective use of the table 98 to change the offset TOPS stream 134, instead of accessing TxF log 126. 但是在系统失败的情况,存储器内结构的版本流控制块在恢复时刻不存在。 However, in the case of system failure, the memory version of the flow control structure block does not exist in the recovery time. 而且单独在存储器内的任何文件版本是不可恢复的。 And a separate version of any file in the memory is unrecoverable. 因此为了恢复,可以将日志中的记录用于在失败期间中断事务,并用于持久地完成在系统失败前提交的事务。 Therefore, in order to restore, you can log records are used to interrupt the transaction during the failure, and for lasting system failed to complete the transaction before submission. 日志项(或日志记录)的顺序特性保存了改变的次序。 Order characteristic log entry (or log) saved the change order.

在本发明中,由于其性能和其他原因,页面写的日志记录分离成两部分。 In the present invention, because of its performance, and other reasons, the page write logging separated into two parts. 与主日志内联(inline)的部分保存其相对于其他日志的次序,而另一部分包括(相对更大量的)字节,它们提供操作的细节,即改变的页面数据。 Save the primary log inline (inline) portions thereof with respect to the order of the other logs, while the other part comprises (a relatively larger amount of) bytes, which provide details of the operation, i.e., changing the page data. 因此,按本发明的一个方面,如在图12所示,每当页面由事务改变,老的页面数据复制到(连续的)TOPS流134,并在TxF日志126记录改变。 Thus, according to one aspect of the present invention, as shown in Fig. 12, every time a page, copy the changes made by the transaction to the old page data (continuous) TOPS stream 134, and change in TxF log 126 records. 如上所述,在调整表以将事务处理的读程序映射到复制页面以后,该页面随后可以改变。 As mentioned above, the adjustment to the transaction table reading program mapped to copy a page later, the page can then be changed. 如图12所示,日志记录(如X2)包括到在复制页的流中的此页的偏移,因而主日志不需要保持数据,只保持具有对应其的偏移的记录。 As shown, the log 12 (such as X2), including to offset this page in the replication stream page, and thus do not need to master the log data, only maintain its offset with a corresponding record.

但是,出于性能原因,这些日志被不同地刷新到盘中。 However, for performance reasons, these logs are flushed to disk differently. 因此,页面和日志126在给定时刻可能都不能持久,例如系统可以在日志126被刷新到盘和/或页面被刷新到盘上之前失败。 Therefore, page 126, and the log may not be sustainable at a given time, for example, the system may be flushed to the disk before the log 126 and / or pages are flushed to the disk failure. 保证页面数据不丢失的简单方法是在两者之间强加排序,即在刷新日志记录到盘上之前总是先将页面刷新到盘上。 The simplest way to ensure that the page data is not lost between the two imposing sort, that refreshes before logging on the first disc to always refresh the page onto the disc. 因此当恢复过程中使用日志时,如果日志记录存在,对应于该记录的正确的页版本也已知继续保持,但是发现此次序依赖关系很大地降低了系统的性能,因为日志刷新操作根据许多无关的因素在不同的日志上更加有效操作。 Therefore, when the recovery process using the log, if logging exist, corresponding to the recorded version of the right page is also known to maintain, but found this order dependency greatly reduces the performance of the system, because the log refresh operation according to many unrelated factors more efficient operations in different log on. 例如为了改善性能,页面通常以多组形式被刷新,如使用惰性写算法一次16页,而日志在满时,或在后台处理的不同时刻被刷新。 For example, to improve performance, the page is refreshed usually in the form of multiple groups, such as the use of inert write algorithms once 16, and log when full, or be refreshed at different times background processing.

按本发明的另外方面,提供一个系统和方法,它们使页面和日志能以相对彼此任意的次序刷新到永久存储器中,而且以这种的方式,确保在失败的情况能恢复正确的页面。 Another aspect according to the present invention there is provided a system and method, which make the page and log each other at relatively arbitrary order flushed to permanent memory, and in this manner, to ensure that in case of failure to restore the correct page. 这是通过将信息加到日志126和页面数据中来实现,将两段信息以一致状态(例如及时)有效互相链接。 This is accomplished by the information to the log 126 and the page data is achieved, the two pieces of information in a consistent state (e.g., in time) and effective mutual link. 更具体地说,保持一个循环计数136(如以字节,虽然可选择使用一个字或更大容量),表示页面的当前状态,如每当指向TOPS流134的指针翻滚回到起点时循环计数就加1,而且该循环计数与日志记录同步。 More specifically, to maintain a cycle count 136 (e.g., in bytes, although the option of using a word or larger), represents the current state of the page, such as stream 134 whenever a pointer pointing TOPS roll back to the beginning of the cycle count incremented, and the cycle count and logging synchronization.

如图12所示按本发明的一个方面,同步是通过将循环计数值保持在与复制到TOPS流134页面相关的日志记录来实现的。 12 according to one aspect of the present invention, synchronization is maintained by the loop counter value as shown in stream 134 and is copied to the page associated TOPS logging to achieve. 这在图12中用标号为138的方框表示,它提供某些记录的数据字段的扩展表示。 This expansion in FIG. 12 by reference numeral 138 indicates a block, which provides some of the recorded data field representation. 还示出。 Also shown. 在每段中的最后部分(如字节)复制到日志记录以便在那里保存。 In each copy of the last part (e.g., in bytes) in order to log stored there. 注意,一个页面包括8个512字节的段,每个段如这里所描述,但可以理解,其他页和/或段的大小是可能的。 Note that one page includes 512 bytes of 8 segments, each segment as described herein, it will be appreciated, other pages, and / or size of the segment is possible. 而且在流数据中每段的最后部分用循环计数代替,如在图12中用标号为140的方框表示,用在每段的最后部分中代替的循环计数提供页数据的扩展表示。 And in the last part of the data stream with each cycle count instead of, as in Fig. 12 by reference numeral 140 indicates a block, to provide extended page data is represented by the last part of each section in place of the loop count. 如在图12所示,如果页和日志均被刷新,在每段的未端的循环计数值将匹配在记录中的循环计数值,即两者具有匹配的特征。 As shown in Figure 12, if the log page and are refreshed, each of the loop count value will not match the end of the loop count in the record, i.e., both have matching characteristics.

如果只有页面数据(外部部分)被写入盘,系统将找不到内联(日志)记录,因此找不到该页面,没有什么可恢复。 If only the page data (outer portion) is written to disk, the system will not find the inline (log) record, so that the page can not be found, no recoverable. 该状态就认为是一致的。 The state is considered to be consistent.

但如果记录出现在日志中,在系统崩溃前,页面可以被或可以不被刷新。 But if the records appear in the log before the system crashes, the page may or may not be refreshed. 图13通常表示当在退回过程中记录到达时,页面及其日志是否均被刷新到盘中。 Figure 13 generally indicates when the recording arrives in return process, and log pages are flushed to disk is. 首先在步骤1300,记录被访问,以通过其储存到流134的偏移量找到页面。 First, at step 1300, the recording is accessed, through its stored offset stream 134 to find the page. 然后,在步骤1302读出页面并从中取出每段的最后部分,在步骤1304将其与存在日志记录中的循环计数比较。 Then, in step 1302 the page is read out and removed from the last part of each segment, with the presence of the log record 1304 in the loop count comparison step. 如果只有内联(日志)记录被写入盘,系统崩溃以后存在外部部分(页面数据)的每个段中的唯一的记号(每个循环计数)将不匹配存在于内联记录数据的循环计数。 If only inline (log) record is written to the disc, the presence of the outer portion of each segment (page data) in the unique token (counts per cycle) after a system crash will not match the recorded data is present in the inline loop count . 在此情况中,如步骤1306所示,系统得出结论,因为老的页面未写入盘,而新页面也未写(只有在两个日志被刷新时它才被刷新)。 In this case, as shown in step 1306, the system concludes that, because the old page is not written to the disc, and the new page has not written (only two logs are flushed when it was only refresh). 因此该页面已知处于以前的老状态。 Therefore, known in the page before the old state.

相反,如果在步骤1304中,日志中的循环计数匹配对应页的每段的最后部分中的循环计数,日志和页面就已知均被成功地刷新。 On the contrary, if the last part of each segment in step 1304, the log in the loop count to match the corresponding page in the loop count, posts and pages are known to successfully refreshed. 因此,就知道被复制的页面保持,而存在日志记录中每段的最后部分在步骤1308恢复到复制的页。 Therefore, we know that the copied pages maintained, and the presence of logging in each of the last part of 1308 to return to the step to copy the page.

此时,被复制的页面可由读程序访问,并提供适当的版本。 At this point, the copied page by reading program access, and provide the appropriate version. 任何对当前页面作出的记录改变可用于(步骤1310)使新的读程序和/或写程序看到。 Any record of changes made to the current page can be used (step 1310) so that the new read program and / or write the program to see. 在该情况,知道老的数据被正确地捕获,并必须作为中断的一部分恢复到文件页。 In this case, to know the old data is correctly captured, and must be used as part of an interrupt return to the document sheet. 注意,尽管中断,现有的事务处理的读程序将继续从TOPS流134读到老的数据。 Note that although the interruption, existing transaction processing procedures will continue to flow 134 reads from TOPS read the old data.

应该注意,使用在每段的结尾的唯一的记号还进一步检测分裂(部分)写,其中某些页面被复制,但不是所有页面。 It should be noted, each used in only mark the end of the further detecting splitting (portion) to write, in which some of the pages are copied, but not all pages. 注意,盘硬件保证段将写满,但不保证一页数据(如8段)将作为一个单元被写。 Note that the disk hardware segment will be filled to ensure that there is no guarantee a data (eg 8) will be written as a unit. 在这种情况,循环计数是“n”和(推测的)“n-1”值的某种混合,该记号将不匹配记录的记号信息。 In this case, the loop count is some combination of "n" and the (presumed) "n-1" value, the token does not match the token recorded information. 就如同整页没有保存来处理这种情况。 Like the whole page without saving to handle this situation.

注意,当循环计数本身重算,有可能使其匹配构成现有页面上的记号的计数(如它已经在存储器内相当长时间),因此使部分的写入不能被检测。 Note that, when the loop count itself recalculation, it is possible to match the configured count (e.g., it has for quite some time in the memory) marked on the existing page, thus the write section can not be detected. 例如,如果使用重算的循环计数,且如果它匹配存储在页面上的现有循环计数,则不管是所有页面还是某些页面被复制,记号是一样的。 For example, if the loop count recalculation, and if it matches the current cycle count is stored on the page, regardless of all the pages or certain pages to be copied, the same mark. 可以理解,在此情况的记号校核将指出,整个页面数据的保持,虽然事实不是。 Be appreciated that, in this case the check mark will be noted that, to keep the whole page data, though the fact is not.

此问题能以许多方法解决。 This problem can be solved in many ways. 一个解决方法是在每次循环退回事件之后读页面一次,以验证是否存在不匹配。 One solution is to return after each cycle events page once read to verify whether there is a mismatch. 如果匹配,可以调节二个循环计数之一以避免匹配。 If it matches, you can adjust the loop count in order to avoid one of the two match. 为保证每次循环退回(即每次循环计数回到0)只发生一次上述情况,可使用单独的验证位映象141保持每页的“验证”状态,即每位是退回之后的一个状态,且当页面首次检查循环计数匹配时作切换。 To ensure the return of each cycle (ie, each cycle count back to 0) occurs only once these cases, you can use separate validation bitmap keep 141 per page "verify" state, a state that each is returned after the and for switching cycle when the page count to match the initial inspection. 注意,使用自由空间位图跟踪一页是否为空闲或使用,且为有效起见,上述解决方案将附加的位图加到跟踪验证状态。 Note that using a free space bitmap tracing whether idle or used, and to be effective, these solutions will be added to the track bitmap additional verification status.

另选的解决办法(对上述读和比较操作)再次跟踪验证状态,但是当“验证”状态是在页面使用时设置,如上所述循环计数被写到页面中,且强制写入。 Alternative solutions (for the reading and comparison operations) again to verify the status of track, but when the "verification" status is set when the page is used as described above loop count is written to the page, and the mandatory written. 如果写入成功,则写入不是部分进行。 If the write is successful, the write is not part of. 对大的TOPS流,由于较少的输入/输出I/O操作,此另选方法调节得很好,因为循环计数匹配页面的情况可能相对很少出现。 TOPS for large flow, due to fewer input / output I / O operations, this alternative method of adjusting very well, because when the loop count matching pages may be relatively rare.

又一种另选的方法是结合检查页面驻留的首次两种方法的组合,即如果页面驻留在高速存储器中,因为不需要实际上的读盘而实现第一种(读)另选方法,否则执行第二种(写)另选方法。 Yet another alternative approach is to combine a combination of the first two methods to check the page resides, i.e. a first (read) Alternative method If the page resides in high-speed memory, because there is no actually read the disk and to achieve Otherwise, perform the second (write) alternative methods.

延迟的重做另选方法上述恢复机制将文件的脏页在提交时刻写入盘,防止在多个事务上成批页写入。 Redo alternative method of the delayed recovery mechanism will write dirty pages to disk files submitted in time to prevent the batch page write on multiple transactions. 为达到在多个事务上成批页写入,可提供一个另选的“延迟重做”方案,它在恢复的方面上做了相反工作。 To achieve batch pages written on more than one transaction, providing one alternative "delayed Redo" program, which in terms of restoration work done to the contrary. 此方案将重做记录写入日志,并当没有读程序仍在读它时,将老的提交处理施加给基文件。 This program will redo records written to the log, and when there is no reading program is still reading it, the old commit processing is applied to the base file. 为了支持老的提交的版本的读,不在原位做出改变,而是当现有的页的原位版本不再需要时,只能原位应用到该文件。 To support older versions submitted to read, not in place to make a change, but when in situ version of an existing page is no longer needed, only apply to the file in situ.

延迟重做方案共享由原位更新方案使用的许多原则,例如,它以与版本控制块和多个存储器内流十分类似的方法支持做版本。 Sharing program delayed redo many of the principles used by the in-situ replacement program, for example, that with the internal version control block and a plurality of memory stream to do a very similar approach to support version. 但是,改变表保持重做页面的LSN,而不是撤消的LSN。 However, changing table remains LSN redo the page, rather than undo the LSN. 通常如图14所示,最早的盘上的版本总是基文件,而较新的版本在其上建立增加的改变。 Typically As shown, the first version of the file on the disk base 14 is always, and the newer version to establish an increased change thereon. 当读程序离开时,较早的版本合并在基文件中。 When reading program left earlier in the base version of the merged file. 为了利用此方案主要优点,多个版本能同时合并到基文件中,因此获得I/O效率。 The main advantage of this scheme in order to take advantage of multiple versions simultaneously incorporated into the base file, so get I / O efficiency. 同时合并多个版本的另一优点是在大的读操作中日志能被有效地读出。 While another advantage merge multiple versions of the log can be effectively read out in large read operations.

但是,日志可以用页面充填,它们返回存储器用于(可能许多)活动的文件,实质将顺序的日志调整到既作为恢复日志也是随机页面的文件中,这可能成为系统中的瓶颈。 However, the log can be filled with a page, they return the memory used (possibly many) file activity in real terms will be adjusted to both sequential log file recovery log is as random page, this may be the bottleneck in the system.

类似于原位更新方案,最新的版本是可更新的。 Similar to the in-situ replacement program, the latest version is updatable. 存在一个版本控制块(TxFVSCB)与每个版本相关,且每个TxFVSCB指向改变表,这是一个记录由该版本改变的页面号的存储器内的表。 There is a version of the control block (TxFVSCB) associated with each version, and each TxFVSCB change point table, which is a memory page number by changing the version of the record within the table. 盘地址可以与每个页号一起存入,以便找出在盘上的页面,只要它至少写入一次(重做映象)。 Disc address can be stored together with each page number, in order to find the page on the disk, as long as it is written at least once (redo image). 缺少盘地址意味着该页从未写入盘中。 Missing disk address of the page never meant to write disk. 为了节省存储器,页范围可存入一个项中,在这个范围内页面连续地存在盘上。 To save memory, a page range can be stored item, in this context a page continuously present on the disc.

版本LSN是提交该版本的事务的提交记录的LSN。 LSN version is the version submitted to submit records of transactions LSN. 对当前可更新的版本没有这种LSN。 The current version can be updated without this LSN. SOP指针是指向对应此版本的段对象指针结构的指针。 SOP pointer is a pointer to the corresponding section of this version of the object pointer structure pointer. 使用该指针,能找到该存储器内页面。 Use the pointer, you can find the memory page. 类似地提供版本长度。 Similarly provide length version.

版本控制块以时间顺序链接到表中。 Version control block in chronological link to the table. 最早的版本是基流,而改变表不包含此版本的任何项。 The earliest version of the base flow, changing table does not contain any of the items in this version.

在打开时刻,如在上述另选方案,给予版本之一一个文件句柄。 In the opening moments, as in the above alternative solutions, giving one version of a file handle. 最新版本的存储器内流部分由日志返回(不是由基文件全部返回)。 Part of the latest version of the memory stream returned by the log (instead of returning all from the base file). 这样,对流的改变写入日志。 Thus, convection changes written to the log. 如果在版本窗的任何版本中该页未改变,则从基文件完成读,否则它们从日志中完成。 If the page does not change in any version of Windows versions, from the base file finished reading, otherwise they done from the log.

在读的时候,查阅对应于该版本的改变表,以判定在该版本中页面是否被修改。 When read, access to the corresponding version of the change table to determine whether the page has been modified version. 若是,I/O对着日志中适当的位置以取入该页面消除故障。 If, I / O against the log in order to get into the proper position in the page fault has been eliminated. 若不是,则查阅该页下一先前版本;此过程一直持续到找到该页的最新提交的复制。 If not, consult a previous version of the page to the next; this process continues until the page to find the latest copy submitted. 如果多个版本包括该页的复制,用VMM调用检查它们的存储器驻留内容。 If there are multiple versions, including the page copy, use VMM to call them to check the contents of the memory-resident. 如果找到存储器驻留页,对其复制,否则使用最新版本的LSN从日志中将其读出。 If you find a memory-resident page, copy them, or else use the latest version of the LSN will read from his journal. 注意,如果该页是在驻留内容被检查和作出复制的时间之间从系统存储器裁剪出,也没有关系,因为产生递归的故障并且页面在其后被复制。 Note that if the page is hosted content is checked and make the time between copying cut out from the system memory, it does not matter, because the failure to produce recursive and subsequent pages are copied. 为了得到系统地址以复制这些页面,使用高速缓存管理器映射它们到系统地址空间。 In order to obtain the system address to copy these pages, using cache manager mapped into system address space.

在图14中示出四个版本V0-V3(但其他数目也可以),其中用“X”标记的页表示在版本中的改变。 In Figure 14 shows four versions of V0-V3 (but other numbers may be), wherein by "X" mark indicates the page change version. 改变表1420-1423示出已写入页面的LSN。 1420-1423 shows the change table pages that have been written LSN. 在最近(可更新的)版本中的某些页还未被写入。 (Can be updated) version certain page has not been written recently. 在此情况,考虑一个例子,其中FileObjectB访问页面50(50)。 In this case, consider an example in which FileObjectB access page 50 (50). 对文件版本V1的改变表1421表示此页在该版本中未被改变。 V1 changes to the file version of this page table 1421, said in the release has not been altered. 因此,通过对该页检查文件版本V0的驻留内容并若是驻留的(不出故障)将其复制,借此处理故障。 Therefore, by examining the file version V0 page content resides and if resident (no fault) copy, thereby deal with failure. 如果文件版本V0没有页是驻留的,则将它从盘读出(在此情况,从基文件读出)。 If the file version V0 page is not resident, then it is read from the disk (in this case, is read out from the base file).

作为另一个例子,如果FileObjectB访问页面200(200),且该页在存储器内,访问很简单完成。 As another example, if FileObjectB access page 200 (200), and the page in the memory access is very simple to complete. 但是如果不是,产生页面故障,并通过从LSN 2500处的日志读出它以达到读的目的。 But if not, generate a page fault, and by reading from the log LSN 2500 at it in order to achieve the purpose of reading.

作为另一个例子,考虑FileObjectC访问页面100(100)。 As another example, consider FileObjectC access page 100 (100). 因为该页面在版本V2没有改变,检查版本V1,且通过从存储器映象(若是驻留的)或通过在LSN 2000读出日志以满足读取。 This page has not changed since the version V2, check the version V1, and by the memory map (if resident) or by LSN 2000 reads the log to satisfy read.

对于文件写,在页调出的时刻,页面以重做记录的形式写到日志中,重写记录也描述了流偏移和流名字。 For documents written in the pages of time to bring up the page to redo log records written form, rewriting records also describe the offset and stream flow name. 在此时,LSN被标记在对该版本的改变表的页缝中。 At this time, LSN is marked change in the page table of the version of the cracks. 页面写由一个系统线程在后台中发生,并通常以顺序页面次序写入。 Page write occurs by a system thread in the background, and usually in the order page order written. 在提交时刻,在那个版本中的脏页面被写入日志,随后是一个提交记录。 In the submission time, in that version of the dirty pages are written to the log, followed by a commit record. 如果在事务期间页面被写出多次,则完成了多个日志写。 If the page has been written many times during the transaction, the completion of a number of log write. 这些写进到日志的末端,且改变表项被改变成指向新的位置。 These written to the end of the log, and to change the entry is changed to point to the new location. 如果新的写事务在提交之后开始而没有对读事务有任何干涉,主存储器流被新事务再使用。 If a new start after the submission of write transactions without any interference on the read transaction, the main memory stream is a new transaction before use. 否则,就由读程序对其申请,而写程序事务建立新的流来工作。 Otherwise, the program by reading their application and write a program to create a new stream of affairs come to work. 注意,在延迟重做方案中,改变的页面能写到TOPS的流(就象原地更新方案)以得到有关的好处。 Note that, in the delay redo program, the altered pages can be written TOPS stream (as place update scheme) to obtain related benefits.

在系统崩溃后,恢复是相对直接的,因为提交事务的重做信息在日志中,并能简单地加到主数据流。 After a system crash, recovery is relatively straightforward, since the transaction is committed redo information in the log, and can be easily added to the main data stream. 注意,版本控制块是存储器内结构,因而在恢复时刻不存在。 Note that the version is within the memory control block structure, which is not present in the recovery time.

当老的读程序结束它们的事务时,老版本不再需要保存。 When the old reading program their affairs, the old version is no longer need to be saved. 在此时,版本从最老的开始以每次一个版本的方式合并到主流中。 In this case, the old version from the start to a time a version of the way into the mainstream. 因为版本被合并,就从版本的链接表把它们去除。 Because the version is merged from version to link the table to remove them. 通过将那个版本中的改变页面(在改变表中查找的页号)复制到基流并强制到盘中,以每次一页方式发生合并。 By changing the version of the page (look at the changing table, page number) to the base flow and forced to dish to each page of the way merge. 此复制操作读出不是当前驻留的页面的日志。 This copy operation reads the log page is not currently reside. 若可能,执行大量的I/O操作以从日志捕获页面范围。 If possible, perform a large number of I / O operations to capture a range of pages from the log. 如在图14中,如果版本V0不再需要支持做版本,版本V1就合并到版本V0中。 As shown in Figure 14, if the version V0 do not need to support the version, the version to version V1 V0 of the merger. 此合并能发生而不必锁定版本V1,因为当合并进行时每个页面的副本存在于版本V1和版本V0中,而对V1的改变表在整个过程中不变。 This merger will take place without locking version V1, because when a copy of each page exists in versions V1 and V0 version, the merger of time and change table V1 constant throughout the process.

当合并完成以后,若版本V1不在版本窗中,V1的版本控制块简单地从版本表中除去。 When the merger is completed, if not in the version of the window version V1, V1 version control block is simply removed from the version table. 通常,合并将延迟直到从读程序释放多个版本。 Often, multiple versions of the merger will be delayed until the release from Reading. 在本例中,V0、V1和V2在它们离开版本窗时能一起合并到基文件中。 In this case, V0, V1 and V2 when they leave the window version can together be incorporated into the base file. 对多版本的合并,改变表首先以这种方式合并:当在多个表中同样的项被修改时,从最高版本号来的LSN先挑选出来。 For multiple versions of the merger, changing table first merger in such a way: when the same item has been modified in several tables, from the highest version number to the LSN first selected. 这实质是在各事务之间将写分批,这是此方案的一个优点。 This is essentially the transaction will be written between batches, which is an advantage of this program. 在一个版本被合并以后,其日志记录适当地从活动的日志中去除。 In a later version are combined, it is suitably removed from the logging activity log.

通常,合并应尽可能早做。 Typically, the merger should be done as early as possible. 每当写程序离开时,版本窗前移。 Whenever written procedures to leave, release window forward. 那时,某些版本标记为适合于合并。 At that time, some versions marked as suitable for the merger. 当标记了多个版本,一个工作项放在系统线程上以进行合并。 When the flag of the multiple versions of a work item on a system thread to be merged.

原地更新方案和延迟重做方案都执行差不多相同数量的I/O。 Redo place update program and delay the implementation of the program are almost the same number of I / O. 原地更新方案可以同步地读撤消命令(因为有时可以在存储器中找到它,例如如果并发的读程序最近在读它们)。 Place synchronously update program can read the Undo command (because sometimes you can find it in the memory, for example, if concurrent reading program recently read them). 原地更新方案将页面写出到基文件,并还将撤消顺序地写到日志中。 Place update program will be written to the base document page, and will undo sequentially written to the log. 相反,延迟重做方案需要将重做写入大的随机I/O,且需要随机地读日志以便合并一版本。 Instead, the program will need to redo redo delay large random write I / O, and the need to merge randomly read a version of the log. 此外,延迟读方案需要将文件页写到盘上,虽然它使各版本上的写最小。 In addition, the delay reading programs need to be written to the disk file page, although it makes the writing on various versions of the smallest. 在存储器中寻找这些日志页面的机会因此是非常低,给出了合并可能延迟多长时间。 The opportunity to find these logs in memory pages is therefore very low, given the merger may be delayed much longer.

在何时何地完成I/O方面有性质上的差别。 There are differences in the nature of when and where on the complete I / O areas. 在延迟重做方案中,最新的存储器流由日志返回,而不是基文件。 Delay redo scheme, the latest memory stream returned by the log file instead of the base. 这很可能是最经常使用的流,因为它处理更新工作,比较地加重日志的负担。 This is probably the most frequently used stream, because it deals with updating, more burden the log. 对于已成版本的读程序,两种方案使用日志作为分页设备。 To become a version of the reading program, the two programs using the log as paging devices.

延迟重做方案与提交处理较少同步工作,因为大量的事务工作在后台完成,但是对每次写API或存储器更新的写程序没有显出更快,因为这些在高速缓存中完成。 Redo delayed treatment programs and submit less synchronization, because a large number of transactions completed in the background, but each write API or write memory update program does not show faster because this is done in the cache. 相反,在提交时刻的刷新是在提交的响应性方面的差别显示增加处。 In contrast, the difference in responsiveness to submit refresh time is presented at the show an increase. 对较大的更新事务,后台系统线程看来调度异步的写,它有时减少响应性的差别。 For larger update transaction, the background system thread scheduling asynchronous write opinion, it is sometimes reduce disparities responsive. 类似地,原地更新方案也能通过在后台中完成对文件API的撤消工作而减轻在提交时的负担,但对在用户映射段中作出的改变是不可行的。 Similarly, the update program is also in place to complete the file API can undo the work in the background and alleviate the burden at the time of submission, but the changes made by the user mapping segment is not feasible.

原地更新方案比延迟重做方案简单,因为原地更新方案不需要处理调度异步合并操作的问题。 Place update program than delay redo scheme is simple, because the program does not need to deal with in situ update asynchronous scheduling problem merge operation. 而且,原地更新方案不需要处理在前台和后台活动之间的速度不匹配问题,这有时可能阻碍日志空间并产生资源获取问题。 Also, place the update program does not require treatment between foreground and background activity rate mismatches, which sometimes may hinder access to resources and generate log space problems.

最后,用延迟重做方案不必改变正常运行记录算法,就可能存档及向前退回,因为在日志中重做记录是可得到的。 Finally, the program does not have to change the normal delay redo log algorithm, it is possible to archive and forward return, because the redo in the log are available. 但是,因为没有撤消记录,需要执行对日志中的某些前向扫描,以便在对该事务应用任何重做之前找出事务的提交状态。 However, because there is no undo records, you need to log in to perform some of the forward scan, in order to find out the affairs of the affairs of any application submitted to the state before the redo.

在网上的文件系统事务通常如图15所示,通过内部核心到核心的“重定向程序”协议(如SMB协议)来访问远程文件。 Internet file system transactions are usually shown in Figure 15, through the inner core to the core of the "redirector" protocol (such as SMB protocol) to access remote files. 此协议反映了在如文件服务器之类的远程机器148上的客户机146上执行的文件系统操作。 This protocol reflects the file system executing on the remote machine 148 such as a file server or the like of the client 146 operate. 当然,其他协议和机制(如Wev DAV、NFS等)能得到等效的结果。 Of course, other protocols and mechanisms (such as Wev DAV, NFS, etc.) to get equivalent results. 所以,如非事务文件系统访问那样,远程文件被识别,并且IRP指向在客户机器146上的重定向程序文件系统驱动程序150。 Therefore, if the non-transactional file system access as remote file is identified and IRP point 146 on the client machine redirector file system driver 150. 已经知道,此驱动程序150与客户机的高速缓存对接以读和写数据。 It is known that this driver 150 with the client in order to read and write data cache butt. 如从针对远程机器的文件系统154的应用程序152来的文件系统请求(如访问在远程盘156上的文件G:/Fname)之类的请求被重定向程序驱动程序150截取,并发送到远程机器148,在远程机器148,代理程序158(数据自适应鉴定监视器(daemon))线程将它们翻译成在驱动程序堆顶层的文件系统操作。 From the remote machine as a file system for the application of 152 to 154 of the file system requests: request (such as access files on a remote disk 156 G / Fname) and the like are redirected Driver 150 intercepted and sent to the remote machine 148, the remote machine 148, the agent 158 (Adaptive identification data monitor (daemon)) thread will translate them into the top of the heap in the driver file system operations.

对远程事务文件系统操作,为打开文件,客户重定向程序能,例如,使用COM/OLE将DTC事务对象160c编排到统一的字节流中,具有给服务器148的打开请求。 Remote transactional file system operations, for opening the file, the client redirector can, for example, using COM / OLE object 160c DTC transaction will arrange the byte stream into a unified, has 148 open request to the server. 可以看到,其他机制能达到等效的功能和/或结果,且虽然这里描述COM/OLE操作,本发明的这方面不限于COM/OLE。 Can see that other mechanisms can achieve equivalent functions and / or result, and although described herein COM / OLE operation, this aspect of the present invention is not limited to the COM / OLE. 在COM/OLE实例中,事务对象160c附着于请求打开的客户线程。 In the COM / OLE instance, the transaction object 160c attached to the request to open the client thread. 注意,服务器机器148不关心事务在哪里开始的,只要它能将DTC事务对象160s的副本保持在其核心空间中。 Note that the server machine 148 does not care where the beginning of the transaction, as long as it will be a copy of the DTC transaction object 160s remain at its core space. 类似地,服务器148不在乎哪个线程或过程代表该事务工作。 Similarly, the server 148 does not care which thread or process on behalf of the affairs. 而是,在服务器148处的代理程序158将统一的字节流转回到在核心中可得到的可用对象。 Instead, the agent server 148 of 158 bytes will be unified in the core of the available transfer back to the available objects. 此时,服务器把请求当作本地事务160s,并用在服务器上的副本DTC代理162s支持它,主要是告诉DTC为了后续的事务工作与服务器148联系(而在这里TxF部件164作为资源管理器)。 At this point, the server requests as local affairs 160s, and used on the server copy of DTC proxy 162s to support it, is to tell DTC for subsequent transaction work with the server 148 to contact (and here TxF member 164 as a resource manager). 注意,因为服务器拥有此事务对象160s,因此,这是合适的。 Note that because the server has this transaction object 160s, therefore, it is appropriate. 因为事务-ID处在分布式名字空间中,事务能在任何地方开始,但是基于事务-id的正确文件同步发生在服务器148上。 Because the transaction is in the distributed name space -ID, the transaction can begin anywhere, but the correct file synchronization -id transaction occurs on the server based on 148.

服务器实际上将文件系统请求当作就像是在本地情况的请求,而本地的TxF部件164处理事务的文件系统请求。 Server file system requests will actually be deemed as a request on the local situation, and local processing transactions TxF member 164 file system requests. 但是,服务器148必须记住对应的文件对象是针对被客户146打开的文件,且该客户具有高速缓存页面。 However, the server 148 must remember the corresponding file object is for the 146 files opened by the client, and the client has a cached page. 因此,在提交时刻,服务器148通知(通过重定向程序协议)客户146刷新其高速缓存到服务器,并刷新任何能在客户处打开的映射段(映射段的客户跟踪)。 Therefore, the submission time, the server 148 notifications (via redirector protocol) client 146 to the server to refresh its cache and refresh can be opened at any customer site map segments (map segments of customer tracking). 数据正常地以某些惰性方式(lazy fashion)到达服务器148,即每当它分页出客户的高速缓存/存储器时到达。 Data is normally inert in some way (lazy fashion) arrives at the server 148, that whenever it is paged out client cache / memory arrive. 当数据到达时,它改写在服务器上的高速缓存的副本。 When the data arrives, it rewrite the copy on the server's cache. 注意,这类似于以前的文件系统模型,其中多重打开的句柄或映射段互相改写。 Note that this is similar to the previous file system model, which handles multiple open or map section to rewrite each other.

对基于重定向的文件建立操作,上述概念也能用于在网上编排ITransaction,在所述概念中用户模式中的CreateFileEx将ITransaction对象编排(如通过DTC ItransactionTransmitter方法)到统一的字节集。 Established based redirection file operations, these concepts can also be used in the online presentation ITransaction, the concept of user-mode CreateFileEx will arrange ITransaction object (such as through DTC ItransactionTransmitter method) into a unified set of bytes. 因为在ITransactionTransmitter调用中不需要与事务管理器进行通信,其花费相对较少,因此对每次建立都能完成。 Because ITransactionTransmitter call does not need to communicate with the transaction manager, its cost is relatively small, and therefore able to complete the establishment of each. 但是,接收调用(如上述)确实需要与事务协调程序(或它的代理)通信,在基于重定向程序的情况,该程序是在远程机器148上。 However, to receive calls (as described above) really need to communicate with the transaction coordinator (or its agent), in the case of a program based redirection, the program is on a remote machine 148. 然而,因为对于在整个网络中每个事务,ITransactionReceiver只做一次(在服务器148上),所以与事务协调程序162s的开销并不明显。 However, since the entire network for each transaction, ITransactionReceiver only once (in the server 148), in coordination with the transaction program 162s so the overhead is not obvious.

在此方式中,透明地支持事务的远程文件访问,即,使用远程文件访问,并通过在多个机器上建立应用程序代理,应用程序实际上能直接访问网络上任何处的文件。 In this way, the transparent remote file access support services, namely, the use of remote file access, and through the establishment of an application proxy on multiple machines, applications can actually access any of the files on the network directly. 因此,同一事务能在同时涉及一个或多个本地过程及远程过程。 Therefore, the same transaction can involve one or more local and remote processes at the same process.

对于单个客户具有一个为远程访问打开的文件的情况,通常可以优化重定向程序协议。 For a single customer has an open file for remote access situations can often be optimized protocol redirector. 在这种情况,通过保持该文件的本地盘高速缓存而避免了大量的网络数据传输。 In this case, by maintaining the local disk cache of the document and avoid a large number of network data transmission. 仅当需要时(即文件关闭时),才刷新改变。 Only when needed (ie when the file is closed), just to refresh changed. 但是一旦另一客户同时打开同一文件时,这样的安排是无效的。 However, once another client opening the same file at the same time, this arrangement is invalid. 机会主义的锁定(oplock,主要是指示所有权的令牌)可完成此事,从而对上面的“关闭时更新”方案的改变最小。 Opportunistic lock (oplock, mostly indicating ownership of the token) to complete the matter, and thus the above "update Off" to change the program to a minimum. 尤其是,在提交时刻,通常询问客户是否将改变刷新到服务器上去。 In particular, the submission time, usually ask the customer whether to change the refresh to the server up. 在中断时刻,客户请求将客户句柄标记成“毁灭的(doomed)”,使得一旦该句柄关闭,改变将简单地丢弃。 In the interrupt time, the client requests the client handles marked "destruction (doomed)", so that once the handle is closed, the change will simply be discarded. 注意,重定向程序协议可以增强,使得服务器在某些环境让客户映射段无效,如同在本地情况。 Note that the redirector protocol enhancements that make the server in some environments allow customers to map section is invalid, as in the local case.

名字空间隔离将一个事务的变化与其他事务隔离是事务的关键特性。 Namespace isolation will change in one transaction and other transaction isolation is a key characteristic of the transaction. 在事务文件系统中,隔离不仅施加到存在文件中的数据(如上述),还施加到文件名和文件组织下的目录名的层次,按本发明的另外方面,提供了在文件/目录名层次内实现名字空间隔离的技术。 In transactional file system, the isolation is not only applied to the existence of the file data (as described above), also applied to the hierarchical directory file name and file organization under, according to another aspect of the present invention, there is provided in the file / directory name hierarchy within achieve namespace isolation technology. 该技术不需要在事务处理期间锁定名字或目录,并能与试图用在事务中的文件上的非事务操作工作。 This technique does not require locking or directory name during a transaction, and can work with trying to use a non-transaction operations in the transaction documents.

作为例子,考虑由事务建立而未被提交的文件。 As an example, consider the document established by the transaction without being submitted. 注意,可以建立目录而不单是文件,但为简单起见,本发明主要关于文件讨论。 Note that you can not just create a directory file, but for simplicity, the present invention is mainly concerned with the discussion document. 然而应该理解,关于下面叙述的名字空间操作,通常能等效地处理文件与目录。 However, it should be understood that the following described on the name space operations, usually equivalent to handle files and directories. 事务建立的文件(或目录)应能用于没有限制地建立事务,但应不能为其他事务(如试图打开它或列出其父目录的事务)可视。 Affairs established file (or directory) should be used to establish the transaction without restrictions, but it should not be for other services (such as trying to open it, or its parent directory listed transaction) visible. 仅当建立的事务提交时,该文件才能被其他事务看到,若它中断了,则该文件对任何人都不可见。 Established only when the transaction commits, the file can be seen by other matters, if it is interrupted, the file is not visible to anyone. 非事务(如请求父目录列举)将看到这种文件,或者,使这种文件在提交前对非事务不可见也是可能的。 Non-transactional (eg request parent directory lists) will see this file, or to make this document before submitting the transaction are not visible to non-also possible.

类似地,若文件(或目录)被尚未提交的事务删除,被删文件需要继续对其他事务可访问,就像在提交时刻以前此删除从未发生。 Similarly, if the file (or directory) is to delete the transaction has not been submitted, the deleted file needs to continue to be accessible to other matters, as in the time of submission of this deletion has never happened before. 但是,删除事务将看到删除的效果,并能够在其空间中用相同的名字建立不同的文件。 However, removing the effect of the transaction will see the deletion, and the ability to build their space using a different file in the same name. 提交以后,被删除的文件被去除。 After submission, deleted files are removed. 非事务将看到删除的效果,即看不到被删的文件,但非事务不能建立带有与由非提交事务所删除的文件有同样名字的新文件,以避免由于删除文件/目录的事务中断或删除撤消而产生冲突。 Remove non-transaction will see the effect that deleted files can not see, but not the transaction can not create a new file with the firm submitted by the non-deleted files have the same name, in order to avoid the transaction to delete the file / directory interruption or delete undo and conflict. 而且还可以将非事务当作好象是不同的事务进行处理,因而继续看到被事务删除的文件,但这并不是很好。 But also can be used as a non-transactional seems different transaction processing, and therefore continue to see deleted files affairs, but this is not very good.

此外,如果文件(或目录)被事务改名,对其他事务继续可用具有原始目录中的原始名字的该文件,新名字对其他事务不可见。 Additionally, if a file (or directory) has been renamed the transaction, the transaction continues to be available to other directories with the original name of the original file, the new name is not visible to other transactions. 做改名的事务将看到改名的效果,且可以使用老名字建立不同的文件。 The transaction will be renamed renamed see results, and you can use the old name of the establishment of different files. 注意改名本质上是建立新链接和删除老的链接的组合。 Note renamed essentially establish a combination of new links and delete the old links.

为完成名字空间的隔离以处理上述情况,本发明保存名字空间的状态以在事务处理期间为其他事务所用。 To complete the name space to handle the above isolation, the present invention is to save the state name space during transaction processing for other firms. 为此,如图16-18所示,建立称为隔离目录1701-1704的各个目录,并将其链接到由执行名字空间操作的事务改变的对应NTFS目录。 To this end, as shown in Figure 16-18, the establishment of each directory called Quarantine Directory 1701-1704, and will link it to the execution of the transaction namespace operations corresponding change NTFS directory. 尤其是,每个隔离目录(如1701)包括与父目录(如目录D3)的TxFSCB结构有关的搜索结构(如二进制搜索树)。 In particular, each of the quarantine directory (such as 1701), including the parent directory (such as a directory D3), structurally related TxFSCB search structure (such as a binary search tree). 此外,隔离目录搜索结构,并且有关的处理例行程序包括支持增加一个项及用名字快速查找一个项的通用接口,且还支持目录列举算法。 In addition, the isolation directory search structure, and related processing routines include support for adding an item by name and generic interface to quickly find an item, and also supports the directory list algorithms.

这些隔离目录包括受使名字空间改变的事务影响的各个名字,并且只是主存储器结构。 These include the quarantine directory under the name space so that the impact of each change transaction name, and only the main memory structures. 结构中的每个项还包括与该名字相关的事务ID(Tid),和可见性配置,它有两个标志,对事务Tid可见,或对其他事务可见。 Structure Each item also includes the names associated with the transaction ID (Tid), and visibility configurations, it has two flags, the transaction Tid visible or visible to other transactions. 这些可见性标志中的一个或两个都可以分别设置。 These visible signs of one or both can be set individually. 隔离目录结构还包括短名字/长名字标志,其中如果一个配对可用,该结构包括指向对应于该配对名字的结构的指针。 Quarantine directory structure also includes a short name / long name signs, which if one pair is available, the structure includes a pointer to the name corresponding to the paired structure. 还提供一个标志指出名字由Tid保持,而其他事务不能请求它;一个Fid(用于对删除的和改名的名字重定向建立());和其他信息,即NTFS复制信息如时间标记和用于目录列举的类似标记。 There is also a sign indicating the name of the Tid maintained, while other transactions can not request it; a Fid (for deletion and renaming names redirect build ()); and other information that is NTFS and copy information such as time stamp for directory listed similar mark. 为了空间使用效率,结构可以分成名字,指向该信息的指针,指向其他名字的指针,和其他信息。 For efficient use of space, the structure can be divided into the name, a pointer to the information, and links to other names pointers, and other information. 这导致单组其他信息被两个名字共享。 This leads to a single set of information to be shared by two other names.

作为隔离目录如何使用的例子,如图16所示,如果文件F3被事务T1删除,文件F3的名字和各种信息在名字从NTFS目录D3去除(几乎)同时被加到隔离目录1701。 As an example of how to use the quarantine directory, as shown in Figure 16, if the file is deleted F3 transaction T1, the file name of the F3 and the removal of all kinds of information in the name of the D3 (almost) at the same time is added to the quarantine directory 1701 from NTFS directory. 注意,为了删除NTFS中的文件,打开文件被标记为删除,文件系统关闭该文件而维持打开句柄的计数,并当没有句柄保持打开时执行删除操作。 Note that in order to delete the NTFS file, open the file delete operation is marked for deletion, the file system while maintaining close the file handle open count, and when there is no handle to keep open. 此外注意,隔离目录1701可能由于此事务T1的或另外事务(如T2)更早的操作已经存在,或者在需要支持此删除操作时建立。 Also note that the quarantine directory 1701 may be due to this transaction T1 or additional services (such as T2) earlier operation already exists, or when you need to support the establishment of this deletion. 下面参考图19的流程图来进一步描述删除操作。 Below with reference to the flowchart of Fig. 19 to be described further deletion.

使用隔离目录1701来处理由不同事务(例如T2)对该文件F3的后续访问,这样,事务T2将继续看见文件F3。 Use quarantine directory 1701 handled by different transactions (eg T2) subsequent access to the file F3, so that transaction T2 will continue to see the file F3. 然而,如果删除文件F3的同一事务T1(或非事务)查找文件F3,它将不能找到该文件。 However, if you delete a file of the same transaction T1 F3 (or transaction) to find the file F3, it can not find the file. 为了处理这些情况,如上所述,为此文件维持该文件的名字、其可见性配置、删除文件的事务ID、重定向文件的ID、$TxF文件标识符(例如,单调增加序号)和复制的信息(数据标记、大小、属性)。 To deal with these situations, as described above, to maintain this file name of the file, its visibility configuration file delete transaction ID, ID, $ TxF file redirection file identifier (for example, a monotonically increasing number) and replication Information (data tag, size, attributes).

图19提供了处理对打开文件删除请求的通用逻辑表示。 Figure 19 provides a handle to open the common representation of the logical file deletion requests. 注意图19和类似的流程图针对提供如何使用隔离目录的理解进行了简化,并且不应认为是基础代码的精确表示,并没有包括特别情况、差错处理等等。 Notice in Figure 19 and similar flow chart on how to use the quarantine directory for understanding simplified, and should not be considered to be an accurate representation of the underlying code, and does not include special circumstances, error handling, and so on. 在任何情况,在步骤1900开始,在事务处理及非事务处理请求实体之间作出区分,因为事务处理的用户将导致不同于非事务处理的用户的删除操作的操作。 In any case, start at step 1900, to make a distinction between non-entity in the transaction and the transaction process the request because the user transaction will result differs from the operation of the user's non-transaction deletion. 如果非事务请求删除一个文件(由其句柄识别),删除以另外的正常方式执行,即在步骤1902指定的文件从盘中被删除。 If the non-transactional request to delete a file (by its handle identification), delete the normal manner with the other, that the file specified in step 1902 is deleted from the disk. 当最后的句柄关闭时删除开始。 When the last of the handle close to delete begins.

若事务(如Tid1)在步骤1900请求删除,则执行步骤1904,它主要是改名文件。 If the transaction (such as Tid1) in step 1900 request to delete, then step 1904, it was renamed the main file. 例如,如图16所示,具有任意名(如“0”)的链接加到隐含的目录168($TxF),它链接到在主文件表130(图11)中的文件记录。 For example, as shown in Figure 16, an arbitrary name (e.g. "0") is applied to implicit link directory 168 ($ TxF), which links to a file record in the master file table 130 (FIG. 11). 同时,来自删除的文件F3的链接从父目录D3中去除。 Meanwhile, the links from F3 to remove deleted files from the parent directory in D3.

然后在步骤1906,删除信息被记录到删除记录中,即文件名F3,对原始父目录和新的链接信息的参考,如果在删除该文件的事务提交以前系统崩溃,该事务将中断,而日志通过简单地如上述改名,即通过恢复以前的链接($TxF目录没有了,因为它是存储器内结构)正确地恢复该文件。 Then, in step 1906, deleting information is recorded to delete the record, the file name F3, the original parent directory and the new reference link information, if you delete the file system crashes before the transaction is committed, the transaction will be interrupted, and the log By simply renamed as described above, namely, by restoring the previous link ($ TxF directory does not, because it is within the memory structure) to properly restore the file.

按照本发明,该文件信息随后加到与正常目录D3链接的隔离目录树1701。 According to the invention, the file information is then added to the normal link directory D3 quarantine directory tree 1701. 隔离目录树1701可能与正常目录相关地已经存在,但若没有,就建立它。 1701 isolated tree may be associated with the normal directory already exists, but if not, it is created. 执行步骤1910适当地调节可见性配置标志以指出该事务Tid1已请求删除此文件,所以此文件对其他事务仍可见,但对Tid1不可见。 1910 steps to suitably adjust the visibility of configuration flag to indicate that the transaction Tid1 has requested to delete the file, so the file is still visible to other matters, but Tid1 invisible. 在步骤1912,任意命名的链接被加到随后将从盘删除(在事务提交以后)的文件表中。 In step 1912, arbitrarily named link is added subsequently deleted from the disk (after the transaction is committed) the file table.

当事务终止时,对应于该事务的名字项从隔离目录中去除,而当隔离目录中不存在任何项时,将将隔离目录删除。 When the transaction is terminated, the transaction corresponds to the name of the directory entry is removed from quarantine, and when there is no quarantine directory entry quarantine directory will be deleted. 注意,若系统崩溃,存储器内结构的隔离目录就丢失。 Note that if the system crashes, the memory structure of the quarantine directory is lost. 但是,因为系统崩溃中断了未提交的事务,隔离目录不再需要隔离,而日志文件的打开合适地复位了文件的状态。 However, because a system crash interrupted uncommitted transaction isolation quarantine directory is no longer needed, and the log file to open properly reset the status of the file.

文件的建立有些类似于删除,当文件由一个事务(如Tid2)在一个目录中建立时,名字实际上加到链接(父)NTFS目录的隔离目录中。 Setting up file is similar to delete, when a file created by a transaction (eg Tid2) in a directory name is actually added to the link (Parent) NTFS directory quarantine directory. 对其他事务,由于为打开该文件所作的可见性标志设置,或在事务提交之前列出其父NTFS目录时,该文件名字被滤掉。 Other matters, due to open the file by the visibility flag is set, or until the transaction commits listed parent NTFS directory, the file names are filtered out. 对Tid2和非事务,建立的文件在其被提交前是可见的。 For Tid2 and non-transactional, file creation before it is submitted to be visible.

命名的项在其加入以后可以被事务修改。 Named after its entry in the join can be modified transaction. 例如,若一个文件被删除且另一文件使用同样名字建立,建立将修改项的状态,这样其他事务将继续看到在删除前存在的文件,但此事务将看到它刚建立的文件。 For example, if a file is deleted and another file with the same name of the establishment, the establishment of the state of the modified terms, so that other matters will continue to see the files before deleting exist, but this transaction will see the files it just created. 注意,在NTFS或隔离目录上没有保持将事务级锁定。 Note that on NTFS or quarantine directory does not maintain transaction-level locking. 这使得系统还保持基文件系统的同时性。 This makes the system also holds the base file system simultaneity.

如图18所示,若事务Tid2建立文件F6(在正常的父目录内请求建立),则F6在目录D4中建立,且因而一个项加到父目录D4相关的隔离目录1702。 Shown in Figure 18, if the transaction Tid2 build file F6 (in a request to establish a normal parent directory), then in the directory D4 F6 established, and thus an item is added to the parent directory D4 1702 related to the quarantine directory. 如需要,建立隔离目录1702。 If necessary, the establishment of quarantine directory 1702. 适当地调节标志以反映Tid2建立的状态(即对Tid2可见但对其他事务不可见)以及对Tid2保存的名字。 Appropriately adjusted to reflect the state flag Tid2 established (ie Tid2 visible but not visible to other transactions) as well as the name of Tid2 saved. 注意,事务Tid2在提交以前也能删除新建立的文件F6,在这种情况它将不能被Tid2及其他事务看到。 Note that the transaction before submitting Tid2 can delete the newly created file F6, in which case it can not be seen Tid2 and other matters. 处理这种建立-然后-删除操作的一个方法是从目录D4去除该项和从隔离目录1702去除该项。 Deal with this build - then - a method of operation is deleted from the directory to remove the D4 and 1702 to remove the item from the quarantine directory. 另外方法是将该项留在隔离目录1702中,而它的配置标志(disposition flag)设置成对创建该文件的Tid2及其他事务均不可见,这样防止在Tid2提交或中断前该文件名被其他事务使用。 Another method is to leave the quarantine directory in 1702, and its configuration flag (disposition flag) arranged to create the file Tid2 and other matters are not visible, thus preventing Tid2 before submitting the file name or interrupted by other Service use.

回到典型的情况,其中F6被事务Tid2建立但未被删除,当(且如果)事务Tid2提交或中断,隔离的项从隔离目录1702去除,从而使得被建立的文件F6在提交的情况对所有事务均可见。 Back to the typical case in which the transaction Tid2 F6 was established but not deleted, when (and if) the transaction Tid2 submit entries or interrupted, the isolation of 1702 removed from the quarantine directory, allowing files to be established in the case filed F6 all Services are visible. 如果事务T2中断,该文件从正常NTFS目录D4删除。 If the transaction T2 interruption, the file is deleted from the normal NTFS directory D4. 每个隔离的项保留直到与其相关的事务结束,且在提交或或中断时被去除。 Each item retained until the end of the isolation associated with the transaction, and is removed when submitted or or interruption. 为便于去除,每个事务保留一个TxFSCB指针表,其中该事务至少具有一个这种项。 In order to facilitate the removal of each transaction TxFSCB retain a pointer table, where the transaction has at least one such item. 事务还在每个TxFSCB上适当地增加和减少参照计数,使得TxFSCB被使用它们的事务所掌握。 Service is also appropriate to increase and decrease the reference count for each TxFSCB makes TxFSCB used their firm grasp.

图20提供处理对建立一个文件的请求的通用逻辑的表示,其中请求是New_File_Create(请求的类型是若文件带有与已经存在文件相同的文件名,则不允许建立)。 Figure 20 provides the processing of a request for the establishment of a common logical file representation, wherein the request is New_File_Create (the type of request is that if a file with the file already exists with the same file name is not allowed to establish). 在步骤2000开始,执行测试以确定该文件名(如图17的F6)是否已出现在正常的父目录中,如父目录D4。 In step 2000, a test is performed to determine if the file name (Figure F6 17's) has appeared in a normal parent directory, such as the parent directory D4. 若是,文件不能建立,且在步骤2000转移到步骤2002,返回一个错误,若在父目录D4中未找到文件F6,有可能该文件名已经被事务使用。 If the file can not be established, and the shift in step 2000 to step 2002, returns an error if the file is not found in the parent directory D4 F6, it is possible that the file name is already in use transactions. 为测试这点,步骤2000转移到步骤2004,其中搜索与D4有关的隔离目录1702以寻找此文件名。 To test this, step 2000 branches to step 2004, where the search and D4 1702 relating to the quarantine directory to find the file name. 若此文件F6(或隔离目录)的项不存在,步骤2004转移到步骤2006,在那里作出判断:是否事务正请求建立,或者是非事务请求。 If this file F6 (or quarantine directory) does not exist, step 2004 branches to step 2006, where the judgment: whether the transaction is requesting the establishment of, or non-transactional requests. 若非事务在请求,步骤2006转移到步骤2018,其中文件在正常目录D4中建立。 If not a transaction request, step 2006 branches to step 2018, where the file is created in the normal directory D4. 否则,事务(如Tid2)请求建立,执行步骤2010,从而将一个项加到隔离目录1702(若对父目录中不存在什么,则在建立隔离目录1702之后)。 Otherwise, the transaction (eg Tid2) request to establish, step 2010, an item which will be added to the quarantine directory 1702 (what if the parent directory does not exist, in 1702, after the establishment of quarantine directory). 然后步骤2014表示合适的标志设置,对此项获得其他信息等。 Then step 2014 represents a suitable flag is set, for this to get additional information. 步骤2014随后延续到步骤2018,其中文件F6在正常目录D4中实际建立。 Step 2014 then continues to step 2018, where in a normal directory file F6 D4 actually created. 注意,在NTFS中,在建立时定位文件,在主文件表中对此文件建立文件记录,且建立记录加到日志中。 Note that in NTFS, locate the file, based on the establishment in the master file table in this file records, and establish a record is added to the log.

在步骤2004,如果在隔离目录1702找到此名字,则不允许建立,除非该指定的文件被正请求建立的同一Tid(如Tid2)删除。 In step 2004, if the 1702 find this name in the quarantine directory, you can not build unless the specified file is being requested to establish the same Tid (eg Tid2) deleted. 以此方式,一个事务可以建立它所删除的文件,但在建立和/或删除该文件的事务提交或中断之前,其他事务或非事务均不能使用该文件名。 Before this way, a transaction can create files it deleted, but in the establishment and / or delete the file transaction commits or interruption, or other affairs matters can not use the file name. 若找到,执行步骤2012以测试标志状态,判断同一事务是否正请求建立。 If found, step 2012 to test the flag status to determine whether it is to request the establishment of the same transaction. 若是,步骤2012转移到步骤2014以改变对此项的标志状态,实质上现在表示“由Tid2建立”(对Tid2可见,对其他不可见)而非“由Tid2删除”(对Tid2不可见,对其他可能可见)。 If so, step 2012 branches to step 2014 to change the sign of this state, in essence, now says "established by Tid2" (for Tid2 visible, not visible to others) rather than "delete from Tid2" (for Tid2 not visible, right others may be visible). 若另一事务或非事务请求建立,步骤2012转移到步骤2016返回一个错误,指出一个事务保留此文件名。 If another transaction or transaction request is established, step 2012 branches to step 2016 returns an error indicating that a transaction to keep this file name.

图18表示事务文件改名操作,它本质上是建立链接请求和删除链接请求的组合。 Figure 18 shows the transaction file renaming, it is essentially a combination of link requests to establish links and delete requests. 因此,若事务T1将文件“\D2\D3\F2”改名成“\D2\D3\D4\F7”,则链接F2从目录D3被删除,并在目录D4中建立链接F7。 Therefore, if the transaction T1 will file "\ D2 \ D3 \ F2" renamed "\ D2 \ D3 \ D4 \ F7", the link is deleted from the directory F2 D3 and D4 in the directory established links F7. 但是,因为在改名中涉及到事务,在合适的隔离目录1703和1704中反映这些操作。 However, because it involves transactions renamed, reflecting these operations in 1703 and appropriate quarantine directory 1704. 注意,文件可以在同一父目录内改名,或使文件改名而在不同的目录中具有相同文件名。 Note that the file can be renamed in the same parent directory, or to rename the files with the same filename but in a different directory.

按照本发明,对文件的事务处理改名,提供涉及改名的每个父目录的隔离目录,例如一个指出事务的删除操作,一个指出事务的建立操作。 According to the invention, the file was renamed transaction, providing involve quarantine directory renaming each parent directory, for example, pointed out that the deletion of a transaction, the establishment of a pointed affairs operations. 注意,在同一父目录中的改名只需要一个隔离目录,一个项用于老文件的删除,一个用于新文件的建立。 Note renamed only need a quarantine directory, delete an entry for the old files, a new file is used to establish the same parent directory. 如上述从图19(删除)和20(建立)能理解,其他事务仍看到该文件,好象文件仍未改名,而在事务提交之前看不到改名后的文件。 As described in Figure 19 (deleted) and 20 (establishment) can understand, other matters still see the file, as if the file has not changed its name, but before the transaction can not read the papers submitted to the renamed. 如果事务中断,其他事务将看不到任何表示曾经发生过改名的迹象,除了可能看到,使用中的文件名在该事务的活动周期暂时被保留。 If the transaction is interrupted, other matters will not see any signs, said there had been renamed, except maybe to see the use of the temporary file name is retained in the transaction activity cycle.

最后,图21-22表示例如在试图打开文件或获得其文件信息(如作为列举的一部分)时,一个事务是否将看到指定的文件,这取决于文件的状态。 Finally, Figure 21-22 indicates, for example in an attempt to open a file or get its file information (such as part of the enumerated) when a transaction to see whether the specified file, depending on the status of the file. 步骤2100表示对文件是否在正常目录中的测试。 Step 2100 indicates whether the files in the directory of the normal test. 若是,需要搜索隔离目录(如果存在),寻找该文件的项,以确定该文件对请求者是否可见(步骤2102)。 If so, you need to search the quarantine directory (if it exists), look for an entry for the file to determine whether the file is visible to the requestor (step 2102). 如不在正常目录中,文件有可能被正在进行的事务从正常目录中删除,这在下面图22中处理。 If not in the normal directory, the file may be deleted from the normal ongoing transactions catalog, which deal in Figure 22 below.

若该文件在正常目录中(步骤2100),而该文件的项不在隔离目录中,步骤2102,则它是普通可访问的文件,即它还未由没有提交的事务建立。 If the file is in the normal directory (step 2100), and the entry of the file is not in the quarantine directory, step 2102, it is a common file that can be accessed, that is, it has not been established by the transaction did not commit. 若是这样,文件系统将好象在事务以前其已创建操作(由步骤2104表示),即可返回文件句柄(如在文件打开请求的情况),或文件信息可以从主文件表中的信息返回(如在列举请求的情况)。 If so, the file system as if it has been created in the transaction before the operation (represented by step 2104), to return a file handle (such as opening a file request in the case), or you can return to the main document file information table from ( As exemplified in the case of a request).

若该文件的一个项在隔离目录树中,它必须由进行的事务建立,而步骤2102转移到2106,在那执行测试,判断建立文件的事务是否为正请求访问或请求信息的同一事务。 If the file is an entry in the quarantine directory tree, it must be established by the transaction, and the steps 2102 to 2106 transfer, perform tests that determine whether a transaction is to establish a file is requesting access to the same transaction or request information. 若是,步骤2106转移到2108,测试可见性配置标志(是否对此Tid可见)。 If so, step 2106 moved to 2108 to test the visibility of configuration flag (if this Tid visible). 若可见,则返回文件句柄(或文件信息)给请求的事务(步骤2110)。 If visible, returns a file handle (or file information) to request the transaction (step 2110). 注意,在这里的实现中,应该不是文件在正常目录且项在隔离目录中的情况,(因为是由事务建立),但标志指出,该文件应对建立它的事务不可见。 Note that in achieving this, the file should not be in the case of a normal directory and directory entries in isolation, (because it is established by the transaction), but signs indicate that the document deal with the establishment of its business is not visible. 因此,在这里的实施中,步骤2108的测试基本上是不必要的,除非用于检测正常和/或隔离目录的破坏等。 Thus, in this embodiment, step 2108 is substantially unnecessary tests, except for the detection and destruction of normal and / or isolation directory.

若文件的一个项在正常目录(步骤2100)及隔离目录树(步骤2102)中,但步骤2106判定,同样的事务未作出请求,则在本实现中,文件在步骤2114中可以是或可以不是可见的。 An entry in a normal directory (step 2100) and isolated tree (step 2102), but it is determined to step 2106 if the file has not made the same request to the transaction, in this implementation, the file in step 2114 may or may not be visible. 若不可见,除非其他事务请求的部分是使用文件名,否则步骤2116可认为该文件未找到,返回指出该文件被其他事务使用的错误信息。 If visible, unless other parts of the transaction request is to use a file name, or else step 2116 can be considered that the file is not found, an error message is returned indicate that the file is used by other transactions. 例如,若找不到指定文件,试图建立新文件的这种打开文件请求类型将失败,因为名字被使用。 For example, if the specified file can not be found, try to establish this new file open request for a file type will fail, because the name is used. 若在步骤2114为其他事务可见(文件删除后被建立),使用重定向Fid从$TxF目录打开被删除的文件(步骤2118)。 If the visible (after the establishment of deleted files), using a redirect Fid open deleted files (step 2118) from $ TxF directory in step 2114 for other transactions.

图22处理文件不在正常目录中的情况。 Figure 22 processes the file is not normal in the case of the directory. 若一个尚未提交或中断的事务已删除一个文件,对该文件的项将在隔离目录中,而那个事务不能看见该文件,但其他事务能看见。 If a transaction has not been submitted or has been interrupted to delete a file, the file's entry in the quarantine directory, and that the transaction can not see the file, but other matters could see. 步骤2200测试,对该文件的项是否不在该隔离目录中(由图21的步骤2100不在正常目录中),若不在,在步骤2202该文件找不到并相应地处理。 Step 2200 tests whether the item is not in the quarantine file directory (from step 21 of the 2100 catalog is not normal), if in step 2202 the file can not be found and processed accordingly.

若相反,在步骤2200中该名字出现在隔离目录中,则事务已删除它。 If the contrary, in step 2200, the name appears in the quarantine directory, the transaction has been deleted it. 步骤2204测试是否删除该文件的同一事务在请求访问该文件(或请求其信息)。 Step 2204 tests whether to delete the file in the same transaction request access to the file (or request their information). 如果是,在步骤2212文件对删除其的事务是不可见的,并且这样存在没有找到的状态(步骤2206)。 If yes, in step 2212 to delete the transaction file is not visible, and the state did not find the presence of this (step 2206). 注意如果由于某些原因,该文件对事务可见,则将存在错误。 Note that if for some reason, the file can be seen on the transaction, an error exists.

如果在步骤2204与删除该文件不同的事务请求对该文件的访问(或其信息)。 If in step 2204 and delete the file access different transaction (or information) to the file request. 如果如在随后的步骤2212所测试的文件对其他事务可见的话,步骤2214返回该文件的句柄,或文件信息(从保存的文件ID,或如下所述的Fid,包括复制的信息)。 If the file is in the subsequent step as tested 2212 visible to other transactions, then step 2214 returns the handle of the file, or the file information (from the saved file ID, or as described below Fid, including the copied information).

另一种可能性是一个进行的事务建立并随后删除一个文件,因而该文件不在正常的目录中。 Another possibility is to establish a transaction carried out and then delete a file, so the file is not a normal directory. 如上所述,文件名或者处理成为其他事务可用,或为进行的事务保留,直到该事务提交或中断。 As mentioned above, the file name, or processed into other matters are available, or reserved for transactions until the transaction commits or interrupted. 对于前者,可以通过在建立文件的事务删除它时,简单地将该文件的项从正常目录及隔离目录中去除未实现;注意,若这种文件项从隔离目录中去除,则将不达到步骤2212。 For the former, you can create a file when a transaction to remove it, simply remove the file entry from the normal directory and quarantine directory unrealized; note that if this file is removed from the quarantine directory entries, it will not reach the step 2212. 对于后者,可以通过在删除时将文件从正常目录中去除而将该文件的项留在隔离目录,并设置标志指出对任何事务都不可见来实现。 For the latter, you can delete the files when the file entries remain in quarantine directory is removed from the normal directory, and set the flag indicates that the transaction is not visible to any implementation. 可以理解,这是可能的,因为可见性配置标志是独立设置的(即它们不是互相排斥的)。 Will be appreciated, this is possible because the visibility flag is provided independently of the configuration (i.e., they are not mutually exclusive). 但是,如果文件留在隔离目录中并标记为对其他事务(和对建立它的事务)不可见,则在步骤2216存在找不到文件的状态,但文件名为进行的事务保留。 However, if the file is left in the quarantine directory and marked as to other matters (and to build its business) is not visible, then the state can not find the file exists in step 2216, but the transaction file named conducted reserved.

在此方式中,本发明方便了整理搜索,如使用NTFS调整规则和NTFS例行程序以调整次序寻找下一个名字。 In this way, the present invention facilitates the consolidation search, such as the use of NTFS and NTFS routine adjustment rules to adjust the order to find the next name. 本发明在空间方面是有效的,并允许并发的读/写访问。 The present invention is effective in terms of space, and allow concurrent read / write access.

注意,出于它看到或看不到的目的,非事务仅看到在正常目录中的内容。 Note that for it to see or do not see the purpose of the non-transactional content only see in a normal directory. 但是对于使用现有的文件名的目的,非事务不能使用为事务保留的文件名。 But for the purpose of using the existing file name, file name can not use non-transactional reserved for the transaction. 为此,当非事务试图建立具有在正常目录中不存在的名字的文件时,如上所述检查隔离目录。 For this reason, when trying to establish a non-transactional file with the directory does not exist in the normal names, as described above to check the quarantine directory.

考虑到上面例子和描述,下面例子示出事务如何使用和修改隔离目录中的项。 Taking into account the above examples and description, the following example shows how to use and modify the transaction isolation entries in the directory. 首先,考虑事务Tid10,它在目录X中建立名为YisAVeryLongName的新文件,即建立X\YisAVeryLongName。 First, consider the transaction Tid10, it establishes a new file in the directory named YisAVeryLongName X, namely the establishment of X \ YisAVeryLongName. 隔离目录节加入下面两个项:Name:YisAVeryLongName; Add the following two sections quarantine directory entry: Name: YisAVeryLongName;

Tid:10; Tid: 10;

(对Tid可见:TRUE,对其他可见:FALSE);LongName:TRUE; (Visible on Tid: TRUE, other visible: FALSE); LongName: TRUE;

pairedNamePtr:Ptr到短名项Reserved:TRUE: pairedNamePtr: Ptr to the short name entry Reserved: TRUE:

Fid:INVALID_ID; Fid: INVALID_ID;

其他复制信息。 Other replication information.

Name:YisAVerv Name: YisAVerv

LongName:FALSE; LongName: FALSE;

pairedNamePtr:Ptr到长名项这保证了X的后续的目录列举如果是由Tid10以外的事务做的,将不返回这些名字,而非事务将看到这两个名字。 pairedNamePtr: Ptr to ensure that long-term follow-up the name of the directory X are listed by the transaction if it is outside of Tid10 do, will not return to these names, rather than the transaction will see two names. 此外,如果另一事务Tid20试图建立或打开两个名字的任一个,该事务将得到从上述隔离结构检测的“File-already-exist-but-sharing-violation”错误。 In addition, if another transaction Tid20 trying to build or open any one of the two names, the transaction will be detected from the isolation structure "File-already-exist-but-sharing-violation" error.

如果非事务处理线程打开这些名字中任一个,如果为了写,删除或任何类型的修改而打开,它得到共享破坏的信息,这种非事务能只读地打开它。 If a non-transactional threads open any of these names, if in order to write, delete, or modify any type and open, it gets corrupted information sharing, which can be read only non-transactional open it. 这是由于上述分别强加的TxF的文件锁定语义。 This is because these were imposed TxF file locking semantics.

考虑第二个例子,从父目录X事务处理删除现有的文件YisAVeryLongName。 Consider a second example, delete existing files from the parent directory YisAVeryLongName X transaction. 在此例中在目录X中还有此名字的短名链接(名字对情况,与链接删除情况相反)。 In this example there are in the X directory name a short name for this link (the name of the case, and link removed contrary). 而且,该事务具有标识符Tid10,而隔离目录具有下列两个加入项目:Name:YisAVeryLongName; Moreover, the transaction has an identifier Tid10, and quarantine directory has the following two to join the project: Name: YisAVeryLongName;

Tid:10; Tid: 10;

(对Tid可见:FALSE,对其他可见:TRUE);LongName:TRUE; (Visible on Tid: FALSE, other visible: TRUE); LongName: TRUE;

pairedNamePtr:Ptr到短名项Reserved:TRUE; pairedNamePtr: Ptr to the short name entry Reserved: TRUE;

Fid:文件Id; Fid: File Id;

其他复制信息。 Other replication information.

Name:YisAVeryLongName:FALSE; Name: YisAVeryLongName: FALSE;

pairedNamePtr:Ptr到长名项这两个链接也从目录X的索引SCB删除,但现在可以假设TxF保证文件未被物理上去除,因为在删除前TxF将系统拥有的链接加到文件上。 pairedNamePtr: Ptr to the long term these two links were also removed from the directory X index SCB, but now it can be assumed TxF ensure the file is not physically removed before deleting TxF because the system has a link to the file. 因此,两个名字均不能为Tid10以外的任何事务用于建立新文件或新链接。 Therefore, two names are not any transaction other than Tid10 used to establish a new file or link. 这是因为Tid10能决定中断和重新使用这些名字。 This is because Tid10 can determine the interrupt and re-use of these names. 而且,这些名字在目录列举或建立中对Tid10不可见,使Tid可用这两个名字的任一个建立新链接/文件,这些名字对其他事务可见,这意味着这些事务能使用该文件ID(Fid)打开它们。 Moreover, these names are listed in the catalog or the establishment of Tid10 not visible, so that either of these two names available Tid establish a new link / document, these names are visible to other transactions, which means that these matters can use the file ID (Fid ) to open them. 非事务处理的用户不能看到这些文件,他们也不能使用这些名字来建立新的文件。 Users of non-transaction can not see these files, they can not use them to create a new file name.

在第三例子中,认为第一例子已经发生,即文件已经建立。 In the third example, consider the first example has occurred, i.e. the file has been established. 然而,因为该名字对事务Tid10可见,Tid10能自由地打开该文件并删除它。 However, since the name of the transaction Tid10 visible, Tid10 free to open the file and delete it. 若Tid10打开该文件用于写,并随后删除它,删除后的隔离项如下:Name:YisAVeryLongName; If Tid10 open the file for writing, and then delete it, delete quarantined items after following: Name: YisAVeryLongName;

Tid:10; Tid: 10;

(对Tid可见:FALSE,对其他可见:FALSE);LongName:TRUE; (Visible on Tid: FALSE, other visible: FALSE); LongName: TRUE;

pairedNamePtr:Ptr到短名项Reserved:TRUE; pairedNamePtr: Ptr to the short name entry Reserved: TRUE;

Fid:INVALID_ID; Fid: INVALID_ID;

没有复制信息。 No replication information.

Name:YisAVeryLongName:FALSE; Name: YisAVeryLongName: FALSE;

pairedNamePtr:Ptr到长名项这些项对该事务保留名字,但使它对任何事务都不可见,注意,执行保留以允许退回到工作。 pairedNamePtr: Ptr to the long term these items were to retain the name of the transaction, but that it is not visible on any matter, attention, executive retention to allow return to work.

浮动存储器映射段本发明的另外方面是定位于解决一个问题,其中应用程序对一个或多个为写访问而打开的文件上执行存储器映射,而不觉察包括该应用程序的事务已中断(或提交)。 Another aspect of the present invention, the floating memory mapping segment is positioned to solve a problem, which the memory map to perform on the application of one or more for write access and open files, without notice, including the affairs of the application has been interrupted (or submit ). 例如,当分布式事务在网络的另一个网络节点中断时会发生这样情况。 For example, when a distributed transaction in another network node network interruption occurs when such a situation. 在那时,应用程序工作情况会不佳或有破坏性。 At that time, the application will work poorly or destructive.

当应用程序对写访问而打开的文件上执行存储器映射,并不觉察其有关的事务已中断(或提交),和/或工作情况会不佳或有破坏性,另一写程序能打开仍然是存储器映射的文件用于写访问。 When the application is opened for write access to memory-mapped files that the implementation is not aware of its related matters has been interrupted (or commit), and / or work poorly or destructive situation, write a program to open another remains memory mapped file for write access. 因此与文件数据能发生冲突,因为存在多个同时的写程序。 Therefore, the file data can be in conflict, because there are multiple simultaneous write programs. 更具体地,当由应用程序执行时,存储器映射指使用段对象(共享存储器块)将文件映射到过程地址空间。 More specifically, when executed by the application, the memory map refers to the use of pieces of object (shared memory block) to map the file into the process address space. 若应用程序修改一页,存储管理器能在正常页面操作期间将改变写回到盘上的文件中,或者,应用程序能直接引起一次刷新。 If the application to modify a page, the storage manager can during normal operation will change the page file on the disk write back, or the application can directly cause a refresh. 虽然在事务的环境中不希望这样,但允许应用程序执行存储器映射,因此可能通过另外的事务应用程序引起对为写访问而打开的文件的写操作。 Although the affairs of the environment is not desirable, but allows applications to perform memory mapping, so it may cause a write access to the file opened by another write transactional applications.

知道事务何时提交或退出,并且例如清除受该事务影响的数据结构的文件系统可以询问存储管理器以确定事务的应用过程(或多个过程)是否是存储器映射,即是否已经建立一个段句柄。 Know when a transaction is committed or exit, and for example, the transaction is cleared by the impact of file system data structures can ask the store manager to determine the application process a transaction (or more processes) whether the memory map, that is, whether to build a segment handle . 如果存在任何这种应用程序,不知道该应用程序操作状态的文件系统不能直接关闭该应用程序或保证它不会继续写到映射段。 If the existence of any such application, the application does not know the state of the file system can not operate directly off the application or to ensure that it will not continue to write the mapping section.

图23示出一个方法,其中文件系统62防止应用程序180(不再是事务的一部分)写到为另一应用程序182打开用于写访问的映射文件。 Figure 23 shows a method in which the file system 62 to prevent the application 180 (not part of the transaction) is written to another application 182 opens the file for write access map. 为此,文件系统调节段控制块(SCB)188,各个应用程序180,182的文件对象184、186指向不同的段对象指针190、192。 To this end, the file system adjustment segment control block (SCB) 188, 180, 182 of each application file object 184 and 186 point to different pieces of object pointers 190, 192. 无效事务应用程序1(180)的段对象指针190是空的,而有效事务应用程序2(182)的段对象指针192具有指向用于该应用程序182的存储器196的指针。 Invalid Transaction application 1 (180) 190 object pointer segment is empty, and effective transactional applications 2 (182) of section 192 has a memory object pointer points to 182 for the application of 196 pointers. 这使得存储器段194浮动。 This makes the memory segment 194 float.

无效事务应用程序180能持续对浮动的存储器段读或写,但它不再对应于该文件。 Invalid transaction application 180 can continue floating section of memory read or write, but it no longer corresponds to the file. 同时,一旦在高速缓存/存储管理器114通过代表有效应用程序182的文件系统62发现页面存在错误,适当的虚拟存储器页面198(及由应用程序182使用的存储196)用从事务处理的正确文件(如保存在TOPS流版本中的正确页面)来的数据,或从盘上文件来的数据进行填充。 Also, once in the cache / memory management errors by 114 representatives of 182 valid application file system found 62 pages, the appropriate virtual memory page 198 (and storage 196 182 used by the application) with the correct file from transaction processing (as stored in the correct page TOPS stream version) data, or to fill data files from the disk. 类似的,当存储管理器114指令时,文件系统62将由有效应用程序改变的页面写到盘112中。 Similarly, when the instruction memory manager 114, file system 62 will be effective to change the application pages to disk 112.

但是,对于映射到无效应用程序180的段中的页面,任何从达到对应于存储器段194的文件系统62的存储管理器114来的写请求被文件系统62接受,但实际上不写入盘,因此,映射的存储器是浮动的段;可以允许其写入存储器,但其改变决不刷新到盘中。 However, the application is mapped to section 180 of invalid pages 114 to any file system write requests are accepted from 62 to reach 194 corresponds to the memory section 62 of the storage file system manager, but does not actually written to the disc, Thus, the memory is mapped segment floating; may allow write memory, but it will never change is flushed to disk. 由存储管理器114对从盘112来的页面请求失败导致返回零。 By the storage manager 114 112 from disk failure results page requests return zero. 因此,段194的此版本不再由盘上的文件返回。 Thus, section 194 of this version is no longer returned by the file on the disk. 以此方式,有效事务的应用程序的数据文件与由无效应用程序对映射文件的数据改变相隔离。 In this way, the effective transaction applications and data files by invalid application data changes to the map file relative isolation.

另外还可能将存储器的映射段改变成对无效应用程序不能访问或只读。 Also possible to map changes in the memory segment pairs invalid application can not access or read-only. 从而由无效应用程序的写导致访问破坏。 Resulting from the application invalid write access destruction. 如果读是允许的,每当由有效应用程序作出的改变在段194中错误,无效应用程序能看到这些改变。 If the read is allowed, whenever changes made by the application in the effective section 194, invalid applications can see these changes.

注意,任意上述解决方法能引起无效的应用程序180崩溃,而有效应用程序182的数据被适当地隔离。 Note that, any of the above solutions can cause invalid application 180 crashes, and effective application data 182 is properly isolated. 为避免破坏无效应用程序180,作出的改变被写到盘上另一文件,但是,目前支持这种后事务处理的版本被认为对这种应用程序是不必要的开销增加。 To avoid damage invalid application 180, changes made by another file is written on the disc, however, support this version after the current transaction is considered for this application is unnecessary overhead increases.

TxF日志记录格式<pre listing-type="program-listing"><![CDATA[ // log record types that are known to the recovery manager. typedef enum{ TxfLogRecTypeRedo, TxfLogRecTypeUndo, TxfLogRecTypePrepare, TxfLogRecTypeAbort, TxfLogRecTypeCommit, } TXF_LOGREC_TYPE; typedef enum { TxfLogRecActionCreateFile, TxfLogRecActionDeleteFile, TxfLogRecActionWriteFile, TxfLogRecActionOverwriteFile, TxfLogRecActionFcbInfoUpdateFile, TxfLogRecActionTemporaryBitChangeFile, TxfLogRecActionUpdateDupInfo, TxfLogRecActionTruncateFile, TxfLogRecActionRestoreFileSizes, TxfLogRecActionCancelRecord, TxfLogRecActionTestPrint]]></pre> <pre listing-type="program-listing"><![CDATA[ } TXF_LOGREC_ACTION; typedef struct { TXF_LOGREC_TYPE Type; TXF_LOGREC_ACTION Action; TXF_TRANS_ID TransId; } TXF_LOGREC,*PTXF_LOGREC; /* typedef struct { TXF_LOGREC_HDR header; char data[1]; } TXF_LOGREC,*PTXF_LOGREC;*/ // // Delete File log record. // // // The Long name and the short name are laid out // immediately after the record. // typedef struct_TXF_DELETE_FILE_UNDO_LOGREC { TXF_LOGREC Header; // // See below for flag values // USHORT Flags; // // ShortNameLength is 0 if there's no short name.]]></pre> <pre listing-type="program-listing"><![CDATA[ // The short name begins right after the // FileName.FileName ends. // It's at PWCHAR FileName.FileName + // FileName.FileNameLength. // ShortNameLength is in unicode chars. // USHORT ShortNameLength; // // MungedFileNumber to which the rename happened. // ULONG MungedFileNumber; // // The Txf subdirectory to which the rename happened. // ULONG SubDirNumber; // // The long/combined name with valid dup info,parent // directory,length // etc. // FILE_NAME FileName; // // Don't add any fields after this. //]]></pre> <pre listing-type="program-listing"><![CDATA[ } *PTXF_DELETE_FILE_UNDO_LOGREC, TXF_DELETE_FILE_UNDO_LOGREC; // // TRUE if the file is a directory. // #define TXF_DELETE_FILE_UNDO_FLAGS_DIRECTORY 0x01 // // TRUE if this delete operation had stored the Fid flags. // #define TXF_DELETE_FILE_UNDO_FLAGS_FID_STORED 0x02 // // IgnoreCase flag for the CCB that opened the name for // delete. // #define TXF_DELETE_FILE_UNDO_FLAGS_IGNORE_CASE 0x04 // // Create-File undo log record. // // The Long name and the short name are laid out // immediately after the record. // typedef struct_TXF_CREATE_FILE_UNDO_LOGREC{ TXF_LOGREC Header; FILE_REFERENCE ParentFid;]]></pre> <pre listing-type="program-listing"><![CDATA[ // // LongNameLength is in unicode characters. // USHORT LongNameLength; // // LongNameOffset=sizeof(struct // _TXF_CREATE_FILE_UNDO_LOGREC) // // // See below for flag values // USHORT Flags; // // ShortNameLength is 0 if there's no short name. // Length is in unicode chars. // USHORT ShortNameLength; // // ShortNameOffset is sizeof(struct // _TXF_CREATE_FILE_UNDO_LOGREC)+ // LongNameLength*sizeof(WCHAR) // USHORT Reserved1; ULONG Reserved2;]]></pre> <pre listing-type="program-listing"><![CDATA[ } *PTXF_CREATE_FILE_UNDO_LOGREC, TXF_CREATE_FILE_UNDO_LOGREC; // // TRUE if the file is a directory. // #define TXF_CREATE_FILE_UNDO_FLAGS_DIRECTORY 0x01 // // IgnoreCase flag for the CCB that created the name. // #define TXF_CREATE_FILE_UNDO_FLAGS_IGNORE_CASE 0x02 // // Overwrite-File undo log record. // typedef struct_TXF_OVERWRITE_FILE_UNDO_LOGREC { TXF_LOGREC Header; // // File reference of the file that was overwritten // FILE_REFERENCE Fid; // // File reference of the TxF file that was created in // the TxF directory.]]></pre> <pre listing-type="program-listing"><![CDATA[ // FILE_REFERENCE TxfFileFid; // // MungedFileNumber of the TxF file that was created in // the TxF directory. // ULONG MungedFileNumber; // // The Txf subdirectory in which the TxF file was // created. // ULONG SubDirNumber; USHORT Flags; USHORT Reserved1; ULONG Reserved2; } *PTXF_OVERWRITE_FILE_UNDO_LOGREC, TXF_OVERWRITE_FILE_UNDO_LOGREC; // // FcbInfoUpdate undo log record.It is undone // unconditionally without checking the TxfLsn in the // standard-info. // typedef struct_TXF_FCB_INFO_UPDATE_UNDO_LOGREC {]]></pre> <pre listing-type="program-listing"><![CDATA[ TXF_LOGREC Header; // // File reference of the file that was overwritten // FILE_REFERENCE Fid; // // Fcb Info to be restored on undo. // DUPLICATED_INFORMATION FcbInfo; } *PTXF_FCB_INFO_UPDATE_UNDO_LOGREC, TXF_FCB_INFO_UPDATE_UNDO_LOGREC; // // FcbInfoUpdate undo log record.It is undone // unconditionally without checking the TxfLsn in the // standard-info. // typedef struct_TXF_TEMPORARY_BIT_CHANGE_UNDO_LOGREC { TXF_LOGREC Header; // // File reference of the file that was overwritten //]]></pre> <pre listing-type="program-listing"><![CDATA[ FILE_REFERENCE Fid; ULONG PrevicusBitValue; // // Attribute name length is 0 if this is the default // data stream. // Length is in unicode chars. // Attribute name follows the log record,if present. // USHORT AttrNameLength; WCHAR AttrName[1]; } *PTXF_TEMPORARY_BIT_CHANGE_UNDO_LOGREC, TXF_TEMPORARY_BIT_CHANGE_UNDO_LOGREC; // // UpdateDupInfo undo log record. // // The Long name is laid out immediately after the record. // typedef struct_TXF_UPDATE_DUPINFO_UNDO_LOGREC { TXF_LOGREC Header; // // Fid of the parent directory. // FILE_REFERENCE ParentFid;]]></pre> <pre listing-type="program-listing"><![CDATA[ // // LongNameLength is in unicode characters. // USHORT LongNameLength; // // See below for flags. // USHORT Flags; // // Duplicated information. // DUPLICATED_INFORMATION DupInfo; WCHAR LongName[1]; } *PTXF_UPDATE_DUPINFO_UNDO_LOGREC, TXF_UPDATE_DUPINFO_UNDO_LOGREC; #define TXF_UPDATE_DUPINFO_UNDO_FLAGS_DIRECTORY 0x0001 // // Truncate undo log record. // // The attribute name is laid out immediately after the // record. // typedef struct_TXF_TRUNCATION_UNDO_LOGREC {]]></pre> <pre listing-type="program-listing"><![CDATA[ TXF_LOGREC Header; // // Fid of the file. // FILE_REFERENCE Fid; LONGLONG ValidDataLength; LONGLONG FileSize; // // Attribute name length is 0 if this is the default // data stream. // Length is in unicode chars. // Attribute name fc lows the log record,if present. // USHORT AttrNameLength; WCHAR AttrName[1]; } *PTXF_TRUNCATION_UNDO_LOGREC,TXF_TRUNCATION_UNDO_LOGREC; // // Restore file sizes undo log record. // // The attribute name is laid out immediately after the // record. // typedef struct_TXF_RESTORE_FILE_SIZES_UNDO_LOGREc { TXF_LOGREC Header;]]></pre> <pre listing-type="program-listing"><![CDATA[ // // Fid of the file. // FILE_REFERENCE Fid; LONGLONG ValidDataLength; LONGLONG FileSize; // // Attribute name length is 0 if this is the default // data stream. // Length is in unicode chars. // Attribute name follows the log record,if present. // USHORT AttrNameLength; WCHAR AttrName[1]; } *PTXF_RESTORE_FILE_SIZES_UNDO_LOGREC, TXF_RESTORE_FILE_SIZES_UNDO_LOGREC; // // Define the format of the Change Table entries,and some // related contents. // #define TOPS_SECTOR_SIZE (512) #define TOPS_PAGE_SIZE (4096) #define TOPS_PAGE_SHIFT (12)]]></pre> <pre listing-type="program-listing"><![CDATA[ #define TOPS_SECTORS_PER_PAGE (TOPS_PAGE_SIZE/ TOPS_SECTOR_SIZE) #define TOPS_MAXIMUM_FLUSH_SIZE (0x10000) typedef struct_CHANGE_ENTRY { // // These two fields describe the virtual address of the // displaced range of the stream. // ULONGLONG VirtualPageNumber; ULONG NumberPages; // // This is the starting page number in the Tops stream // to where the old pages were written. // ULONG TopsPageNumber; // // This is the Lsn of the log record describing this // change. // CLFS_LSN Lsn; // // SequenceNumber being written into all bytes of the // undo pages covered // by this change. //]]></pre> <pre listing-type="program-listing"><![CDATA[ UCHAR SequenceNumber; // // May as well reserve bytes here for alignment,since // the size will always round to quad word anyway. // UCHAR Reserved[7]; // // Finally,these are the displaced bytes of data, // allowing torn write detection in the Tops stream. // Enough are allocated here for one page,yet // additional bytes will be allocated if NumberPages is // greater than one. // UCHAR DisplacedBytes[TOPS_SECTORS_PER_PAGE]; } CHANGE_ENTRY,*PCHANGE_ENTRY; // // Create-File undo log record. // // The Long name and the short name are laid out // immediately after the record. // typedef struct_TXF_WRITE_FILE_UNDO_LOGREC { TXF_LOGREC Header;]]></pre> <pre listing-type="program-listing"><![CDATA[ // // File Reference for file undo data was captured from. // FILE_REFERENCE FileReference; // // Describe where the undo data was written and store // the displaced bytes which were replaced by a // sequence number. // CHANGE_ENTRY ChangeEntry; } TXF_WRITE_FILE_UNDO_LOGREC,*PTXF_WRITE_FILE_UNDO_LOGREC;]]></pre>如上面详细描述可见,提供了一个事务处理文件系统及方法,使得应用程序能容易地对一个或多个文件执行多重事务操作。 TxF logging format <pre listing-type = "program-listing"> <[CDATA [// log record types that are known to the recovery manager typedef enum {TxfLogRecTypeRedo, TxfLogRecTypeUndo, TxfLogRecTypePrepare, TxfLogRecTypeAbort, TxfLogRecTypeCommit,} TXF_LOGREC_TYPE!.; typedef enum {TxfLogRecActionCreateFile, TxfLogRecActionDeleteFile, TxfLogRecActionWriteFile, TxfLogRecActionOverwriteFile, TxfLogRecActionFcbInfoUpdateFile, TxfLogRecActionTemporaryBitChangeFile, TxfLogRecActionUpdateDupInfo, TxfLogRecActionTruncateFile, TxfLogRecActionRestoreFileSizes, TxfLogRecActionCancelRecord, TxfLogRecActionTestPrint]]> </ pre> <pre listing-type = "program-listing"> <! [CDATA [} TXF_LOGREC_ACTION ; typedef struct {TXF_LOGREC_TYPE Type; TXF_LOGREC_ACTION Action; TXF_TRANS_ID TransId;} TXF_LOGREC, * PTXF_LOGREC; / * typedef struct {TXF_LOGREC_HDR header; char data [1];} TXF_LOGREC, * PTXF_LOGREC; * / // // Delete File log record. // // // The Long name and the short name are laid out // immediately after the record // typedef struct_TXF_DELETE_FILE_UNDO_LOGREC {TXF_LOGREC Header;. // // See below for flag values // USHORT Flags; // // ShortNameLength is 0 if there's no short name.]]> </ pre> <pre listing-type = "program-listing"> <! [CDATA [// The short name begins right after the // FileName.FileName ends. // It's at PWCHAR FileName.FileName + // FileName.FileNameLength // ShortNameLength is in unicode chars // USHORT ShortNameLength;.. // // MungedFileNumber to which the rename happened // ULONG MungedFileNumber;. // // The Txf subdirectory to . which the rename happened // ULONG SubDirNumber; // // The long / combined name with valid dup info, parent // directory, length // etc. // FILE_NAME FileName; // // Do not add any fields after . this //]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [} * PTXF_DELETE_FILE_UNDO_LOGREC, TXF_DELETE_FILE_UNDO_LOGREC;! // // TRUE if the file is a directory // #define. TXF_DELETE_FILE_UNDO_FLAGS_DIRECTORY 0x01 // // TRUE if this delete operation had stored the Fid flags. // #define TXF_DELETE_FILE_UNDO_FLAGS_FID_STORED 0x02 // // IgnoreCase flag for the CCB that opened the name for // delete. // #define TXF_DELETE_FILE_UNDO_FLAGS_IGNORE_CASE 0x04 // / . / Create-File undo log record // // The Long name and the short name are laid out // immediately after the record // typedef struct_TXF_CREATE_FILE_UNDO_LOGREC {TXF_LOGREC Header;. FILE_REFERENCE ParentFid;]]> </ pre> <pre listing -type = "program-listing"> <[CDATA [// // LongNameLength is in unicode characters // USHORT LongNameLength;!. // // LongNameOffset = sizeof (struct // _TXF_CREATE_FILE_UNDO_LOGREC) // // // See below for flag values // USHORT Flags; // // ShortNameLength is 0 if there's no short name // Length is in unicode chars // USHORT ShortNameLength;.. // // ShortNameOffset is sizeof (struct // _TXF_CREATE_FILE_UNDO_LOGREC) + // LongNameLength * sizeof (WCHAR) // USHORT Reserved1; ULONG Reserved2;]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [} * PTXF_CREATE_FILE_UNDO_LOGREC, TXF_CREATE_FILE_UNDO_LOGREC;! // // TRUE if the file is a directory. // #define TXF_CREATE_FILE_UNDO_FLAGS_DIRECTORY 0x01 // // IgnoreCase flag for the CCB that created the name. // #define TXF_CREATE_FILE_UNDO_FLAGS_IGNORE_CASE 0x02 // // Overwrite-File undo log record. // typedef struct_TXF_OVERWRITE_FILE_UNDO_LOGREC {TXF_LOGREC Header ; // // File reference of the file that was overwritten // FILE_REFERENCE Fid; // // File reference of the TxF file that was created in // the TxF directory]]> </ pre> <pre listing-type. = "program-listing"> <[CDATA [// FILE_REFERENCE TxfFileFid;! // // MungedFileNumber of the TxF file that was created in // the TxF directory // ULONG MungedFileNumber;. // // The Txf subdirectory in which the TxF file was // created // ULONG SubDirNumber;. USHORT Flags; USHORT Reserved1; ULONG Reserved2;} * PTXF_OVERWRITE_FILE_UNDO_LOGREC, TXF_OVERWRITE_FILE_UNDO_LOGREC; // // FcbInfoUpdate undo log record.It is undone // unconditionally without checking the TxfLsn in the / / standard-info // typedef struct_TXF_FCB_INFO_UPDATE_UNDO_LOGREC {]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [TXF_LOGREC Header;.! // // File reference of the file that was overwritten / / FILE_REFERENCE Fid; // // Fcb Info to be restored on undo // DUPLICATED_INFORMATION FcbInfo;.} * PTXF_FCB_INFO_UPDATE_UNDO_LOGREC, TXF_FCB_INFO_UPDATE_UNDO_LOGREC; // // FcbInfoUpdate undo log record.It is undone // unconditionally without checking the TxfLsn in the // . standard-info // typedef struct_TXF_TEMPORARY_BIT_CHANGE_UNDO_LOGREC {TXF_LOGREC Header; // // File reference of the file that was overwritten //]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [! FILE_REFERENCE Fid; ULONG PrevicusBitValue; // // Attribute name length is 0 if this is the default // data stream // Length is in unicode chars // Attribute name follows the log record, if present // USHORT AttrNameLength...; WCHAR AttrName [1];} * PTXF_TEMPORARY_BIT_CHANGE_UNDO_LOGREC, TXF_TEMPORARY_BIT_CHANGE_UNDO_LOGREC; // // UpdateDupInfo undo log record // // The Long name is laid out immediately after the record // typedef struct_TXF_UPDATE_DUPINFO_UNDO_LOGREC {TXF_LOGREC Header;.. // // Fid of . the parent directory // FILE_REFERENCE ParentFid;]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [// // LongNameLength is in unicode characters // USHORT LongNameLength;!. // // See below for flags // USHORT Flags;. // // Duplicated information // DUPLICATED_INFORMATION DupInfo;. WCHAR LongName [1];} * PTXF_UPDATE_DUPINFO_UNDO_LOGREC, TXF_UPDATE_DUPINFO_UNDO_LOGREC; #define TXF_UPDATE_DUPINFO_UNDO_FLAGS_DIRECTORY 0x0001 // // Truncate undo log record /. / // The attribute name is laid out immediately after the // record // typedef struct_TXF_TRUNCATION_UNDO_LOGREC {]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [TXF_LOGREC Header;.! // . // Fid of the file // FILE_REFERENCE Fid; LONGLONG ValidDataLength; LONGLONG FileSize; // // Attribute name length is 0 if this is the default // data stream // Length is in unicode chars // Attribute name fc.. lows the log record, if present // USHORT AttrNameLength;. WCHAR AttrName [1];} * PTXF_TRUNCATION_UNDO_LOGREC, TXF_TRUNCATION_UNDO_LOGREC; // // Restore file sizes undo log record // // The attribute name is laid out immediately after the /. / record // typedef struct_TXF_RESTORE_FILE_SIZES_UNDO_LOGREc {TXF_LOGREC Header;.]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [// // Fid of the file // FILE_REFERENCE Fid;!. LONGLONG ValidDataLength; LONGLONG FileSize; // // Attribute name length is 0 if this is the default // data stream // Length is in unicode chars // Attribute name follows the log record, if present // USHORT AttrNameLength;... WCHAR AttrName [1];} * PTXF_RESTORE_FILE_SIZES_UNDO_LOGREC, TXF_RESTORE_FILE_SIZES_UNDO_LOGREC; // // Define the format of the Change Table entries, and some // related contents // #define TOPS_SECTOR_SIZE (512) #define TOPS_PAGE_SIZE (4096) #define TOPS_PAGE_SHIFT (12. )]]> </ pre> <pre listing-type = "program-listing"> <! [CDATA [#define TOPS_SECTORS_PER_PAGE (TOPS_PAGE_SIZE / TOPS_SECTOR_SIZE) #define TOPS_MAXIMUM_FLUSH_SIZE (0x10000) typedef struct_CHANGE_ENTRY {// // These two fields describe the virtual address of the // displaced range of the stream // ULONGLONG VirtualPageNumber;. ULONG NumberPages; // // This is the starting page number in the Tops stream // to where the old pages were written // ULONG TopsPageNumber.; // // This is the Lsn of the log record describing this // change // CLFS_LSN Lsn;.. // // SequenceNumber being written into all bytes of the // undo pages covered // by this change //]] > </ pre> <pre listing-type = "program-listing"> <[CDATA [UCHAR SequenceNumber;! // // May as well reserve bytes here for alignment, since // the size will always round to quad word anyway . // UCHAR Reserved [7];. // // Finally, these are the displaced bytes of data, // allowing torn write detection in the Tops stream // Enough are allocated here for one page, yet // additional bytes will be allocated if NumberPages is // greater than one // UCHAR DisplacedBytes [TOPS_SECTORS_PER_PAGE];.} CHANGE_ENTRY, * PCHANGE_ENTRY; // // Create-File undo log record // // The Long name and the short name are laid out. . // immediately after the record // typedef struct_TXF_WRITE_FILE_UNDO_LOGREC {TXF_LOGREC Header;]]> </ pre> <pre listing-type = "program-listing"> <[CDATA [// // File Reference for file undo data was! captured from // FILE_REFERENCE FileReference;. // // Describe where the undo data was written and store // the displaced bytes which were replaced by a // sequence number // CHANGE_ENTRY ChangeEntry;.} TXF_WRITE_FILE_UNDO_LOGREC, * PTXF_WRITE_FILE_UNDO_LOGREC;]]> </ pre> As described in detail above, visible, providing a transactional file system and method allows applications to easily perform multiple operations on one or more of the transaction documents. 多重文件系统操作在文件系统中以事务处理方式互相结合,使得操作要么一起提交,要么任何部分的活动都被撤消。 Multiple file system operations in the file system in transaction mode combined with each other, so that the operator or submitted with, or any part of the activities have been canceled. 此外一个事务的操作和数据改变与另外事务的操作和数据改变互相隔离。 Further operations and data changes in a transaction with another change operations and data transactions isolated from each other. 因此,例如本发明能以快速,有效和安全的方式将网站作为由文件系统部件处理的单个事务进行更新。 Thus, for example, the present invention can be fast, efficient and secure way to the site as a single transaction handled by the file system components to be updated. 同时,在事务提交前进行中的改变互相隔离。 Meanwhile, before the transaction submitted for a change in isolation from each other.

然而本发明易受各种修改和改变的结构的影响,某些这里说明的实施例在图中示出,并在上面详细描述。 However, susceptible to various modifications and changes affect the structure of the present invention, certain embodiments described herein is shown in the drawings and described in detail above. 但是应该理解,这并不是要将本发明限制在特定的形式或所揭示的形式,而相反,本发明覆盖所有修改、另外结构和落在本发明的精神和范围内的等效事物。 It is to be understood that this is not intended to limit the invention to the specific form or forms disclosed, but on the contrary, the present invention is to cover all modifications, additional structure and fall within the spirit and scope of the present invention is equivalent things.

Classifications
International ClassificationG06F9/46, G06F12/00, G06F17/30
Cooperative ClassificationY10S707/99931, Y10S707/99952, Y10S707/99937, Y10S707/959, Y10S707/99932, Y10S707/99942, Y10S707/99938, Y10S707/99944, Y10S707/99953, G06F17/30227
European ClassificationG06F17/30F8T
Legal Events
DateCodeEventDescription
15 Oct 2003C06Publication
17 Dec 2003C10Request of examination as to substance
12 Sep 2007C14Granted
20 May 2015ASSSuccession or assignment of patent right
Owner name: MICROSOFT TECHNOLOGY LICENSING LLC
Free format text: FORMER OWNER: MICROSOFT CORP.
Effective date: 20150504
20 May 2015C41Transfer of the right of patent application or the patent right