US20070233710A1 - Systems and methods for notifying listeners of events - Google Patents

Systems and methods for notifying listeners of events Download PDF

Info

Publication number
US20070233710A1
US20070233710A1 US11/396,282 US39628206A US2007233710A1 US 20070233710 A1 US20070233710 A1 US 20070233710A1 US 39628206 A US39628206 A US 39628206A US 2007233710 A1 US2007233710 A1 US 2007233710A1
Authority
US
United States
Prior art keywords
listening
inode
event
file
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US11/396,282
Other versions
US7756898B2 (en
Inventor
Aaron Passey
Neal Fachan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
EMC Corp
Isilon Systems LLC
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US11/396,282 priority Critical patent/US7756898B2/en
Assigned to ISILON SYSTEMS, INC. reassignment ISILON SYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FACHAN, NEAL T., PASSEY, AARON J.
Publication of US20070233710A1 publication Critical patent/US20070233710A1/en
Priority to US12/789,393 priority patent/US8005865B2/en
Application granted granted Critical
Publication of US7756898B2 publication Critical patent/US7756898B2/en
Assigned to ISILON SYSTEMS LLC reassignment ISILON SYSTEMS LLC MERGER (SEE DOCUMENT FOR DETAILS). Assignors: ISILON SYSTEMS, INC.
Assigned to IVY HOLDING, INC. reassignment IVY HOLDING, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ISILON SYSTEMS LLC
Assigned to EMC CORPORATION reassignment EMC CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: IVY HOLDING, INC.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT SECURITY AGREEMENT Assignors: ASAP SOFTWARE EXPRESS, INC., AVENTAIL LLC, CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL SOFTWARE INC., DELL SYSTEMS CORPORATION, DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., MAGINATICS LLC, MOZY, INC., SCALEIO LLC, SPANNING CLOUD APPS LLC, WYSE TECHNOLOGY L.L.C.
Assigned to CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT reassignment CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT SECURITY AGREEMENT Assignors: ASAP SOFTWARE EXPRESS, INC., AVENTAIL LLC, CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL SOFTWARE INC., DELL SYSTEMS CORPORATION, DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., MAGINATICS LLC, MOZY, INC., SCALEIO LLC, SPANNING CLOUD APPS LLC, WYSE TECHNOLOGY L.L.C.
Assigned to EMC IP Holding Company LLC reassignment EMC IP Holding Company LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EMC CORPORATION
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES, INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. reassignment THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A. SECURITY AGREEMENT Assignors: CREDANT TECHNOLOGIES INC., DELL INTERNATIONAL L.L.C., DELL MARKETING L.P., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, EMC IP Holding Company LLC, FORCE10 NETWORKS, INC., WYSE TECHNOLOGY L.L.C.
Assigned to MAGINATICS LLC, DELL SYSTEMS CORPORATION, EMC IP Holding Company LLC, MOZY, INC., ASAP SOFTWARE EXPRESS, INC., DELL PRODUCTS L.P., DELL USA L.P., EMC CORPORATION, DELL SOFTWARE INC., CREDANT TECHNOLOGIES, INC., DELL MARKETING L.P., DELL INTERNATIONAL, L.L.C., FORCE10 NETWORKS, INC., AVENTAIL LLC, WYSE TECHNOLOGY L.L.C., SCALEIO LLC reassignment MAGINATICS LLC RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH
Assigned to DELL INTERNATIONAL L.L.C., DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), DELL PRODUCTS L.P., SCALEIO LLC, DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), DELL USA L.P., EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.) reassignment DELL INTERNATIONAL L.L.C. RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Assigned to DELL INTERNATIONAL L.L.C., DELL USA L.P., EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), DELL PRODUCTS L.P., DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), SCALEIO LLC reassignment DELL INTERNATIONAL L.L.C. RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001) Assignors: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Definitions

  • This invention relates generally to systems and methods of notifying listeners of events.
  • the distributed architecture allows multiple nodes to process incoming requests. Accordingly, different process requests may be handled by different nodes. Problems may occur, however, when one of the nodes modifies information that effects other nodes.
  • the systems and methods generally relate to notifying listeners of events.
  • an event listening system may include a file system including a plurality of files, the plurality of files logically stored in a tree; for each of the plurality of files, a first data structure configured to track a set of listening files that are listening for events that affect the corresponding file; a plurality of processes that each listen for events that affect at least one of the plurality of files; a second data structure configured to track, for each of the plurality of files, which of the plurality of processes are listening to each of the files; a listening module configured to receive an identifier for a first file of the plurality of files and to determine whether the first file is relevant to any of the plurality of processes using the first data structure and the second data structure; a traverse module configured to traverse a first set of first data structures that correspond to a subset of the plurality of files that represent one branch of the tree; and an update module configured to update at least one of the corresponding first data structures of the file in at least one traversed level by reviewing a scope
  • a method for listening for events may include logically storing a plurality of files in a tree; for each of the plurality of files, tracking a set of listening files that are listening for events that affect the corresponding file; storing a plurality of processes that each listen for events that affect at least one of the plurality of files; for each of the plurality of files, tracking which of the plurality of processes are listening to each of the files; receiving an identifier for a first file of the plurality of files; determining whether the first file is relevant to any of the plurality of processes using the first data structure and the second data structure; traversing a first set of first data structures that correspond to a subset of the plurality of files that represent one branch of the tree; and updating at least one of the corresponding first data structures of the file in at least one traversed level, wherein updating includes reviewing a scope of at least one of the listening files of the first data structure that corresponds to the file's parent.
  • a system for listening for events may include a file structure comprising a plurality of files that are logically stored in a tree; for each of the plurality of files, a data structure corresponding to each files, the data structure comprising: a set of identifiers of the plurality of files that are listening for events that affect the corresponding file; and an indication of the currentness of the data structure.
  • a method for listening for events may include logically storing a plurality of files in a tree; and for each of the plurality of files, storing a data structure corresponding to each files, the data structure comprising a set of identifiers of the plurality of files that are listening for events that affect the corresponding file and an indication of the currentness of the data structure.
  • a system for queuing event messages in a file system may include a plurality of processes that each listen for events that affect at least one of a plurality of files; a first data structure configured to determine, for each of the plurality of processes, a set of listening files to which each of the plurality of processes is listening; and a message module configured to receive an event message related to a first file of the plurality of files, the event message including an indication of a minimum scope that would have generated the event message, to search the first data structure to determine a first subset of the plurality of processes that that listen for files that are affected by the event using the sets of listening files, to determine a second subset of the first subset by removing from the first subset, processes whose scope is less than the minimum scope of the event message, and to inform the second subset of the event message.
  • FIGS. 1A and 1B illustrate, respectively, one embodiment of physical and logical connections of one embodiment of nodes in a system.
  • FIG. 2 illustrates one embodiment of the elements of an inode data structure in a distributed file system.
  • FIGS. 3A, 3B , and 3 C illustrate one embodiment of the respective scope of single, children, and recursive listeners.
  • FIGS. 4A, 4B and 4 C illustrate one embodiment of initiator hash tables.
  • FIG. 5 illustrates one embodiment of a participant hash table.
  • FIGS. 6A and 6B illustrate one embodiment of the scope of listeners from the perspective of processes and nodes, respectively.
  • FIG. 7 illustrates one embodiment of a flowchart of operations to add an additional listener to an embodiment of the system.
  • FIGS. 8A and 8B illustrate one embodiment of the scope of exemplary listeners (from the perspective of nodes) following the addition of another listener.
  • FIG. 9 illustrates one embodiment of a top-level flowchart of operations for notifying listeners of an event in an embodiment of the system.
  • FIG. 10 illustrates one embodiment of a flowchart of operations to validate the event cache of an inode.
  • FIG. 11 illustrates one embodiment of a flowchart of operations to update the cache of a child inode with the cache of the parent inode.
  • FIGS. 12A and 12B illustrate one embodiment of the status of caches following a “size change” and a “create” event, respectively.
  • FIG. 13 illustrates one embodiment of a flowchart of operations of the participant module to send event messages to listening nodes.
  • FIG. 14 illustrates one embodiment of two event messages.
  • FIG. 15 illustrates one embodiment of a flowchart of operations to determine the minimum scope.
  • FIG. 16 illustrates one embodiment of a flowchart of operations of the initiator module to receive an event message and to notify listening processes accordingly.
  • FIG. 17 illustrates one embodiment of a flowchart of operations to update initiator and participant hash tables following the addition of a node to the system.
  • a distributed file system For purposes of illustration, some embodiments will be described in a context of a distributed file system.
  • the present invention is not limited by the type of environment in which the systems and methods are used, however, and the systems and methods may be used in other environments, such as, for example, other file systems, other distributed systems, the Internet, the World Wide Web, a private network for a hospital, a broadcast network for government agency, an internal network of a corporate enterprise, an Internet, a local area network, a wide area network, a wired network, a wireless network, and so forth.
  • Some of the figures and descriptions relate to an embodiment of the invention wherein the environment is that of a distributed file system.
  • systems and methods are provided for tracking events in a distributed file system.
  • an event system monitors certain areas of a file system. When an event occurs in one area of the distributed file system, the event system notifies the processes listening to that area of the distributed file system of the event.
  • a listening application is a directory management application. When the directory management application opens a window on a particular directory, it may instantiate a listener on that directory. When another application, such as a word processor, creates a new file in that directory, the event system notifies the listening application, which can then immediately update the window to show the new file.
  • Another example of a listening application is an indexing service which listens to a subdirectory recursively.
  • An indexing service may, for example, store an index for words and phrases appearing within a certain group of documents. The index may be used to enhance document searching functionality. Whenever the service is notified of an event, it may re-index the file or files corresponding to that event.
  • An event system may also be used internally by the distributed file system to monitor configuration files and to take appropriate actions when they change.
  • a listening process which includes an executed instantiation of an application, may refer to the client process that requests a listener on the distributed file system, and the listener may refer to the data structures initiated by the event system to monitor and report events to the listening process.
  • the event system illustrated in FIGS. 1 through 17 , there are three general areas that the event system implements: (1) maintaining a cluster-wide set of listeners; (2) determining whether a specified file is being listened to; and (3) notifying listeners of those files of the events.
  • FIG. 1A illustrates the connections of elements in one embodiment of a distributed system 100 .
  • Client processes access the distributed system 100 through the network 104 , using, for example, client machines 106 .
  • client machines 106 may access the distributed system 100 through the network 104 , using, for example, client machines 106 .
  • there is a single network 104 connecting both nodes 102 and client machines 106 in other embodiments there may be separate networks.
  • FIG. 1B illustrates one possible logical connection of three nodes 102 , forming a cluster 108 .
  • the nodes 102 in cluster 108 are connected in a fully connected topology.
  • a fully connected topology is a network where each of the nodes in the network is connected to every other node in the network.
  • the nodes 102 are arranged in a fully connected network topology
  • the cluster 108 of nodes 102 may be arranged in other topologies, including, but not limited to, the following topologies: ring, mesh, star, line, tree, bus topologies, and so forth. It will be appreciated by one skilled in the art that various network topologies may be used to implement different embodiments of the invention.
  • the nodes 102 may be connected directly, indirectly, or a combination of the two, and that all of the nodes may be connected using the same type of connection or one or more different types of connections. It is also recognized that in other embodiments, a different number of nodes may be included in the cluster, such as, for example, 2, 16, 83, 6, 883, 10,000, and so forth.
  • the nodes 102 are interconnected through a bi-directional communication link where messages are received in the order they are sent.
  • the link comprises a “keep-alive” mechanism that quickly detects when nodes or other network components fail, and the nodes are notified when a link goes up or down.
  • the link includes a TCP connection.
  • the link includes an SDP connection over Infiniband, a wireless network, a wired network, a serial connection, IP over FibreChannel, proprietary communication links, connection based datagrams or streams, and/or connection based protocols.
  • An event system may be implemented for a distributed file system, notifying listening processes of certain events on files and directories within the file system.
  • metadata structures also referred to as inodes, are used to monitor and manipulate the files and directories within the system.
  • An inode is a data structure that describes a file or directory and may be stored in a variety of locations including on disk and/or in memory.
  • the inode in-memory may include a copy of the on-disk data plus additional data used by the system, including the fields associated with the data structure and/or information about the event system.
  • the nodes of a distributed system such as nodes 102 , may implement an inode cache.
  • Such a cache may be implemented as a global hash table that may be configured to store the most recently used inodes.
  • the inode cache may store more than 150,000 inodes and the inode may be around 1 KB of data in memory though it is recognized that a variety of different implementations may be used with caches of different sizes and inodes of different sizes.
  • Information for an event system may include information regarding those listeners that are monitoring certain events on the file or directory corresponding to a particular inode. In one embodiment of an event system, this information is referred to as the event cache.
  • FIG. 2 illustrates one embodiment of an in-memory inode 200 of a distributed file system.
  • the inode 200 includes several fields.
  • the inode 200 includes a mode field 202 , which indicates, for example, either a file or directory.
  • a file is a collection of data stored in one unit under a filename.
  • a directory similar to a file, is a collection of data stored in one unit under a directory name.
  • a directory is a specialized collection of data regarding elements in a file system.
  • a file system is organized in a tree-like structure. Directories are organized like the branches of trees. Directories may begin with a root directory and/or may include other branching directories. Files resemble the leaves or the fruit of the tree.
  • Files typically, do not include other elements in the file system, such as files and directories. In other words, files do not typically branch.
  • an inode represents either a file or a directory
  • an inode may include metadata for other elements in a distributed file system, in other distributed systems, or in other file systems.
  • the exemplary inode 200 also includes a LIN field 204 .
  • the LIN or Logical Inode Number, is a unique identifier for the file or directory. It uniquely refers to the on-disk data structures for the file or directory. It may also be used as the index for the in-memory inodes, such as the index for a cache of in-memory inodes stored on nodes 102 .
  • the LIN is 10 . Accordingly, the exemplary inode 200 would be referred to as “inode 10 .”
  • the exemplary inode 200 also includes fields to implement an event cache, including a listening set field 206 and a cache generation number field 208 .
  • the listening set provides information about which other inodes are listening to this particular inode.
  • An event system may use an inode's listening set to help notify listeners of particular events.
  • the listening set of an inode may include a set of LINs, referring to a set of inodes, including perhaps the inode itself.
  • the listening set field 206 would include inodes 12 , 13 , and 16 . In the illustrated embodiment, however, the listening set field 206 is empty, indicating that there are no listeners whose scope includes inode 10 .
  • the scope of listeners in an exemplary directory system with inode 10 is illustrated by FIG. 3 .
  • the exemplary inode 200 also includes a cache generation number field 208 .
  • an event system may use the cache generation number to identify whether the listening set of an inode, such as inode 10 , is up-to-date.
  • the exemplary inode 200 may also include other fields 210 and/or a subset of the fields discussed above.
  • a listener is a logical construction of data and operations that monitors events on a particular resource or data structure.
  • listeners may be assigned to a particular file or directory.
  • a single listener may listen for events on more than just one resource, or more than one file or directory.
  • a listener may be defined with a particular scope. Events on a file or directory within the scope of a particular listener may be monitored with the other files and/or directories within the scope of that listener.
  • FIGS. 3A, 3B , and 3 C illustrate one embodiment of the respective scope of single, children, and recursive listeners on an inode tree 300 .
  • an inode tree such as inode tree 300
  • an inode tree corresponds to a file directory system.
  • Each inode in the tree corresponds to a file or directory in the file directory system.
  • circles are used to denote directories, and squares are used to denote files.
  • FIG. 3A illustrates one embodiment of the scope of a single listener.
  • a process requesting a single listener is requesting notification of events on only the specified inode.
  • a listening process has requested event messages for events that occur on the directory corresponding to the inode 12 .
  • FIG. 3B illustrates one embodiment the scope of a children listener.
  • a listening process requesting a children listener is requesting notification of events on the specified inode and its children inodes.
  • inode 12 directory
  • inode 12 has three immediate descendents, or children: 13 (directory), 14 (file) and 15 (directory).
  • a process listening to inode 12 with children scope listens for events that occur on inode 12 and all of the immediate descendents, or children, of inode 12 .
  • listening scopes may be defined in accordance with embodiments of the invention.
  • another embodiment of the invention may define a grandchildren scope which listens to the events on the specified inode and its children and grandchildren, an only grandchild scope which listens to the events on the specified inode and its grandchildren, or a parent scope which listens to an inode and its parent.
  • a grandchildren listener on inode 12 would listen to events on inodes 12 , 13 , 14 , 15 , 16 , 17 , and 18
  • an only grandchildren listener on inode 12 would listen to events on inodes 12 , 16 , 17 , and 18
  • a parent listener on inode 12 would listen to events on inodes 12 and 10 .
  • a listening scope may be defined that includes only files or only directories. Other possible listening scopes may also be defined.
  • the event system implements: (1) maintaining a cluster-wide set of listeners; (2) deciding if a specified file is being listened to; and (3) notifying listeners of those events on files that the listeners are listening for.
  • Each of these three areas is described in further detail below.
  • FIGS. 4A, 4B , 4 C, and 5 illustrate one embodiment of data structures that an event system may employ to maintain a cluster-wide set of listeners in a distributed system.
  • an event system there are two logical entities that implement the listeners for a distributed file system: initiators and participants.
  • each listener is instantiated on one particular node 102 of cluster 108 . This is the initiator node for that listener.
  • a node 102 may be the initiator node for multiple listeners.
  • each node 102 keeps track of the instantiated listeners on that node in a single initiator hash table, which also keeps a queue, for each listener, of the listened-for events.
  • Each node 102 may also execute certain operations to maintain the instantiated listeners and to notify nodes 102 of cluster 108 of any changes to the group of instantiated listeners, including additional listeners.
  • the term “initiator” may be used to refer to the node upon which a listener is instantiated, the data structure that stores relevant events for that listener, and/or a module that executes operations related to the instantiated listeners.
  • the word module refers to logic embodied in hardware or firmware, or to a collection of software instructions, possibly having entry and exit points, written in a programming language, such as, for example, C or C++.
  • a software module may be compiled and linked into an executable program, installed in a dynamic link library, or may be written in an interpreted programming language such as, for example, BASIC, Perl, or Python. It will be appreciated that software modules may be callable from other modules or from themselves, and/or may be invoked in response to detected events or interrupts.
  • Software instructions may be embedded in firmware, such as an EPROM.
  • hardware modules may be comprised of connected logic units, such as gates and flip-flops, and/or may be comprised of programmable units, such as programmable gate arrays or processors.
  • the modules described herein are preferably implemented as software modules, but may be represented in hardware or firmware.
  • the listening inode is the point of reference from which to calculate the scope of the listener. For instance, with respect again to FIG. 3C , if the listening inode is inode 12 and the scope is recursive, then the listener listens for events that occur on inode 12 and its descendents, inodes 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , and 21 . If, alternatively, the listening inode is 18 and the scope is similarly recursive, then the listener listens for events that occur on inode 18 and its lone descendent, inode 21 .
  • the initiator module In addition to receiving requests for listeners, the initiator module also stores the requested listeners in a hash table, also referred to as the initiator hash table, and sends messages to participant modules regarding additions, deletions, and/or changes to the listeners stored in its hash table. This is discussed in more detail with reference to FIG. 7 .
  • the nodes 102 of the cluster 108 include an initiator hash table. As discussed in more detail below with reference to FIG. 4B , the hash table may not include any listeners.
  • the initiator module also communicates the contents of its hash table to the participant modules when a new node 102 is added to the cluster 108 . (This is discussed in more detail below with reference to FIG.
  • the initiator module may also communicate the contents of its hash table to the participant modules when a node 102 is removed from the cluster.
  • the initiator module may also be configured to receive event messages from participant modules, signifying that the participant node has processed an event that affects an inode for which the initiator node is listening.
  • the initiator module determines from these event messages which, if any, listeners are listening for the event.
  • the initiator module queues those events for which listeners are listening and notifies the listening process of the event.
  • an event message arrives at an initiator module even though there are no listeners in the initiator's hash table listening for that event. Receiving event messages is discussed in more detail below with reference to FIG. 16 .
  • the initiator hash table 400 includes an index 402 , initiator structures 404 , process structures 406 , and event queues 408 . As mentioned above, in the illustrated embodiment, there is one initiator hash table 400 per node 102 . In the illustrated embodiment, the initiator hash table 400 maps the LINs of listening inodes to initiator structures 404 . As mentioned above, in the illustrated embodiment, a listening inode is the inode to which a listener is directed. The LIN is the key for the initiator hash table 400 . The respective LIN is put through a hash function which maps it to another number. That number is then divided by the size of the table, and the remainder is then used as the index into an array of linked list heads or buckets.
  • the initiator hash table 400 maps the LINs of listening inodes to initiator structures 404 .
  • the initiator structure 404 stores the composite scope and event mask of the listeners (initiated on the initiator node) that listen to the respective listening inode.
  • the initiator structure 404 includes a field for the LIN of the respective listening inode, the composite scope of the listeners listening to the listening inode, and the composite event mask of the listeners listening to the listening inode.
  • the composite scope is the broadest scope of those listeners listening to the listening inode
  • the composite event mask is the union of the event masks of the listeners listening to the listening inode.
  • the process structures 406 represent the individual listeners initiated by the initiator.
  • the process structures 406 correspond to associated listening processes 410 , which request individual listeners.
  • listening processes may refer both to the listening processes 410 and, at other times, to the corresponding process structures 406 .
  • the process structures 406 include three fields.
  • the first field is a process identifier, which uniquely identifies a process in data communication with the node 102 that has requested a listener on the listening inode.
  • the second field is the scope of the listener that the respective listening process 410 requests for the listening inode.
  • the process structures 406 are associated with respective event queues 408 .
  • the event queues 408 store event messages from participants. Event messages include information regarding events on inodes within an inode tree, such as inode tree 300 , that fall within the scope and event mask of the listener. Event messages are stored in the event queues 408 until a process is ready to process the event.
  • FIG. 4A illustrates an exemplary initiator hash table 400 on Node 1 .
  • the listening processes 410 have requested listeners on the same inode, inode 12 .
  • Process 3000 has requested notification of all “create” events on inode 12 and the descendents of inode 12 .
  • Process 3000 has requested a listener on inode 12 with recursive scope and an event mask of “create.”
  • Process 3001 has requested notification of all “size change” events on inode 12 and its immediate descendents, or children.
  • Process 3001 has requested a listener on inode 12 with children scope and an event mask of “size change.” It will be appreciated that there are many different ways in which listening processes 410 may communicate listening parameters to nodes 102 . In some embodiments, listening processes 410 may communicate listening parameters via a system call, subroutine call, etc.
  • the listening processes 410 there is a corresponding process structure 406 in the initiator hash table 400 of a node 102 , for example Node 1 .
  • the two listening processes 410 Processes 3000 and 3001 , have corresponding process structures 406 stored on Node 1 .
  • the respective scopes and event masks of the process structures 406 match the respective scopes and event masks of listening processes 410 .
  • listening processes 410 specify the point-of-reference inode (the listening inode), and process structures 406 include a field with a unique identifier for the respective listening process 410 that requested the listener.
  • FIG. 4B illustrates one embodiment of the initiator hash table 400 for Node 2 . Because there are no processes requesting listeners through the network 104 on Node 2 , there are no initiator structures 404 . Thus, the initiator hash table 400 is initialized, but the entries in the hash table 400 are empty. In other embodiments, the initiator hast table 400 is not initialized until there are initiator structures 404 .
  • FIG. 4C illustrates the initiator hash table 400 for Node 3 .
  • Three of the listening processes 410 specify inode 13 as the inode to which they are listening.
  • the remaining two listening processes 410 specify inodes 12 and 16 , respectively, as the inodes to which they are listening.
  • Processes 2000 , 2001 , and 2002 all request listeners on inode 13 .
  • Process 2003 requests a listener on inode 16 .
  • Process 2004 requests a listener on inode 12 . Because the five processes collectively request listeners on three different inodes, there are three initiator structures 404 , corresponding to each one of the specified listening inodes. These initiator structures 404 are indexed by their respective LINs.
  • the process structures 406 have corresponding event queues 408 .
  • events occur within the scope and event mask of the listener, the events or the messages about the events are queued in the event queue 408 of the corresponding process structure 406 .
  • FIG. 5 illustrates one embodiment of the participant data structures.
  • participant data structures include a participant hash table 500 and a node generation number 502 .
  • the node structures 508 indicate the composite scope and event masks for all of the listeners for a particular listening inode initiated on a particular node 102 .
  • the scope and event masks of the node structures 508 correspond to the initiator structures 404 for the respective listening inode.
  • the participant structure 506 represents the composite scope and event masks of the node structures 508 corresponding to the respective listening inodes in the participant hash table 500 .
  • the participant structure 506 corresponding to inode 12 includes a composite scope and composite event mask representing those listeners for inode 12 that are initiated, in this embodiment, on all of the nodes 102 .
  • the scope of the participant structure 506 corresponding to inode 12 is recursive, indicating the broadest scope of the two node structures 508 corresponding to inode 12 .
  • the participant hash table 500 is the same for Nodes 1 , 2 , and 3 .
  • the purpose of the participant structures is to process events that may occur on any given inode in the distributed file system, and that may occur, for instance, on any one of the nodes 102 in the cluster 108 . It is recognized that in some embodiments one or more of the participant hash tables may be different.
  • the participant data structures also include a node generation number 502 .
  • the node generation number 502 is used to verify that a particular inode's cache is up-to-date, as discussed further below with reference to FIGS. 8A and 8B .
  • the node generation number 502 may be incremented every time there is a significant change to the participant hash table 500 . Changes to the participant hash tables 500 correspond to changes to the respective initiator hash tables 400 .
  • the node generation number 502 for each respective node 102 need not be the same.
  • nodes 102 that may have been disconnected from the cluster 108 may not have been involved during a change to the participant hash tables 500 of the other nodes 102 , the generation numbers for the nodes 102 may be different.
  • the participant hash tables 500 are the same on every node 102 .
  • FIGS. 6A and 6B illustrate one embodiment of the different perspectives of listening scope.
  • FIG. 6A illustrates the scope of listeners from the perspective of each individual process.
  • FIG. 6B illustrates the scope of listeners with respect to the participant structures 506 in the participant hash tables 500 .
  • the listeners illustrated in FIGS. 6A and 6B correspond to the listeners described in FIGS. 4A, 4B , 4 C, and 5 .
  • FIG. 6B illustrates the listeners from the perspective of each process that has requested a listener on one of the inodes in the inode tree 300 .
  • FIG. 6A There are seven listeners represented in FIG. 6A .
  • a listener with single scope 302 is attached to inode 12 .
  • These three listeners correspond to the three listeners requested for inode 12 , as illustrated in FIGS. 4A and 4C .
  • FIG. 4A illustrates two processes, Processes 3000 and 3001 , which are listening to inode 12 .
  • the recursive listening scope 306 corresponds to the listener requested by Process 3000 , which has a recursive scope.
  • the children listening scope 304 corresponds to the listener requested by Process 3001 , which also has a scope of children.
  • the single listening scope 302 corresponds to the listener requested by Process 2004 , as illustrated in FIG. 4C , which is a single listener attached to inode 12 .
  • FIG. 6A there are three listening scopes for inode 13 . Two of these listening scopes are single scope 308 and 310 , and the last listening scope is a children scope 312 . These three listening scopes correspond to the three listeners illustrated in FIG. 4C .
  • Process 2000 requests a listener for inode 13 with children scope, which corresponds to children listening scope 312 .
  • Processes 2001 and 2002 request listeners on inode 13 , each with single scope, which correspond to single listening scopes 308 and 310 .
  • inode 16 has a listening scope 314 attached to it, which corresponds to the listener requested by Process 2003 .
  • the scope of the listener attached to inode 16 as illustrated in FIG. 16 appears to be a single listener, it is in fact a recursive listener. Because the inode 16 has no descendents, the recursive listener appears as if it were a single listener.
  • FIG. 6B illustrates the same set of listeners whose scope is illustrated in FIG. 6A , but does so from the perspective of the participant structures 506 .
  • the three scopes illustrated in FIG. 6B correspond to the three scopes of the participant structures 506 , as illustrated in FIG. 5 .
  • these scopes may typically be different than the scopes of the initiator structures 404 on the initiator hash tables 400 , even though in the exemplary embodiment they are the same.
  • there is a recursive listening scope 316 defined for inode 12 a children listening scope defined for inode 13 , and a single listening scope defined for inode 16 .
  • FIG. 6B illustrates the composite scope of the listeners for a particular listening inode.
  • the scope defined for inode 12 represents the composite scope of the listeners attached to inode 12 across all the nodes 102 .
  • the composite scope for inode 12 is recursive, the recursive scope 316 .
  • the initiator hash tables 400 and the participant hash tables 500 are updated when a change, or a certain type of change, is made to the set of listeners. For example, a listening process 410 may terminate and no longer require listeners. Alternatively, in some embodiments, the scope or event mask of a previously initiated listener may be altered. Thus, in one embodiment, there may be a need to update on a consistent basis the initiator hash tables 400 and the participant hash tables 500 .
  • FIG. 7 illustrates one embodiment of a flowchart for the operations to update the initiator hash tables 400 and the participant hash tables 500 .
  • the node 102 receiving the request for a change to one of the listeners, including adding or deleting a listener gets the exclusive event lock.
  • an exclusive event lock prevents other nodes 102 from reading from or writing to the distributed system.
  • an exclusive event lock is obtained in order to prevent other nodes from reading or writing to the initiator hash tables 400 or the participant hash tables 500 during the update.
  • the illustrated embodiment also implements a shared event lock, which prevents other nodes 102 from gaining access to an exclusive event lock.
  • a locking scheme may be used that is finer grained.
  • an initiator process for the node 102 updates its respective initiator hash table 400 corresponding to the node 102 .
  • the initiator process describes an executable portion of the initiator module. Although in the illustrated embodiment, the operations described in FIG. 7 are executed by the initiator module, in other embodiments, the same operations may be executed by other modules, such as the participant module.
  • the initiator process sends messages to the participant modules signifying that there has been an update to an initiator hash table 400 , and subsequently delivers the updated information for its hash table 400 , which is described in state 706 .
  • the nodes 102 include both an initiator and a participant module.
  • the participants update their participant hash tables 500 .
  • a participant process which may include an executable portion of the participant module, indexes the appropriate listening inode. If necessary, changes are made to the node structures 508 and the corresponding participant structures 506 , to represent the updated listener information.
  • the participant process increments the node generation number 502 in state 708 .
  • the node generation number 502 is simply incremented.
  • the node generation number 502 may correspond to some other identifier that participant nodes recognize as the updated status of the participant hash tables 500 .
  • the respective initiator process releases the exclusive event lock.
  • the initiator process described in FIG. 7 pertains to the initiator module and the participant process pertains to the participant module. In other embodiments, the initiator process and/or the participant process reside in other and/or additional modules.
  • FIGS. 8A and 8B illustrate one embodiment of a change to the listeners of the inode tree 300 .
  • FIG. 8A illustrates the state of the inode tree 300 before Processes 3000 and 3001 , illustrated in FIG. 4A , have requested listeners on inode 12 .
  • FIG. 8A illustrates one embodiment of the state of the inode tree 300 with listeners requested by Processes 2000 , 2001 , 2002 , 2003 , and 2004 . Only three scopes are illustrated because two of the listeners fall within the scope of another listener. Specifically, the single scope listeners requested by Processes 2001 and 2002 fall within the scope of the children scope listener requested by Process 2000 .
  • each individual inode includes an event cache.
  • the event cache includes a listening set 804 and a cache generation number 806 .
  • the listening set 804 of a particular inode includes the LINs of the listening inodes (in the participant hash tables 500 ) whose scope encompasses that particular inode.
  • the listening set 804 includes listening inodes 13 and 16 . This means that there is a listener associated with inode 13 whose scope is broad enough to include inode 16 . Similarly, there is a listener associated with inode 16 whose scope is broad enough to include inode 16 , namely, the listener associated with inode 16 .
  • each inode cache includes a cache generation number 806 . If the cache generation number 806 of an inode matches the node generation number 502 , then the event cache of the inode is up-to-date.
  • FIG. 8A illustrates an inode tree 300 wherein the event cache of every inode is up-to-date. The event caches are up-to-date because each cache generation number 806 matches the node generation number 502 .
  • FIG. 8B illustrates one embodiment of the state of the inode tree 300 following the addition of two additional listeners.
  • these listeners correspond to the listeners requested by Processes 3000 and 3001 , as illustrated in FIG. 4A .
  • Process 3000 requests a listener on inode 12
  • the broadest scope on inode 12 becomes the recursive scope.
  • the only listener previously attached to inode 12 is the listener requested by Process 2004 , which has single scope.
  • the broadest scope of a listener on inode 12 following the addition of the listener corresponding to Process 2004 , is the recursive scope, as illustrated in FIG. 8B .
  • This scope corresponds to the scope of the participant structure 506 corresponding to inode 12 following the addition of the listener corresponding to Process 2004 .
  • Process 3001 attaches an additional listener of children scope to inode 12 , the broadest scope does not change because the children scope is less than or equal to the previous scope, recursive.
  • the addition of each listener, first 3000 and then 3001 caused the node generation number 502 to increment by one (not illustrated).
  • successive changes to the listeners may be grouped together.
  • FIG. 8B also illustrates how up-to-date event caches would appear following the addition of the two listeners.
  • the listening set is 12 , 13 , and 16 . This means that there are listeners attached to inodes 12 , 13 , and 16 whose scope is broad enough to include inode 16 .
  • Inode 16 is within the scope of listening inode 12 because the broadest listener attached to inode 12 is a recursive listener. Inode 16 is within the scope of listening inode 13 because the broadest scope of a listener attached to inode 13 is the children scope and inode 16 is an immediate descendent, or child, of inode 13 . Finally, inode 16 is within the scope of listening inode 16 because inode 16 is inode 16 ; thus, inode 16 is within the scope of any listener attached to inode 16 because even the smallest scope, in the exemplary embodiment the single scope, includes the inode itself.
  • the transition of event caches from FIG. 8A to FIG. 8B does not happen automatically; the event caches of each inode in the inode tree 300 are not updated automatically. Instead, the event caches for each inode are updated as needed. In other embodiments, some or all of the updating is automatic.
  • One embodiment of an updating process is described in detail further below in FIGS. 10 through 12 .
  • FIG. 9 illustrates one embodiment of a flowchart of the top-level events for processing an event on the cluster 108 .
  • the respective node 102 where the event occurs determines whether event messages are sent to listeners.
  • the execution of operations depicted in FIG. 9 is referred to collectively as the process.
  • the initiator module executes the process. In other embodiments, the process may be executed by other modules, such as the participant module.
  • the process decides if the relevant inode, referred to as the inode on which the event occurs, is being listened to. This is one of the functions of one embodiment of an event system described herein, and this function is described in more detail in the third section, with reference to FIGS. 10 through 12 .
  • this function is the third primary function of the exemplary event system described herein, and it is described in more detail in the fourth section, with reference to FIGS. 13 through 16 .
  • the node 102 on which the event occurs acquires a shared event lock 902 .
  • a shared lock prevents other nodes 102 from obtaining an exclusive event lock. Other nodes 102 may continue to read the contents of the system while one node 102 has the shared event lock.
  • the process validates the event cache of the relevant inode 904 .
  • the relevant inode is the inode that the event affects.
  • the event affects the file or directory corresponding to the relevant inode. Because the data stored in the relevant inode may also change, in some embodiments, the event is referred to as occurring to the inode.
  • these events may include, without limitation, attribute change, creation, deletion, size change, remove, content change, sizing increase, attribute change, link count change, rename, access revoke, create, rename from here, rename to here, rename within same directory, event occurred on file, event occurred on directory, size change, permission change, and/or other events.
  • the node 102 After performing the operation on the relevant inode, the node 102 sends event messages to the listeners, in state 908 . Various embodiments of this state are described in more detail below with reference to FIGS. 13 through 16 . After sending event messages to listeners, the node 102 releases the shared event lock, in state 910 . As mentioned above, in one embodiment, the process described by FIG. 9 is executed by the participant module, though in other embodiments, the process may be executed by other modules.
  • FIG. 10 illustrates one embodiment of a flowchart of operations to validate an event cache of a relevant inode.
  • the operations described in FIG. 10 are collectively referred to as the process.
  • the participant executes the process, though in other embodiments, other modules, such as the initiator module, may execute the process.
  • the process determines if the cache generation number 806 of the relevant inode matches the node generation number 502 of the relevant node.
  • the relevant node is the node upon which the relevant event occurs, and the relevant event is the current event being processed. If there is a match, then the event cache of the relevant inode is up-to-date and it has been validated.
  • the participant module proceeds to state 1004 , and the process determines whether the relevant inode is the root of the inode tree 300 . If the relevant inode is the root then participant module proceeds to state 1010 , and there is no need to update the cache of the relevant inode with the cache the parent because the root has no parent.
  • the cache of the relevant inode is updated with the cache of the parent. Before doing this, however, the cache of the parent is validated. In other words, in one embodiment, the cache of the relevant inode may not be updated with the cache of the parent until the cache of the parent is up-to-date itself. This step demonstrates the recursive nature of one embodiment of the algorithm. In one embodiment, the recursion occurs all the way until the relevant inode is the root or the relevant inode has a valid cache.
  • the relevant inode is the inode being updated, not the inode to which the event originally occurred, as is used in other parts of the description.
  • the relevant inode refers to the inode upon which the relevant, or current, event occurs, during the validation stage, the relevant inode refers to whichever inode is being validated.
  • each inode along the way becomes the relevant inode for purposes of validating the event caches.
  • the relevant inode generally refers to the inode to which the relevant, or current, event occurred.
  • the participant module proceeds to validate the event cache of the parent, which, with respect to the flowchart depicted in FIG. 10 , returns the participant module to state 1002 .
  • the process then proceeds through the same flowchart operations, with the parent of the relevant inode of the previous pass becoming the relevant inode for the successive pass. This is the recursive element of one embodiment of the event cache validating algorithm.
  • the process After validating the event cache of the parent, the process updates the cache of the relevant inode with the cache of the parent 1008 . This is the first operation taken after returning from each successive call to validate the event cache of the “relevant” inode. In one embodiment, “relevant” is relative because as the process works up the tree, the parent of the relevant inode becomes the relevant inode. State 1008 is described in more detail below with reference to FIG. 11 .
  • the process proceeds to state 1010 . As set forth above, the process also progresses to state 1010 if it is determined, in state 1004 , that the relevant inode is the root.
  • the process determines whether or not the relevant inode is itself a listening inode, by looking, for example, in the participant hash table 500 . If the relevant inode indexes a participant structure 506 , then it is a listening inode. If the relevant inode is a listening inode, then process proceeds to state 1012 . In state 1012 , the relevant inode is added to the listening set 802 of the relevant inode. If, on the other hand, the relevant inode is not a listening inode then the process proceeds to state 1014 .
  • the process proceeds to state 1014 , where the cache generation number 806 of the relevant inode is updated with the current value of the node generation number 502 .
  • the participant module executes the process, though in other embodiments, other modules, such as the initiator module, may execute the process.
  • FIG. 11 illustrates one embodiment of state 1008 in more detail and illustrates the operations for updating the cache of the relevant inode with the cache of the parent.
  • the states in between state 1102 and 1112 repeat for each listening inode in the listening set 804 of the parent of the relevant inode.
  • the respective listening inode in the listening set 804 is the corresponding listening inode specified in the flowchart. For example, if the listening set 804 of the parent of the relevant inode includes two listening inodes, then the loop would repeat two times.
  • the process determines whether the scope of the respective listening inode is recursive. If the scope of the respective listening inode is recursive, then the relevant inode is within the scope of the respective listening inode, and the process proceeds to state 1110 , where the respective listening inode is added to the listening set 804 of the relevant inode. If, on the other hand, the scope of the respective listening inode is not recursive, then the process determines whether the scope of the listening inode is children 1106 .
  • the scope of the listening inode is be single, and if the scope of the listening inode is single then the relevant inode is not within the scope of the listening inode because the listening inode is not the relevant inode. If the scope of the listening inode is children, then the process proceeds to state 1108 . In state 1108 , the participant module determines whether the respective listening inode is the parent of the relevant inode.
  • the process proceeds to state 1110 , where the respective listening inode is added to the listening set 804 of the relevant inode. If, on the other hand, the respective listening inode is not the parent of the relevant inode, then the relevant inode is not within the scope of the listening inode. In that case, the process proceeds to state 1112 , ending the corresponding loop of instructions for that respective listening inode.
  • the operations between states 1102 and 1112 execute for each respective listening inode in the listening set 804 of the parent of the relevant inode.
  • the participant module executes the process, though in other embodiments, other modules, such as the initiator module, may execute the process.
  • FIGS. 12A and 12B illustrate one embodiment of validating the event caches of the inode tree 300 following certain events.
  • FIG. 12A illustrates stages of validating event caches in the inode tree 300 following a “size change” event on the file corresponding to inode 20 .
  • FIG. 8A illustrates the event caches of the inode three 300 before the “size change” event.
  • the process first attempts to validate the event cache of inode 20 . Because the cache generation number 806 of inode 20 does not match the node generation number 502 , the process proceeds to determine whether inode 20 is the root. Because inode 20 is not the root, the process proceeds to validate the event cache of the parent, inode 17 .
  • the process proceeds to validate the event cache of the parent, inode 13 . (This is the second recursive call.) Because the cache generation number 806 of inode 13 does not match the node generation number 502 , and because inode 13 is not the root, the process attempts to validate the event cache of the parent, inode 12 .
  • inode 10 is not a listening inode, the process proceeds to state 1014 , where the cache generation number 806 of inode 10 is updated to the value of the node generation number 502 . Having terminated the fourth and final recursive call, the process begins to unwind.
  • the process proceeds to update the cache of the relevant inode with the cache of the parent of the relevant inode.
  • the relevant inode is inode 12 .
  • the process updates the event cache of inode 12 with the event cache of inode 10 .
  • the listening set 804 of inode 10 is empty, the process proceeds from state 1102 to 1112 and returns to state 1010 .
  • inode 12 is a listening inode, the process proceeds to state 1012 , where inode 12 is added to the listening set 804 of inode 12 .
  • the cache generation number 806 of inode 12 is then updated, and the algorithm unwinds down the recursive call stack, returning to state 1008 in the second nested call.
  • the relevant inode is inode 13 .
  • the process then updates the event cache of inode 13 with the cache of the parent, which is inode 12 . Because there is one listening inode in the listening set 804 of inode 12 , the operations between states 1102 and 1112 execute once. Because the scope of listening inode 12 is recursive, the process adds inode 12 to the listening set 804 of inode 13 and returns to state 1010 . Because inode 13 is a listening inode, inode 13 is added to the listening set 804 of inode 13 , which now includes 12 and 13 . The cache generation number 806 of inode 13 is then updated, and the recursive call stack unwinds another level to the first recursive call.
  • the relevant inode is 17 .
  • the process then updates the event cache of inode 17 with the event cache of the parent, which is inode 13 . Because there are two listening inodes in the listening set 804 of inode 13 , the operations between 1102 and 1112 are executed twice, once for inode 12 and then once for inode 13 . Because the scope of inode 12 is recursive, inode 12 is added to the listening set 804 of inode 17 , and the process begins the next loop with inode 13 as the respective listening inode. Because the scope of inode 13 is children and because inode 13 is the parent of inode 17 , inode 13 is added to the listening set 804 of inode 17 . After finishing both loops, the process returns to state 1010 . Because inode 17 is not a listening inode, the process proceeds to update the cache generation number 806 of inode 17 and then to return to the original call state.
  • the relevant inode is now the original relevant inode, which is inode 20 .
  • the process then updates the event cache of inode 20 with the event cache of the parent, inode 17 . Because inode 17 includes two listening inodes in its listening set 804 , the operations between states 1102 and 1112 are executed twice. Because the scope of the first listening inode, inode 12 , is recursive, inode 12 is added to the listening set 804 of inode 20 . Because the scope of listening inode 13 is not recursive and because the listening inode 13 is not the parent inode 20 , the process returns to state 1010 without adding inode 13 to the listening set 804 of inode 20 . Because inode 20 is not a listening inode, the process updates the cache generation number 806 of inode 20 , which validates inode 20 , the relevant inode.
  • FIG. 12A illustrates the state of each event cache in the inode tree 300 following the execution of the “size change” event on inode 20 .
  • inodes 10 , 12 , 13 , 17 and include up-to-date caches.
  • the remaining inodes, however, include out-of-date event caches.
  • FIG. 12B illustrates the up-to-date status of the event cache of each inode in the inode tree 300 following the execution of a “create” inode 22 event.
  • the event system first validates the parent directory, and then the new child inherits the event cache of the up-to-date parent. Thus, the process first attempts to validate the parent directory of inode 22 , which is inode 18 . Because the cache generation number 806 of inode 18 does not match the node generation number 502 , and because inode 18 is not the root, the process proceeds to validate the event cache of the parent, inode 15 .
  • the respective relevant inode is updated with the up-to-date event cache of the parent, none of the respective relevant inodes are added to their own listening sets 802 (because there are no listening inodes in this branch of the tree), and the cache generation number 806 of each respective relevant inode is updated.
  • inode 22 inherits the up-to-date event cache of inode 18 .
  • the inodes 10 , 12 , 13 , 15 , 17 , 18 , 20 , and 22 have up-to-date caches, and the remaining inodes still have caches that are not up-to-date.
  • FIG. 8B illustrates the event caches of the inode tree 300 after all the event caches have been validated.
  • FIGS. 13 through 16 illustrate additional embodiments of the operation of state 908 illustrated in FIG. 9 .
  • Sending event messages to listening processes 410 includes two principal sets of operations, depicted in FIGS. 13 and 16 , respectively.
  • FIG. 13 illustrates one embodiment of a flowchart of operations executed by participant modules.
  • FIG. 16 illustrates one embodiment of a flowchart of operations executed by initiator modules upon receiving event messages from the participant modules.
  • the processes depicted in FIGS. 13 and 16 are executed by the participant and initiator modules, respectively, in other embodiments, these processes may be executed by other modules and/or executed by the same module.
  • FIG. 13 illustrates one embodiment of the flowchart of operations to send event messages to the listening nodes.
  • the operations between states 1302 and 1316 are executed for as many respective listening inodes as are in the listening set 804 of the relevant inode, where the relevant event is the event on the relevant inode being processed. If, in state 1304 , it is determined that the relevant event is within the event mask of the respective listening inode, the process proceeds to state 1306 . If, on the other hand, it is determined that the relevant event is not within the event mask of the respective listening inode, the process terminates the respective iteration.
  • states 1306 and 1314 are executed as many times as there are listening nodes for the listening inode. Thus, if there are two nodes 102 in the cluster 108 that are listening for the respective listening inode, the operations between state 1306 and state 1314 execute twice.
  • state 1308 the process determines whether the relevant event is within the event mask of the respective listening node. If the relevant event is not within the event mask of the respective listening node, then the process terminates the respective iteration. If, on the other hand, the relevant event is within the event mask of the respective listening inode, then the process proceeds to state 1310 , where it determines whether the relevant inode is within the scope of the respective listening node.
  • the process terminates the respective iteration. If, on the other hand, the relevant inode is within the scope of the listening node, then the process proceeds to state 1312 , where the process creates and sends an event message to the respective listening node. As described above, in one embodiment, the participant module executes the process, though in other embodiments, other modules may execute the process.
  • FIG. 13 illustrates one embodiment of operations for sending event messages from the participant modules to the respective initiator modules, which correspond to the respective listening nodes. These operations are accomplished in a two-step process. First, it is determined whether the relevant event falls within the event mask of any of the listening inodes within the listening set 804 of the relevant inode. Second, it is determined, for any of the qualifying listening inodes, whether the relevant event falls within the event masks of listening nodes corresponding to the respective listening inode and whether the relevant inode is also within the scope of any of the listening nodes corresponding to the respective listening inode. It is recognized that in other embodiments, the process could first check the scope and then check the event mask.
  • FIG. 14 illustrates one embodiment of event messages.
  • participant modules send event messages 1400 to initiator modules to apprise them of events on relevant inodes for which respective listening processes 410 may be monitoring.
  • An exemplary event message 1400 may include several fields.
  • An event message may include a listening inode field 1402 . This field apprises the initiator module of the listening inode that triggered the event message.
  • An event message 1400 may also include, a listening node field 1404 .
  • the listening node is the node 102 that initiated at least one listener on the listening inode specified in the listening inode field 1402 . In some embodiments, there may be no field for the listening node.
  • the event message is merely directed to the appropriate listening node, and the event message 1400 does not identify the node to which it was directed.
  • the event message 1400 may also include a relevant inode field 1406 .
  • the relevant inode field 1406 identifies the inode upon which the event occurred that triggered the event message.
  • An event message 1400 may also include a relevant event field 1408 .
  • the relevant event field 1408 identifies the type of event that triggered the event message 1400 .
  • An event message 1400 may also include a minimum scope field 1410 .
  • the minimum scope identifies the minimum scope necessary to trigger the event message 1400 .
  • the minimum scope is the minimum scope of the listening inode that would have included the relevant inode for purposes of determining whether to send an event message. For instance, with regards to FIGS. 12A and 12B , if the listening inode is 13 and the relevant inode is 17 , then the minimum scope for triggering an event message would be the children scope. If, on the other hand, the listening inode is inode 12 and the relevant inode is inode 17 , then the minimum scope to trigger an event message would be the recursive scope. If, in yet another example, the listening inode were 13 and the relevant inode were also 13 , then the minimum scope for triggering the event message would be the single scope.
  • FIG. 15 illustrates one embodiment of a flowchart of the operations to determine the minimum scope.
  • the process determines whether the relevant inode is the listening inode. If the relevant inode is the listening inode, then the participant module sets the minimum scope to single 1504 . If, on the other hand, the relevant inode is not the listening inode, then the process determines whether the relevant inode is the immediate child of the listening inode 1506 . If the relevant inode is the immediate child of the listening inode, then the process sets the minimum scope to children 1508 . If, on the other hand, the relevant inode is not the immediate child of the listening inode, then the process sets the minimum scope to recursive 1510 . In one embodiment, the participant module executes the process to determine minimum scope, though in other embodiments, other modules, such as the initiator module, may execute this process to determine the minimum scope.
  • the listening inode is 12
  • the minimum scope is single. In other words, the minimum scope necessary for a listener on inode 12 to cause an event message to be sent from inode 12 is the single scope.
  • the relevant inode is 15
  • the minimum scope is children. In other words, the minimum scope necessary for an event on the relevant inode 15 to trigger an event message to the listener attached to inode 12 would be the children scope.
  • the relevant inode is 18
  • the minimum scope would be recursive. In other words, the minimum scope necessary for an event on inode 18 to trigger an event message being sent to the listener attached to inode 12 would be the recursive scope.
  • FIG. 16 illustrates one embodiment of a flowchart of the operations to notify the listening processes 410 of relevant event messages.
  • relevant event messages are those event messages corresponding to relevant events for which listeners on the respective listening node are listening.
  • Relevant event messages are those event messages that queued in an event queue of a listener. In the illustrated embodiment, not all relevant events result in relevant event messages. In other words, relevant events may trigger an event message that is never queued. This scenario is discussed in more detail below.
  • the initiator module receives an event message from the participant module. In some embodiments, the participant module and the initiator module may reside on the same physical node 102 , even though they are different logical modules.
  • the operations between states 1604 and 1614 are repeated for as many times as there are process structures 406 for the listening inode.
  • the process determines the listening inode from the event message 1400 . By consulting the respective initiator hash table 400 , the process determines which listening processes 410 are listening to the listening inode. For example, with reference to FIG. 4A , there are two processes, Process 3000 and Process 3001 , listening to inode 12 . Thus, in this example, the operations between 1604 and 1614 would be executed twice, once each time for each listening process (or, in other words, each process structure 406 corresponding to a particular listening process 410 ).
  • state 1606 the process determines whether the relevant event, delivered by the event message 1400 , is within the event mask of the listening process. If it is not within the event mask of the listening process, then the process proceeds to state 1614 , and the respective iteration terminates. If, on the other hand, the relevant event is within the event mask of the listening process, then the process determines, in state 1608 , whether the minimum scope that could have generated the event message is less than or equal to the scope of the listening process. In the illustrated embodiment, the single scope is less than the children scope and the children scope is less than the recursive scope. State 1606 tests whether the listener is listening for the event. State 1608 tests whether the listener listens for the relevant inode.
  • event messages 1400 may be sent to an initiator module without the event message 1400 being queued in one of the event queues 408 of a corresponding listening structure 406 . This is due to the fact that a participant evaluated the composite scope and the composite event mask of all listeners for a particular listening inode. Some listeners, however, may be listening for different events within different scopes. Therefore, sometimes event messages 1400 will be routed to a respective initiator module without being added to the event queue 408 of any process structure 406 .
  • the relevant event is added to the event queue 408 of respective process structure 406 .
  • the respective event queue 408 may also be coalesced in some embodiments.
  • the process determines whether the event message 1400 is repetitive of other event messages. If the event message 1400 is repetitive, then it is not added to the respective event queue 408 .
  • the listening process is woken up and notified there are events available in the respective event queue 408 .
  • the initiator module executes the process illustrated in FIG. 16 (as distinguished from the listening process), though in other embodiments, other modules, such as the participant module, may execute the process.
  • Table 1 illustrates one embodiment of results of events on the particular inodes within the inode tree 300 .
  • the first example in Table 1 is a “create” event for a new inode, inode 22 .
  • the up-to-date listening set 804 of inode 22 include only listening inode 12 .
  • the operations between states 1302 and 1316 would execute once, for the single listening inode within the listening set 804 of the relevant inode 22 . Because the “create” event is within the event mask of listening inode 12 , as illustrated in FIG. 5 , the process progresses from state 1304 to state 1306 .
  • state 1306 and state 1312 execute twice.
  • the process progresses from state 1308 to state 1310 because the “create” event is within the event mask of listening Node 1 , as illustrated in FIG. 5 .
  • the relevant inode 22 is within the scope of the listening node, and, in state 1312 , the process creates and sends an event message to Node 1 .
  • the process determines that the “create” event is not within the event mask of the listening node, which causes the iteration to terminate without sending an event message to Node 3 with regards to the relevant message.
  • FIG. 16 illustrates one embodiment of a flowchart of operations that the initiator module executes upon receiving an event message 1400 from a participant node, in state 1602 .
  • the operations between states 1604 and 1614 execute twice.
  • the respective listening process is Process 3000 .
  • the “create” event is within the event mask of the listener requested by Process 3000
  • the process proceeds from state 1606 to state 1608 .
  • the minimum scope that would have generated the event message is equal to the scope requested by Process 3000 , as both scopes are recursive, the event message 1400 is added to the event queue 408 of the process structure 406 corresponding to Process 3000 .
  • the respective event queue 408 may also be coalesced in some embodiments.
  • the process determines whether the event message 1400 is repetitive of other event messages. If the event message 1400 is repetitive, then it is not added to the respective event queue 408 .
  • the listening process Process 3000 , is woken up and notified there are events available in the respective event queue 408 .
  • the listening process is Process 3001 . Because the “create” event is not within the event mask of the listener requested by Process 3001 , the initiator module ends.
  • the second example in Table 1 is a “size change” to inode 12 .
  • the up-to-date listening set 804 of inode 12 comprises only listening inode 12 .
  • both Nodes 1 and 3 listen for “size change” events.
  • inode 12 is within the scope of the respective node structure 508 for both nodes 1 and 3 because inode 12 is the listening inode.
  • the participant module sends event messages 1400 to both Nodes 1 and 3 . Because both Processes 2004 and 3001 listen for the “size change” event, the event messages 1400 sent to Nodes 1 and 3 are placed into the corresponding event queues 408 of the respective listener structures 406 .
  • the third example in Table 1 is a “size change” to inode 16 .
  • the up-to-date listening set 804 for inode 16 includes listening inodes 12 , 13 , and 16 .
  • the participant module sends an event message 1400 only to Node 1 because the listeners on Node 3 attached to listening inode 12 specify the single scope, and inode 16 is not inode 12 .
  • the participant module sends an event message 1400 to Node 3 , as the listeners for inode 13 on Node 3 listen for “size change” events and have a composite scope of children, and inode 16 is a child of inode 13 .
  • the participant module sends an event message 1400 to Node 3 , as the listeners for inode 16 on Node 3 listen for “size change” events and have a composite scope of recursive.
  • event message 1400 specifying inode 12 as the listening inode, the event message 1400 is not queued because the scope of the Process 3001 is only the children scope, and the minimum scope necessary to trigger an event on inode 16 based on listening inode 12 is the recursive scope, which is greater than the children scope.
  • Process 3000 has a recursive scope, it only listens for the “create” event, not the “size change” event. Thus, this event message 1400 reaches the respective initiator module, but is never queued.
  • the event message 1400 directed to Node 3 with respect to listening inode 13 is also not queued.
  • the Process 2000 does not listen for the “size change” event and Processes 2001 and 2002 have single scope, and the minimum scope required to trigger an event message from inode 16 arising from listening inode 13 is the children scope, which is greater than the single scope.
  • the event message 1400 sent to Node 3 with respect to listening inode 16 is placed on the corresponding event queue 408 of the respective process structure 406 .
  • Process 2003 listens to events within the recursive scope of inode 16 , and listens for “size change” events. Because a “size change” event on inode 16 is within the scope and the event mask of the listener attached by Process 2003 , the respective event message 1400 is queued in the event queue 408 corresponding to Process 2003 .
  • Example 4 in Table 1 illustrates a “size change” event to inode 13 .
  • the listening set 804 of inode 13 includes inodes 12 and 13 .
  • Node I listens for all events within the recursive scope of inode 12 and also listens for the “size change” event. Therefore, an event message 1400 is sent to Node 1 .
  • no event message is sent to Node 3 .
  • listening inode 13 because the listeners on Node 3 listen for the “size change” event, and because inode 13 is within the scope of listening inode 13 , an event message 1400 is sent to Node 3 .
  • the event message 1400 sent to Node 1 is queued in the event queue 408 corresponding to Process 3001 because inode 13 is within the children scope of inode 12 and because Process 3001 listens for the “size change” event.
  • the same event is not queued in the event queue 408 corresponding to Process 3000 because that listener listens only for “create” events.
  • the event message 1400 sent to Node 3 with respect to inode 13 , the event message 1400 is queued in the event queues 408 corresponding to Processes 2001 and 2002 because inode 13 is within the single scope of inode 13 and because Processes 2001 and 2002 listen for the “size change” event.
  • the fifth example in Table 1 is a “remove” event on inode 13 .
  • the up-to-date listening set 804 of inode 13 comprises listening inodes 12 and 13 .
  • none of the nodes listening to the listening inode 12 listens for the “remove” event.
  • This is illustrated in the participant structure 506 for inode 12 .
  • the corresponding event mask does not include the “remove” event.
  • the participant structure 506 for inode 13 does include the “remove” event.
  • An event message 1400 is created and sent to Node 3 because inode 13 is within the children scope of inode 13 and because the “remove” event is within the event mask of the node structure 508 corresponding to node 3 .
  • Only Process 3000 listens for the “remove” event. Because inode 13 is within the children scope of listening inode 13 , the “remove” event on inode 13 is queued in the event queue 408 corresponding to Process 2000 .
  • FIG. 17 illustrates one embodiment of a flowchart of the operations to add a node 102 to a cluster 108 , and, accordingly, to update the participant hash tables 500 .
  • the process acquires an exclusive event lock, preventing other nodes from reading from or writing to the system.
  • FIG. 17 illustrates a group of operations, referred to collectively as a process.
  • the process of the nodes 102 within the cluster 108 send the contents of their respective initiator hash tables 400 to the other nodes 102 .
  • the process sends messages to the participant modules to build new participant hash tables 500 based on the sent initiator hash tables 400 .
  • the process instructs the participant modules to swap in the new participant hash tables, in state 1710 . If all of the sends were not successful, then the process sends an error message to the Group Management Protocol, in state 1712 . After completing state 1712 or state 1710 , the process releases the exclusive lock in state 1714 .
  • the initiator module executes the process illustrated in FIG. 17 , though in other embodiments, other modules, such as the participant module, may execute the process.
  • the data structures described herein have been addressed to a distributed system, some embodiments of the invention may be used in a single file system. In such a system, there may be only an initiator hash table, and the processes described above may all reference it. Additionally or alternatively, the data structures may also be organized such that the queue of events appears on the participant side, rather than the initiator side. Moreover, the event system described above explained that some event messages may arrive to the initiator, but may never be queued. In other embodiments, the data structures could be changed to track listener processes on the participant side. The above-mentioned alternatives are examples of other embodiments, and they do not limit the scope of the invention. It is recognized that a variety of data structures with various fields and data sets may be used. In addition, other embodiments of the flow charts may be used.

Abstract

In one embodiment, systems and methods are provided for tracking events wherein an event system monitors certain areas of a system. When an event occurs in one area of the system, the event system notifies the processes listening to that area of the system of the event.

Description

    LIMITED COPYRIGHT AUTHORIZATION
  • A portion of the disclosure of this patent document includes material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyrights whatsoever.
  • FIELD OF THE INVENTION
  • This invention relates generally to systems and methods of notifying listeners of events.
  • BACKGROUND
  • The increase in processing power of computer systems has ushered in a new era in which information is accessed on a constant basis. One response has been to distribute processing requests across multiple nodes or devices. A distributed architecture allows for more flexible configurations with respect to factors such as speed, bandwidth management, and other performance and reliability parameters.
  • The distributed architecture allows multiple nodes to process incoming requests. Accordingly, different process requests may be handled by different nodes. Problems may occur, however, when one of the nodes modifies information that effects other nodes.
  • Because of the foregoing challenges and limitations, there is an ongoing need to improve the manner in which nodes of a distributed architecture process events.
  • SUMMARY OF THE INVENTION
  • The systems and methods generally relate to notifying listeners of events.
  • In one embodiment, an event listening system is provided. The event listening system may include a file system including a plurality of files, the plurality of files logically stored in a tree; for each of the plurality of files, a first data structure configured to track a set of listening files that are listening for events that affect the corresponding file; a plurality of processes that each listen for events that affect at least one of the plurality of files; a second data structure configured to track, for each of the plurality of files, which of the plurality of processes are listening to each of the files; a listening module configured to receive an identifier for a first file of the plurality of files and to determine whether the first file is relevant to any of the plurality of processes using the first data structure and the second data structure; a traverse module configured to traverse a first set of first data structures that correspond to a subset of the plurality of files that represent one branch of the tree; and an update module configured to update at least one of the corresponding first data structures of the file in at least one traversed level by reviewing a scope of at least one of the listening files of the first data structure that corresponds to the file's parent.
  • In a further embodiment, a method for listening for events is provided. The method may include logically storing a plurality of files in a tree; for each of the plurality of files, tracking a set of listening files that are listening for events that affect the corresponding file; storing a plurality of processes that each listen for events that affect at least one of the plurality of files; for each of the plurality of files, tracking which of the plurality of processes are listening to each of the files; receiving an identifier for a first file of the plurality of files; determining whether the first file is relevant to any of the plurality of processes using the first data structure and the second data structure; traversing a first set of first data structures that correspond to a subset of the plurality of files that represent one branch of the tree; and updating at least one of the corresponding first data structures of the file in at least one traversed level, wherein updating includes reviewing a scope of at least one of the listening files of the first data structure that corresponds to the file's parent.
  • In an additional embodiment, a system for listening for events is provided. The system may include a file structure comprising a plurality of files that are logically stored in a tree; for each of the plurality of files, a data structure corresponding to each files, the data structure comprising: a set of identifiers of the plurality of files that are listening for events that affect the corresponding file; and an indication of the currentness of the data structure.
  • In a further embodiment, a method for listening for events is provided. The method may include logically storing a plurality of files in a tree; and for each of the plurality of files, storing a data structure corresponding to each files, the data structure comprising a set of identifiers of the plurality of files that are listening for events that affect the corresponding file and an indication of the currentness of the data structure.
  • In an additional embodiment, a system for queuing event messages in a file system is provided. The system may include a plurality of processes that each listen for events that affect at least one of a plurality of files; a first data structure configured to determine, for each of the plurality of processes, a set of listening files to which each of the plurality of processes is listening; and a message module configured to receive an event message related to a first file of the plurality of files, the event message including an indication of a minimum scope that would have generated the event message, to search the first data structure to determine a first subset of the plurality of processes that that listen for files that are affected by the event using the sets of listening files, to determine a second subset of the first subset by removing from the first subset, processes whose scope is less than the minimum scope of the event message, and to inform the second subset of the event message.
  • For purposes of summarizing this invention, certain aspects, advantages, and novel features of the invention have been described herein. It is to be understood that not necessarily all such advantages may be achieved in accordance with any particular embodiment of the invention. Thus, the invention may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other advantages as may be taught or suggested herein.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIGS. 1A and 1B illustrate, respectively, one embodiment of physical and logical connections of one embodiment of nodes in a system.
  • FIG. 2 illustrates one embodiment of the elements of an inode data structure in a distributed file system.
  • FIGS. 3A, 3B, and 3C illustrate one embodiment of the respective scope of single, children, and recursive listeners.
  • FIGS. 4A, 4B and 4C illustrate one embodiment of initiator hash tables.
  • FIG. 5 illustrates one embodiment of a participant hash table.
  • FIGS. 6A and 6B illustrate one embodiment of the scope of listeners from the perspective of processes and nodes, respectively.
  • FIG. 7 illustrates one embodiment of a flowchart of operations to add an additional listener to an embodiment of the system.
  • FIGS. 8A and 8B illustrate one embodiment of the scope of exemplary listeners (from the perspective of nodes) following the addition of another listener.
  • FIG. 9 illustrates one embodiment of a top-level flowchart of operations for notifying listeners of an event in an embodiment of the system.
  • FIG. 10 illustrates one embodiment of a flowchart of operations to validate the event cache of an inode.
  • FIG. 11 illustrates one embodiment of a flowchart of operations to update the cache of a child inode with the cache of the parent inode.
  • FIGS. 12A and 12B illustrate one embodiment of the status of caches following a “size change” and a “create” event, respectively.
  • FIG. 13 illustrates one embodiment of a flowchart of operations of the participant module to send event messages to listening nodes.
  • FIG. 14 illustrates one embodiment of two event messages.
  • FIG. 15 illustrates one embodiment of a flowchart of operations to determine the minimum scope.
  • FIG. 16 illustrates one embodiment of a flowchart of operations of the initiator module to receive an event message and to notify listening processes accordingly.
  • FIG. 17 illustrates one embodiment of a flowchart of operations to update initiator and participant hash tables following the addition of a node to the system.
  • These and other features will now be described with reference to the drawings summarized above. The drawings and the associated descriptions are provided to illustrate embodiments of the invention and not to limit the scope of the invention. Throughout the drawings, reference numbers may be re-used to indicate correspondence between referenced elements. In addition, the first digit of each reference number generally indicates the figure in which the element first appears.
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • Systems and methods which represent one embodiment of an example application of the invention will now be described with reference to the drawings. Variations to the systems and methods which represent other embodiments will also be described.
  • For purposes of illustration, some embodiments will be described in a context of a distributed file system. The present invention is not limited by the type of environment in which the systems and methods are used, however, and the systems and methods may be used in other environments, such as, for example, other file systems, other distributed systems, the Internet, the World Wide Web, a private network for a hospital, a broadcast network for government agency, an internal network of a corporate enterprise, an Internet, a local area network, a wide area network, a wired network, a wireless network, and so forth. Some of the figures and descriptions, however, relate to an embodiment of the invention wherein the environment is that of a distributed file system. It is also recognized that in other embodiments, the systems and methods may be implemented as a single module and/or implemented in conjunction with a variety of other modules and the like. Moreover, the specific implementations described herein are set forth in order to illustrate, and not to limit, the invention. The scope of the invention is defined by the appended claims.
  • I. OVERVIEW
  • In one embodiment, systems and methods are provided for tracking events in a distributed file system. In this embodiment, an event system monitors certain areas of a file system. When an event occurs in one area of the distributed file system, the event system notifies the processes listening to that area of the distributed file system of the event. One example of a listening application is a directory management application. When the directory management application opens a window on a particular directory, it may instantiate a listener on that directory. When another application, such as a word processor, creates a new file in that directory, the event system notifies the listening application, which can then immediately update the window to show the new file. Another example of a listening application is an indexing service which listens to a subdirectory recursively. An indexing service may, for example, store an index for words and phrases appearing within a certain group of documents. The index may be used to enhance document searching functionality. Whenever the service is notified of an event, it may re-index the file or files corresponding to that event. An event system may also be used internally by the distributed file system to monitor configuration files and to take appropriate actions when they change. In general, a listening process, which includes an executed instantiation of an application, may refer to the client process that requests a listener on the distributed file system, and the listener may refer to the data structures initiated by the event system to monitor and report events to the listening process.
  • In one embodiment of the event system illustrated in FIGS. 1 through 17, there are three general areas that the event system implements: (1) maintaining a cluster-wide set of listeners; (2) determining whether a specified file is being listened to; and (3) notifying listeners of those files of the events. Before describing these areas in more detail, some preliminary background will be provided regarding the components and connections of the exemplary distributed network, a metadata element of the exemplary file system, and event listeners maintained by the event system.
  • A. Components and Connections
  • An event system may be designed for a distributed network architecture. FIG. 1A illustrates the connections of elements in one embodiment of a distributed system 100. In the illustrated embodiment, there are three nodes 102. These nodes are connected through a network 104. Client processes access the distributed system 100 through the network 104, using, for example, client machines 106. Although in the illustrated embodiment there is a single network 104 connecting both nodes 102 and client machines 106, in other embodiments there may be separate networks. For instance, there may be a front-end network connecting the nodes 102 to the client machines 106 and a back-end network for inter-node communication.
  • FIG. 1B illustrates one possible logical connection of three nodes 102, forming a cluster 108. In the illustrated embodiment, the nodes 102 in cluster 108 are connected in a fully connected topology. A fully connected topology is a network where each of the nodes in the network is connected to every other node in the network. Although in the illustrated embodiment the nodes 102 are arranged in a fully connected network topology, in other embodiments of the invention, the cluster 108 of nodes 102 may be arranged in other topologies, including, but not limited to, the following topologies: ring, mesh, star, line, tree, bus topologies, and so forth. It will be appreciated by one skilled in the art that various network topologies may be used to implement different embodiments of the invention. In addition, it is recognized that the nodes 102 may be connected directly, indirectly, or a combination of the two, and that all of the nodes may be connected using the same type of connection or one or more different types of connections. It is also recognized that in other embodiments, a different number of nodes may be included in the cluster, such as, for example, 2, 16, 83, 6, 883, 10,000, and so forth.
  • In one embodiment, the nodes 102 are interconnected through a bi-directional communication link where messages are received in the order they are sent. In one embodiment, the link comprises a “keep-alive” mechanism that quickly detects when nodes or other network components fail, and the nodes are notified when a link goes up or down. In one embodiment, the link includes a TCP connection. In other embodiments, the link includes an SDP connection over Infiniband, a wireless network, a wired network, a serial connection, IP over FibreChannel, proprietary communication links, connection based datagrams or streams, and/or connection based protocols.
  • B. Distributed File System
  • One example implementation of a distributed architecture is a distributed file system. An event system may be implemented for a distributed file system, notifying listening processes of certain events on files and directories within the file system. In one embodiment of a distributed file system, metadata structures, also referred to as inodes, are used to monitor and manipulate the files and directories within the system. An inode is a data structure that describes a file or directory and may be stored in a variety of locations including on disk and/or in memory. The inode in-memory may include a copy of the on-disk data plus additional data used by the system, including the fields associated with the data structure and/or information about the event system. The nodes of a distributed system, such as nodes 102, may implement an inode cache. Such a cache may be implemented as a global hash table that may be configured to store the most recently used inodes. In one implementation, the inode cache may store more than 150,000 inodes and the inode may be around 1 KB of data in memory though it is recognized that a variety of different implementations may be used with caches of different sizes and inodes of different sizes. Information for an event system may include information regarding those listeners that are monitoring certain events on the file or directory corresponding to a particular inode. In one embodiment of an event system, this information is referred to as the event cache.
  • FIG. 2 illustrates one embodiment of an in-memory inode 200 of a distributed file system. In the illustrated embodiment, the inode 200 includes several fields. The inode 200 includes a mode field 202, which indicates, for example, either a file or directory. A file is a collection of data stored in one unit under a filename. A directory, similar to a file, is a collection of data stored in one unit under a directory name. A directory, however, is a specialized collection of data regarding elements in a file system. In one embodiment, a file system is organized in a tree-like structure. Directories are organized like the branches of trees. Directories may begin with a root directory and/or may include other branching directories. Files resemble the leaves or the fruit of the tree. Files, typically, do not include other elements in the file system, such as files and directories. In other words, files do not typically branch. Although in the illustrated embodiment an inode represents either a file or a directory, in other embodiments, an inode may include metadata for other elements in a distributed file system, in other distributed systems, or in other file systems.
  • The exemplary inode 200 also includes a LIN field 204. In one embodiment of a distributed file system, the LIN, or Logical Inode Number, is a unique identifier for the file or directory. It uniquely refers to the on-disk data structures for the file or directory. It may also be used as the index for the in-memory inodes, such as the index for a cache of in-memory inodes stored on nodes 102. In the exemplary inode 200, the LIN is 10. Accordingly, the exemplary inode 200 would be referred to as “inode 10.”
  • The exemplary inode 200 also includes fields to implement an event cache, including a listening set field 206 and a cache generation number field 208. The listening set provides information about which other inodes are listening to this particular inode. An event system may use an inode's listening set to help notify listeners of particular events. The listening set of an inode may include a set of LINs, referring to a set of inodes, including perhaps the inode itself. If, for example, inodes 12, 13, and 16, or otherwise stated, the inodes whose LINs are 12, 13, and 16, respectively, are inodes being listened to by listeners whose scope is broad enough to include inode 10, then the listening set field 206 would include inodes 12, 13, and 16. In the illustrated embodiment, however, the listening set field 206 is empty, indicating that there are no listeners whose scope includes inode 10. The scope of listeners in an exemplary directory system with inode 10 is illustrated by FIG. 3.
  • Another element of the exemplary event cache described herein is the cache generation number. The exemplary inode 200, therefore, also includes a cache generation number field 208. As will be discussed in further detail below with reference to FIGS. 8A and 8B, an event system may use the cache generation number to identify whether the listening set of an inode, such as inode 10, is up-to-date. The exemplary inode 200 may also include other fields 210 and/or a subset of the fields discussed above.
  • One example of a distributed file system, in which embodiments of event systems and methods described herein may be implemented, is described in U.S. patent application Ser. No. 10/007,003 entitled “Systems and Methods for Providing a Distributed File System Utilizing Metadata to Track Information About Data Stored Throughout the System,” filed Nov. 9, 2001 which claims priority to Application No. 60/309,803 filed Aug. 3, 2001, U.S. patent application Ser. No. 10/281,467 entitled “Systems and Methods for Providing A Distributed File System Incorporating a Virtual Hot Spare,” filed Oct. 25, 2002, and U.S. patent application Ser. No. 10/714,326 entitled “Systems And Methods For Restriping Files In A Distributed File System,” filed Nov. 14, 2003, which claims priority to Application No. 60/426,464, filed Nov. 14, 2002, all of which are hereby incorporated by reference herein in their entirety.
  • C. Event Listeners
  • In one embodiment, a listener is a logical construction of data and operations that monitors events on a particular resource or data structure. In a file system, listeners may be assigned to a particular file or directory. A single listener, however, may listen for events on more than just one resource, or more than one file or directory. Thus, in one embodiment, a listener may be defined with a particular scope. Events on a file or directory within the scope of a particular listener may be monitored with the other files and/or directories within the scope of that listener.
  • FIGS. 3A, 3B, and 3C illustrate one embodiment of the respective scope of single, children, and recursive listeners on an inode tree 300. In the illustrated embodiment, an inode tree, such as inode tree 300, corresponds to a file directory system. Each inode in the tree corresponds to a file or directory in the file directory system. Throughout the drawings representing features of a file system, circles are used to denote directories, and squares are used to denote files.
  • FIG. 3A illustrates one embodiment of the scope of a single listener. A process requesting a single listener is requesting notification of events on only the specified inode. In the illustrated embodiment, a listening process has requested event messages for events that occur on the directory corresponding to the inode 12.
  • FIG. 3B illustrates one embodiment the scope of a children listener. A listening process requesting a children listener is requesting notification of events on the specified inode and its children inodes. In the illustrated embodiment, inode 12 (directory) has three immediate descendents, or children: 13 (directory), 14 (file) and 15 (directory). A process listening to inode 12 with children scope listens for events that occur on inode 12 and all of the immediate descendents, or children, of inode 12.
  • FIG. 3C illustrates one embodiment of the scope of a recursive listener. A listening process requesting a recursive listener is requesting notification of events on the specified inode and its descendents, regardless of the immediacy. In the illustrated embodiment, inode 12 is being listened to recursively. A process listening to inode 12 recursively listens for events that occur on inode 12 and its descendents in the inode tree 300.
  • Although in the illustrated embodiments, only single, children, and recursive listeners are identified, one skilled in the art will appreciate that many different types of listening scopes may be defined in accordance with embodiments of the invention. For instance, another embodiment of the invention may define a grandchildren scope which listens to the events on the specified inode and its children and grandchildren, an only grandchild scope which listens to the events on the specified inode and its grandchildren, or a parent scope which listens to an inode and its parent. (Grandchildren inodes would be those inodes that are two-generation descendents of the originating inode.) In the illustrated inode tree 300, a grandchildren listener on inode 12 would listen to events on inodes 12, 13, 14, 15, 16, 17, and 18, an only grandchildren listener on inode 12 would listen to events on inodes 12, 16, 17, and 18, and a parent listener on inode 12 would listen to events on inodes 12 and 10. Alternatively or additionally, a listening scope may be defined that includes only files or only directories. Other possible listening scopes may also be defined.
  • As mentioned above, in one embodiment of the event system described herein, there are three main areas that the event system implements: (1) maintaining a cluster-wide set of listeners; (2) deciding if a specified file is being listened to; and (3) notifying listeners of those events on files that the listeners are listening for. Each of these three areas is described in further detail below.
  • II. MAINTAINING A CLUSTER-WIDE SET OF LISTENERS
  • FIGS. 4A, 4B, 4C, and 5 illustrate one embodiment of data structures that an event system may employ to maintain a cluster-wide set of listeners in a distributed system. In one embodiment of an event system, there are two logical entities that implement the listeners for a distributed file system: initiators and participants. In the exemplary embodiment of the event system, each listener is instantiated on one particular node 102 of cluster 108. This is the initiator node for that listener. A node 102 may be the initiator node for multiple listeners. In one embodiment, each node 102 keeps track of the instantiated listeners on that node in a single initiator hash table, which also keeps a queue, for each listener, of the listened-for events. Each node 102 may also execute certain operations to maintain the instantiated listeners and to notify nodes 102 of cluster 108 of any changes to the group of instantiated listeners, including additional listeners. Thus, the term “initiator” may be used to refer to the node upon which a listener is instantiated, the data structure that stores relevant events for that listener, and/or a module that executes operations related to the instantiated listeners.
  • In the exemplary event system, each node 102 of cluster 108 is a participant node for all of the instantiated listeners. The participants monitor each event to determine whether a particular node 102 is listening to the relevant inode for that event. In the exemplary embodiment, the relevant inode is the inode that is affected by the current event, and the particular node 102 listening to the relevant inode is called a listening node. Participants notify listening nodes of listened-for events with an event message. Although the exemplary event system contemplates nodes 102 acting as both initiators and participants, in other embodiments, certain nodes 102 within the cluster may be defined exclusively as initiators or participants. FIG. 1B illustrates one embodiment of a node that includes a participant module and an initiator module, through it is recognized that in some embodiments one or more of the nodes may include a participant module or may instead include an initiator module.
  • As used herein, the word module refers to logic embodied in hardware or firmware, or to a collection of software instructions, possibly having entry and exit points, written in a programming language, such as, for example, C or C++. A software module may be compiled and linked into an executable program, installed in a dynamic link library, or may be written in an interpreted programming language such as, for example, BASIC, Perl, or Python. It will be appreciated that software modules may be callable from other modules or from themselves, and/or may be invoked in response to detected events or interrupts. Software instructions may be embedded in firmware, such as an EPROM. It will be further appreciated that hardware modules may be comprised of connected logic units, such as gates and flip-flops, and/or may be comprised of programmable units, such as programmable gate arrays or processors. The modules described herein are preferably implemented as software modules, but may be represented in hardware or firmware.
  • A. Initiator Data Structures
  • FIGS. 4A, 4B, and 4C illustrate one embodiment of the initiator data structures for each node 102 in the cluster 108. In general, the initiator module receives requests from processes to instantiate certain listeners. In the exemplary embodiment, a listener is defined by a three-element tuple comprising an identifier for the listening inode, (for example, the inode's LIN), the listener's scope, and the list of the type of events to be listened for. The listening inode is the inode to which the listener is directed. As explained above with reference to FIGS. 3B and 3C, a single listener may listen for events to more than just one inode. Therefore, the listening inode is the point of reference from which to calculate the scope of the listener. For instance, with respect again to FIG. 3C, if the listening inode is inode 12 and the scope is recursive, then the listener listens for events that occur on inode 12 and its descendents, inodes 13, 14, 15, 16, 17, 18, 19, 20, and 21. If, alternatively, the listening inode is 18 and the scope is similarly recursive, then the listener listens for events that occur on inode 18 and its lone descendent, inode 21. In one embodiment, listeners may not listen for every event, choosing instead to filter the events they listen for with a list of listening events, collectively referred to as an event mask. It is recognized that in other embodiments, other points of reference, scopes, and event masks may be used.
  • In addition to receiving requests for listeners, the initiator module also stores the requested listeners in a hash table, also referred to as the initiator hash table, and sends messages to participant modules regarding additions, deletions, and/or changes to the listeners stored in its hash table. This is discussed in more detail with reference to FIG. 7. In one embodiment, the nodes 102 of the cluster 108 include an initiator hash table. As discussed in more detail below with reference to FIG. 4B, the hash table may not include any listeners. The initiator module also communicates the contents of its hash table to the participant modules when a new node 102 is added to the cluster 108. (This is discussed in more detail below with reference to FIG. 17.) In other embodiments, the initiator module may also communicate the contents of its hash table to the participant modules when a node 102 is removed from the cluster. The initiator module may also be configured to receive event messages from participant modules, signifying that the participant node has processed an event that affects an inode for which the initiator node is listening. The initiator module determines from these event messages which, if any, listeners are listening for the event. The initiator module queues those events for which listeners are listening and notifies the listening process of the event. In some embodiments, an event message arrives at an initiator module even though there are no listeners in the initiator's hash table listening for that event. Receiving event messages is discussed in more detail below with reference to FIG. 16.
  • The initiator hash table 400 includes an index 402, initiator structures 404, process structures 406, and event queues 408. As mentioned above, in the illustrated embodiment, there is one initiator hash table 400 per node 102. In the illustrated embodiment, the initiator hash table 400 maps the LINs of listening inodes to initiator structures 404. As mentioned above, in the illustrated embodiment, a listening inode is the inode to which a listener is directed. The LIN is the key for the initiator hash table 400. The respective LIN is put through a hash function which maps it to another number. That number is then divided by the size of the table, and the remainder is then used as the index into an array of linked list heads or buckets. The buckets hold or point to the initiator structures 404. If multiple LINs hash to the same bucket, then there are multiple initiator structures 404 held in that bucket. Although in the illustrated embodiment a standard hash table is used to map the LIN of a listening inode to an initiator structure 404, there are many suitable data structures that may be used, including without limitation an: array, skip list, rediblack tree, btree, splay tree, AVL tree, and so forth.
  • Structures are collections of associated data elements, such as a group or set of variables or parameters. In one embodiment, a structure may be implemented as a C-language “struct.” One skilled in the art will appreciate that many suitable data structures may be used. As described above, the initiator hash table 400 maps the LINs of listening inodes to initiator structures 404. In the illustrated embodiment, the initiator structure 404 stores the composite scope and event mask of the listeners (initiated on the initiator node) that listen to the respective listening inode. The initiator structure 404 includes a field for the LIN of the respective listening inode, the composite scope of the listeners listening to the listening inode, and the composite event mask of the listeners listening to the listening inode. In one embodiment, the composite scope is the broadest scope of those listeners listening to the listening inode, and the composite event mask is the union of the event masks of the listeners listening to the listening inode.
  • In the illustrated embodiment, the process structures 406 represent the individual listeners initiated by the initiator. The process structures 406 correspond to associated listening processes 410, which request individual listeners. As used herein, listening processes may refer both to the listening processes 410 and, at other times, to the corresponding process structures 406. In the illustrated embodiment, the process structures 406 include three fields. The first field is a process identifier, which uniquely identifies a process in data communication with the node 102 that has requested a listener on the listening inode. The second field is the scope of the listener that the respective listening process 410 requests for the listening inode. As mentioned with reference to FIGS. 3A, 3B, and 3C above, in the illustrated embodiment, listening scope may be one of three scopes: single, children, or recursive. In the illustrated embodiment, S denotes single listeners, C denotes children listeners, and R denotes recursive listeners. The third field is the event mask of the listener that the respective listening process 410 requests for the listening inode. The listened-for events may include, without limitation, attribute change, creation, deletion, size change, remove, content change, sizing increase, attribute change, link count change, rename, access revoke, create, rename from here, rename to here, rename within same directory, event occurred on file, event occurred on directory, size change, permission change, and/or other events. The event mask is a list of all the events the listener is listening for.
  • In the illustrated embodiment, the process structures 406 are associated with respective event queues 408. The event queues 408 store event messages from participants. Event messages include information regarding events on inodes within an inode tree, such as inode tree 300, that fall within the scope and event mask of the listener. Event messages are stored in the event queues 408 until a process is ready to process the event.
  • FIG. 4A illustrates an exemplary initiator hash table 400 on Node 1. As illustrated, there are two listening processes 410 that have requested listeners on a particular inode. In the case of Node 1, the listening processes 410 have requested listeners on the same inode, inode 12. Process 3000 has requested notification of all “create” events on inode 12 and the descendents of inode 12. In other words, Process 3000 has requested a listener on inode 12 with recursive scope and an event mask of “create.” Process 3001 has requested notification of all “size change” events on inode 12 and its immediate descendents, or children. In other words, Process 3001 has requested a listener on inode 12 with children scope and an event mask of “size change.” It will be appreciated that there are many different ways in which listening processes 410 may communicate listening parameters to nodes 102. In some embodiments, listening processes 410 may communicate listening parameters via a system call, subroutine call, etc.
  • In the illustrated embodiment, for the listening processes 410, there is a corresponding process structure 406 in the initiator hash table 400 of a node 102, for example Node 1. In the illustrated embodiment, the two listening processes 410, Processes 3000 and 3001, have corresponding process structures 406 stored on Node 1. As illustrated, the respective scopes and event masks of the process structures 406 match the respective scopes and event masks of listening processes 410. In the illustrated embodiment, listening processes 410 specify the point-of-reference inode (the listening inode), and process structures 406 include a field with a unique identifier for the respective listening process 410 that requested the listener.
  • Initiator structures 404 store global parameters for process structures 406 for respective inodes. Thus, the scope of a given initiator structure 404 is the broadest scope of any of the process structures 406 listening to the respective listening inode, which, in the case of inode 12 on Node 1, is recursive. The event mask of a given initiator structure 404 is a composite, for example, the union, of the event masks of the process structures 406 listening to the respective listening inode. The initiator structures 404 are indexed by the LIN of the listening inode in the initiator hash table index 402. In the illustrated embodiment, the only listeners that have been instantiated on Node 1 are the two listeners on inode 12. Therefore, there is only one corresponding initiator structure 404. The remaining entries in the initiator hash table 400 are empty.
  • FIG. 4B illustrates one embodiment of the initiator hash table 400 for Node 2. Because there are no processes requesting listeners through the network 104 on Node 2, there are no initiator structures 404. Thus, the initiator hash table 400 is initialized, but the entries in the hash table 400 are empty. In other embodiments, the initiator hast table 400 is not initialized until there are initiator structures 404.
  • FIG. 4C illustrates the initiator hash table 400 for Node 3. In the illustrated embodiment, there are five processes requesting listeners. Three of the listening processes 410 specify inode 13 as the inode to which they are listening. The remaining two listening processes 410 specify inodes 12 and 16, respectively, as the inodes to which they are listening. Processes 2000, 2001, and 2002 all request listeners on inode 13. Process 2003 requests a listener on inode 16. Finally, Process 2004 requests a listener on inode 12. Because the five processes collectively request listeners on three different inodes, there are three initiator structures 404, corresponding to each one of the specified listening inodes. These initiator structures 404 are indexed by their respective LINs. The five listening processes 410 have a corresponding process structure 406 in the initiator hash table 400 for Node 3. Because three of the listening processes 410 request listeners for inode 13, there are three process structures 406 linked to the initiator structure 404 corresponding to inode 13. Thus, there is a separate process structure 406 for Processes 2000, 2001, and 2002. These process structures 406 have different scopes and event masks, corresponding to the individual scope and event mask specified by the corresponding listening process. Thus, the process structure 406 corresponding to Process 2000 listens to every “remove” event on inode 13 and its immediate children inodes. The process structure 406 corresponding to Process 2001 listens to every “size change” event on inode 13. Identically, the process structure 406 listens to every “size change” event on inode 13.
  • Process 2004 requests a listener on inode 12. This listener listens for “size change” events on inode 12. Because there is only one process structure 406 for inode 12, the initiator structure 404 corresponding to inode 12 matches the process structure 406 for Process 2004, with respect to the scope and the event mask. Process 2003 requests a listener for inode 16. The listener listens for “size change” events on inode 16 and its descendents. Similar to the initiator structure 404 for inode 12, the initiator structure 404 for inode 16 matches the process structure 406.
  • As described above with reference to FIG. 4A, the process structures 406 have corresponding event queues 408. When events occur within the scope and event mask of the listener, the events or the messages about the events are queued in the event queue 408 of the corresponding process structure 406.
  • B. Participant Data Structures
  • FIG. 5 illustrates one embodiment of the participant data structures. In the exemplary embodiment, participant data structures include a participant hash table 500 and a node generation number 502. In the exemplary embodiment, there are listeners listening to three different inodes. These inodes are 12, 13, and 16, respectively. The node structures 508 indicate the composite scope and event masks for all of the listeners for a particular listening inode initiated on a particular node 102. The scope and event masks of the node structures 508 correspond to the initiator structures 404 for the respective listening inode. By way of example, the node structure 508 for Node 1 that is associated with inode 12 corresponds to the initiator structure 404 for Node I that is associated with inode 12, as illustrated in FIG. 4A. In the illustrated embodiment, there are listeners for inode 12 that were initiated by both Nodes 1 and 3. In other words, certain listening processes 410 communicated to Nodes 1 and 3, respectively, the parameters for process structures 406 for inode 12. Each node structure 508 may represent multiple listeners, just as initiator structures 404 represented multiple listeners. For instance, the node structure 508 for inode 13 represents the three listeners initiated on Node 3, corresponding to Processes 2000, 2001, and 2002 and their respective process structures 406.
  • In the illustrated embodiment, the participant structure 506 represents the composite scope and event masks of the node structures 508 corresponding to the respective listening inodes in the participant hash table 500. For example, the participant structure 506 corresponding to inode 12 includes a composite scope and composite event mask representing those listeners for inode 12 that are initiated, in this embodiment, on all of the nodes 102. Thus, the scope of the participant structure 506 corresponding to inode 12 is recursive, indicating the broadest scope of the two node structures 508 corresponding to inode 12. The event mask for participant structure 506 corresponding to inode 12 includes the “create” and “size change” events, which is the union of the event masks of node structure 508 for Node 1 and of node structure 508 for Node 3. Each participant structure 506 is indexed in the participant hash table index 504 by the LIN of the respective listening inode. Because the node structures 508 corresponding to listening inodes 13 and 16, respectively, are the only node structures 508 for their respective listening inodes, the respective participant structures 506 have the same scope and event mask as their respective node structures 508.
  • In the illustrated embodiment, the participant hash table 500 is the same for Nodes 1, 2, and 3. The purpose of the participant structures is to process events that may occur on any given inode in the distributed file system, and that may occur, for instance, on any one of the nodes 102 in the cluster 108. It is recognized that in some embodiments one or more of the participant hash tables may be different.
  • In the illustrated embodiment, the participant data structures also include a node generation number 502. The node generation number 502 is used to verify that a particular inode's cache is up-to-date, as discussed further below with reference to FIGS. 8A and 8B. The node generation number 502 may be incremented every time there is a significant change to the participant hash table 500. Changes to the participant hash tables 500 correspond to changes to the respective initiator hash tables 400. The node generation number 502 for each respective node 102, however, need not be the same. Because nodes 102 that may have been disconnected from the cluster 108 may not have been involved during a change to the participant hash tables 500 of the other nodes 102, the generation numbers for the nodes 102 may be different. The participant hash tables 500, however, are the same on every node 102.
  • FIGS. 6A and 6B illustrate one embodiment of the different perspectives of listening scope. FIG. 6A illustrates the scope of listeners from the perspective of each individual process. FIG. 6B illustrates the scope of listeners with respect to the participant structures 506 in the participant hash tables 500. The listeners illustrated in FIGS. 6A and 6B correspond to the listeners described in FIGS. 4A, 4B, 4C, and 5. FIG. 6B illustrates the listeners from the perspective of each process that has requested a listener on one of the inodes in the inode tree 300.
  • There are seven listeners represented in FIG. 6A. A listener with single scope 302 is attached to inode 12. Additionally, there is a listener with children scope 304 attached to inode 12. Finally, there is a listener with recursive scope 306 attached to inode 12. These three listeners correspond to the three listeners requested for inode 12, as illustrated in FIGS. 4A and 4C. Thus, FIG. 4A illustrates two processes, Processes 3000 and 3001, which are listening to inode 12. The recursive listening scope 306 corresponds to the listener requested by Process 3000, which has a recursive scope. Similarly, the children listening scope 304 corresponds to the listener requested by Process 3001, which also has a scope of children. Finally, the single listening scope 302 corresponds to the listener requested by Process 2004, as illustrated in FIG. 4C, which is a single listener attached to inode 12.
  • As illustrated in FIG. 6A, there are three listening scopes for inode 13. Two of these listening scopes are single scope 308 and 310, and the last listening scope is a children scope 312. These three listening scopes correspond to the three listeners illustrated in FIG. 4C. Thus, Process 2000 requests a listener for inode 13 with children scope, which corresponds to children listening scope 312. Processes 2001 and 2002 request listeners on inode 13, each with single scope, which correspond to single listening scopes 308 and 310. Finally, as illustrated in FIG. 6, inode 16 has a listening scope 314 attached to it, which corresponds to the listener requested by Process 2003. Although the scope of the listener attached to inode 16 as illustrated in FIG. 16 appears to be a single listener, it is in fact a recursive listener. Because the inode 16 has no descendents, the recursive listener appears as if it were a single listener.
  • FIG. 6B illustrates the same set of listeners whose scope is illustrated in FIG. 6A, but does so from the perspective of the participant structures 506. The three scopes illustrated in FIG. 6B correspond to the three scopes of the participant structures 506, as illustrated in FIG. 5. In one embodiment, these scopes may typically be different than the scopes of the initiator structures 404 on the initiator hash tables 400, even though in the exemplary embodiment they are the same. Thus, there is a recursive listening scope 316 defined for inode 12, a children listening scope defined for inode 13, and a single listening scope defined for inode 16. These scopes do not necessarily represent individual listeners, but rather represent the scope of the listeners for each particular listening inode across all nodes 102. Thus, FIG. 6B illustrates the composite scope of the listeners for a particular listening inode. There are three scopes defined, corresponding to the three inodes, as illustrated in FIG. 5, with listeners attached to them. Thus, the scope defined for inode 12 represents the composite scope of the listeners attached to inode 12 across all the nodes 102. Because the scope of one of the listeners attached to inode 12 is recursive, the composite scope for inode 12 is recursive, the recursive scope 316. In other words, the scopes in FIG. 6B describe the broadest scope of any one of the listeners for a particular listening inode. The broadest scope of any listener attached to inode 13 is children. For this reason, the listening scope for inode 13 is the children scope 318. The scope for the listeners attached to inode 16 does not appear to extend beyond inode 16. Although this appears to be a single listening scope, in fact, it is recursive listening scope 320, corresponding to the listener requested by Process 2003, which specifies recursive scope.
  • C. Update Process
  • As mentioned above with respect to FIG. 5, in one embodiment, the initiator hash tables 400 and the participant hash tables 500 are updated when a change, or a certain type of change, is made to the set of listeners. For example, a listening process 410 may terminate and no longer require listeners. Alternatively, in some embodiments, the scope or event mask of a previously initiated listener may be altered. Thus, in one embodiment, there may be a need to update on a consistent basis the initiator hash tables 400 and the participant hash tables 500.
  • FIG. 7 illustrates one embodiment of a flowchart for the operations to update the initiator hash tables 400 and the participant hash tables 500. In state 702, the node 102 receiving the request for a change to one of the listeners, including adding or deleting a listener, gets the exclusive event lock. In one embodiment of a distributed system, an exclusive event lock prevents other nodes 102 from reading from or writing to the distributed system. In the illustrated embodiment of the event system, an exclusive event lock is obtained in order to prevent other nodes from reading or writing to the initiator hash tables 400 or the participant hash tables 500 during the update. As described below with reference to FIG. 9, the illustrated embodiment also implements a shared event lock, which prevents other nodes 102 from gaining access to an exclusive event lock. In other embodiments, a locking scheme may be used that is finer grained.
  • In state 704, an initiator process for the node 102 updates its respective initiator hash table 400 corresponding to the node 102. As used with reference to FIG. 7, the initiator process describes an executable portion of the initiator module. Although in the illustrated embodiment, the operations described in FIG. 7 are executed by the initiator module, in other embodiments, the same operations may be executed by other modules, such as the participant module.
  • Once the respective initiator hash table 400 has been updated, the initiator process sends messages to the participant modules signifying that there has been an update to an initiator hash table 400, and subsequently delivers the updated information for its hash table 400, which is described in state 706. As mentioned above, in the illustrated embodiment of the invention, the nodes 102 include both an initiator and a participant module. The participants update their participant hash tables 500. In one embodiment, to update the participant hash table 500, a participant process, which may include an executable portion of the participant module, indexes the appropriate listening inode. If necessary, changes are made to the node structures 508 and the corresponding participant structures 506, to represent the updated listener information. Once the participant hash tables 500 have been updated, the participant process increments the node generation number 502 in state 708. In some embodiments, the node generation number 502 is simply incremented. In other embodiments, the node generation number 502 may correspond to some other identifier that participant nodes recognize as the updated status of the participant hash tables 500. In state 710, the respective initiator process releases the exclusive event lock. As described above, in one embodiment, the initiator process described in FIG. 7 pertains to the initiator module and the participant process pertains to the participant module. In other embodiments, the initiator process and/or the participant process reside in other and/or additional modules.
  • D. Example Change to Listeners
  • FIGS. 8A and 8B illustrate one embodiment of a change to the listeners of the inode tree 300. FIG. 8A illustrates the state of the inode tree 300 before Processes 3000 and 3001, illustrated in FIG. 4A, have requested listeners on inode 12. Thus, FIG. 8A illustrates one embodiment of the state of the inode tree 300 with listeners requested by Processes 2000, 2001, 2002, 2003, and 2004. Only three scopes are illustrated because two of the listeners fall within the scope of another listener. Specifically, the single scope listeners requested by Processes 2001 and 2002 fall within the scope of the children scope listener requested by Process 2000. It is important to note, however, that, in the exemplary embodiment, the overall scope does not define the scope for particular events. Thus, even though the Process 2000 listens only for “remove” events and Processes 2001 and 2002 listen only for “size change” events, these three listeners are represented by only one scope, which is the children scope because it is the broadest scope.
  • As discussed in greater detail below with reference to FIGS. 8, and 10 through 12, in one embodiment, each individual inode includes an event cache. (See also the description of FIG. 2 above.) The event cache includes a listening set 804 and a cache generation number 806. The listening set 804 of a particular inode includes the LINs of the listening inodes (in the participant hash tables 500) whose scope encompasses that particular inode. For example, with respect to inode 16, the listening set 804, as illustrated in FIG. 8A, includes listening inodes 13 and 16. This means that there is a listener associated with inode 13 whose scope is broad enough to include inode 16. Similarly, there is a listener associated with inode 16 whose scope is broad enough to include inode 16, namely, the listener associated with inode 16.
  • In addition to the listening set 804, each inode cache includes a cache generation number 806. If the cache generation number 806 of an inode matches the node generation number 502, then the event cache of the inode is up-to-date. FIG. 8A illustrates an inode tree 300 wherein the event cache of every inode is up-to-date. The event caches are up-to-date because each cache generation number 806 matches the node generation number 502.
  • FIG. 8B illustrates one embodiment of the state of the inode tree 300 following the addition of two additional listeners. In the exemplary embodiment, these listeners correspond to the listeners requested by Processes 3000 and 3001, as illustrated in FIG. 4A. When Process 3000 requests a listener on inode 12, the broadest scope on inode 12 becomes the recursive scope. The only listener previously attached to inode 12 is the listener requested by Process 2004, which has single scope. The broadest scope of a listener on inode 12, following the addition of the listener corresponding to Process 2004, is the recursive scope, as illustrated in FIG. 8B. This scope corresponds to the scope of the participant structure 506 corresponding to inode 12 following the addition of the listener corresponding to Process 2004. When Process 3001 attaches an additional listener of children scope to inode 12, the broadest scope does not change because the children scope is less than or equal to the previous scope, recursive.
  • In the exemplary embodiment, the addition of each listener, first 3000 and then 3001, caused the node generation number 502 to increment by one (not illustrated). In some embodiments, successive changes to the listeners may be grouped together. One skilled in the art will appreciate that there are many possible ways to and times to adjust the node generation number 502 to reflect the change in the status of listeners. FIG. 8B also illustrates how up-to-date event caches would appear following the addition of the two listeners. Thus, with respect to inode 16, the listening set is 12, 13, and 16. This means that there are listeners attached to inodes 12, 13, and 16 whose scope is broad enough to include inode 16. Inode 16 is within the scope of listening inode 12 because the broadest listener attached to inode 12 is a recursive listener. Inode 16 is within the scope of listening inode 13 because the broadest scope of a listener attached to inode 13 is the children scope and inode 16 is an immediate descendent, or child, of inode 13. Finally, inode 16 is within the scope of listening inode 16 because inode 16 is inode 16; thus, inode 16 is within the scope of any listener attached to inode 16 because even the smallest scope, in the exemplary embodiment the single scope, includes the inode itself.
  • In one embodiment, the transition of event caches from FIG. 8A to FIG. 8B does not happen automatically; the event caches of each inode in the inode tree 300 are not updated automatically. Instead, the event caches for each inode are updated as needed. In other embodiments, some or all of the updating is automatic. One embodiment of an updating process is described in detail further below in FIGS. 10 through 12.
  • E. Processing An Event
  • FIG. 9 illustrates one embodiment of a flowchart of the top-level events for processing an event on the cluster 108. The respective node 102 where the event occurs determines whether event messages are sent to listeners. The execution of operations depicted in FIG. 9 is referred to collectively as the process. In the illustrated embodiment, the initiator module executes the process. In other embodiments, the process may be executed by other modules, such as the participant module. Before the node 102 processes the event, the process decides if the relevant inode, referred to as the inode on which the event occurs, is being listened to. This is one of the functions of one embodiment of an event system described herein, and this function is described in more detail in the third section, with reference to FIGS. 10 through 12. If the relevant inode is being listened to, then the process sends event messages to the corresponding listening nodes, referred to as those nodes listening for events on the relevant inode, and the respective initiators determine whether to place the event message in the event queues 408 of any process structures 406. In one embodiment, this function is the third primary function of the exemplary event system described herein, and it is described in more detail in the fourth section, with reference to FIGS. 13 through 16.
  • With respect to the flowchart illustrated in FIG. 9 the node 102 on which the event occurs acquires a shared event lock 902. In one embodiment, a shared lock prevents other nodes 102 from obtaining an exclusive event lock. Other nodes 102 may continue to read the contents of the system while one node 102 has the shared event lock. After acquiring the shared event lock, the process validates the event cache of the relevant inode 904. The relevant inode is the inode that the event affects. In one embodiment, the event affects the file or directory corresponding to the relevant inode. Because the data stored in the relevant inode may also change, in some embodiments, the event is referred to as occurring to the inode. Various embodiments of validation of the event cache of the relevant inode are discussed in further detail below with reference to FIG. 10. After validating the event cache of the relevant inode the node 102 executes the operation, in state 906. As mentioned above, with reference to FIGS. 4A, 4B, 4C, and 5, these events may include, without limitation, attribute change, creation, deletion, size change, remove, content change, sizing increase, attribute change, link count change, rename, access revoke, create, rename from here, rename to here, rename within same directory, event occurred on file, event occurred on directory, size change, permission change, and/or other events. After performing the operation on the relevant inode, the node 102 sends event messages to the listeners, in state 908. Various embodiments of this state are described in more detail below with reference to FIGS. 13 through 16. After sending event messages to listeners, the node 102 releases the shared event lock, in state 910. As mentioned above, in one embodiment, the process described by FIG. 9 is executed by the participant module, though in other embodiments, the process may be executed by other modules.
  • III. DECIDING WHETHER ANYONE IS LISTENING TO A FILE
  • A. Validating An Event Cache
  • FIG. 10 illustrates one embodiment of a flowchart of operations to validate an event cache of a relevant inode. The operations described in FIG. 10 are collectively referred to as the process. In one embodiment, the participant executes the process, though in other embodiments, other modules, such as the initiator module, may execute the process. In state 1002, the process determines if the cache generation number 806 of the relevant inode matches the node generation number 502 of the relevant node. In one embodiment, the relevant node is the node upon which the relevant event occurs, and the relevant event is the current event being processed. If there is a match, then the event cache of the relevant inode is up-to-date and it has been validated. If, on the other hand, the cache generation number 806 of the relevant inode does not match the node generation number 502, then the participant module proceeds to state 1004, and the process determines whether the relevant inode is the root of the inode tree 300. If the relevant inode is the root then participant module proceeds to state 1010, and there is no need to update the cache of the relevant inode with the cache the parent because the root has no parent.
  • If, on the other hand, the relevant inode is not the root, then the cache of the relevant inode is updated with the cache of the parent. Before doing this, however, the cache of the parent is validated. In other words, in one embodiment, the cache of the relevant inode may not be updated with the cache of the parent until the cache of the parent is up-to-date itself. This step demonstrates the recursive nature of one embodiment of the algorithm. In one embodiment, the recursion occurs all the way until the relevant inode is the root or the relevant inode has a valid cache. (As used during the recursive stage, the relevant inode is the inode being updated, not the inode to which the event originally occurred, as is used in other parts of the description.) Although in general the relevant inode refers to the inode upon which the relevant, or current, event occurs, during the validation stage, the relevant inode refers to whichever inode is being validated. Thus, as the process proceeds up the tree, each inode along the way becomes the relevant inode for purposes of validating the event caches. Once this process is finished, the relevant inode generally refers to the inode to which the relevant, or current, event occurred. Thus, if the relevant inode is not the root then, in state 1006, the participant module proceeds to validate the event cache of the parent, which, with respect to the flowchart depicted in FIG. 10, returns the participant module to state 1002. The process then proceeds through the same flowchart operations, with the parent of the relevant inode of the previous pass becoming the relevant inode for the successive pass. This is the recursive element of one embodiment of the event cache validating algorithm.
  • After validating the event cache of the parent, the process updates the cache of the relevant inode with the cache of the parent 1008. This is the first operation taken after returning from each successive call to validate the event cache of the “relevant” inode. In one embodiment, “relevant” is relative because as the process works up the tree, the parent of the relevant inode becomes the relevant inode. State 1008 is described in more detail below with reference to FIG. 11. Once the cache of the relevant inode has been updated with the cache of the parent, the process proceeds to state 1010. As set forth above, the process also progresses to state 1010 if it is determined, in state 1004, that the relevant inode is the root. In state 1010, the process determines whether or not the relevant inode is itself a listening inode, by looking, for example, in the participant hash table 500. If the relevant inode indexes a participant structure 506, then it is a listening inode. If the relevant inode is a listening inode, then process proceeds to state 1012. In state 1012, the relevant inode is added to the listening set 802 of the relevant inode. If, on the other hand, the relevant inode is not a listening inode then the process proceeds to state 1014. Similarly, after adding the relevant inode to the listening set 804 of the relevant inode, in state 1012, the process proceeds to state 1014, where the cache generation number 806 of the relevant inode is updated with the current value of the node generation number 502. As described above, in one embodiment, the participant module executes the process, though in other embodiments, other modules, such as the initiator module, may execute the process.
  • B. Updating The Cache
  • FIG. 11 illustrates one embodiment of state 1008 in more detail and illustrates the operations for updating the cache of the relevant inode with the cache of the parent. In one embodiment, the states in between state 1102 and 1112 repeat for each listening inode in the listening set 804 of the parent of the relevant inode. During each loop, the respective listening inode in the listening set 804 is the corresponding listening inode specified in the flowchart. For example, if the listening set 804 of the parent of the relevant inode includes two listening inodes, then the loop would repeat two times.
  • In state 1104, the process determines whether the scope of the respective listening inode is recursive. If the scope of the respective listening inode is recursive, then the relevant inode is within the scope of the respective listening inode, and the process proceeds to state 1110, where the respective listening inode is added to the listening set 804 of the relevant inode. If, on the other hand, the scope of the respective listening inode is not recursive, then the process determines whether the scope of the listening inode is children 1106. If the scope of the respective listening inode is not children, then the scope of the listening inode is be single, and if the scope of the listening inode is single then the relevant inode is not within the scope of the listening inode because the listening inode is not the relevant inode. If the scope of the listening inode is children, then the process proceeds to state 1108. In state 1108, the participant module determines whether the respective listening inode is the parent of the relevant inode. If the respective listening inode is the parent of the relevant inode, then the relevant inode is within the scope of the respective listening inode because the scope of the respective listening inode is children, and the process proceeds to state 1110, where the respective listening inode is added to the listening set 804 of the relevant inode. If, on the other hand, the respective listening inode is not the parent of the relevant inode, then the relevant inode is not within the scope of the listening inode. In that case, the process proceeds to state 1112, ending the corresponding loop of instructions for that respective listening inode. As explained above, in one embodiment, the operations between states 1102 and 1112 execute for each respective listening inode in the listening set 804 of the parent of the relevant inode. As described above, in one embodiment, the participant module executes the process, though in other embodiments, other modules, such as the initiator module, may execute the process.
  • C. Examples of Validating the Event Cache
  • FIGS. 12A and 12B illustrate one embodiment of validating the event caches of the inode tree 300 following certain events. FIG. 12A illustrates stages of validating event caches in the inode tree 300 following a “size change” event on the file corresponding to inode 20. FIG. 8A illustrates the event caches of the inode three 300 before the “size change” event. Beginning with the relevant inode as 20, the process first attempts to validate the event cache of inode 20. Because the cache generation number 806 of inode 20 does not match the node generation number 502, the process proceeds to determine whether inode 20 is the root. Because inode 20 is not the root, the process proceeds to validate the event cache of the parent, inode 17. Because the cache generation number 806 of inode 17, currently, the relevant inode, does not match the node generation number 502, and because inode 17 is not the root, the process proceeds to validate the event cache of the parent, inode 13. (This is the second recursive call.) Because the cache generation number 806 of inode 13 does not match the node generation number 502, and because inode 13 is not the root, the process attempts to validate the event cache of the parent, inode 12. (This is the third recursive call.) Because the cache generation number 806 of inode 12 does not match the node generation number 502, and because inode 12 is not the root, the process attempts to validate the event cache of the parent, inode 10. (This is the fourth recursive call.) Although the cache generation number 806 of 10 does not match the node generation number 502, the inode 10 is the root, so the process does not make another recursive call to validate the event cache of the parent, and the process proceeds to state 1010 (still in the fourth nested call). Because inode 10 is not a listening inode, the process proceeds to state 1014, where the cache generation number 806 of inode 10 is updated to the value of the node generation number 502. Having terminated the fourth and final recursive call, the process begins to unwind.
  • Starting with state 1008 in the third nested call, the process proceeds to update the cache of the relevant inode with the cache of the parent of the relevant inode. At this point, the relevant inode is inode 12. The process updates the event cache of inode 12 with the event cache of inode 10. Because the listening set 804 of inode 10 is empty, the process proceeds from state 1102 to 1112 and returns to state 1010. Because inode 12 is a listening inode, the process proceeds to state 1012, where inode 12 is added to the listening set 804 of inode 12. The cache generation number 806 of inode 12 is then updated, and the algorithm unwinds down the recursive call stack, returning to state 1008 in the second nested call.
  • At this point, the relevant inode is inode 13. The process then updates the event cache of inode 13 with the cache of the parent, which is inode 12. Because there is one listening inode in the listening set 804 of inode 12, the operations between states 1102 and 1112 execute once. Because the scope of listening inode 12 is recursive, the process adds inode 12 to the listening set 804 of inode 13 and returns to state 1010. Because inode 13 is a listening inode, inode 13 is added to the listening set 804 of inode 13, which now includes 12 and 13. The cache generation number 806 of inode 13 is then updated, and the recursive call stack unwinds another level to the first recursive call.
  • At this point, the relevant inode is 17. The process then updates the event cache of inode 17 with the event cache of the parent, which is inode 13. Because there are two listening inodes in the listening set 804 of inode 13, the operations between 1102 and 1112 are executed twice, once for inode 12 and then once for inode 13. Because the scope of inode 12 is recursive, inode 12 is added to the listening set 804 of inode 17, and the process begins the next loop with inode 13 as the respective listening inode. Because the scope of inode 13 is children and because inode 13 is the parent of inode 17, inode 13 is added to the listening set 804 of inode 17. After finishing both loops, the process returns to state 1010. Because inode 17 is not a listening inode, the process proceeds to update the cache generation number 806 of inode 17 and then to return to the original call state.
  • The relevant inode is now the original relevant inode, which is inode 20. The process then updates the event cache of inode 20 with the event cache of the parent, inode 17. Because inode 17 includes two listening inodes in its listening set 804, the operations between states 1102 and 1112 are executed twice. Because the scope of the first listening inode, inode 12, is recursive, inode 12 is added to the listening set 804 of inode 20. Because the scope of listening inode 13 is not recursive and because the listening inode 13 is not the parent inode 20, the process returns to state 1010 without adding inode 13 to the listening set 804 of inode 20. Because inode 20 is not a listening inode, the process updates the cache generation number 806 of inode 20, which validates inode 20, the relevant inode.
  • FIG. 12A illustrates the state of each event cache in the inode tree 300 following the execution of the “size change” event on inode 20. Thus, inodes 10, 12, 13, 17, and include up-to-date caches. The remaining inodes, however, include out-of-date event caches.
  • FIG. 12B illustrates the up-to-date status of the event cache of each inode in the inode tree 300 following the execution of a “create” inode 22 event. In the case of a “create” event, the event system first validates the parent directory, and then the new child inherits the event cache of the up-to-date parent. Thus, the process first attempts to validate the parent directory of inode 22, which is inode 18. Because the cache generation number 806 of inode 18 does not match the node generation number 502, and because inode 18 is not the root, the process proceeds to validate the event cache of the parent, inode 15. (This is the first recursive call.) Because the cache generation number 806 of inode 15 does not match the node generation number 502, and because inode 15 is not the root, the process proceeds to validate the event cache of the parent, inode 12. (This is the second recursive call.) Because the cache generation number 806 of inode 12 matches the node generation number 502, the process terminates the last recursive call, returning to the first recursive call. At this point, the process executes in a similar manner as it did for the previous “size change” event on inode 20. Starting with inode 15, the respective relevant inode is updated with the up-to-date event cache of the parent, none of the respective relevant inodes are added to their own listening sets 802 (because there are no listening inodes in this branch of the tree), and the cache generation number 806 of each respective relevant inode is updated. Once inode 22 has been created, it inherits the up-to-date event cache of inode 18. At this point, the inodes 10, 12, 13, 15, 17, 18, 20, and 22 have up-to-date caches, and the remaining inodes still have caches that are not up-to-date. FIG. 8B illustrates the event caches of the inode tree 300 after all the event caches have been validated.
  • The following is one embodiment of exemplary code for implementing the validate event cache algorithm:
    update_cache_from_parent(inode, parent) {
      /* Take all lins from the parent that are relevant to us */
      for each <lin> in parent->event_lins {
        if (scope_is_recursive(lin))
          lin_set_add(inode->event_lins, lin);
        else if (scope_is_children(lin) and lin == parent->lin)
          lin_set_add(inode->event_lins, lin);
      }
    }
    update_cache(inode) {
      /* If we're up to date, we're done. */
      if (inode->gen == global gen)
        return;
      if (is_not_root(inode)) {
        /* Make sure our parent is up to date */
        update_cache(inode->parent);
        /* Update our cache from our parent */
        update_cache_from_parent(inode, inode->parent);
      }
      /* See if we have an entry in the event hash */
      if (in_event_hash(inode))
        lin_set_add(inode->event_lins, inode->lin);
      /* Update our generation number to the latest global gen */
      inode->gen = global gen;
    }
  • IV. NOTIFYING LISTENERS OF EVENTS
  • FIGS. 13 through 16 illustrate additional embodiments of the operation of state 908 illustrated in FIG. 9. Sending event messages to listening processes 410 includes two principal sets of operations, depicted in FIGS. 13 and 16, respectively. FIG. 13 illustrates one embodiment of a flowchart of operations executed by participant modules. FIG. 16 illustrates one embodiment of a flowchart of operations executed by initiator modules upon receiving event messages from the participant modules. Although in the illustrated embodiment the processes depicted in FIGS. 13 and 16 are executed by the participant and initiator modules, respectively, in other embodiments, these processes may be executed by other modules and/or executed by the same module.
  • A. Sending Event Messages
  • FIG. 13 illustrates one embodiment of the flowchart of operations to send event messages to the listening nodes. The operations between states 1302 and 1316 are executed for as many respective listening inodes as are in the listening set 804 of the relevant inode, where the relevant event is the event on the relevant inode being processed. If, in state 1304, it is determined that the relevant event is within the event mask of the respective listening inode, the process proceeds to state 1306. If, on the other hand, it is determined that the relevant event is not within the event mask of the respective listening inode, the process terminates the respective iteration.
  • The operations between states 1306 and 1314 are executed as many times as there are listening nodes for the listening inode. Thus, if there are two nodes 102 in the cluster 108 that are listening for the respective listening inode, the operations between state 1306 and state 1314 execute twice. In state 1308, the process determines whether the relevant event is within the event mask of the respective listening node. If the relevant event is not within the event mask of the respective listening node, then the process terminates the respective iteration. If, on the other hand, the relevant event is within the event mask of the respective listening inode, then the process proceeds to state 1310, where it determines whether the relevant inode is within the scope of the respective listening node. If the relevant inode is not within the scope of the respective listening node, the process terminates the respective iteration. If, on the other hand, the relevant inode is within the scope of the listening node, then the process proceeds to state 1312, where the process creates and sends an event message to the respective listening node. As described above, in one embodiment, the participant module executes the process, though in other embodiments, other modules may execute the process.
  • FIG. 13 illustrates one embodiment of operations for sending event messages from the participant modules to the respective initiator modules, which correspond to the respective listening nodes. These operations are accomplished in a two-step process. First, it is determined whether the relevant event falls within the event mask of any of the listening inodes within the listening set 804 of the relevant inode. Second, it is determined, for any of the qualifying listening inodes, whether the relevant event falls within the event masks of listening nodes corresponding to the respective listening inode and whether the relevant inode is also within the scope of any of the listening nodes corresponding to the respective listening inode. It is recognized that in other embodiments, the process could first check the scope and then check the event mask.
  • FIG. 14 illustrates one embodiment of event messages. In one embodiment, participant modules send event messages 1400 to initiator modules to apprise them of events on relevant inodes for which respective listening processes 410 may be monitoring. An exemplary event message 1400 may include several fields. An event message may include a listening inode field 1402. This field apprises the initiator module of the listening inode that triggered the event message. An event message 1400 may also include, a listening node field 1404. The listening node is the node 102 that initiated at least one listener on the listening inode specified in the listening inode field 1402. In some embodiments, there may be no field for the listening node. In these embodiments, the event message is merely directed to the appropriate listening node, and the event message 1400 does not identify the node to which it was directed. The event message 1400 may also include a relevant inode field 1406. The relevant inode field 1406 identifies the inode upon which the event occurred that triggered the event message. An event message 1400 may also include a relevant event field 1408. The relevant event field 1408 identifies the type of event that triggered the event message 1400.
  • An event message 1400 may also include a minimum scope field 1410. The minimum scope identifies the minimum scope necessary to trigger the event message 1400. In other words, the minimum scope is the minimum scope of the listening inode that would have included the relevant inode for purposes of determining whether to send an event message. For instance, with regards to FIGS. 12A and 12B, if the listening inode is 13 and the relevant inode is 17, then the minimum scope for triggering an event message would be the children scope. If, on the other hand, the listening inode is inode 12 and the relevant inode is inode 17, then the minimum scope to trigger an event message would be the recursive scope. If, in yet another example, the listening inode were 13 and the relevant inode were also 13, then the minimum scope for triggering the event message would be the single scope.
  • B. Determining Minimum Scope
  • FIG. 15 illustrates one embodiment of a flowchart of the operations to determine the minimum scope. In state 1502, the process determines whether the relevant inode is the listening inode. If the relevant inode is the listening inode, then the participant module sets the minimum scope to single 1504. If, on the other hand, the relevant inode is not the listening inode, then the process determines whether the relevant inode is the immediate child of the listening inode 1506. If the relevant inode is the immediate child of the listening inode, then the process sets the minimum scope to children 1508. If, on the other hand, the relevant inode is not the immediate child of the listening inode, then the process sets the minimum scope to recursive 1510. In one embodiment, the participant module executes the process to determine minimum scope, though in other embodiments, other modules, such as the initiator module, may execute this process to determine the minimum scope.
  • For example, with regard to FIGS. 12A and 12B, if the listening inode is 12, then the following are the minimum scopes for the respective relevant inodes. If the relevant inode is inode 12, then the minimum scope is single. In other words, the minimum scope necessary for a listener on inode 12 to cause an event message to be sent from inode 12 is the single scope. If the relevant inode is 15, then the minimum scope is children. In other words, the minimum scope necessary for an event on the relevant inode 15 to trigger an event message to the listener attached to inode 12 would be the children scope. If the relevant inode is 18, then the minimum scope would be recursive. In other words, the minimum scope necessary for an event on inode 18 to trigger an event message being sent to the listener attached to inode 12 would be the recursive scope.
  • C. Notifying Processes
  • FIG. 16 illustrates one embodiment of a flowchart of the operations to notify the listening processes 410 of relevant event messages. In one embodiment, relevant event messages are those event messages corresponding to relevant events for which listeners on the respective listening node are listening. Relevant event messages are those event messages that queued in an event queue of a listener. In the illustrated embodiment, not all relevant events result in relevant event messages. In other words, relevant events may trigger an event message that is never queued. This scenario is discussed in more detail below. In state 1602, the initiator module receives an event message from the participant module. In some embodiments, the participant module and the initiator module may reside on the same physical node 102, even though they are different logical modules.
  • In one embodiment, the operations between states 1604 and 1614 are repeated for as many times as there are process structures 406 for the listening inode. The process determines the listening inode from the event message 1400. By consulting the respective initiator hash table 400, the process determines which listening processes 410 are listening to the listening inode. For example, with reference to FIG. 4A, there are two processes, Process 3000 and Process 3001, listening to inode 12. Thus, in this example, the operations between 1604 and 1614 would be executed twice, once each time for each listening process (or, in other words, each process structure 406 corresponding to a particular listening process 410). In state 1606, the process determines whether the relevant event, delivered by the event message 1400, is within the event mask of the listening process. If it is not within the event mask of the listening process, then the process proceeds to state 1614, and the respective iteration terminates. If, on the other hand, the relevant event is within the event mask of the listening process, then the process determines, in state 1608, whether the minimum scope that could have generated the event message is less than or equal to the scope of the listening process. In the illustrated embodiment, the single scope is less than the children scope and the children scope is less than the recursive scope. State 1606 tests whether the listener is listening for the event. State 1608 tests whether the listener listens for the relevant inode. If either of these conditions fails, then the event is not an event being listened for and this iteration of instructions terminates. Thus, in some embodiments, event messages 1400 may be sent to an initiator module without the event message 1400 being queued in one of the event queues 408 of a corresponding listening structure 406. This is due to the fact that a participant evaluated the composite scope and the composite event mask of all listeners for a particular listening inode. Some listeners, however, may be listening for different events within different scopes. Therefore, sometimes event messages 1400 will be routed to a respective initiator module without being added to the event queue 408 of any process structure 406.
  • In state 1610, the relevant event is added to the event queue 408 of respective process structure 406. In state 1610, the respective event queue 408 may also be coalesced in some embodiments. In some embodiments, the process determines whether the event message 1400 is repetitive of other event messages. If the event message 1400 is repetitive, then it is not added to the respective event queue 408. In state 1612, the listening process is woken up and notified there are events available in the respective event queue 408. As described above, in one embodiment, the initiator module executes the process illustrated in FIG. 16 (as distinguished from the listening process), though in other embodiments, other modules, such as the participant module, may execute the process.
    TABLE 1
    Relevant Listening
    Inode Event Listening Set Nodes Listening Processes
    22 Create {12} 1 3,000
    12 Size Δ {12} 1, 3 3,001
    2,004
    16 Size Δ {12, 13, 16} 1, 3 2,003
    13 Size Δ {12, 13} 1, 3 3,001
    2,001
    2,002
    13 Remove {12, 13} 3 2,000
  • Table 1 illustrates one embodiment of results of events on the particular inodes within the inode tree 300. The first example in Table 1 is a “create” event for a new inode, inode 22. As illustrated in FIG. 12B, the up-to-date listening set 804 of inode 22 include only listening inode 12. With reference to FIG. 13, the operations between states 1302 and 1316 would execute once, for the single listening inode within the listening set 804 of the relevant inode 22. Because the “create” event is within the event mask of listening inode 12, as illustrated in FIG. 5, the process progresses from state 1304 to state 1306. Because there are two listening nodes for listening inode 12, the operations between state 1306 and state 1312 execute twice. On the first pass, with respect to Node 1, the process progresses from state 1308 to state 1310 because the “create” event is within the event mask of listening Node 1, as illustrated in FIG. 5. Because the scope of the node structure 506 of Node 1 is recursive, the relevant inode 22 is within the scope of the listening node, and, in state 1312, the process creates and sends an event message to Node 1. On the second pass, with respect to node 3, the process determines that the “create” event is not within the event mask of the listening node, which causes the iteration to terminate without sending an event message to Node 3 with regards to the relevant message.
  • As described above, FIG. 16 illustrates one embodiment of a flowchart of operations that the initiator module executes upon receiving an event message 1400 from a participant node, in state 1602. Because there are two listening processes for the listening inode 12, the operations between states 1604 and 1614 execute twice. During the first pass, the respective listening process is Process 3000. Because the “create” event is within the event mask of the listener requested by Process 3000, the process proceeds from state 1606 to state 1608. Because the minimum scope that would have generated the event message is equal to the scope requested by Process 3000, as both scopes are recursive, the event message 1400 is added to the event queue 408 of the process structure 406 corresponding to Process 3000. In state 1610, the respective event queue 408 may also be coalesced in some embodiments. In some embodiments, the process determines whether the event message 1400 is repetitive of other event messages. If the event message 1400 is repetitive, then it is not added to the respective event queue 408. In state 1612, the listening process, Process 3000, is woken up and notified there are events available in the respective event queue 408. In the second pass of the instructions between states 1604 and 1614, the listening process is Process 3001. Because the “create” event is not within the event mask of the listener requested by Process 3001, the initiator module ends.
  • The second example in Table 1 is a “size change” to inode 12. As illustrated in FIG. 8B, the up-to-date listening set 804 of inode 12 comprises only listening inode 12. As illustrated in FIG. 5, both Nodes 1 and 3 listen for “size change” events. Furthermore, inode 12 is within the scope of the respective node structure 508 for both nodes 1 and 3 because inode 12 is the listening inode. Thus, the participant module sends event messages 1400 to both Nodes 1 and 3. Because both Processes 2004 and 3001 listen for the “size change” event, the event messages 1400 sent to Nodes 1 and 3 are placed into the corresponding event queues 408 of the respective listener structures 406.
  • The third example in Table 1 is a “size change” to inode 16. As illustrated in FIG. 8B, the up-to-date listening set 804 for inode 16 includes listening inodes 12, 13, and 16. With respect to listening inode 12, the participant module sends an event message 1400 only to Node 1 because the listeners on Node 3 attached to listening inode 12 specify the single scope, and inode 16 is not inode 12. With respect to listening inode 13, the participant module sends an event message 1400 to Node 3, as the listeners for inode 13 on Node 3 listen for “size change” events and have a composite scope of children, and inode 16 is a child of inode 13. Similarly, with respect to inode 16, the participant module sends an event message 1400 to Node 3, as the listeners for inode 16 on Node 3 listen for “size change” events and have a composite scope of recursive.
  • Although three event messages 1400 were sent, only one of the event messages is placed into a corresponding event queue 408 of a respective process structure 406. With regard to the event message 1400 specifying inode 12 as the listening inode, the event message 1400 is not queued because the scope of the Process 3001 is only the children scope, and the minimum scope necessary to trigger an event on inode 16 based on listening inode 12 is the recursive scope, which is greater than the children scope. Although Process 3000 has a recursive scope, it only listens for the “create” event, not the “size change” event. Thus, this event message 1400 reaches the respective initiator module, but is never queued. Similarly, the event message 1400 directed to Node 3 with respect to listening inode 13 is also not queued. The Process 2000 does not listen for the “size change” event and Processes 2001 and 2002 have single scope, and the minimum scope required to trigger an event message from inode 16 arising from listening inode 13 is the children scope, which is greater than the single scope. In contrast, the event message 1400 sent to Node 3 with respect to listening inode 16 is placed on the corresponding event queue 408 of the respective process structure 406. Process 2003 listens to events within the recursive scope of inode 16, and listens for “size change” events. Because a “size change” event on inode 16 is within the scope and the event mask of the listener attached by Process 2003, the respective event message 1400 is queued in the event queue 408 corresponding to Process 2003.
  • Example 4 in Table 1 illustrates a “size change” event to inode 13. As illustrated in FIG. 8B, the listening set 804 of inode 13 includes inodes 12 and 13. As illustrated in FIG. 5, with respect to listening inode 12, Node I listens for all events within the recursive scope of inode 12 and also listens for the “size change” event. Therefore, an event message 1400 is sent to Node 1. Still with respect to listening inode 12, because the listeners on Node 3 only listen to events within the single scope of inode 12, no event message is sent to Node 3. With respect to listening inode 13, because the listeners on Node 3 listen for the “size change” event, and because inode 13 is within the scope of listening inode 13, an event message 1400 is sent to Node 3.
  • The event message 1400 sent to Node 1, with respect to listening inode 12, is queued in the event queue 408 corresponding to Process 3001 because inode 13 is within the children scope of inode 12 and because Process 3001 listens for the “size change” event. The same event is not queued in the event queue 408 corresponding to Process 3000 because that listener listens only for “create” events. With respect to the event message 1400 sent to Node 3, with respect to inode 13, the event message 1400 is queued in the event queues 408 corresponding to Processes 2001 and 2002 because inode 13 is within the single scope of inode 13 and because Processes 2001 and 2002 listen for the “size change” event.
  • The fifth example in Table 1 is a “remove” event on inode 13. As illustrated in FIG. 8B, the up-to-date listening set 804 of inode 13 comprises listening inodes 12 and 13. As illustrated in FIG. 5, none of the nodes listening to the listening inode 12 listens for the “remove” event. This is illustrated in the participant structure 506 for inode 12. The corresponding event mask does not include the “remove” event. The participant structure 506 for inode 13, however, does include the “remove” event. An event message 1400 is created and sent to Node 3 because inode 13 is within the children scope of inode 13 and because the “remove” event is within the event mask of the node structure 508 corresponding to node 3. With respect to FIG. 4C, only Process 3000 listens for the “remove” event. Because inode 13 is within the children scope of listening inode 13, the “remove” event on inode 13 is queued in the event queue 408 corresponding to Process 2000.
  • V. UPDATING HASH TABLES FOLLOWING A GROUP CHANGE
  • FIG. 17 illustrates one embodiment of a flowchart of the operations to add a node 102 to a cluster 108, and, accordingly, to update the participant hash tables 500. In state 1702, the process acquires an exclusive event lock, preventing other nodes from reading from or writing to the system. FIG. 17 illustrates a group of operations, referred to collectively as a process. In state 1704, the process of the nodes 102 within the cluster 108 send the contents of their respective initiator hash tables 400 to the other nodes 102. In state 1706, the process sends messages to the participant modules to build new participant hash tables 500 based on the sent initiator hash tables 400. If all of the sends were successful, as determined in state 1708, then the process instructs the participant modules to swap in the new participant hash tables, in state 1710. If all of the sends were not successful, then the process sends an error message to the Group Management Protocol, in state 1712. After completing state 1712 or state 1710, the process releases the exclusive lock in state 1714. In one embodiment, the initiator module executes the process illustrated in FIG. 17, though in other embodiments, other modules, such as the participant module, may execute the process.
  • VI. CONCLUSION
  • While certain embodiments of the invention have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the present invention. Accordingly, the breadth and scope of the present invention should be defined in accordance with the following claims and their equivalents.
  • By way of example, the following alternatives are also contemplated, though not described in detail. Although the data structures described herein have been addressed to a distributed system, some embodiments of the invention may be used in a single file system. In such a system, there may be only an initiator hash table, and the processes described above may all reference it. Additionally or alternatively, the data structures may also be organized such that the queue of events appears on the participant side, rather than the initiator side. Moreover, the event system described above explained that some event messages may arrive to the initiator, but may never be queued. In other embodiments, the data structures could be changed to track listener processes on the participant side. The above-mentioned alternatives are examples of other embodiments, and they do not limit the scope of the invention. It is recognized that a variety of data structures with various fields and data sets may be used. In addition, other embodiments of the flow charts may be used.

Claims (26)

1. An event listening system, the event listening system comprising:
a file system including a plurality of files, the plurality of files logically stored in a tree;
for each of the plurality of files, a first data structure configured to track a set of listening files that are listening for events that affect the corresponding file;
a plurality of processes that each listen for events that affect at least one of the plurality of files;
a second data structure configured to track, for each of the plurality of files, which of the plurality of processes are listening to each of the files;
a listening module configured to receive an identifier for a first file of the plurality of files and to determine whether the first file is relevant to any of the plurality of processes using the first data structure and the second data structure;
a traverse module configured to traverse a first set of first data structures that correspond to a subset of the plurality of files that represent one branch of the tree; and
an update module configured to update at least one of the corresponding first data structures of the file in at least one traversed level by reviewing a scope of at least one of the listening files of the first data structure that corresponds to the file's parent.
2. The event listening system of claim 1, wherein the file system is a distributed file system.
3. The event listening system of claim 1, wherein the files include files and directories.
4. The event listening system of claim 1, wherein the second data structure comprises:
for each of the plurality of files, a set of processes listening to the corresponding file.
5. The event listening system of claim 4, wherein the second data structure further comprises:
for each of the processes, a scope of files related to the corresponding file for which the corresponding process is listening.
6. The event listening system of claim 5, wherein the listening module is further configured to determine whether the processes listening to the file include a scope that encompasses the first file.
7. The event listening system of claim 4, wherein the second data structure further comprises:
for each of the processes, a set of events for which the corresponding process is listening.
8. The event listening system of claim 1, wherein the listening module is further configured to review the first data structure corresponding to the first file; for each listening file in the set of listening files, to use the second data structure to determine whether there are any processes listening to the file.
9. The event listening system of claim 1 further comprising:
an event message module configured to send an event message to the plurality of processes that are relevant to the first file.
10. The event listening system of claim 1 further comprising:
a plurality of message queues corresponding to the plurality of processes configured to receive and queue the event messages.
11. A method for listening for events, the method comprising:
logically storing a plurality of files in a tree;
for each of the plurality of files, tracking a set of listening files that are listening for events that affect the corresponding file;
storing a plurality of processes that each listen for events that affect at least one of the plurality of files;
for each of the plurality of files, tracking which of the plurality of processes are listening to each of the files;
receiving an identifier for a first file of the plurality of files;
determining whether the first file is relevant to any of the plurality of processes using the first data structure and the second data structure;
traversing a first set of first data structures that correspond to a subset of the plurality of files that represent one branch of the tree; and
updating at least one of the corresponding first data structures of the file in at least one traversed level, wherein updating includes reviewing a scope of at least one of the listening files of the first data structure that corresponds to the file's parent.
12. The method of claim 11, wherein the file system is a distributed file system.
13. The method of claim 11, wherein the files include files and directories.
14. A system for listening for events, the system comprising:
a file structure comprising a plurality of files that are logically stored in a tree;
for each of the plurality of files, a data structure corresponding to each files, the data structure comprising:
a set of identifiers of the plurality of files that are listening for events that affect the corresponding file; and
an indication of the currentness of the data structure.
15. The system of claim 14 wherein the indication of the currentness of the data structure is a generation number.
16. The system of claim 14, further comprising:
an update module configured to update the data structures that correspond to a subset of the plurality of files that represent one branch of the tree.
17. The system of claim 16, wherein the update module is configured:
to begin with the data structure that corresponds to the leaf node file of the one branch;
to climb the one branch, reviewing the corresponding data structure of the file at each climbed level in the one branch, until reaching a data structure whose indication indicates that the data structure is current; and
from that level, to traverse down the one branch to the leaf node file updating the corresponding data structure of the file at each traversed level in the one branch.
18. The system of claim 17, wherein the update module is further configured:
update the corresponding data structure of the file at each traversed level in the one branch using the corresponding data structure of the file's parent.
19. A method for listening for events, the method comprising:
logically storing a plurality of files in a tree; and
for each of the plurality of files, storing a data structure corresponding to each files, the data structure comprising a set of identifiers of the plurality of files that are listening for events that affect the corresponding file and an indication of the currentness of the data structure.
20. The method of claim 19, further comprising:
updating the data structures that correspond to a subset of the plurality of files that represent one branch of the tree.
21. The method of claim 20, further comprising
starting with the data structure that corresponds to the leaf node file of the one branch;
climbing the one branch;
reviewing the corresponding data structure of the file at each climbed level in the one branch, until reaching a data structure whose indication indicates that the data structure is current; and
from that level, traversing down the one branch to the leaf node file and updating the corresponding data structure of the file at each traversed level in the one branch.
22. The event listening method of claim 21 further comprising:
updating the corresponding data structure of the file at each traversed level in the one branch using the corresponding data structure of the file's parent.
23. A storage medium having a computer program stored thereon for causing a suitably programmed system to process the computer program by performing the method of claim 19 when such computer program is executed on the system.
24. A system for queuing event messages in a file system, the system comprising:
a plurality of processes that each listen for events that affect at least one of a plurality of files
a first data structure configured to determine, for each of the plurality of processes, a set of listening files to which each of the plurality of processes is listening; and
a message module configured:
to receive an event message related to a first file of the plurality of files, the event message including an indication of a minimum scope that would have generated the event message;
to search the first data structure to determine a first subset of the plurality of processes that that listen for files that are affected by the event using the sets of listening files;
to determine a second subset of the first subset by removing from the first subset, processes whose scope is less than the minimum scope of the event message; and
to inform the second subset of the event message.
25. The system of claim 24 wherein each of the plurality of processes including an event queue and the message module is further configured to send the event message to the event queues of the second subset.
26. The system of claim 24, wherein the minimum scope is determined by a scope module configured to:
receive a first identifier corresponding to the first file;
receive a second identifier corresponding to a second file;
determine whether the first file matches the second file;
if the first file matches the second file, returning a minimum scope of single; and
if the first file does not match the second file,
determining if the first file is the immediate child of the second file;
if the first file is the immediate child of the second file, returning a minimum scope of children; and
if the first file is not the immediate child of the second file, returning a minimum scope of recursive.
US11/396,282 2006-03-31 2006-03-31 Systems and methods for notifying listeners of events Active 2027-01-26 US7756898B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/396,282 US7756898B2 (en) 2006-03-31 2006-03-31 Systems and methods for notifying listeners of events
US12/789,393 US8005865B2 (en) 2006-03-31 2010-05-27 Systems and methods for notifying listeners of events

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/396,282 US7756898B2 (en) 2006-03-31 2006-03-31 Systems and methods for notifying listeners of events

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/789,393 Continuation US8005865B2 (en) 2006-03-31 2010-05-27 Systems and methods for notifying listeners of events

Publications (2)

Publication Number Publication Date
US20070233710A1 true US20070233710A1 (en) 2007-10-04
US7756898B2 US7756898B2 (en) 2010-07-13

Family

ID=38560640

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/396,282 Active 2027-01-26 US7756898B2 (en) 2006-03-31 2006-03-31 Systems and methods for notifying listeners of events
US12/789,393 Active US8005865B2 (en) 2006-03-31 2010-05-27 Systems and methods for notifying listeners of events

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/789,393 Active US8005865B2 (en) 2006-03-31 2010-05-27 Systems and methods for notifying listeners of events

Country Status (1)

Country Link
US (2) US7756898B2 (en)

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080046445A1 (en) * 2006-08-18 2008-02-21 Passey Aaron J Systems and methods of reverse lookup
US7676691B2 (en) 2006-08-18 2010-03-09 Isilon Systems, Inc. Systems and methods for providing nonlinear journaling
US7680836B2 (en) 2006-08-18 2010-03-16 Isilon Systems, Inc. Systems and methods for a snapshot of data
US7680842B2 (en) 2006-08-18 2010-03-16 Isilon Systems, Inc. Systems and methods for a snapshot of data
US7685126B2 (en) 2001-08-03 2010-03-23 Isilon Systems, Inc. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7752402B2 (en) 2006-08-18 2010-07-06 Isilon Systems, Inc. Systems and methods for allowing incremental journaling
US7756898B2 (en) 2006-03-31 2010-07-13 Isilon Systems, Inc. Systems and methods for notifying listeners of events
US7779048B2 (en) 2007-04-13 2010-08-17 Isilon Systems, Inc. Systems and methods of providing possible value ranges
US7788303B2 (en) 2005-10-21 2010-08-31 Isilon Systems, Inc. Systems and methods for distributed system scanning
US7797283B2 (en) 2005-10-21 2010-09-14 Isilon Systems, Inc. Systems and methods for maintaining distributed data
US7822932B2 (en) 2006-08-18 2010-10-26 Isilon Systems, Inc. Systems and methods for providing nonlinear journaling
US7844617B2 (en) 2006-12-22 2010-11-30 Isilon Systems, Inc. Systems and methods of directory entry encodings
US7848261B2 (en) 2006-02-17 2010-12-07 Isilon Systems, Inc. Systems and methods for providing a quiescing protocol
US7870345B2 (en) 2008-03-27 2011-01-11 Isilon Systems, Inc. Systems and methods for managing stalled storage devices
US7882068B2 (en) 2007-08-21 2011-02-01 Isilon Systems, Inc. Systems and methods for adaptive copy on write
US7882071B2 (en) 2006-08-18 2011-02-01 Isilon Systems, Inc. Systems and methods for a snapshot of data
US7900015B2 (en) 2007-04-13 2011-03-01 Isilon Systems, Inc. Systems and methods of quota accounting
US7899800B2 (en) 2006-08-18 2011-03-01 Isilon Systems, Inc. Systems and methods for providing nonlinear journaling
US20110066796A1 (en) * 2009-09-11 2011-03-17 Sean Eilert Autonomous subsystem architecture
US7917474B2 (en) 2005-10-21 2011-03-29 Isilon Systems, Inc. Systems and methods for accessing and updating distributed data
US7937421B2 (en) 2002-11-14 2011-05-03 Emc Corporation Systems and methods for restriping files in a distributed file system
US7949692B2 (en) 2007-08-21 2011-05-24 Emc Corporation Systems and methods for portals into snapshot data
US7949636B2 (en) 2008-03-27 2011-05-24 Emc Corporation Systems and methods for a read only mode for a portion of a storage system
US7953704B2 (en) 2006-08-18 2011-05-31 Emc Corporation Systems and methods for a snapshot of data
US7953709B2 (en) 2008-03-27 2011-05-31 Emc Corporation Systems and methods for a read only mode for a portion of a storage system
US7962779B2 (en) 2001-08-03 2011-06-14 Emc Corporation Systems and methods for a distributed file system with data recovery
US7966289B2 (en) 2007-08-21 2011-06-21 Emc Corporation Systems and methods for reading objects in a file system
US7984324B2 (en) 2008-03-27 2011-07-19 Emc Corporation Systems and methods for managing stalled storage devices
US8051425B2 (en) 2004-10-29 2011-11-01 Emc Corporation Distributed system with asynchronous execution systems and methods
US8055711B2 (en) 2004-10-29 2011-11-08 Emc Corporation Non-blocking commit protocol systems and methods
US8054765B2 (en) 2005-10-21 2011-11-08 Emc Corporation Systems and methods for providing variable protection
US8082379B2 (en) 2007-01-05 2011-12-20 Emc Corporation Systems and methods for managing semantic locks
US8238350B2 (en) 2004-10-29 2012-08-07 Emc Corporation Message batching with checkpoints systems and methods
US8286029B2 (en) 2006-12-21 2012-10-09 Emc Corporation Systems and methods for managing unavailable storage devices
US8539056B2 (en) 2006-08-02 2013-09-17 Emc Corporation Systems and methods for configuring multiple network interfaces
US20130311523A1 (en) * 2009-09-02 2013-11-21 Microsoft Corporation Extending file system namespace types
US20140282585A1 (en) * 2013-03-13 2014-09-18 Barracuda Networks, Inc. Organizing File Events by Their Hierarchical Paths for Multi-Threaded Synch and Parallel Access System, Apparatus, and Method of Operation
US20140359232A1 (en) * 2013-05-10 2014-12-04 Hugh W. Holbrook System and method of a shared memory hash table with notifications
US8966080B2 (en) 2007-04-13 2015-02-24 Emc Corporation Systems and methods of managing resource utilization on a threaded computer system
US8972345B1 (en) * 2006-09-27 2015-03-03 Hewlett-Packard Development Company, L.P. Modifying data structures in distributed file systems
US20150347526A1 (en) * 2011-03-14 2015-12-03 Splunk Inc. Display for a number of unique values for an event field
US10003675B2 (en) 2013-12-02 2018-06-19 Micron Technology, Inc. Packet processor receiving packets containing instructions, data, and starting location and generating packets containing instructions and data
CN109445966A (en) * 2018-11-06 2019-03-08 网易传媒科技(北京)有限公司 Event-handling method, device, medium and calculating equipment
CN109525466A (en) * 2019-01-03 2019-03-26 杭州云英网络科技有限公司 Back end monitor method and device
US10769097B2 (en) 2009-09-11 2020-09-08 Micron Technologies, Inc. Autonomous memory architecture
US11068469B2 (en) 2015-09-04 2021-07-20 Arista Networks, Inc. System and method of a dynamic shared memory hash table with notifications
US20220050819A1 (en) * 2020-08-13 2022-02-17 Red Hat, Inc. Automated pinning of file system subtrees
US11757755B1 (en) * 2020-09-28 2023-09-12 Cyral Inc. Techniques for in-band topology connections in a proxy

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010039966A (en) * 2008-08-08 2010-02-18 Hitachi Ltd Data management method
US9904681B2 (en) * 2009-01-12 2018-02-27 Sri International Method and apparatus for assembling a set of documents related to a triggering item
US20110040666A1 (en) * 2009-08-17 2011-02-17 Jason Crabtree Dynamic pricing system and method for complex energy securities
US20100217550A1 (en) * 2009-02-26 2010-08-26 Jason Crabtree System and method for electric grid utilization and optimization
US8196116B2 (en) * 2009-03-31 2012-06-05 International Business Systems Corporation Tracing objects in object-oriented programming model
US9122786B2 (en) * 2012-09-14 2015-09-01 Software Ag Systems and/or methods for statistical online analysis of large and potentially heterogeneous data sets
US9680692B2 (en) * 2013-01-23 2017-06-13 Facebook, Inc. Method and system for using a recursive event listener on a node in hierarchical data structure
US10514985B1 (en) 2013-09-30 2019-12-24 EMC IP Holding Company LLC Summary file change log for faster forever incremental backup
US9418097B1 (en) * 2013-11-15 2016-08-16 Emc Corporation Listener event consistency points
US9501487B1 (en) * 2014-06-30 2016-11-22 Emc Corporation Change tree incremental backup
US9953070B1 (en) 2015-04-05 2018-04-24 Simply Data Now Inc. Enterprise resource planning (ERP) system data extraction, loading, and directing
WO2018032519A1 (en) * 2016-08-19 2018-02-22 华为技术有限公司 Resource allocation method and device, and numa system
CN108984544B (en) * 2017-05-31 2021-04-30 北京京东尚科信息技术有限公司 Method and device for modifying configuration information of distributed system
US11520826B2 (en) * 2019-02-20 2022-12-06 Bank Of America Corporation Data extraction using a distributed indexing architecture for databases
CN111488101B (en) * 2020-04-10 2021-09-10 得到(天津)文化传播有限公司 Event monitoring response method, device, equipment and storage medium

Citations (97)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5181162A (en) * 1989-12-06 1993-01-19 Eastman Kodak Company Document management and production system
US5212784A (en) * 1990-10-22 1993-05-18 Delphi Data, A Division Of Sparks Industries, Inc. Automated concurrent data backup system
US5403639A (en) * 1992-09-02 1995-04-04 Storage Technology Corporation File server having snapshot application data groups
US5596709A (en) * 1990-06-21 1997-01-21 International Business Machines Corporation Method and apparatus for recovering parity protected data
US5612865A (en) * 1995-06-01 1997-03-18 Ncr Corporation Dynamic hashing method for optimal distribution of locks within a clustered system
US5734826A (en) * 1991-03-29 1998-03-31 International Business Machines Corporation Variable cyclic redundancy coding method and apparatus for use in a multistage network
US5761659A (en) * 1996-02-29 1998-06-02 Sun Microsystems, Inc. Method, product, and structure for flexible range locking of read and write requests using shared and exclusive locks, flags, sub-locks, and counters
US5774643A (en) * 1995-10-13 1998-06-30 Digital Equipment Corporation Enhanced raid write hole protection and recovery
US5862312A (en) * 1995-10-24 1999-01-19 Seachange Technology, Inc. Loosely coupled mass storage computer cluster
US5870563A (en) * 1992-09-19 1999-02-09 International Business Machines Corporation Method and apparatus for optimizing message transmission
US5878414A (en) * 1997-06-06 1999-03-02 International Business Machines Corp. Constructing a transaction serialization order based on parallel or distributed database log files
US5878410A (en) * 1996-09-13 1999-03-02 Microsoft Corporation File system sort order indexes
US5884098A (en) * 1996-04-18 1999-03-16 Emc Corporation RAID controller system utilizing front end and back end caching systems including communication path connecting two caching systems and synchronizing allocation of blocks in caching systems
US5884046A (en) * 1996-10-23 1999-03-16 Pluris, Inc. Apparatus and method for sharing data and routing messages between a plurality of workstations in a local area network
US5884303A (en) * 1996-03-15 1999-03-16 International Computers Limited Parallel searching technique
US5890147A (en) * 1997-03-07 1999-03-30 Microsoft Corporation Scope testing of documents in a search engine using document to folder mapping
US6021414A (en) * 1995-09-11 2000-02-01 Sun Microsystems, Inc. Single transaction technique for a journaling file system of a computer operating system
US6029168A (en) * 1998-01-23 2000-02-22 Tricord Systems, Inc. Decentralized file mapping in a striped network file system in a distributed computing environment
US6038570A (en) * 1993-06-03 2000-03-14 Network Appliance, Inc. Method for allocating files in a file system integrated with a RAID disk sub-system
US6044367A (en) * 1996-08-02 2000-03-28 Hewlett-Packard Company Distributed I/O store
US6055543A (en) * 1997-11-21 2000-04-25 Verano File wrapper containing cataloging information for content searching across multiple platforms
US6070172A (en) * 1997-03-06 2000-05-30 Oracle Corporation On-line free space defragmentation of a contiguous-file file system
US6081833A (en) * 1995-07-06 2000-06-27 Kabushiki Kaisha Toshiba Memory space management method, data transfer method, and computer device for distributed computer system
US6081883A (en) * 1997-12-05 2000-06-27 Auspex Systems, Incorporated Processing system with dynamically allocatable buffer memory
US6173374B1 (en) * 1998-02-11 2001-01-09 Lsi Logic Corporation System and method for peer-to-peer accelerated I/O shipping between host bus adapters in clustered computer network
US6209059B1 (en) * 1997-09-25 2001-03-27 Emc Corporation Method and apparatus for the on-line reconfiguration of the logical volumes of a data storage system
US6219693B1 (en) * 1997-11-04 2001-04-17 Adaptec, Inc. File array storage architecture having file system distributed across a data processing platform
US20020010696A1 (en) * 2000-06-01 2002-01-24 Tadanori Izumi Automatic aggregation method, automatic aggregation apparatus, and recording medium having automatic aggregation program
US6353823B1 (en) * 1999-03-08 2002-03-05 Intel Corporation Method and system for using associative metadata
US20020035668A1 (en) * 1998-05-27 2002-03-21 Yasuhiko Nakano Information storage system for redistributing information to information storage devices when a structure of the information storage devices is changed
US6384626B2 (en) * 2000-07-19 2002-05-07 Micro-Star Int'l Co., Ltd. Programmable apparatus and method for programming a programmable device
US6385626B1 (en) * 1998-11-19 2002-05-07 Emc Corporation Method and apparatus for identifying changes to a logical object based on changes to the logical object at physical level
US20020055940A1 (en) * 2000-11-07 2002-05-09 Charles Elkan Method and system for selecting documents by measuring document quality
US6397311B1 (en) * 1990-01-19 2002-05-28 Texas Instruments Incorporated System and method for defragmenting a file system
US6405219B2 (en) * 1999-06-22 2002-06-11 F5 Networks, Inc. Method and system for automatically updating the version of a set of files stored on content servers
US20020072974A1 (en) * 2000-04-03 2002-06-13 Pugliese Anthony V. System and method for displaying and selling goods and services in a retail environment employing electronic shopper aids
US6408313B1 (en) * 1998-12-16 2002-06-18 Microsoft Corporation Dynamic memory allocation based on free memory size
US20020078180A1 (en) * 2000-12-18 2002-06-20 Kizna Corporation Information collection server, information collection method, and recording medium
US20020075870A1 (en) * 2000-08-25 2002-06-20 De Azevedo Marcelo Method and apparatus for discovering computer systems in a distributed multi-system cluster
US20020083078A1 (en) * 2000-11-02 2002-06-27 Guy Pardon Decentralized, distributed internet data management
US20030005159A1 (en) * 2001-06-07 2003-01-02 International Business Machines Corporation Method and system for generating and serving multilingual web pages
US20030014391A1 (en) * 2000-03-07 2003-01-16 Evans Paul A Data distribution
US20030033308A1 (en) * 2001-08-03 2003-02-13 Patel Sujal M. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US6523130B1 (en) * 1999-03-11 2003-02-18 Microsoft Corporation Storage system having error detection and recovery
US6526478B1 (en) * 2000-02-02 2003-02-25 Lsi Logic Corporation Raid LUN creation using proportional disk mapping
US6549513B1 (en) * 1999-10-12 2003-04-15 Alcatel Method and apparatus for fast distributed restoration of a communication network
US6557114B2 (en) * 1995-10-24 2003-04-29 Seachange Technology, Inc. Loosely coupled mass storage computer cluster
US6567926B2 (en) * 1995-10-24 2003-05-20 Seachange International, Inc. Loosely coupled mass storage computer cluster
US6567894B1 (en) * 1999-12-08 2003-05-20 International Business Machines Corporation Method and apparatus to prefetch sequential pages in a multi-stream environment
US6571244B1 (en) * 1999-10-28 2003-05-27 Microsoft Corporation Run formation in large scale sorting using batched replacement selection
US20030109253A1 (en) * 2000-01-18 2003-06-12 Fenton Shaun Richard Digital communications system
US20040003053A1 (en) * 2002-03-25 2004-01-01 Williams Michael John System
US20040024963A1 (en) * 2002-08-05 2004-02-05 Nisha Talagala Method and system for striping data to accommodate integrity metadata
US20040078812A1 (en) * 2001-01-04 2004-04-22 Calvert Kerry Wayne Method and apparatus for acquiring media services available from content aggregators
US6732125B1 (en) * 2000-09-08 2004-05-04 Storage Technology Corporation Self archiving log structured volume with intrinsic data protection
US20050005266A1 (en) * 1997-05-01 2005-01-06 Datig William E. Method of and apparatus for realizing synthetic knowledge processes in devices for useful applications
US6848029B2 (en) * 2000-01-03 2005-01-25 Dirk Coldewey Method and apparatus for prefetching recursive data structures
US20050066095A1 (en) * 2003-09-23 2005-03-24 Sachin Mullick Multi-threaded write interface and methods for increasing the single file read and write throughput of a file server
US20050114609A1 (en) * 2003-11-26 2005-05-26 Shorb Charles S. Computer-implemented system and method for lock handling
US20050114402A1 (en) * 2003-11-20 2005-05-26 Zetta Systems, Inc. Block level data snapshot system and method
US20060004760A1 (en) * 2004-06-21 2006-01-05 Microsoft Corporation Method, system, and apparatus for managing access to a data object
US6990604B2 (en) * 2001-12-28 2006-01-24 Storage Technology Corporation Virtual storage status coalescing with a plurality of physical storage devices
US20060041894A1 (en) * 2004-08-03 2006-02-23 Tu-An Cheng Apparatus, system, and method for isolating a storage application from a network interface driver
US7007097B1 (en) * 2000-07-20 2006-02-28 Silicon Graphics, Inc. Method and system for covering multiple resourcces with a single credit in a computer system
US7007044B1 (en) * 2002-12-26 2006-02-28 Storage Technology Corporation Storage backup system for backing up data written to a primary storage device to multiple virtual mirrors using a reconciliation process that reflects the changing state of the primary storage device over time
US20060059467A1 (en) * 2004-09-16 2006-03-16 International Business Machines Corporation Fast source file to line number table association
US7017003B2 (en) * 2004-02-16 2006-03-21 Hitachi, Ltd. Disk array apparatus and disk array apparatus control method
US20060074922A1 (en) * 2002-11-25 2006-04-06 Kozo Nishimura File management device, file management method, file management program and recording medium
US20060083177A1 (en) * 2004-10-18 2006-04-20 Nokia Corporation Listener mechanism in a distributed network system
US20060095438A1 (en) * 2004-10-29 2006-05-04 Fachan Neal T Non-blocking commit protocol systems and methods
US20060101062A1 (en) * 2004-10-29 2006-05-11 Godman Peter J Distributed system with asynchronous execution systems and methods
US7177295B1 (en) * 2002-03-08 2007-02-13 Scientific Research Corporation Wireless routing protocol for ad-hoc networks
US7184421B1 (en) * 2001-12-21 2007-02-27 Itt Manufacturing Enterprises, Inc. Method and apparatus for on demand multicast and unicast using controlled flood multicast communications
US20070091790A1 (en) * 2005-10-21 2007-04-26 Passey Aaron J Systems and methods for providing variable protection
US20070094452A1 (en) * 2005-10-21 2007-04-26 Fachan Neal T Systems and methods for using excitement values to predict future access to resources
US20070094277A1 (en) * 2005-10-21 2007-04-26 Fachan Neal T Systems and methods for maintaining distributed data
US20070094431A1 (en) * 2005-10-21 2007-04-26 Fachan Neal T Systems and methods for managing concurrent access requests to a shared resource
US20070094310A1 (en) * 2005-10-21 2007-04-26 Passey Aaron J Systems and methods for accessing and updating distributed data
US20070094269A1 (en) * 2005-10-21 2007-04-26 Mikesell Paul A Systems and methods for distributed system scanning
US20080005145A1 (en) * 2006-06-30 2008-01-03 Data Equation Limited Data processing
US7318134B1 (en) * 2004-03-16 2008-01-08 Emc Corporation Continuous data backup using distributed journaling
US20080010507A1 (en) * 2006-05-30 2008-01-10 Oracle International Corporation Selecting optimal repair strategy for mirrored files
US20080031238A1 (en) * 2006-08-02 2008-02-07 Shai Harmelin Systems and methods for configuring multiple network interfaces
US20080034004A1 (en) * 2006-08-04 2008-02-07 Pavel Cisler System for electronic backup
US20080046445A1 (en) * 2006-08-18 2008-02-21 Passey Aaron J Systems and methods of reverse lookup
US20080046476A1 (en) * 2006-08-18 2008-02-21 Anderson Robert J Systems and methods for a snapshot of data
US20080046432A1 (en) * 2006-08-18 2008-02-21 Anderson Robert J Systems and methods for a snapshot of data
US20080046475A1 (en) * 2006-08-18 2008-02-21 Anderson Robert J Systems and methods for a snapshot of data
US20080046443A1 (en) * 2006-08-18 2008-02-21 Fachan Neal T Systems and methods for providing nonlinear journaling
US20080044016A1 (en) * 2006-08-04 2008-02-21 Henzinger Monika H Detecting duplicate and near-duplicate files
US20080046667A1 (en) * 2006-08-18 2008-02-21 Fachan Neal T Systems and methods for allowing incremental journaling
US20080046444A1 (en) * 2006-08-18 2008-02-21 Fachan Neal T Systems and methods for providing nonlinear journaling
US20080059541A1 (en) * 2006-08-18 2008-03-06 Fachan Neal T Systems and methods for a snapshot of data
US7373426B2 (en) * 2002-03-29 2008-05-13 Kabushiki Kaisha Toshiba Network system using name server with pseudo host name and pseudo IP address generation function
US20080126365A1 (en) * 2006-08-18 2008-05-29 Fachan Neal T Systems and methods for providing nonlinear journaling
US7509448B2 (en) * 2007-01-05 2009-03-24 Isilon Systems, Inc. Systems and methods for managing semantic locks
US7533298B2 (en) * 2005-09-07 2009-05-12 Lsi Corporation Write journaling using battery backed cache

Family Cites Families (235)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5163131A (en) 1989-09-08 1992-11-10 Auspex Systems, Inc. Parallel i/o network file server architecture
US5230047A (en) 1990-04-16 1993-07-20 International Business Machines Corporation Method for balancing of distributed tree file structures in parallel computing systems to enable recovery after a failure
DE69017198T2 (en) * 1990-05-15 1995-08-17 Ibm Hybrid switching system for a communication node.
US5329626A (en) 1990-10-23 1994-07-12 Digital Equipment Corporation System for distributed computation processing includes dynamic assignment of predicates to define interdependencies
AU8683991A (en) 1990-11-09 1992-05-14 Array Technology Corporation Logical partitioning of a redundant array storage system
US5258984A (en) 1991-06-13 1993-11-02 International Business Machines Corporation Method and means for distributed sparing in DASD arrays
JP3160106B2 (en) 1991-12-23 2001-04-23 ヒュンダイ エレクトロニクス アメリカ How to sort disk arrays
US5359594A (en) 1992-05-15 1994-10-25 International Business Machines Corporation Power-saving full duplex nodal communications systems
US5469562A (en) * 1992-06-26 1995-11-21 Digital Equipment Corporation Durable atomic storage update manager
EP0595453B1 (en) 1992-10-24 1998-11-11 International Computers Limited Distributed data processing system
US5649200A (en) 1993-01-08 1997-07-15 Atria Software, Inc. Dynamic rule-based version control system
US5548724A (en) 1993-03-22 1996-08-20 Hitachi, Ltd. File server system and file access control method of the same
US5963962A (en) 1995-05-31 1999-10-05 Network Appliance, Inc. Write anywhere file-system layout
US6138126A (en) 1995-05-31 2000-10-24 Network Appliance, Inc. Method for allocating files in a file system integrated with a raid disk sub-system
US5548795A (en) 1994-03-28 1996-08-20 Quantum Corporation Method for determining command execution dependencies within command queue reordering process
DE69429983T2 (en) * 1994-05-25 2002-10-17 Ibm Data transmission network and method for operating the network
US5657439A (en) 1994-08-23 1997-08-12 International Business Machines Corporation Distributed subsystem sparing
US5694593A (en) 1994-10-05 1997-12-02 Northeastern University Distributed computer database system and method
EP0709779B1 (en) 1994-10-31 2001-05-30 International Business Machines Corporation Virtual shared disks with application-transparent recovery
US6108759A (en) 1995-02-23 2000-08-22 Powerquest Corporation Manipulation of partitions holding advanced file systems
JP3358687B2 (en) 1995-03-13 2002-12-24 株式会社日立製作所 Disk array device
US5696895A (en) 1995-05-19 1997-12-09 Compaq Computer Corporation Fault tolerant multiple network servers
US5680621A (en) 1995-06-07 1997-10-21 International Business Machines Corporation System and method for domained incremental changes storage and retrieval
US5787267A (en) 1995-06-07 1998-07-28 Monolithic System Technology, Inc. Caching method and circuit for a memory system with circuit module architecture
US5875456A (en) 1995-08-17 1999-02-23 Nstor Corporation Storage device array and methods for striping and unstriping data and for adding and removing disks online to/from a raid storage array
US5778395A (en) * 1995-10-23 1998-07-07 Stac, Inc. System for backing up files from disk volumes on multiple nodes of a computer network
US5805578A (en) 1995-10-27 1998-09-08 International Business Machines Corporation Automatic reconfiguration of multipoint communication channels
US5799305A (en) 1995-11-02 1998-08-25 Informix Software, Inc. Method of commitment in a distributed database transaction
JPH103421A (en) 1995-11-20 1998-01-06 Matsushita Electric Ind Co Ltd Virtual file management system
US6117181A (en) * 1996-03-22 2000-09-12 Sun Microsystems, Inc. Synchronization mechanism for distributed hardware simulation
US5806065A (en) 1996-05-06 1998-09-08 Microsoft Corporation Data system with distributed tree indexes and method for maintaining the indexes
US5917998A (en) 1996-07-26 1999-06-29 International Business Machines Corporation Method and apparatus for establishing and maintaining the status of membership sets used in mirrored read and write input/output without logging
US5805900A (en) 1996-09-26 1998-09-08 International Business Machines Corporation Method and apparatus for serializing resource access requests in a multisystem complex
US6202085B1 (en) * 1996-12-06 2001-03-13 Microsoft Corportion System and method for incremental change synchronization between multiple copies of data
US5822790A (en) 1997-02-07 1998-10-13 Sun Microsystems, Inc. Voting data prefetch engine
US5943690A (en) 1997-04-07 1999-08-24 Sony Corporation Data storage apparatus and method allocating sets of data
US6393483B1 (en) 1997-06-30 2002-05-21 Adaptec, Inc. Method and apparatus for network interface card load balancing and port aggregation
US5963963A (en) 1997-07-11 1999-10-05 International Business Machines Corporation Parallel file system and buffer management arbitration
US6014669A (en) * 1997-10-01 2000-01-11 Sun Microsystems, Inc. Highly-available distributed cluster configuration database
US5933834A (en) 1997-10-16 1999-08-03 International Business Machines Incorporated System and method for re-striping a set of objects onto an exploded array of storage units in a computer system
US6442533B1 (en) 1997-10-29 2002-08-27 William H. Hinkle Multi-processing financial transaction processing system
US5966707A (en) 1997-12-02 1999-10-12 International Business Machines Corporation Method for managing a plurality of data processes residing in heterogeneous data repositories
US6226377B1 (en) 1998-03-06 2001-05-01 Avaya Technology Corp. Prioritized transaction server allocation
US6055564A (en) 1998-03-11 2000-04-25 Hewlett Packard Company Admission control where priority indicator is used to discriminate between messages
US6421781B1 (en) 1998-04-30 2002-07-16 Openwave Systems Inc. Method and apparatus for maintaining security in a push server
US6122754A (en) 1998-05-22 2000-09-19 International Business Machines Corporation Method and system for data recovery using a distributed and scalable data structure
US6463442B1 (en) 1998-06-30 2002-10-08 Microsoft Corporation Container independent data binding system
US6631411B1 (en) 1998-10-12 2003-10-07 Freshwater Software, Inc. Apparatus and method for monitoring a chain of electronic transactions
US6862635B1 (en) 1998-11-13 2005-03-01 Cray Inc. Synchronization techniques in a multithreaded environment
US6279007B1 (en) 1998-11-30 2001-08-21 Microsoft Corporation Architecture for managing query friendly hierarchical values
US6434574B1 (en) 1998-12-17 2002-08-13 Apple Computer, Inc. System and method for storing and retrieving filenames and files in computer memory using multiple encodings
US6457139B1 (en) 1998-12-30 2002-09-24 Emc Corporation Method and apparatus for providing a host computer with information relating to the mapping of logical volumes within an intelligent storage system
US6922708B1 (en) 1999-02-18 2005-07-26 Oracle International Corporation File system that supports transactions
US6334168B1 (en) 1999-02-19 2001-12-25 International Business Machines Corporation Method and system for updating data in a data storage system
US6321345B1 (en) 1999-03-01 2001-11-20 Seachange Systems, Inc. Slow response in redundant arrays of inexpensive disks
TW418360B (en) 1999-03-02 2001-01-11 Via Tech Inc Memory access controller
US6725392B1 (en) 1999-03-03 2004-04-20 Adaptec, Inc. Controller fault recovery system for a distributed file system
US6502174B1 (en) 1999-03-03 2002-12-31 International Business Machines Corporation Method and system for managing meta data
US6658554B1 (en) 1999-03-09 2003-12-02 Wisconsin Alumni Res Found Electronic processor providing direct data transfer between linked data consuming instructions
US6671704B1 (en) 1999-03-11 2003-12-30 Hewlett-Packard Development Company, L.P. Method and apparatus for handling failures of resource managers in a clustered environment
US6907011B1 (en) 1999-03-30 2005-06-14 International Business Machines Corporation Quiescent reconfiguration of a routing network
US6801949B1 (en) 1999-04-12 2004-10-05 Rainfinity, Inc. Distributed server cluster with graphical user interface
US6496842B1 (en) 1999-05-28 2002-12-17 Survol Interactive Technologies Navigating heirarchically organized information
US6453389B1 (en) 1999-06-25 2002-09-17 Hewlett-Packard Company Optimizing computer performance by using data compression principles to minimize a loss function
US6415259B1 (en) 1999-07-15 2002-07-02 American Management Systems, Inc. Automatic work progress tracking and optimizing engine for a telecommunications customer care and billing system
US7290056B1 (en) 1999-09-09 2007-10-30 Oracle International Corporation Monitoring latency of a network to manage termination of distributed transactions
US7206805B1 (en) 1999-09-09 2007-04-17 Oracle International Corporation Asynchronous transcription object management system
US6895482B1 (en) 1999-09-10 2005-05-17 International Business Machines Corporation Reordering and flushing commands in a computer memory subsystem
US20020029200A1 (en) * 1999-09-10 2002-03-07 Charles Dulin System and method for providing certificate validation and other services
US6662184B1 (en) 1999-09-23 2003-12-09 International Business Machines Corporation Lock-free wild card search data structure and method
US7069320B1 (en) 1999-10-04 2006-06-27 International Business Machines Corporation Reconfiguring a network by utilizing a predetermined length quiescent state
US6359594B1 (en) * 1999-12-01 2002-03-19 Logitech Europe S.A. Loop antenna parasitics reduction technique
US6584581B1 (en) 1999-12-06 2003-06-24 Ab Initio Software Corporation Continuous flow checkpointing data processing
US6546443B1 (en) 1999-12-15 2003-04-08 Microsoft Corporation Concurrency-safe reader-writer lock with time out support
US6748429B1 (en) 2000-01-10 2004-06-08 Sun Microsystems, Inc. Method to dynamically change cluster or distributed system configuration
US7213063B2 (en) 2000-01-18 2007-05-01 Lucent Technologies Inc. Method, apparatus and system for maintaining connections between computers using connection-oriented protocols
US6594660B1 (en) 2000-01-24 2003-07-15 Microsoft Corporation Share latch clearing
US20020091855A1 (en) 2000-02-02 2002-07-11 Yechiam Yemini Method and apparatus for dynamically addressing and routing in a data network
US6917942B1 (en) 2000-03-15 2005-07-12 International Business Machines Corporation System for dynamically evaluating locks in a distributed data storage system
US20020049778A1 (en) 2000-03-31 2002-04-25 Bell Peter W. System and method of information outsourcing
US6598174B1 (en) 2000-04-26 2003-07-22 Dell Products L.P. Method and apparatus for storage unit replacement in non-redundant array
US6735678B2 (en) 2000-05-24 2004-05-11 Seagate Technology Llc Method and apparatus for disc drive defragmentation
US6922696B1 (en) 2000-05-31 2005-07-26 Sri International Lattice-based security classification system and method
US6742020B1 (en) 2000-06-08 2004-05-25 Hewlett-Packard Development Company, L.P. System and method for managing data flow and measuring service in a storage network
US6618798B1 (en) 2000-07-11 2003-09-09 International Business Machines Corporation Method, system, program, and data structures for mapping logical units to a storage space comprises of at least one array of storage units
US6898607B2 (en) 2000-07-11 2005-05-24 Sony Corporation Proposed syntax for a synchronized commands execution
US6671772B1 (en) 2000-09-20 2003-12-30 Robert E. Cousins Hierarchical file system structure for enhancing disk transfer efficiency
JP2002108573A (en) * 2000-09-28 2002-04-12 Nec Corp Disk array device and method for controlling its error and recording medium with its control program recorded thereon
US6970939B2 (en) 2000-10-26 2005-11-29 Intel Corporation Method and apparatus for large payload distribution in a network
US6687805B1 (en) * 2000-10-30 2004-02-03 Hewlett-Packard Development Company, L.P. Method and system for logical-object-to-physical-location translation and physical separation of logical objects
US7313614B2 (en) 2000-11-02 2007-12-25 Sun Microsystems, Inc. Switching system
US6499091B1 (en) 2000-11-13 2002-12-24 Lsi Logic Corporation System and method for synchronizing data mirrored by storage subsystems
US6594744B1 (en) 2000-12-11 2003-07-15 Lsi Logic Corporation Managing a snapshot volume or one or more checkpoint volumes with multiple point-in-time images in a single repository
US6856591B1 (en) * 2000-12-15 2005-02-15 Cisco Technology, Inc. Method and system for high reliability cluster management
US20020078161A1 (en) 2000-12-19 2002-06-20 Philips Electronics North America Corporation UPnP enabling device for heterogeneous networks of slave devices
US6785678B2 (en) * 2000-12-21 2004-08-31 Emc Corporation Method of improving the availability of a computer clustering system through the use of a network medium link state function
US6990611B2 (en) * 2000-12-29 2006-01-24 Dot Hill Systems Corp. Recovering data from arrays of storage devices after certain failures
US20020087366A1 (en) 2000-12-30 2002-07-04 Collier Timothy R. Tentative-hold-based protocol for distributed transaction processing
US6594655B2 (en) 2001-01-04 2003-07-15 Ezchip Technologies Ltd. Wildcards in radix- search tree structures
US6907520B2 (en) 2001-01-11 2005-06-14 Sun Microsystems, Inc. Threshold-based load address prediction and new thread identification in a multithreaded microprocessor
US7054927B2 (en) 2001-01-29 2006-05-30 Adaptec, Inc. File system metadata describing server directory information
US20020138559A1 (en) 2001-01-29 2002-09-26 Ulrich Thomas R. Dynamically distributed file system
US20020124137A1 (en) 2001-01-29 2002-09-05 Ulrich Thomas R. Enhancing disk array performance via variable parity based load balancing
US20020169827A1 (en) 2001-01-29 2002-11-14 Ulrich Thomas R. Hot adding file system processors
US6754773B2 (en) 2001-01-29 2004-06-22 Snap Appliance, Inc. Data engine with metadata processor
US20020191311A1 (en) 2001-01-29 2002-12-19 Ulrich Thomas R. Dynamically scalable disk array
US6990547B2 (en) 2001-01-29 2006-01-24 Adaptec, Inc. Replacing file system processors by hot swapping
US6990667B2 (en) 2001-01-29 2006-01-24 Adaptec, Inc. Server-independent object positioning for load balancing drives and servers
US6862692B2 (en) 2001-01-29 2005-03-01 Adaptec, Inc. Dynamic redistribution of parity groups
FR2820843B1 (en) 2001-02-09 2003-05-30 Thomson Csf PROTECTION SYSTEM AGAINST THE COPY OF INFORMATION FOR THE CREATION OF A PROTECTED OPTICAL DISK AND CORRESPONDING PROTECTION METHOD
WO2002071183A2 (en) 2001-02-28 2002-09-12 Wily Technology, Inc. Detecting a stalled routine
US6895534B2 (en) 2001-04-23 2005-05-17 Hewlett-Packard Development Company, L.P. Systems and methods for providing automated diagnostic services for a cluster computer system
US20020158900A1 (en) 2001-04-30 2002-10-31 Hsieh Vivian G. Graphical user interfaces for network management automated provisioning environment
US20040189682A1 (en) 2001-12-26 2004-09-30 Lidror Troyansky Method and a system for embedding textual forensic information
US7295755B2 (en) 2001-06-22 2007-11-13 Thomson Licensing Method and apparatus for simplifying the access of metadata
US7181746B2 (en) * 2001-06-29 2007-02-20 Intel Corporation Initialization, reconfiguration, and shut down of a module function
US6877107B2 (en) 2001-07-05 2005-04-05 Softwired Ag Method for ensuring operation during node failures and network partitions in a clustered message passing server
US7546354B1 (en) 2001-07-06 2009-06-09 Emc Corporation Dynamic network based storage with high availability
US7146524B2 (en) 2001-08-03 2006-12-05 Isilon Systems, Inc. Systems and methods for providing a distributed file system incorporating a virtual hot spare
US6929013B2 (en) 2001-08-14 2005-08-16 R. J. Reynolds Tobacco Company Wrapping materials for smoking articles
GB0120744D0 (en) * 2001-08-28 2001-10-17 Hewlett Packard Co Quantum computation with coherent optical pulses
JP2003094926A (en) * 2001-09-20 2003-04-03 Denso Corp Air conditioner for vehicle
US20030061491A1 (en) 2001-09-21 2003-03-27 Sun Microsystems, Inc. System and method for the allocation of network storage
US6920494B2 (en) 2001-10-05 2005-07-19 International Business Machines Corporation Storage area network methods and apparatus with virtual SAN recognition
US6954877B2 (en) 2001-11-29 2005-10-11 Agami Systems, Inc. Fault tolerance using logical checkpointing in computing systems
US7055058B2 (en) 2001-12-26 2006-05-30 Boon Storage Technologies, Inc. Self-healing log-structured RAID
US7433948B2 (en) 2002-01-23 2008-10-07 Cisco Technology, Inc. Methods and apparatus for implementing virtualization of storage within a storage area network
US6859696B2 (en) 2001-12-27 2005-02-22 Caterpillar Inc System and method for monitoring machine status
US7073115B2 (en) 2001-12-28 2006-07-04 Network Appliance, Inc. Correcting multiple block data loss in a storage array using a combination of a single diagonal parity group and multiple row parity groups
US6898587B2 (en) 2002-01-18 2005-05-24 Bea Systems, Inc. System and method for performing commutative operations in data access systems
US7742504B2 (en) 2002-01-24 2010-06-22 University Of Southern California Continuous media system
US20030149750A1 (en) 2002-02-07 2003-08-07 Franzenburg Alan M. Distributed storage array
US6829617B2 (en) 2002-02-15 2004-12-07 International Business Machines Corporation Providing a snapshot of a subset of a file system
US7216135B2 (en) 2002-02-15 2007-05-08 International Business Machines Corporation File system for providing access to a snapshot dataset where disk address in the inode is equal to a ditto address for indicating that the disk address is invalid disk address
US6940966B2 (en) 2002-02-21 2005-09-06 Vtech Telecommunications, Ltd. Method and apparatus for detection of a telephone CPE alerting signal
ATE526784T1 (en) 2002-02-27 2011-10-15 Opentv Inc METHOD AND DEVICE FOR PROVIDING A HIERARCHICAL SECURITY PROFILE OBJECT
US7240235B2 (en) 2002-03-13 2007-07-03 Intel Corporation Journaling technique for write transactions to mass storage
US7143307B1 (en) * 2002-03-15 2006-11-28 Network Appliance, Inc. Remote disaster recovery and data migration using virtual appliance migration
US7225204B2 (en) 2002-03-19 2007-05-29 Network Appliance, Inc. System and method for asynchronous mirroring of snapshots at a destination using a purgatory directory and inode mapping
US7010553B2 (en) 2002-03-19 2006-03-07 Network Appliance, Inc. System and method for redirecting access to a remote mirrored snapshot
US7043485B2 (en) 2002-03-19 2006-05-09 Network Appliance, Inc. System and method for storage of snapshot metadata in a remote file
CN1286012C (en) 2002-03-20 2006-11-22 联想(北京)有限公司 Method for recovering and backing up information in hard disc of computer
US6934878B2 (en) 2002-03-22 2005-08-23 Intel Corporation Failure detection and failure handling in cluster controller networks
US7631066B1 (en) 2002-03-25 2009-12-08 Symantec Operating Corporation System and method for preventing data corruption in computer system clusters
US6904430B1 (en) * 2002-04-26 2005-06-07 Microsoft Corporation Method and system for efficiently identifying differences between large files
US6954435B2 (en) 2002-04-29 2005-10-11 Harris Corporation Determining quality of service (QoS) routing for mobile ad hoc networks
US7249118B2 (en) 2002-05-17 2007-07-24 Aleri, Inc. Database system and methods
US8447963B2 (en) 2002-06-12 2013-05-21 Bladelogic Inc. Method and system for simplifying distributed server management
US7043567B2 (en) 2002-07-22 2006-05-09 Seagate Technology Llc Method and apparatus for determining the order of execution of queued commands in a data storage system
US7047243B2 (en) * 2002-08-05 2006-05-16 Microsoft Corporation Coordinating transactional web services
US7370064B2 (en) 2002-08-06 2008-05-06 Yousefi Zadeh Homayoun Database remote replication for back-end tier of multi-tier computer systems
US7877425B2 (en) 2002-09-05 2011-01-25 Hiroyuki Yasoshima Method for managing file using network structure, operation object display limiting program, and recording medium
US7698338B2 (en) 2002-09-18 2010-04-13 Netezza Corporation Field oriented pipeline architecture for a programmable data streaming processor
US7103597B2 (en) 2002-10-03 2006-09-05 Mcgoveran David O Adaptive transaction manager for complex transactions and business process
US7111305B2 (en) 2002-10-31 2006-09-19 Sun Microsystems, Inc. Facilitating event notification through use of an inverse mapping structure for subset determination
AU2003291014A1 (en) 2002-11-14 2004-06-15 Isilon Systems, Inc. Systems and methods for restriping files in a distributed file system
US7412433B2 (en) 2002-11-19 2008-08-12 International Business Machines Corporation Hierarchical storage management using dynamic tables of contents and sets of tables of contents
GB2411030B (en) 2002-11-20 2006-03-22 Filesx Ltd Fast backup storage and fast recovery of data (FBSRD)
US7552445B2 (en) * 2002-12-13 2009-06-23 Savvis Communications Corporation Systems and methods for monitoring events from multiple brokers
US8250202B2 (en) * 2003-01-04 2012-08-21 International Business Machines Corporation Distributed notification and action mechanism for mirroring-related events
US7512701B2 (en) * 2003-01-16 2009-03-31 Hewlett-Packard Development Company, L.P. System and method for efficiently replicating a file among a plurality of recipients in a reliable manner
JP4268969B2 (en) 2003-01-20 2009-05-27 エスケーテレコム株式会社 Media message upload control method via wireless communication network
JP4077329B2 (en) 2003-01-31 2008-04-16 株式会社東芝 Transaction processing system, parallel control method, and program
WO2004072816A2 (en) 2003-02-07 2004-08-26 Lammina Systems Corporation Method and apparatus for online transaction processing
US7509378B2 (en) 2003-03-11 2009-03-24 Bea Systems, Inc. System and method for message ordering in a message oriented network
US7337290B2 (en) 2003-04-03 2008-02-26 Oracle International Corporation Deadlock resolution through lock requeing
US7228299B1 (en) 2003-05-02 2007-06-05 Veritas Operating Corporation System and method for performing file lookups based on tags
JP3973597B2 (en) 2003-05-14 2007-09-12 株式会社ソニー・コンピュータエンタテインメント Prefetch instruction control method, prefetch instruction control device, cache memory control device, object code generation method and device
US7673307B2 (en) 2003-05-29 2010-03-02 International Business Machines Corporation Managing transactions in a messaging system
US7152182B2 (en) 2003-06-06 2006-12-19 Hewlett-Packard Development Company, L.P. Data redundancy system and method
US20050010592A1 (en) * 2003-07-08 2005-01-13 John Guthrie Method and system for taking a data snapshot
US7831693B2 (en) * 2003-08-18 2010-11-09 Oracle America, Inc. Structured methodology and design patterns for web services
US7257257B2 (en) 2003-08-19 2007-08-14 Intel Corporation Method and apparatus for differential, bandwidth-efficient and storage-efficient backups
US7409587B2 (en) * 2004-08-24 2008-08-05 Symantec Operating Corporation Recovering from storage transaction failures using checkpoints
US7269588B1 (en) 2003-09-24 2007-09-11 Oracle International Corporation Neighborhood locking technique for increasing concurrency among transactions
US7194487B1 (en) * 2003-10-16 2007-03-20 Veritas Operating Corporation System and method for recording the order of a change caused by restoring a primary volume during ongoing replication of the primary volume
DE10350715A1 (en) * 2003-10-30 2005-06-02 Bayerische Motoren Werke Ag Method and device for setting user-dependent parameter values
CA2452251C (en) 2003-12-04 2010-02-09 Timothy R. Jewell Data backup system and method
US20050125456A1 (en) 2003-12-09 2005-06-09 Junichi Hara File migration method based on access history
US8244903B2 (en) 2003-12-22 2012-08-14 Emc Corporation Data streaming and backup systems having multiple concurrent read threads for improved small file performance
US7181556B2 (en) 2003-12-23 2007-02-20 Arm Limited Transaction request servicing mechanism
US7287076B2 (en) * 2003-12-29 2007-10-23 Microsoft Corporation Performing threshold based connection status responses
JP2005196467A (en) 2004-01-07 2005-07-21 Hitachi Ltd Storage system, control method for storage system, and storage controller
US7265692B2 (en) * 2004-01-29 2007-09-04 Hewlett-Packard Development Company, L.P. Data compression system based on tree models
US7383276B2 (en) * 2004-01-30 2008-06-03 Microsoft Corporation Concurrency control for B-trees with node deletion
US7296139B1 (en) 2004-01-30 2007-11-13 Nvidia Corporation In-memory table structure for virtual address translation system with translation units of variable range size
US7440966B2 (en) 2004-02-12 2008-10-21 International Business Machines Corporation Method and apparatus for file system snapshot persistence
CA2564967C (en) * 2004-04-30 2014-09-30 Commvault Systems, Inc. Hierarchical systems and methods for providing a unified view of storage information
US7707195B2 (en) 2004-06-29 2010-04-27 Microsoft Corporation Allocation locks and their use
US7472129B2 (en) 2004-06-29 2008-12-30 Microsoft Corporation Lossless recovery for computer systems with map assisted state transfer
US20060047713A1 (en) * 2004-08-03 2006-03-02 Wisdomforce Technologies, Inc. System and method for database replication by interception of in memory transactional change records
US7716262B2 (en) 2004-09-30 2010-05-11 Emc Corporation Index processing
JP2006107151A (en) 2004-10-06 2006-04-20 Hitachi Ltd Storage system and communication path control method for storage system
WO2006038111A1 (en) * 2004-10-07 2006-04-13 Pfizer Products Inc. Benzoimidazole derivatives useful as antiproliferative agents
US8238350B2 (en) 2004-10-29 2012-08-07 Emc Corporation Message batching with checkpoints systems and methods
US7921076B2 (en) * 2004-12-15 2011-04-05 Oracle International Corporation Performing an action in response to a file system event
US7770150B2 (en) * 2004-12-15 2010-08-03 International Business Machines Corporation Apparatus, system, and method for sharing and accessing data by scopes
US7877466B2 (en) 2005-01-11 2011-01-25 Cisco Technology, Inc. Network topology based storage allocation for virtualization
US7603502B2 (en) 2005-04-12 2009-10-13 Microsoft Corporation Resource accessing with locking
US7562188B2 (en) 2005-06-17 2009-07-14 Intel Corporation RAID power safe apparatus, systems, and methods
US7540027B2 (en) 2005-06-23 2009-05-26 International Business Machines Corporation Method/system to speed up antivirus scans using a journal file system
US7577258B2 (en) 2005-06-30 2009-08-18 Intel Corporation Apparatus and method for group session key and establishment using a certified migration key
US7716171B2 (en) 2005-08-18 2010-05-11 Emc Corporation Snapshot indexing
US7707193B2 (en) 2005-09-22 2010-04-27 Netapp, Inc. System and method for verifying and restoring the consistency of inode to pathname mappings in a filesystem
US7356643B2 (en) 2005-10-26 2008-04-08 International Business Machines Corporation System, method and program for managing storage
US7665123B1 (en) * 2005-12-01 2010-02-16 Symantec Corporation Method and apparatus for detecting hidden rootkits
US7546412B2 (en) 2005-12-02 2009-06-09 International Business Machines Corporation Apparatus, system, and method for global metadata copy repair
US7734603B1 (en) 2006-01-26 2010-06-08 Netapp, Inc. Content addressable storage array element
JP4800046B2 (en) 2006-01-31 2011-10-26 株式会社日立製作所 Storage system
US7848261B2 (en) 2006-02-17 2010-12-07 Isilon Systems, Inc. Systems and methods for providing a quiescing protocol
US7756898B2 (en) 2006-03-31 2010-07-13 Isilon Systems, Inc. Systems and methods for notifying listeners of events
US20070244877A1 (en) 2006-04-12 2007-10-18 Battelle Memorial Institute Tracking methods for computer-readable files
US8548947B2 (en) 2006-04-28 2013-10-01 Hewlett-Packard Development Company, L.P. Systems and methods for file maintenance
US7689597B1 (en) * 2006-05-02 2010-03-30 Emc Corporation Mirrored storage architecture using continuous data protection techniques
JP4890160B2 (en) * 2006-09-06 2012-03-07 株式会社日立製作所 Storage system and backup / recovery method
UA98128C2 (en) 2006-11-22 2012-04-25 Басф Се Agrochemical formulation, a method for controlling harmful pests and/or phytopathogenous fungi, a method for controlling undesirable vegetation and seeds
EP2115563A2 (en) 2006-12-06 2009-11-11 Fusion Multisystems, Inc. Apparatus, system, and method for managing data in a storage device with an empty data token directive
US20080155191A1 (en) 2006-12-21 2008-06-26 Anderson Robert J Systems and methods for providing heterogeneous storage systems
US8286029B2 (en) 2006-12-21 2012-10-09 Emc Corporation Systems and methods for managing unavailable storage devices
US7593938B2 (en) 2006-12-22 2009-09-22 Isilon Systems, Inc. Systems and methods of directory entry encodings
EP2114165A2 (en) * 2007-02-01 2009-11-11 Basf Se Pesticidal mixtures
US7900015B2 (en) 2007-04-13 2011-03-01 Isilon Systems, Inc. Systems and methods of quota accounting
US8966080B2 (en) 2007-04-13 2015-02-24 Emc Corporation Systems and methods of managing resource utilization on a threaded computer system
US7779048B2 (en) 2007-04-13 2010-08-17 Isilon Systems, Inc. Systems and methods of providing possible value ranges
US7949692B2 (en) * 2007-08-21 2011-05-24 Emc Corporation Systems and methods for portals into snapshot data
US7882068B2 (en) * 2007-08-21 2011-02-01 Isilon Systems, Inc. Systems and methods for adaptive copy on write
US7966289B2 (en) * 2007-08-21 2011-06-21 Emc Corporation Systems and methods for reading objects in a file system
US7783666B1 (en) 2007-09-26 2010-08-24 Netapp, Inc. Controlling access to storage resources by using access pattern based quotas
US7783601B2 (en) 2007-11-08 2010-08-24 Oracle International Corporation Replicating and sharing data between heterogeneous data systems
US7840536B1 (en) 2007-12-26 2010-11-23 Emc (Benelux) B.V., S.A.R.L. Methods and apparatus for dynamic journal expansion
US7870345B2 (en) * 2008-03-27 2011-01-11 Isilon Systems, Inc. Systems and methods for managing stalled storage devices
US7953709B2 (en) 2008-03-27 2011-05-31 Emc Corporation Systems and methods for a read only mode for a portion of a storage system
US7984324B2 (en) 2008-03-27 2011-07-19 Emc Corporation Systems and methods for managing stalled storage devices
US7949636B2 (en) 2008-03-27 2011-05-24 Emc Corporation Systems and methods for a read only mode for a portion of a storage system
US8527726B2 (en) 2008-11-13 2013-09-03 International Business Machines Corporation Tiled storage array with systolic move-to-front reorganization

Patent Citations (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5181162A (en) * 1989-12-06 1993-01-19 Eastman Kodak Company Document management and production system
US6397311B1 (en) * 1990-01-19 2002-05-28 Texas Instruments Incorporated System and method for defragmenting a file system
US5596709A (en) * 1990-06-21 1997-01-21 International Business Machines Corporation Method and apparatus for recovering parity protected data
US5212784A (en) * 1990-10-22 1993-05-18 Delphi Data, A Division Of Sparks Industries, Inc. Automated concurrent data backup system
US5734826A (en) * 1991-03-29 1998-03-31 International Business Machines Corporation Variable cyclic redundancy coding method and apparatus for use in a multistage network
US5403639A (en) * 1992-09-02 1995-04-04 Storage Technology Corporation File server having snapshot application data groups
US5870563A (en) * 1992-09-19 1999-02-09 International Business Machines Corporation Method and apparatus for optimizing message transmission
US6038570A (en) * 1993-06-03 2000-03-14 Network Appliance, Inc. Method for allocating files in a file system integrated with a RAID disk sub-system
US5612865A (en) * 1995-06-01 1997-03-18 Ncr Corporation Dynamic hashing method for optimal distribution of locks within a clustered system
US6081833A (en) * 1995-07-06 2000-06-27 Kabushiki Kaisha Toshiba Memory space management method, data transfer method, and computer device for distributed computer system
US6021414A (en) * 1995-09-11 2000-02-01 Sun Microsystems, Inc. Single transaction technique for a journaling file system of a computer operating system
US5774643A (en) * 1995-10-13 1998-06-30 Digital Equipment Corporation Enhanced raid write hole protection and recovery
US5862312A (en) * 1995-10-24 1999-01-19 Seachange Technology, Inc. Loosely coupled mass storage computer cluster
US6571349B1 (en) * 1995-10-24 2003-05-27 Seachange Technology, Inc. Loosely coupled mass storage computer cluster
US6567926B2 (en) * 1995-10-24 2003-05-20 Seachange International, Inc. Loosely coupled mass storage computer cluster
US6557114B2 (en) * 1995-10-24 2003-04-29 Seachange Technology, Inc. Loosely coupled mass storage computer cluster
US5761659A (en) * 1996-02-29 1998-06-02 Sun Microsystems, Inc. Method, product, and structure for flexible range locking of read and write requests using shared and exclusive locks, flags, sub-locks, and counters
US5884303A (en) * 1996-03-15 1999-03-16 International Computers Limited Parallel searching technique
US5884098A (en) * 1996-04-18 1999-03-16 Emc Corporation RAID controller system utilizing front end and back end caching systems including communication path connecting two caching systems and synchronizing allocation of blocks in caching systems
US6044367A (en) * 1996-08-02 2000-03-28 Hewlett-Packard Company Distributed I/O store
US5878410A (en) * 1996-09-13 1999-03-02 Microsoft Corporation File system sort order indexes
US5884046A (en) * 1996-10-23 1999-03-16 Pluris, Inc. Apparatus and method for sharing data and routing messages between a plurality of workstations in a local area network
US6070172A (en) * 1997-03-06 2000-05-30 Oracle Corporation On-line free space defragmentation of a contiguous-file file system
US5890147A (en) * 1997-03-07 1999-03-30 Microsoft Corporation Scope testing of documents in a search engine using document to folder mapping
US20050005266A1 (en) * 1997-05-01 2005-01-06 Datig William E. Method of and apparatus for realizing synthetic knowledge processes in devices for useful applications
US5878414A (en) * 1997-06-06 1999-03-02 International Business Machines Corp. Constructing a transaction serialization order based on parallel or distributed database log files
US6209059B1 (en) * 1997-09-25 2001-03-27 Emc Corporation Method and apparatus for the on-line reconfiguration of the logical volumes of a data storage system
US6219693B1 (en) * 1997-11-04 2001-04-17 Adaptec, Inc. File array storage architecture having file system distributed across a data processing platform
US6055543A (en) * 1997-11-21 2000-04-25 Verano File wrapper containing cataloging information for content searching across multiple platforms
US6081883A (en) * 1997-12-05 2000-06-27 Auspex Systems, Incorporated Processing system with dynamically allocatable buffer memory
US6029168A (en) * 1998-01-23 2000-02-22 Tricord Systems, Inc. Decentralized file mapping in a striped network file system in a distributed computing environment
US6173374B1 (en) * 1998-02-11 2001-01-09 Lsi Logic Corporation System and method for peer-to-peer accelerated I/O shipping between host bus adapters in clustered computer network
US20020035668A1 (en) * 1998-05-27 2002-03-21 Yasuhiko Nakano Information storage system for redistributing information to information storage devices when a structure of the information storage devices is changed
US6385626B1 (en) * 1998-11-19 2002-05-07 Emc Corporation Method and apparatus for identifying changes to a logical object based on changes to the logical object at physical level
US6408313B1 (en) * 1998-12-16 2002-06-18 Microsoft Corporation Dynamic memory allocation based on free memory size
US6353823B1 (en) * 1999-03-08 2002-03-05 Intel Corporation Method and system for using associative metadata
US6523130B1 (en) * 1999-03-11 2003-02-18 Microsoft Corporation Storage system having error detection and recovery
US6405219B2 (en) * 1999-06-22 2002-06-11 F5 Networks, Inc. Method and system for automatically updating the version of a set of files stored on content servers
US6549513B1 (en) * 1999-10-12 2003-04-15 Alcatel Method and apparatus for fast distributed restoration of a communication network
US6571244B1 (en) * 1999-10-28 2003-05-27 Microsoft Corporation Run formation in large scale sorting using batched replacement selection
US6567894B1 (en) * 1999-12-08 2003-05-20 International Business Machines Corporation Method and apparatus to prefetch sequential pages in a multi-stream environment
US6848029B2 (en) * 2000-01-03 2005-01-25 Dirk Coldewey Method and apparatus for prefetching recursive data structures
US20030109253A1 (en) * 2000-01-18 2003-06-12 Fenton Shaun Richard Digital communications system
US6526478B1 (en) * 2000-02-02 2003-02-25 Lsi Logic Corporation Raid LUN creation using proportional disk mapping
US20030014391A1 (en) * 2000-03-07 2003-01-16 Evans Paul A Data distribution
US20020072974A1 (en) * 2000-04-03 2002-06-13 Pugliese Anthony V. System and method for displaying and selling goods and services in a retail environment employing electronic shopper aids
US20020010696A1 (en) * 2000-06-01 2002-01-24 Tadanori Izumi Automatic aggregation method, automatic aggregation apparatus, and recording medium having automatic aggregation program
US6384626B2 (en) * 2000-07-19 2002-05-07 Micro-Star Int'l Co., Ltd. Programmable apparatus and method for programming a programmable device
US7007097B1 (en) * 2000-07-20 2006-02-28 Silicon Graphics, Inc. Method and system for covering multiple resourcces with a single credit in a computer system
US20020075870A1 (en) * 2000-08-25 2002-06-20 De Azevedo Marcelo Method and apparatus for discovering computer systems in a distributed multi-system cluster
US6732125B1 (en) * 2000-09-08 2004-05-04 Storage Technology Corporation Self archiving log structured volume with intrinsic data protection
US20020083078A1 (en) * 2000-11-02 2002-06-27 Guy Pardon Decentralized, distributed internet data management
US20020055940A1 (en) * 2000-11-07 2002-05-09 Charles Elkan Method and system for selecting documents by measuring document quality
US20020078180A1 (en) * 2000-12-18 2002-06-20 Kizna Corporation Information collection server, information collection method, and recording medium
US20040078812A1 (en) * 2001-01-04 2004-04-22 Calvert Kerry Wayne Method and apparatus for acquiring media services available from content aggregators
US20030005159A1 (en) * 2001-06-07 2003-01-02 International Business Machines Corporation Method and system for generating and serving multilingual web pages
US20080021907A1 (en) * 2001-08-03 2008-01-24 Patel Sujal M Systems and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US20030033308A1 (en) * 2001-08-03 2003-02-13 Patel Sujal M. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7184421B1 (en) * 2001-12-21 2007-02-27 Itt Manufacturing Enterprises, Inc. Method and apparatus for on demand multicast and unicast using controlled flood multicast communications
US6990604B2 (en) * 2001-12-28 2006-01-24 Storage Technology Corporation Virtual storage status coalescing with a plurality of physical storage devices
US7177295B1 (en) * 2002-03-08 2007-02-13 Scientific Research Corporation Wireless routing protocol for ad-hoc networks
US20040003053A1 (en) * 2002-03-25 2004-01-01 Williams Michael John System
US7373426B2 (en) * 2002-03-29 2008-05-13 Kabushiki Kaisha Toshiba Network system using name server with pseudo host name and pseudo IP address generation function
US20040024963A1 (en) * 2002-08-05 2004-02-05 Nisha Talagala Method and system for striping data to accommodate integrity metadata
US20060074922A1 (en) * 2002-11-25 2006-04-06 Kozo Nishimura File management device, file management method, file management program and recording medium
US7007044B1 (en) * 2002-12-26 2006-02-28 Storage Technology Corporation Storage backup system for backing up data written to a primary storage device to multiple virtual mirrors using a reconciliation process that reflects the changing state of the primary storage device over time
US20050066095A1 (en) * 2003-09-23 2005-03-24 Sachin Mullick Multi-threaded write interface and methods for increasing the single file read and write throughput of a file server
US20050114402A1 (en) * 2003-11-20 2005-05-26 Zetta Systems, Inc. Block level data snapshot system and method
US20050114609A1 (en) * 2003-11-26 2005-05-26 Shorb Charles S. Computer-implemented system and method for lock handling
US7017003B2 (en) * 2004-02-16 2006-03-21 Hitachi, Ltd. Disk array apparatus and disk array apparatus control method
US7318134B1 (en) * 2004-03-16 2008-01-08 Emc Corporation Continuous data backup using distributed journaling
US20060004760A1 (en) * 2004-06-21 2006-01-05 Microsoft Corporation Method, system, and apparatus for managing access to a data object
US20060041894A1 (en) * 2004-08-03 2006-02-23 Tu-An Cheng Apparatus, system, and method for isolating a storage application from a network interface driver
US20060059467A1 (en) * 2004-09-16 2006-03-16 International Business Machines Corporation Fast source file to line number table association
US20060083177A1 (en) * 2004-10-18 2006-04-20 Nokia Corporation Listener mechanism in a distributed network system
US20060095438A1 (en) * 2004-10-29 2006-05-04 Fachan Neal T Non-blocking commit protocol systems and methods
US20060101062A1 (en) * 2004-10-29 2006-05-11 Godman Peter J Distributed system with asynchronous execution systems and methods
US7533298B2 (en) * 2005-09-07 2009-05-12 Lsi Corporation Write journaling using battery backed cache
US20070091790A1 (en) * 2005-10-21 2007-04-26 Passey Aaron J Systems and methods for providing variable protection
US20070094269A1 (en) * 2005-10-21 2007-04-26 Mikesell Paul A Systems and methods for distributed system scanning
US20070094310A1 (en) * 2005-10-21 2007-04-26 Passey Aaron J Systems and methods for accessing and updating distributed data
US20070094431A1 (en) * 2005-10-21 2007-04-26 Fachan Neal T Systems and methods for managing concurrent access requests to a shared resource
US20070094277A1 (en) * 2005-10-21 2007-04-26 Fachan Neal T Systems and methods for maintaining distributed data
US20070094452A1 (en) * 2005-10-21 2007-04-26 Fachan Neal T Systems and methods for using excitement values to predict future access to resources
US20080010507A1 (en) * 2006-05-30 2008-01-10 Oracle International Corporation Selecting optimal repair strategy for mirrored files
US20080005145A1 (en) * 2006-06-30 2008-01-03 Data Equation Limited Data processing
US20080031238A1 (en) * 2006-08-02 2008-02-07 Shai Harmelin Systems and methods for configuring multiple network interfaces
US20080044016A1 (en) * 2006-08-04 2008-02-21 Henzinger Monika H Detecting duplicate and near-duplicate files
US20080034004A1 (en) * 2006-08-04 2008-02-07 Pavel Cisler System for electronic backup
US20080046432A1 (en) * 2006-08-18 2008-02-21 Anderson Robert J Systems and methods for a snapshot of data
US20080046443A1 (en) * 2006-08-18 2008-02-21 Fachan Neal T Systems and methods for providing nonlinear journaling
US20080046475A1 (en) * 2006-08-18 2008-02-21 Anderson Robert J Systems and methods for a snapshot of data
US20080046667A1 (en) * 2006-08-18 2008-02-21 Fachan Neal T Systems and methods for allowing incremental journaling
US20080046444A1 (en) * 2006-08-18 2008-02-21 Fachan Neal T Systems and methods for providing nonlinear journaling
US20080059541A1 (en) * 2006-08-18 2008-03-06 Fachan Neal T Systems and methods for a snapshot of data
US20080046476A1 (en) * 2006-08-18 2008-02-21 Anderson Robert J Systems and methods for a snapshot of data
US20080126365A1 (en) * 2006-08-18 2008-05-29 Fachan Neal T Systems and methods for providing nonlinear journaling
US20080046445A1 (en) * 2006-08-18 2008-02-21 Passey Aaron J Systems and methods of reverse lookup
US7509448B2 (en) * 2007-01-05 2009-03-24 Isilon Systems, Inc. Systems and methods for managing semantic locks

Cited By (86)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8112395B2 (en) 2001-08-03 2012-02-07 Emc Corporation Systems and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7685126B2 (en) 2001-08-03 2010-03-23 Isilon Systems, Inc. System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7743033B2 (en) 2001-08-03 2010-06-22 Isilon Systems, Inc. Systems and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system
US7962779B2 (en) 2001-08-03 2011-06-14 Emc Corporation Systems and methods for a distributed file system with data recovery
US7937421B2 (en) 2002-11-14 2011-05-03 Emc Corporation Systems and methods for restriping files in a distributed file system
US8238350B2 (en) 2004-10-29 2012-08-07 Emc Corporation Message batching with checkpoints systems and methods
US8140623B2 (en) 2004-10-29 2012-03-20 Emc Corporation Non-blocking commit protocol systems and methods
US8055711B2 (en) 2004-10-29 2011-11-08 Emc Corporation Non-blocking commit protocol systems and methods
US8051425B2 (en) 2004-10-29 2011-11-01 Emc Corporation Distributed system with asynchronous execution systems and methods
US8176013B2 (en) 2005-10-21 2012-05-08 Emc Corporation Systems and methods for accessing and updating distributed data
US7788303B2 (en) 2005-10-21 2010-08-31 Isilon Systems, Inc. Systems and methods for distributed system scanning
US7797283B2 (en) 2005-10-21 2010-09-14 Isilon Systems, Inc. Systems and methods for maintaining distributed data
US8214334B2 (en) 2005-10-21 2012-07-03 Emc Corporation Systems and methods for distributed system scanning
US8214400B2 (en) 2005-10-21 2012-07-03 Emc Corporation Systems and methods for maintaining distributed data
US8054765B2 (en) 2005-10-21 2011-11-08 Emc Corporation Systems and methods for providing variable protection
US7917474B2 (en) 2005-10-21 2011-03-29 Isilon Systems, Inc. Systems and methods for accessing and updating distributed data
US8625464B2 (en) 2006-02-17 2014-01-07 Emc Corporation Systems and methods for providing a quiescing protocol
US7848261B2 (en) 2006-02-17 2010-12-07 Isilon Systems, Inc. Systems and methods for providing a quiescing protocol
US7756898B2 (en) 2006-03-31 2010-07-13 Isilon Systems, Inc. Systems and methods for notifying listeners of events
US8005865B2 (en) 2006-03-31 2011-08-23 Emc Corporation Systems and methods for notifying listeners of events
US8539056B2 (en) 2006-08-02 2013-09-17 Emc Corporation Systems and methods for configuring multiple network interfaces
US8015156B2 (en) 2006-08-18 2011-09-06 Emc Corporation Systems and methods for a snapshot of data
US7676691B2 (en) 2006-08-18 2010-03-09 Isilon Systems, Inc. Systems and methods for providing nonlinear journaling
US7680836B2 (en) 2006-08-18 2010-03-16 Isilon Systems, Inc. Systems and methods for a snapshot of data
US7953704B2 (en) 2006-08-18 2011-05-31 Emc Corporation Systems and methods for a snapshot of data
US7822932B2 (en) 2006-08-18 2010-10-26 Isilon Systems, Inc. Systems and methods for providing nonlinear journaling
US8380689B2 (en) 2006-08-18 2013-02-19 Emc Corporation Systems and methods for providing nonlinear journaling
US8027984B2 (en) 2006-08-18 2011-09-27 Emc Corporation Systems and methods of reverse lookup
US7752402B2 (en) 2006-08-18 2010-07-06 Isilon Systems, Inc. Systems and methods for allowing incremental journaling
US7882071B2 (en) 2006-08-18 2011-02-01 Isilon Systems, Inc. Systems and methods for a snapshot of data
US7680842B2 (en) 2006-08-18 2010-03-16 Isilon Systems, Inc. Systems and methods for a snapshot of data
US8010493B2 (en) 2006-08-18 2011-08-30 Emc Corporation Systems and methods for a snapshot of data
US8181065B2 (en) 2006-08-18 2012-05-15 Emc Corporation Systems and methods for providing nonlinear journaling
US20080046445A1 (en) * 2006-08-18 2008-02-21 Passey Aaron J Systems and methods of reverse lookup
US8356150B2 (en) 2006-08-18 2013-01-15 Emc Corporation Systems and methods for providing nonlinear journaling
US7899800B2 (en) 2006-08-18 2011-03-01 Isilon Systems, Inc. Systems and methods for providing nonlinear journaling
US8972345B1 (en) * 2006-09-27 2015-03-03 Hewlett-Packard Development Company, L.P. Modifying data structures in distributed file systems
US8286029B2 (en) 2006-12-21 2012-10-09 Emc Corporation Systems and methods for managing unavailable storage devices
US8060521B2 (en) 2006-12-22 2011-11-15 Emc Corporation Systems and methods of directory entry encodings
US7844617B2 (en) 2006-12-22 2010-11-30 Isilon Systems, Inc. Systems and methods of directory entry encodings
US8082379B2 (en) 2007-01-05 2011-12-20 Emc Corporation Systems and methods for managing semantic locks
US8966080B2 (en) 2007-04-13 2015-02-24 Emc Corporation Systems and methods of managing resource utilization on a threaded computer system
US7900015B2 (en) 2007-04-13 2011-03-01 Isilon Systems, Inc. Systems and methods of quota accounting
US8015216B2 (en) 2007-04-13 2011-09-06 Emc Corporation Systems and methods of providing possible value ranges
US7779048B2 (en) 2007-04-13 2010-08-17 Isilon Systems, Inc. Systems and methods of providing possible value ranges
US7966289B2 (en) 2007-08-21 2011-06-21 Emc Corporation Systems and methods for reading objects in a file system
US8200632B2 (en) 2007-08-21 2012-06-12 Emc Corporation Systems and methods for adaptive copy on write
US7882068B2 (en) 2007-08-21 2011-02-01 Isilon Systems, Inc. Systems and methods for adaptive copy on write
US7949692B2 (en) 2007-08-21 2011-05-24 Emc Corporation Systems and methods for portals into snapshot data
US7984324B2 (en) 2008-03-27 2011-07-19 Emc Corporation Systems and methods for managing stalled storage devices
US7870345B2 (en) 2008-03-27 2011-01-11 Isilon Systems, Inc. Systems and methods for managing stalled storage devices
US7971021B2 (en) 2008-03-27 2011-06-28 Emc Corporation Systems and methods for managing stalled storage devices
US7953709B2 (en) 2008-03-27 2011-05-31 Emc Corporation Systems and methods for a read only mode for a portion of a storage system
US7949636B2 (en) 2008-03-27 2011-05-24 Emc Corporation Systems and methods for a read only mode for a portion of a storage system
US20130311523A1 (en) * 2009-09-02 2013-11-21 Microsoft Corporation Extending file system namespace types
US10067941B2 (en) * 2009-09-02 2018-09-04 Microsoft Technology Licensing, Llc Extending file system namespace types
US10769097B2 (en) 2009-09-11 2020-09-08 Micron Technologies, Inc. Autonomous memory architecture
US9612750B2 (en) 2009-09-11 2017-04-04 Micron Technologies, Inc. Autonomous memory subsystem architecture
US9015440B2 (en) * 2009-09-11 2015-04-21 Micron Technology, Inc. Autonomous memory subsystem architecture
US20110066796A1 (en) * 2009-09-11 2011-03-17 Sean Eilert Autonomous subsystem architecture
US11003675B2 (en) 2011-03-14 2021-05-11 Splunk Inc. Interactive display of search result information
US10339149B2 (en) 2011-03-14 2019-07-02 Splunk Inc. Determining and providing quantity of unique values existing for a field
US9430574B2 (en) * 2011-03-14 2016-08-30 Splunk Inc. Display for a number of unique values for an event field
US20150347526A1 (en) * 2011-03-14 2015-12-03 Splunk Inc. Display for a number of unique values for an event field
US11860881B1 (en) 2011-03-14 2024-01-02 Splunk Inc. Tracking event records across multiple search sessions
US11176146B2 (en) 2011-03-14 2021-11-16 Splunk Inc. Determining indications of unique values for fields
US10061821B2 (en) 2011-03-14 2018-08-28 Splunk Inc. Extracting unique field values from event fields
US10860592B2 (en) 2011-03-14 2020-12-08 Splunk Inc. Providing interactive search results from a distributed search system
US10162863B2 (en) 2011-03-14 2018-12-25 Splunk Inc. Interactive display of aggregated search result information
US10860591B2 (en) 2011-03-14 2020-12-08 Splunk Inc. Server-side interactive search results
US10380122B2 (en) 2011-03-14 2019-08-13 Splunk Inc. Interactive display of search result information
US10318535B2 (en) 2011-03-14 2019-06-11 Splunk Inc. Displaying drill-down event information using event identifiers
US20140282585A1 (en) * 2013-03-13 2014-09-18 Barracuda Networks, Inc. Organizing File Events by Their Hierarchical Paths for Multi-Threaded Synch and Parallel Access System, Apparatus, and Method of Operation
US9152466B2 (en) * 2013-03-13 2015-10-06 Barracuda Networks, Inc. Organizing file events by their hierarchical paths for multi-threaded synch and parallel access system, apparatus, and method of operation
US9996263B2 (en) 2013-05-10 2018-06-12 Arista Networks, Inc. System and method of a shared memory hash table with notifications
US9367251B2 (en) * 2013-05-10 2016-06-14 Arista Networks, Inc. System and method of a shared memory hash table with notifications
US20140359232A1 (en) * 2013-05-10 2014-12-04 Hugh W. Holbrook System and method of a shared memory hash table with notifications
US10003675B2 (en) 2013-12-02 2018-06-19 Micron Technology, Inc. Packet processor receiving packets containing instructions, data, and starting location and generating packets containing instructions and data
US10778815B2 (en) 2013-12-02 2020-09-15 Micron Technology, Inc. Methods and systems for parsing and executing instructions to retrieve data using autonomous memory
US11068469B2 (en) 2015-09-04 2021-07-20 Arista Networks, Inc. System and method of a dynamic shared memory hash table with notifications
US11860861B2 (en) 2015-09-04 2024-01-02 Arista Networks, Inc. Growing dynamic shared memory hash table
CN109445966A (en) * 2018-11-06 2019-03-08 网易传媒科技(北京)有限公司 Event-handling method, device, medium and calculating equipment
CN109525466A (en) * 2019-01-03 2019-03-26 杭州云英网络科技有限公司 Back end monitor method and device
US20220050819A1 (en) * 2020-08-13 2022-02-17 Red Hat, Inc. Automated pinning of file system subtrees
US11645266B2 (en) * 2020-08-13 2023-05-09 Red Hat, Inc. Automated pinning of file system subtrees
US11757755B1 (en) * 2020-09-28 2023-09-12 Cyral Inc. Techniques for in-band topology connections in a proxy

Also Published As

Publication number Publication date
US8005865B2 (en) 2011-08-23
US7756898B2 (en) 2010-07-13
US20100306786A1 (en) 2010-12-02

Similar Documents

Publication Publication Date Title
US7756898B2 (en) Systems and methods for notifying listeners of events
US8370583B2 (en) Network memory architecture for providing data based on local accessibility
JP6560308B2 (en) System and method for implementing a data storage service
US8027984B2 (en) Systems and methods of reverse lookup
US7822711B1 (en) Conflict resolution for a distributed file sharing system
Rhea et al. Probabilistic location and routing
JP3167893B2 (en) Method and apparatus for reducing network resource location traffic
JP2017216010A (en) Check point avoidance of whole system for distributed database system
US8307115B1 (en) Network memory mirroring
KR101038358B1 (en) Consistency unit replication in application-defined systems
US8489562B1 (en) Deferred data storage
US7805416B1 (en) File system query and method of use
US6061735A (en) Network restoration plan regeneration responsive to network topology changes
US8321528B2 (en) Method of processing event notifications and event subscriptions
US20210200446A1 (en) System and method for providing a committed throughput level in a data store
US20100131564A1 (en) Index data structure for a peer-to-peer network
US20030120666A1 (en) Real-time monitoring of service performance through the use of relational database calculation clusters
US8135763B1 (en) Apparatus and method for maintaining a file system index
US20080270596A1 (en) System and method for validating directory replication
US20090077075A1 (en) Management of logical statements in a distributed database environment
US20110264782A1 (en) Systems and methods for improved multisite management of converged communication systems and computer systems
CN108566449A (en) Domain name mapping data managing method, system and storage system based on block chain
JPH11513517A (en) Communication management method and system with redundancy
CN109639773A (en) A kind of the distributed data cluster control system and its method of dynamic construction
CN107465706B (en) Distributed data object storage device based on wireless communication network

Legal Events

Date Code Title Description
AS Assignment

Owner name: ISILON SYSTEMS, INC., WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PASSEY, AARON J.;FACHAN, NEAL T.;REEL/FRAME:018054/0850

Effective date: 20060627

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: ISILON SYSTEMS LLC, WASHINGTON

Free format text: MERGER;ASSIGNOR:ISILON SYSTEMS, INC.;REEL/FRAME:026066/0785

Effective date: 20101229

AS Assignment

Owner name: IVY HOLDING, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ISILON SYSTEMS LLC;REEL/FRAME:026069/0925

Effective date: 20101229

AS Assignment

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:IVY HOLDING, INC.;REEL/FRAME:026083/0036

Effective date: 20101231

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLATERAL AGENT, NORTH CAROLINA

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040134/0001

Effective date: 20160907

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT, TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040136/0001

Effective date: 20160907

Owner name: CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH, AS COLLAT

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040134/0001

Effective date: 20160907

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., A

Free format text: SECURITY AGREEMENT;ASSIGNORS:ASAP SOFTWARE EXPRESS, INC.;AVENTAIL LLC;CREDANT TECHNOLOGIES, INC.;AND OTHERS;REEL/FRAME:040136/0001

Effective date: 20160907

AS Assignment

Owner name: EMC IP HOLDING COMPANY LLC, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EMC CORPORATION;REEL/FRAME:040203/0001

Effective date: 20160906

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552)

Year of fee payment: 8

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., T

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES, INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:049452/0223

Effective date: 20190320

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES, INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:049452/0223

Effective date: 20190320

AS Assignment

Owner name: THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., TEXAS

Free format text: SECURITY AGREEMENT;ASSIGNORS:CREDANT TECHNOLOGIES INC.;DELL INTERNATIONAL L.L.C.;DELL MARKETING L.P.;AND OTHERS;REEL/FRAME:053546/0001

Effective date: 20200409

AS Assignment

Owner name: WYSE TECHNOLOGY L.L.C., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: MOZY, INC., WASHINGTON

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: MAGINATICS LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: FORCE10 NETWORKS, INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: EMC IP HOLDING COMPANY LLC, TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: EMC CORPORATION, MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL SYSTEMS CORPORATION, TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL SOFTWARE INC., CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL MARKETING L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL INTERNATIONAL, L.L.C., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: CREDANT TECHNOLOGIES, INC., TEXAS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: AVENTAIL LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

Owner name: ASAP SOFTWARE EXPRESS, INC., ILLINOIS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CREDIT SUISSE AG, CAYMAN ISLANDS BRANCH;REEL/FRAME:058216/0001

Effective date: 20211101

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12

AS Assignment

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL INTERNATIONAL L.L.C., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (040136/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061324/0001

Effective date: 20220329

AS Assignment

Owner name: SCALEIO LLC, MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: EMC IP HOLDING COMPANY LLC (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MOZY, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: EMC CORPORATION (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO MAGINATICS LLC), MASSACHUSETTS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO FORCE10 NETWORKS, INC. AND WYSE TECHNOLOGY L.L.C.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL PRODUCTS L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL INTERNATIONAL L.L.C., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL USA L.P., TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING L.P. (ON BEHALF OF ITSELF AND AS SUCCESSOR-IN-INTEREST TO CREDANT TECHNOLOGIES, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329

Owner name: DELL MARKETING CORPORATION (SUCCESSOR-IN-INTEREST TO ASAP SOFTWARE EXPRESS, INC.), TEXAS

Free format text: RELEASE OF SECURITY INTEREST IN PATENTS PREVIOUSLY RECORDED AT REEL/FRAME (045455/0001);ASSIGNOR:THE BANK OF NEW YORK MELLON TRUST COMPANY, N.A., AS NOTES COLLATERAL AGENT;REEL/FRAME:061753/0001

Effective date: 20220329