WO2001038992A2 - Quorum resource arbiter within a storage network - Google Patents

Quorum resource arbiter within a storage network Download PDF

Info

Publication number
WO2001038992A2
WO2001038992A2 PCT/US2000/031936 US0031936W WO0138992A2 WO 2001038992 A2 WO2001038992 A2 WO 2001038992A2 US 0031936 W US0031936 W US 0031936W WO 0138992 A2 WO0138992 A2 WO 0138992A2
Authority
WO
WIPO (PCT)
Prior art keywords
quorum
volume
logical
computer
ownership
Prior art date
Application number
PCT/US2000/031936
Other languages
French (fr)
Other versions
WO2001038992A3 (en
Inventor
Catherine Van Ingen
Norbert P. Kusters
Rod N. Gamache
Robert D. Rinne
Original Assignee
Microsoft Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corporation filed Critical Microsoft Corporation
Priority to JP2001540586A priority Critical patent/JP5185483B2/en
Priority to EP00980603A priority patent/EP1234240A2/en
Publication of WO2001038992A2 publication Critical patent/WO2001038992A2/en
Publication of WO2001038992A3 publication Critical patent/WO2001038992A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/505Clust

Definitions

  • This invention relates generally to data storage devices, and more particularly to an arbitration mechanism for logical quorums resources within a storage network.
  • data storage devices such as magnetic or optical disks.
  • these storage devices can be connected to the computer system via a bus, or they can be connected to the computer system via a wired or wireless network.
  • the storage devices can be separate or co-located in a single cabinet.
  • a storage network is a collection of interconnected computing systems, referred to as nodes, operating as a single storage resource.
  • a storage network allows a system to continue to operate during hardware or software failures, increases scalability by allowing nodes to be easily added and simplifies management by allowing an administrator to manage the nodes as a single system.
  • Cluster software exists on each node and manages all cluster-specific activity of a storage network.
  • the cluster software often executes automatically upon startup of the node. At this time the cluster software configures and mounts local, non-shared devices.
  • the cluster software also uses a 'discovery' process to determine whether other members of the storage network are operational.
  • the cluster software discovers an existing cluster, it attempts to join the cluster by performing an authentication sequence.
  • a cluster master of the existing cluster authenticates the newcomer and returns a status of success if the joining node is authenticated. If the node is not recognized as a member then the request to join is refused.
  • a quorum resource can be a logical resource, such as a volume, that includes one or more physical quorum resources.
  • a volume is a logical storage unit that can be a fraction of a disk, a whole disk, fractions of multiple disks or even multiple disks.
  • Inventive cluster management software and volume management software execute on the nodes of a storage network and operate in cooperation with the underlying operating system in order to arbitrate for logical quorum resources such as a quorum volume.
  • the cluster management software arbitrates for logical quorum resources and forms a storage network without having knowledge of the underlying physical quorum resources.
  • the cluster management software is not hardware specific.
  • the cluster management software need not be aware of how the logical quorum resource is formed from the underlying physical quorum resources.
  • the volume management software manages is solely responsible for forming and mounting the logical quorum volume. The volume management software performs volume management without having detailed knowledge of the arbitration process and the determination of ownership.
  • FIG. 1 shows a diagram of the hardware and operating environment in conjunction with which embodiments of the invention can be practiced
  • FIG. 2 is a block diagram illustrating a system-level overview of a storage network having two computing systems and a variety of storage devices
  • FIG. 3 is a block diagram illustrating one embodiment of a software system having cooperating software components that cleanly separates the responsibilities of cluster arbitration from the management of volumes and the underlying storage devices;
  • FIG. 4 is a flowchart illustrating one mode of operation of the software system of FIG. 3 in which the system arbitrates for logical quorum resources according to the invention.
  • Configuration data - describes the mapping of physical resources to logical volumes.
  • Directed configuration - provider is explicitly provided with rules for choosing logical block remapping.
  • Disk platter - a subset of a diskpack, used for exporting or importing volumes from a diskpack.
  • Diskpack a collection of logical volumes and underlying disks.
  • a diskpack is the unit of transitive closure for a volume.
  • Export - Move a disk platter and all volumes contained on that platter out of one diskpack.
  • Exposed - a volume is exposed to an operating system when the volume has an associated volume name (drive letter) or mount point.
  • the volume can be made available to a file system or other data store.
  • Free agent drive - a disk drive which is not a member of a disk pack. Free agent drives cannot contain logical volumes that are exposed.
  • a volume can be initializing, healthy, compromised, unhealthy, or rebuilding.
  • Logical quorum resource - a logical resource that is necessary to form a storage network.
  • the logical quorum resource such as a logical volume, comprises one or more physical quorum resources, such as a disk
  • Logical volume - a logical storage unit that can be a fraction of a disk, a whole disk, a fraction of multiple disks or even multiple disks.
  • Logical volume provider - software which exposes logical volumes.
  • a provider includes runtime services, configuration data, and management services.
  • Mapped volume a simple linearly logical block mapping which concatenates volumes to expose a single larger volume.
  • Parity striped volume - logical volume which maintains parity check information as well as data.
  • the exact mapping and protection scheme is vendor- specific. Includes RAID 3, 4, 5, 6.
  • Plexed volume - dynamic mirror volume Plexing is used to create a copy of a volume rather than to provide fault tolerance.
  • the mirror is added to the volume with the intent of removal after the contents have been synchronized.
  • Runtime service - software that executes on a per-IO request basis.
  • Stacked volume - volume has been constructed by more than one logical block mapping operation.
  • An example is a stripe set of mirror volumes.
  • Stacking includes stripping, mapping, and plexing.
  • Striped volume - a logical block mapping which distributes contiguous logical volume extents across multiple volumes. Also termed RAID 0.
  • Unhealthy a status indicating that a non-fault tolerant volume missing one or more disk or volume extents; data contained on unhealthy volumes must not be accessed.
  • Volume configuration stability whether volume logical to physical mapping is undergoing change. A volume may be stable, extending, shrinking, plexing, or remapping.
  • Volume extent a contiguous range of logical blocks contained on a volume. Volume extents are the smallest managed logical volume unit. Volume status - current use of a volume by the system. A volume may be unused, hot spare, mapped, used, or unknown.
  • FIG. 1 is a diagram of the hardware and operating environment in conjunction with which embodiments of the invention may be practiced.
  • the description of FIG. 1 is intended to provide a brief, general description of suitable computer hardware and a suitable computing environment in conjunction with which the invention may be implemented.
  • the invention is described in the general context of computer- executable instructions, such as program modules, being executed by a computer, such as a personal computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like.
  • the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules can be located in both local and remote memory storage devices.
  • the exemplary hardware and operating environment of FIG. 1 for implementing the invention includes a general purpose computing device in the form of a computer 20, including a processing unit 21 , a system memory 22, and a system bus 23 that operatively couples various system components, including the system memory 22, to the processing unit 21.
  • a general purpose computing device in the form of a computer 20, including a processing unit 21 , a system memory 22, and a system bus 23 that operatively couples various system components, including the system memory 22, to the processing unit 21.
  • CPU central-processing unit
  • parallel processing environment commonly referced to as a parallel processing environment.
  • the computer 20 can be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited.
  • the system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
  • the system memory may also be referred to as simply the memory, and includes read only memory (ROM) 24 and random access memory (RAM) 25.
  • ROM read only memory
  • RAM random access memory
  • the computer 20 further includes a hard disk drive 27 for reading from and writing to a hard disk, not shown, a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29, and an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD ROM or other optical media.
  • the hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 are connected to the system bus 23 by a hard disk drive interface 32, a magnetic disk drive interface 33, and an optical disk drive interface 34, respectively.
  • the drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computer 20. It should be appreciated by those skilled in the art that any type of computer-readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read only memories (ROMs), and the like, may be used in the exemplary operating environment.
  • a number of program modules may be stored on the hard disk 27, magnetic disk 29, optical disk 31, ROM 24, or RAM 25, including an operating system 35, one or more application programs 36, other program modules 37, and program data 38.
  • a user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42.
  • Other input devices may include a microphone, joystick, game pad, satellite dish, scanner, or the like.
  • These and other input devices are often connected to the processing unit 21 through a serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port, or a universal serial bus (USB).
  • a monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48.
  • computers typically include other peripheral output devices (not shown), such as speakers and printers.
  • the computer 20 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer 49.
  • the remote computer 49 may be another computer, a server, a router, a network PC, a client, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 20, although only a memory storage device 50 has been illustrated in FIG. 1.
  • the logical connections depicted in FIG. 1 include a local-area network (LAN) 51 and a wide-area network (WAN) 52.
  • LAN local-area network
  • WAN wide-area network
  • the computer 20 When used in a LAN-networking environment, the computer 20 is connected to the local network 51 through a network interface or adapter 53, which is one type of communications device.
  • the computer 20 When used in a WAN-networking environment, the computer 20 typically includes a modem 54, a type of communications device, or any other type of communications device for establishing communications over the wide area network 52, such as the Internet.
  • the modem 54 which may be internal or external, is connected to the system bus 23 via the serial port interface 46.
  • program modules depicted relative to the personal computer 20, or portions thereof may be stored in the remote memory storage device. It is appreciated that the network connections shown are exemplary and other means of, and communications devices for, establishing a communications link between the computers may be used.
  • the computer in conjunction with which embodiments of the invention may be practiced may be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited.
  • a computer typically includes one or more processing units as its processor, and a computer-readable medium such as a memory.
  • the computer may also include a communications device such as a network adapter or a modem, so that it is able to communicatively couple to other computers.
  • FIG. 2 is a block diagram illustrating a system-level overview of storage network 100 that includes node 105 communicatively coupled to node 110 via network 120.
  • Nodes 105 and 110 represent any suitable computing system such as local computer 20 or remote computer 49 depicted in FIG. 1.
  • Storage network 100 further includes storage subsystem 106 that comprise storage device 107, storage device 108, and storage device 109. These devices may be any suitable storage medium such as a single internal disk, multiple external disks or even a RAID cabinet.
  • Storage subsystem 106 are coupled via bus 112, which is any suitable interconnect mechanism such as dual- connect SCSI ("Small-Computer Systems Interface”), fiber-channel, etc.
  • nodes 105 and 110 arbitrate for a logical quorum resource such as a quorum volume.
  • the logical quorum resource is illustrated as a quorum volume that is collectively formed by physical quorum resources 111, which in this embodiment are data storage extents within data storage device 108 and data storage device 109. If either node 105 or 110 is successful at obtaining ownership of all physical quorum resources 111, the successful node may form storage network 100.
  • inventive cluster management software and volume management software execute on each node and resolve situations where ownership of physical quorum resources 111 is split between nodes 105 and 110. On each node, the cluster management software and the volume management software cooperate with the underlying operating system to form storage network 100.
  • arbitration and management responsibilities are divided between the cluster management software and the volume management software such that cluster management software handles the arbitration process without knowing the details of volume management and storage subsystem 106.
  • the volume management software handles the configuration and management of storage subsystem 106 without knowing how storage network 100 is formed.
  • FIG. 3 is a block diagram illustrating one embodiment of a node 200, such as node 105 or node 110 of FIG. 2, in which various cooperating software components carryout the inventive arbitration technique.
  • cluster manager 202 oversees all cluster specific activity and communicates to bus 112 (FIG. 2) of storage subsystem 106 via disk controller 206.
  • bus 112 FIG. 2
  • cluster manager 202, volume manager 204 and operating system 35 cooperatively manage the quorum volume for storage network 100 and the conesponding physical quorum resources 111. More specifically, cluster manager 202 handles the arbitration process without knowing the details of volume management and storage subsystem 106.
  • Volume manager handles all volume mapping and the configuration of storage subsystem 106 of storage network 100.
  • Disk controller 206 handles all communications with storage subsystem 106 and may implement one of a variety of data communication protocols such as SCSI, IP, etc.
  • Applications 210 represent any user-mode software module that interacts with storage network 100. The system level overview of the operation of an exemplary embodiment of the invention has been described in this section of the detailed description.
  • FIG. 4 illustrates how the present invention cleanly separates the responsibilities of cluster management from the responsibility of volume management.
  • arbitration cycle 300 illustrates one embodiment of the inventive transformation arbitration method as performed by cluster manager 202 and volume manager on each node of storage network 100.
  • Arbitration cycle 300 is invoked when storage network 100 has not yet been established, such as when either node 105 or 110 is the first to boot, or anytime storage network 100 had been previously formed but communication between the nodes 105 and 110 has broken down.
  • the arbitration cycle 300 can be initiated by either node 105 or node 110 by proceeding from block 302 to block 304.
  • cluster manager 202 (FIG. 3) terminates all cureent ownership of storage subsystem 106. In one embodiment this is accomplished by resetting bus 112. This action in turn forces all the other nodes of the storage network 100 to perform arbitration cycle 300 and places all volumes of into an off-line mode. In this mode, volume manager 204 blocks all access storage subsystem 106.
  • the arbitrating nodes wait a predetermined delay period before proceeding with arbitration cycle 300 in order to ensure that all nodes of storage network 100 have entered arbitration.
  • cluster manager 202 instructs volume manager 204 to scan all other nodes within storage network 100 in order to update configuration information for each new or removed storage device 106.
  • the configuration information maintained by volume manager 204 is only partially complete because those that were owned by other nodes may have been changed.
  • cluster manager 202 instructs volume manager 204 to generate a list that identifies those storage subsystem 106 that were previously owned by nodes of storage network 100.
  • volume manager 204 reads and processes volume information from each storage device 106 of the generated list. Volume manager 204 rebuilds an internal configuration database.
  • volume manager 204 has information regarding all storage subsystem 106 and all volumes thereon.
  • cluster manager 202 requests that volume manager 204 identify all physical quorum resources 111 associated with the quorum volume.
  • the volume manager 204 determines all storage subsystem 106 having physical quorum resources 111 and rebuilds quorum volume information for storage network 100. For example, referring to Figure 1 volume manager 204 identifies storage device 108 and 109 as necessary for ownership to ensure that a volume may be brought online.
  • quorum volume information is consistent for all nodes of storage network 100.
  • cluster manager 202 attempts to take ownership of storage devices 108 and 109.
  • cluster manager 202 invokes conventional arbitration techniques provided by bus 112, such as techniques specified by the SCSI protocol, in order to arbitrate for the physical quorum resources, i.e., storage devices 108 and 109.
  • conventional arbitration techniques provided by bus 112, such as techniques specified by the SCSI protocol, in order to arbitrate for the physical quorum resources, i.e., storage devices 108 and 109.
  • either node 105 or 110 may own both storage devices 108 and 109 or the ownership of physical quorum resources 111 may be split due to race conditions present in the conventional arbitration techniques.
  • volume manager 204 determines whether the local node, i.e. the node upon which cluster manager 202 is running, has successfully acquired ownership of both storage devices 108 and 109 necessary for the quorum volume. If so, volume manager 204 mounts the quorum volume and, in block 316, cluster manager 202 declares the local node to be the cluster master and informs the other nodes that storage network 100 has been formed. At this point, the other nodes terminate arbitration and join storage network 100.
  • volume manager 204 proceeds from block 314 to block 318 and determines whether the local node has acquired ownership of any quorum volume resources, i.e., either storage device 108 or 109. If the local node does not have ownership of either then control passes to cluster manager 202 which, in block 320, terminates arbitration and waits for communication from another node that ultimately becomes the cluster master.
  • volume manager 204 proceeds from block 318 to block 322 and determines whether the volume list is sufficient to form a quorum. Volume manager may use several different algorithms in determining whether the volume list is suitable such as a simple majority or a weighted voting scheme. If the volume list is not sufficient then volume manager 204 releases any quorum resources.
  • Cluster manager 202 proceeds to block 320 and waits for communication from another node that ultimately becomes the cluster master.
  • volume manager 204 determines whether it is safe to mount the quorum volume. This determination is based on volume specific information. For example, if the quorum volume uses concatenated or striped extents then volume manager 204 will always determine it unsafe to mount the quorum volume when only one extent is owned. As another example, when the quorum volume is a RAID V, then volume manager 204 may apply a "minus one" algorithm such that all but one of the extents are required. In addition, volume manager 204 may apply user selectable criteria. For example, if the quorum volume is a mirror then the user may configure volume manager 204 to require all extents or to require a simple majority. If volume manager 204 can safely mount the quorum volume then volume manager 204 mounts the quorum volume and cluster manager 202 proceeds to block 316 and declares the local node the cluster master.
  • cluster manager 202 waits a predetermined amount of time. If in block 326 communication is not received from a cluster master within that time, cluster manager 202 jumps back to block 304 and repeats the inventive arbitration method. In one embodiment, the delay period increases with each iteration of arbitration cycle 300.

Abstract

The invention provides a method and system for arbitrating for ownership of a logical quorum resource, such as a logical quorum volume, comprising one or more physical quorum resources so as to form a storage network having a plurality of storage devices. Arbitration and volume management responsibilities are cleanly divided between cluster management software and volume management software. The cluster management software handles the arbitration process without knowing the details of how the logical quorum resource is formed. The volume management software handles the formation and management of the logical quorum volume without having details of the arbitration process.

Description

QUORUM RESOURCE ARBITER WITHIN A STORAGE NETWORK
RELATED APPLICATIONS This application is related to the following applications, all of which are filed on the same day and assigned to the same assignee as the present application:
"Storage Management System Having Common Volume Manager" - serial no. 09/449,577 [Attorney docket 777.245US1], "Storage Management System Having Abstracted Volume Providers" - serial no. 09/450,364 [Attorney docket 777.246US1],
"Volume Stacking Model" - serial no. 09/451,219 [Attorney docket 777.247US1],
"Volume Configuration Data Administration" - serial no. 09/450,300 [Attorney docket 777.248US1], and
"Volume Migration Between Volume Groups" - serial no. 0/451,220 [Attorney docket 777.249US1].
FIELD OF THE INVENTION This invention relates generally to data storage devices, and more particularly to an arbitration mechanism for logical quorums resources within a storage network.
COPYRIGHT NOTICE/PERMISSION A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawing hereto: Copyright © 1999, Microsoft Corporation, All Rights Reserved. BACKGROUND OF THE INVENTION
As computer systems have evolved so has the availability and configuration of data storage devices, such as magnetic or optical disks. For example, these storage devices can be connected to the computer system via a bus, or they can be connected to the computer system via a wired or wireless network. In addition, the storage devices can be separate or co-located in a single cabinet.
A storage network is a collection of interconnected computing systems, referred to as nodes, operating as a single storage resource. A storage network allows a system to continue to operate during hardware or software failures, increases scalability by allowing nodes to be easily added and simplifies management by allowing an administrator to manage the nodes as a single system.
Cluster software exists on each node and manages all cluster-specific activity of a storage network. The cluster software often executes automatically upon startup of the node. At this time the cluster software configures and mounts local, non-shared devices. The cluster software also uses a 'discovery' process to determine whether other members of the storage network are operational. When the cluster software discovers an existing cluster, it attempts to join the cluster by performing an authentication sequence. A cluster master of the existing cluster authenticates the newcomer and returns a status of success if the joining node is authenticated. If the node is not recognized as a member then the request to join is refused.
If a cluster is not found during the discovery process, the node will attempt to form its own cluster. This process is repeated any time a node cannot communicate with the cluster to which it belongs. In conventional computing systems, nodes arbitrate for a physical "quorum resource", such as a disk, in order to form a storage network. In more recent systems, a quorum resource can be a logical resource, such as a volume, that includes one or more physical quorum resources. For example, a volume is a logical storage unit that can be a fraction of a disk, a whole disk, fractions of multiple disks or even multiple disks.
In conventional systems the responsibility and intelligence for determining ownership of a cluster, i.e. the implementing the arbitration process, is often distributed between several components and/or software modules. The responsibility for configuring and managing the underlying storage devices is often is often similarly distributed. This lack of clean division in responsibility creates difficulties when a given component or software module changes. Thus, there is a need in the art for a system that more cleanly separates the responsibilities of cluster arbitration from the cluster management from the responsibility of volume management and the underlying storage devices.
SUMMARY OF THE INVENTION The above-mentioned shortcomings, disadvantages and problems are addressed by the present invention. Inventive cluster management software and volume management software execute on the nodes of a storage network and operate in cooperation with the underlying operating system in order to arbitrate for logical quorum resources such as a quorum volume. According to the invention, the cluster management software arbitrates for logical quorum resources and forms a storage network without having knowledge of the underlying physical quorum resources. In this fashion, the cluster management software is not hardware specific. In addition, the cluster management software need not be aware of how the logical quorum resource is formed from the underlying physical quorum resources. For example, the volume management software manages is solely responsible for forming and mounting the logical quorum volume. The volume management software performs volume management without having detailed knowledge of the arbitration process and the determination of ownership.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 shows a diagram of the hardware and operating environment in conjunction with which embodiments of the invention can be practiced; FIG. 2 is a block diagram illustrating a system-level overview of a storage network having two computing systems and a variety of storage devices;
FIG. 3 is a block diagram illustrating one embodiment of a software system having cooperating software components that cleanly separates the responsibilities of cluster arbitration from the management of volumes and the underlying storage devices; and
FIG. 4 is a flowchart illustrating one mode of operation of the software system of FIG. 3 in which the system arbitrates for logical quorum resources according to the invention. DETAILED DESCRIPTION OF THE INVENTION
In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention can be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments can be utilized and that changes can be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the claims.
The detailed description is divided into four sections. In the first section, a glossary of terms is provided. In the second section, the hardware and the operating environment in conjunction with which embodiments of the invention can be practiced are described. In the third section, a system level overview of the invention is presented. Finally, in the fourth section, a conclusion of the detailed description is provided.
Definitions Compromised - a status indicating that a fault tolerant volume is missing one or more disk or volume extents; for example, a mirror set with only one mirror currently available.
Configuration data - describes the mapping of physical resources to logical volumes.
Directed configuration - provider is explicitly provided with rules for choosing logical block remapping.
Disk platter - a subset of a diskpack, used for exporting or importing volumes from a diskpack.
Diskpack - a collection of logical volumes and underlying disks. A diskpack is the unit of transitive closure for a volume. Export - Move a disk platter and all volumes contained on that platter out of one diskpack.
Exposed - a volume is exposed to an operating system when the volume has an associated volume name (drive letter) or mount point. The volume can be made available to a file system or other data store. Free agent drive - a disk drive which is not a member of a disk pack. Free agent drives cannot contain logical volumes that are exposed.
Health - volume fault management status. A volume can be initializing, healthy, compromised, unhealthy, or rebuilding.
Healthy - containing or able to contain valid data. Hot-spotting - temporary plexing of a volume or collection of volume extents.
Import - Move a disk platter and all volumes contained on that platter into one diskpack.
Initializing - a status indicating that a volume is rediscovering volume configuration. LBN - logical block number.
Logical block mapping - relationship between the logical blocks exposed to the logical volume provider to those exposed by the same provider. Logical quorum resource - a logical resource that is necessary to form a storage network. The logical quorum resource, such as a logical volume, comprises one or more physical quorum resources, such as a disk
Logical volume - a logical storage unit that can be a fraction of a disk, a whole disk, a fraction of multiple disks or even multiple disks.
Logical volume provider - software which exposes logical volumes. A provider includes runtime services, configuration data, and management services.
Management service - software that executes only infrequently to perform volume configuration, monitoring or fault handling. Mapped volume - a simple linearly logical block mapping which concatenates volumes to expose a single larger volume.
Mirrored volume - logical volume which maintains two or more identical data copies. Also termed RAID 1.
Parity striped volume - logical volume which maintains parity check information as well as data. The exact mapping and protection scheme is vendor- specific. Includes RAID 3, 4, 5, 6.
Plexed volume - dynamic mirror volume. Plexing is used to create a copy of a volume rather than to provide fault tolerance. The mirror is added to the volume with the intent of removal after the contents have been synchronized. RAID - Redundant Array of Independent Disks.
Rebuilding - a status indicating that a previously compromised fault tolerant volume is resynchronizing all volume extent data.
Runtime service - software that executes on a per-IO request basis.
SCSI - Small-Computer Systems Interface. Stacked volume - volume has been constructed by more than one logical block mapping operation. An example is a stripe set of mirror volumes. Stacking includes stripping, mapping, and plexing.
Striped volume - a logical block mapping which distributes contiguous logical volume extents across multiple volumes. Also termed RAID 0.
Unhealthy - a status indicating that a non-fault tolerant volume missing one or more disk or volume extents; data contained on unhealthy volumes must not be accessed. Volume configuration stability - whether volume logical to physical mapping is undergoing change. A volume may be stable, extending, shrinking, plexing, or remapping.
Volume extent - a contiguous range of logical blocks contained on a volume. Volume extents are the smallest managed logical volume unit. Volume status - current use of a volume by the system. A volume may be unused, hot spare, mapped, used, or unknown.
Hardware and Operating Environment FIG. 1 is a diagram of the hardware and operating environment in conjunction with which embodiments of the invention may be practiced. The description of FIG. 1 is intended to provide a brief, general description of suitable computer hardware and a suitable computing environment in conjunction with which the invention may be implemented. Although not required, the invention is described in the general context of computer- executable instructions, such as program modules, being executed by a computer, such as a personal computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
Moreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules can be located in both local and remote memory storage devices.
The exemplary hardware and operating environment of FIG. 1 for implementing the invention includes a general purpose computing device in the form of a computer 20, including a processing unit 21 , a system memory 22, and a system bus 23 that operatively couples various system components, including the system memory 22, to the processing unit 21. There may be only one or there may be more than one processing unit 21, such that the processor of computer 20 comprises a single central-processing unit (CPU), or a plurality of processing units, commonly referced to as a parallel processing environment.
The computer 20 can be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited.
The system bus 23 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. The system memory may also be referred to as simply the memory, and includes read only memory (ROM) 24 and random access memory (RAM) 25. A basic input/output system (BIOS) 26, containing the basic routines that help to transfer information between elements within the computer 20, such as during start-up, is stored in ROM 24. The computer 20 further includes a hard disk drive 27 for reading from and writing to a hard disk, not shown, a magnetic disk drive 28 for reading from or writing to a removable magnetic disk 29, and an optical disk drive 30 for reading from or writing to a removable optical disk 31 such as a CD ROM or other optical media. The hard disk drive 27, magnetic disk drive 28, and optical disk drive 30 are connected to the system bus 23 by a hard disk drive interface 32, a magnetic disk drive interface 33, and an optical disk drive interface 34, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for the computer 20. It should be appreciated by those skilled in the art that any type of computer-readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read only memories (ROMs), and the like, may be used in the exemplary operating environment. A number of program modules may be stored on the hard disk 27, magnetic disk 29, optical disk 31, ROM 24, or RAM 25, including an operating system 35, one or more application programs 36, other program modules 37, and program data 38. A user may enter commands and information into the personal computer 20 through input devices such as a keyboard 40 and pointing device 42. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 21 through a serial port interface 46 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port, or a universal serial bus (USB). A monitor 47 or other type of display device is also connected to the system bus 23 via an interface, such as a video adapter 48. In addition to the monitor, computers typically include other peripheral output devices (not shown), such as speakers and printers.
The computer 20 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer 49.
These logical connections are achieved by a communication device coupled to or a part of the computer 20, the local computer; the invention is not limited to a particular type of communications device. The remote computer 49 may be another computer, a server, a router, a network PC, a client, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 20, although only a memory storage device 50 has been illustrated in FIG. 1. The logical connections depicted in FIG. 1 include a local-area network (LAN) 51 and a wide-area network (WAN) 52. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
When used in a LAN-networking environment, the computer 20 is connected to the local network 51 through a network interface or adapter 53, which is one type of communications device. When used in a WAN-networking environment, the computer 20 typically includes a modem 54, a type of communications device, or any other type of communications device for establishing communications over the wide area network 52, such as the Internet. The modem 54, which may be internal or external, is connected to the system bus 23 via the serial port interface 46. In a networked environment, program modules depicted relative to the personal computer 20, or portions thereof, may be stored in the remote memory storage device. It is appreciated that the network connections shown are exemplary and other means of, and communications devices for, establishing a communications link between the computers may be used. The hardware and operating environment in conjunction with which embodiments of the invention may be practiced has been described. The computer in conjunction with which embodiments of the invention may be practiced may be a conventional computer, a distributed computer, or any other type of computer; the invention is not so limited. Such a computer typically includes one or more processing units as its processor, and a computer-readable medium such as a memory. The computer may also include a communications device such as a network adapter or a modem, so that it is able to communicatively couple to other computers.
System Level Overview FIG. 2 is a block diagram illustrating a system-level overview of storage network 100 that includes node 105 communicatively coupled to node 110 via network 120. Nodes 105 and 110 represent any suitable computing system such as local computer 20 or remote computer 49 depicted in FIG. 1.
Storage network 100 further includes storage subsystem 106 that comprise storage device 107, storage device 108, and storage device 109. These devices may be any suitable storage medium such as a single internal disk, multiple external disks or even a RAID cabinet. Storage subsystem 106 are coupled via bus 112, which is any suitable interconnect mechanism such as dual- connect SCSI ("Small-Computer Systems Interface"), fiber-channel, etc.
In order to form storage network 100, nodes 105 and 110 arbitrate for a logical quorum resource such as a quorum volume. In FIG. 2 the logical quorum resource is illustrated as a quorum volume that is collectively formed by physical quorum resources 111, which in this embodiment are data storage extents within data storage device 108 and data storage device 109. If either node 105 or 110 is successful at obtaining ownership of all physical quorum resources 111, the successful node may form storage network 100. As described below, inventive cluster management software and volume management software execute on each node and resolve situations where ownership of physical quorum resources 111 is split between nodes 105 and 110. On each node, the cluster management software and the volume management software cooperate with the underlying operating system to form storage network 100. As illustrated below, arbitration and management responsibilities are divided between the cluster management software and the volume management software such that cluster management software handles the arbitration process without knowing the details of volume management and storage subsystem 106. The volume management software handles the configuration and management of storage subsystem 106 without knowing how storage network 100 is formed.
FIG. 3 is a block diagram illustrating one embodiment of a node 200, such as node 105 or node 110 of FIG. 2, in which various cooperating software components carryout the inventive arbitration technique. Within node 200, cluster manager 202 oversees all cluster specific activity and communicates to bus 112 (FIG. 2) of storage subsystem 106 via disk controller 206. As a cluster master, cluster manager 202, volume manager 204 and operating system 35 cooperatively manage the quorum volume for storage network 100 and the conesponding physical quorum resources 111. More specifically, cluster manager 202 handles the arbitration process without knowing the details of volume management and storage subsystem 106. Volume manager handles all volume mapping and the configuration of storage subsystem 106 of storage network 100. Disk controller 206 handles all communications with storage subsystem 106 and may implement one of a variety of data communication protocols such as SCSI, IP, etc. Applications 210 represent any user-mode software module that interacts with storage network 100. The system level overview of the operation of an exemplary embodiment of the invention has been described in this section of the detailed description.
Methods of an Exemplary Embodiment of the Invention In the previous section, a system level overview of the operation of an exemplary embodiment of the invention was described. In this section, the particular methods performed by a computer executing an exemplary embodiment are described by reference to a series of flowcharts. The methods to be performed by a computer constitute computer programs made up of computer-executable instructions. Describing the methods by reference to a flowchart enables one skilled in the art to develop such programs including such instructions to carry out the methods on suitable computers (the processor of the computers executing the instructions from computer-readable media).
FIG. 4 illustrates how the present invention cleanly separates the responsibilities of cluster management from the responsibility of volume management. More specifically, arbitration cycle 300 illustrates one embodiment of the inventive transformation arbitration method as performed by cluster manager 202 and volume manager on each node of storage network 100. Arbitration cycle 300 is invoked when storage network 100 has not yet been established, such as when either node 105 or 110 is the first to boot, or anytime storage network 100 had been previously formed but communication between the nodes 105 and 110 has broken down.
The arbitration cycle 300 can be initiated by either node 105 or node 110 by proceeding from block 302 to block 304. In block 304, cluster manager 202 (FIG. 3) terminates all cureent ownership of storage subsystem 106. In one embodiment this is accomplished by resetting bus 112. This action in turn forces all the other nodes of the storage network 100 to perform arbitration cycle 300 and places all volumes of into an off-line mode. In this mode, volume manager 204 blocks all access storage subsystem 106. In one embodiment the arbitrating nodes wait a predetermined delay period before proceeding with arbitration cycle 300 in order to ensure that all nodes of storage network 100 have entered arbitration.
In block 306, cluster manager 202 instructs volume manager 204 to scan all other nodes within storage network 100 in order to update configuration information for each new or removed storage device 106. At the end of block 306 the configuration information maintained by volume manager 204 is only partially complete because those that were owned by other nodes may have been changed. Thus, in block 308, cluster manager 202 instructs volume manager 204 to generate a list that identifies those storage subsystem 106 that were previously owned by nodes of storage network 100. In block 309 volume manager 204 reads and processes volume information from each storage device 106 of the generated list. Volume manager 204 rebuilds an internal configuration database. This action ensures that the arbitrating node discovers the quorum resource for storage network 100 even if the quorum resource was owned entirely by a different node prior to arbitration cycle 300. At the conclusion of block 309, volume manager 204 has information regarding all storage subsystem 106 and all volumes thereon.
Next, in block 310 cluster manager 202 requests that volume manager 204 identify all physical quorum resources 111 associated with the quorum volume. The volume manager 204 determines all storage subsystem 106 having physical quorum resources 111 and rebuilds quorum volume information for storage network 100. For example, referring to Figure 1 volume manager 204 identifies storage device 108 and 109 as necessary for ownership to ensure that a volume may be brought online. At the completion of block 310, quorum volume information is consistent for all nodes of storage network 100. At this point cluster manager 202 attempts to take ownership of storage devices 108 and 109.
In block 312, cluster manager 202 invokes conventional arbitration techniques provided by bus 112, such as techniques specified by the SCSI protocol, in order to arbitrate for the physical quorum resources, i.e., storage devices 108 and 109. At the conclusion of these conventional mechanisms, either node 105 or 110 may own both storage devices 108 and 109 or the ownership of physical quorum resources 111 may be split due to race conditions present in the conventional arbitration techniques.
After arbitration for physical quorum resources 111 has completed, volume manager 204 determines whether the local node, i.e. the node upon which cluster manager 202 is running, has successfully acquired ownership of both storage devices 108 and 109 necessary for the quorum volume. If so, volume manager 204 mounts the quorum volume and, in block 316, cluster manager 202 declares the local node to be the cluster master and informs the other nodes that storage network 100 has been formed. At this point, the other nodes terminate arbitration and join storage network 100.
If the local node does not have ownership of both storage devices 108 and 109, volume manager 204 proceeds from block 314 to block 318 and determines whether the local node has acquired ownership of any quorum volume resources, i.e., either storage device 108 or 109. If the local node does not have ownership of either then control passes to cluster manager 202 which, in block 320, terminates arbitration and waits for communication from another node that ultimately becomes the cluster master.
If the arbitrating node has ownership of one but not both storage devices 108 and 109, then volume manager 204 proceeds from block 318 to block 322 and determines whether the volume list is sufficient to form a quorum. Volume manager may use several different algorithms in determining whether the volume list is suitable such as a simple majority or a weighted voting scheme. If the volume list is not sufficient then volume manager 204 releases any quorum resources. Cluster manager 202 proceeds to block 320 and waits for communication from another node that ultimately becomes the cluster master.
If, however, volume manager 204 determines that the volume list is sufficient then volume manager 204 proceeds from block 322 to block 323 and determines whether it is safe to mount the quorum volume. This determination is based on volume specific information. For example, if the quorum volume uses concatenated or striped extents then volume manager 204 will always determine it unsafe to mount the quorum volume when only one extent is owned. As another example, when the quorum volume is a RAID V, then volume manager 204 may apply a "minus one" algorithm such that all but one of the extents are required. In addition, volume manager 204 may apply user selectable criteria. For example, if the quorum volume is a mirror then the user may configure volume manager 204 to require all extents or to require a simple majority. If volume manager 204 can safely mount the quorum volume then volume manager 204 mounts the quorum volume and cluster manager 202 proceeds to block 316 and declares the local node the cluster master.
If, however, the volume manager 204 determines that it cannot safely mount the quorum volume, cluster manager 202 waits a predetermined amount of time. If in block 326 communication is not received from a cluster master within that time, cluster manager 202 jumps back to block 304 and repeats the inventive arbitration method. In one embodiment, the delay period increases with each iteration of arbitration cycle 300.
Conclusion Various embodiments of the inventive arbitration scheme have been described that allow cluster software to arbitrate for logical quorum resource without requiring knowledge of volume management and the physical characteristics that underlie the formation of the logical resource. The volume management software manages the underlying storage devices without having knowledge of how ownership of the cluster is established via the arbitration process. In this manner, the present invention cleanly separates the responsibilities of cluster management from the responsibility of volume management. It is intended that only the claims and equivalents thereof limit this invention.

Claims

What is claimed is:
1. A method for forming a storage network from one or more storage devices comprising: invoking a first software module for arbitrating for ownership of a logical quorum resource; and invoking a second software module for forming the logical quorum resource from one or more physical quorum resources at the request of the first software module.
2. The method of claim 1 wherein forming the logical quorum resource includes forming a quorum volume.
3. The method of claim 2 wherein forming the logical quorum resource includes determining whether the quorum volume is safe to mount.
4. The method of claim 1 and further including waiting a delay period for a communication from a cluster master indicating that a storage network has been formed.
5. The method of claim 1 wherein the first software module arbitrates for ownership of the physical quorum resources used by the second software module to form the logical quorum resource.
6. The method of claim 1 and further including resetting any pre-existing ownership of the physical quorum resources.
7. The method of claim 2 and further including generating a rescan list of storage devices within the storage network that were previously owned by nodes of the storage network other than a local node upon which the first and second software modules execute.
8. The method of claim 1, wherein the first software module does not have information of how the second software module forms the logical quorum resource, and further wherein the second software module does not have information of how the first software module arbitrates for the logical quorum resource.
9. A method for arbitrating for ownership of a logical quorum resource comprising one or more physical quorum resources so as to form a storage network having a plurality of storage devices comprising: arbitrating for ownership of the physical quorum resources; determining whether a quorum volume can be mounted based on the ownership of the physical quorum resources; and forming a storage network when the quorum volume can be mounted.
10. The method of claim 9 and further including waiting a delay period for a communication from a cluster master indicating that a storage network has been formed.
11. The method of claim 9 and resetting any pre-existing ownership of the physical quorum resources.
12. The method of claim 9 and further including generating a rescan list of storage devices within the storage network that were previously owned by nodes of the storage network.
13. The method of claim 12 and further including retrieving volume information from the storage devices on the rescan list.
14. A computer-readable medium having computer-executable instructions to cause a computer to perform a method of: invoking a first software module for arbitrating for ownership of a logical quorum resource without having information relating to formation of the logical quorum resource; and invoking a second software module for forming the logical quorum resource from one or more physical quorum resources without having information relating to the arbitration for the logical quorum resource.
15. The computer-readable medium of claim 14 wherein forming the logical quorum resource includes forming a quorum volume.
16. The computer-readable medium of claim 15 and having computer- executable instructions to cause a computer to determine whether the quorum volume is safe to mount.
17. The computer-readable medium of claim 14 and having computer- executable instructions to cause a computer to wait a delay period for a communication from a cluster master indicating that a storage network has been formed.
18. The computer-readable medium of claim 14 wherein the first software module arbitrates for ownership of the physical quorum resources.
19. The computer-readable medium of claim 14 and further including resetting any pre-existing ownership of the physical quorum resources.
20. A computing system comprising: a processor and a computer-readable medium; an operating environment executing on the processor from the computer-readable medium; a cluster manager executing on the computing system for arbitrating for ownership of a logical quorum volume and for forming a storage network upon obtaining ownership of the logical quorum volume; and a volume manager executing on the computing system for forming the logical quorum volume from one or more physical quorum resources at the request of the cluster manager.
21. The computing system of claim 20, wherein in order to arbitrate for the logical quorum volume the cluster manager arbitrates for ownership of the physical quorum resources without having information relating to the formation of the logical quorum volume.
22. The computing system of claim 21, wherein the volume manager forms the logical volume from quorum resources owned without having information relating to the arbitration process that determined ownership of the physical quorum resources.
23. The computing system of claim 20, wherein during arbitration the cluster manager releases ownership of owned physical quorum resources when the volume manager is not able to write data to the storage devices without data corruption.
24. The computing system of claim 20, wherein the cluster manager waits a delay period for a communication from a cluster master indicating that a storage network has been formed when the computing system has ownership of no physical quorum resources.
25. The computing system of claim 20, wherein the cluster manager resets any pre-existing ownership of the physical quorum resources in order to initiate arbitration.
26. The computing system of claim 20, wherein the volume manager generates a rescan list of storage devices within the storage network.
27. The computing system of claim 20, wherein the volume manager retrieves volume information from the storage devices on the rescan list.
28. The computing system of claim 20, wherein the volume manager determines whether the logical quorum volume is safe for mounting.
29. The computing system of claim 20, wherein the volume manager determines whether a volume list is sufficient for a quorum.
PCT/US2000/031936 1999-11-29 2000-11-21 Quorum resource arbiter within a storage network WO2001038992A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2001540586A JP5185483B2 (en) 1999-11-29 2000-11-21 Quorum resource arbiter in the storage network
EP00980603A EP1234240A2 (en) 1999-11-29 2000-11-21 Quorum resource arbiter within a storage network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/449,579 1999-11-29
US09/449,579 US6615256B1 (en) 1999-11-29 1999-11-29 Quorum resource arbiter within a storage network

Publications (2)

Publication Number Publication Date
WO2001038992A2 true WO2001038992A2 (en) 2001-05-31
WO2001038992A3 WO2001038992A3 (en) 2002-06-06

Family

ID=23784686

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/031936 WO2001038992A2 (en) 1999-11-29 2000-11-21 Quorum resource arbiter within a storage network

Country Status (4)

Country Link
US (1) US6615256B1 (en)
EP (1) EP1234240A2 (en)
JP (1) JP5185483B2 (en)
WO (1) WO2001038992A2 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1415377A2 (en) * 2001-07-06 2004-05-06 Computer Associates International, Inc. System and method for managing object based clusters
WO2007013961A2 (en) * 2005-07-22 2007-02-01 Network Appliance, Inc. Architecture and method for configuring a simplified cluster over a network with fencing and quorum
EP1751660A2 (en) * 2004-03-09 2007-02-14 Scaleout Software, Inc. Scalable, software based quorum architecture
US7483978B2 (en) 2006-05-15 2009-01-27 Computer Associates Think, Inc. Providing a unified user interface for managing a plurality of heterogeneous computing environments
US7979863B2 (en) 2004-05-21 2011-07-12 Computer Associates Think, Inc. Method and apparatus for dynamic CPU resource management
US7979857B2 (en) 2004-05-21 2011-07-12 Computer Associates Think, Inc. Method and apparatus for dynamic memory resource management
US8104033B2 (en) 2005-09-30 2012-01-24 Computer Associates Think, Inc. Managing virtual machines based on business priorty
US8225313B2 (en) 2005-10-19 2012-07-17 Ca, Inc. Object-based virtual infrastructure management

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553387B1 (en) * 1999-11-29 2003-04-22 Microsoft Corporation Logical volume configuration data management determines whether to expose the logical volume on-line, off-line request based on comparison of volume epoch numbers on each extents of the volume identifiers
US6684231B1 (en) * 1999-11-29 2004-01-27 Microsoft Corporation Migration of friendly volumes
US6769008B1 (en) * 2000-01-10 2004-07-27 Sun Microsystems, Inc. Method and apparatus for dynamically altering configurations of clustered computer systems
US7093288B1 (en) 2000-10-24 2006-08-15 Microsoft Corporation Using packet filters and network virtualization to restrict network communications
US7113900B1 (en) 2000-10-24 2006-09-26 Microsoft Corporation System and method for logical modeling of distributed computer systems
US6907395B1 (en) 2000-10-24 2005-06-14 Microsoft Corporation System and method for designing a logical model of a distributed computer system and deploying physical resources according to the logical model
US7606898B1 (en) * 2000-10-24 2009-10-20 Microsoft Corporation System and method for distributed management of shared computers
US6915338B1 (en) 2000-10-24 2005-07-05 Microsoft Corporation System and method providing automatic policy enforcement in a multi-computer service application
US6886038B1 (en) * 2000-10-24 2005-04-26 Microsoft Corporation System and method for restricting data transfers and managing software components of distributed computers
US7243374B2 (en) 2001-08-08 2007-07-10 Microsoft Corporation Rapid application security threat analysis
US8285825B1 (en) * 2002-11-13 2012-10-09 Novell, Inc. Method and system for managing network resources based on a dynamic quorum
JP4516322B2 (en) 2004-01-28 2010-08-04 株式会社日立製作所 A computer system having a shared exclusive control method between sites having a storage system shared by a plurality of host devices
US20050193105A1 (en) * 2004-02-27 2005-09-01 Basham Robert B. Method and system for processing network discovery data
JP2005310243A (en) * 2004-04-20 2005-11-04 Seiko Epson Corp Memory controller, semiconductor integrated circuit apparatus, semiconductor apparatus, microcomputer, and electronic equipment
EP1748361A1 (en) * 2004-08-23 2007-01-31 Sun Microsystems France S.A. Method and apparatus for using a USB cable as a cluster quorum device
EP1632854A1 (en) 2004-08-23 2006-03-08 Sun Microsystems France S.A. Method and apparatus for using a serial cable as a cluster quorum device
US7941309B2 (en) 2005-11-02 2011-05-10 Microsoft Corporation Modeling IT operations/policies
US8209417B2 (en) * 2007-03-08 2012-06-26 Oracle International Corporation Dynamic resource profiles for clusterware-managed resources
US7890555B2 (en) * 2007-07-10 2011-02-15 International Business Machines Corporation File system mounting in a clustered file system
JP4577384B2 (en) 2008-03-14 2010-11-10 日本電気株式会社 Management machine, management system, management program, and management method
US20110178984A1 (en) * 2010-01-18 2011-07-21 Microsoft Corporation Replication protocol for database systems
US8825601B2 (en) * 2010-02-01 2014-09-02 Microsoft Corporation Logical data backup and rollback using incremental capture in a distributed database
US8650281B1 (en) * 2012-02-01 2014-02-11 Symantec Corporation Intelligent arbitration servers for network partition arbitration
WO2016106682A1 (en) * 2014-12-31 2016-07-07 华为技术有限公司 Post-cluster brain split quorum processing method and quorum storage device and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5822531A (en) * 1996-07-22 1998-10-13 International Business Machines Corporation Method and system for dynamically reconfiguring a cluster of computer systems

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5828889A (en) * 1996-05-31 1998-10-27 Sun Microsystems, Inc. Quorum mechanism in a two-node distributed computer system
US6253240B1 (en) * 1997-10-31 2001-06-26 International Business Machines Corporation Method for producing a coherent view of storage network by a storage network manager using data storage device configuration obtained from data storage devices

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5822531A (en) * 1996-07-22 1998-10-13 International Business Machines Corporation Method and system for dynamically reconfiguring a cluster of computer systems

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GELB J P: "SYSTEM-MANAGED STORAGE" IBM SYSTEMS JOURNAL, IBM CORP. ARMONK, NEW YORK, US, vol. 28, no. 1, 1989, pages 77-103, XP000054276 ISSN: 0018-8670 *
See also references of EP1234240A2 *
VERITAS SOFTWARE CORPORATION: "VERITAS Volume Manager Administrator's Reference Guide Release 3.0.1" VERITAS VOLUME MANAGER FOR SOLARIS DOCUMENTATION, [Online] May 1999 (1999-05), pages i-ii,115-148, XP002190199 Retrieved from the Internet: <URL:ftp://ftp.support.veritas.com/pub/sup port/products/VolumeManager_UNIX/vm301_ref _236742.pdf> [retrieved on 2002-02-13] *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7603327B2 (en) 2001-07-06 2009-10-13 Computer Associates Think, Inc. System and method for managing object based clusters
EP1415377A4 (en) * 2001-07-06 2007-05-30 Computer Ass Think Inc System and method for managing object based clusters
EP1415377A2 (en) * 2001-07-06 2004-05-06 Computer Associates International, Inc. System and method for managing object based clusters
EP1751660A2 (en) * 2004-03-09 2007-02-14 Scaleout Software, Inc. Scalable, software based quorum architecture
EP1751660A4 (en) * 2004-03-09 2010-06-16 Scaleout Software Inc Scalable, software based quorum architecture
US7979863B2 (en) 2004-05-21 2011-07-12 Computer Associates Think, Inc. Method and apparatus for dynamic CPU resource management
US7979857B2 (en) 2004-05-21 2011-07-12 Computer Associates Think, Inc. Method and apparatus for dynamic memory resource management
WO2007013961A2 (en) * 2005-07-22 2007-02-01 Network Appliance, Inc. Architecture and method for configuring a simplified cluster over a network with fencing and quorum
WO2007013961A3 (en) * 2005-07-22 2008-05-29 Network Appliance Inc Architecture and method for configuring a simplified cluster over a network with fencing and quorum
US8104033B2 (en) 2005-09-30 2012-01-24 Computer Associates Think, Inc. Managing virtual machines based on business priorty
US8255907B2 (en) 2005-09-30 2012-08-28 Ca, Inc. Managing virtual machines based on business priority
US8225313B2 (en) 2005-10-19 2012-07-17 Ca, Inc. Object-based virtual infrastructure management
US7483978B2 (en) 2006-05-15 2009-01-27 Computer Associates Think, Inc. Providing a unified user interface for managing a plurality of heterogeneous computing environments

Also Published As

Publication number Publication date
JP5185483B2 (en) 2013-04-17
US6615256B1 (en) 2003-09-02
WO2001038992A3 (en) 2002-06-06
EP1234240A2 (en) 2002-08-28
JP2003515813A (en) 2003-05-07

Similar Documents

Publication Publication Date Title
US6615256B1 (en) Quorum resource arbiter within a storage network
US7584224B2 (en) Volume configuration data administration
EP0936547B1 (en) Method and apparatus for identifying at-risk components in systems with redundant components
JP2003515813A5 (en)
US6904599B1 (en) Storage management system having abstracted volume providers
US7007024B2 (en) Hashing objects into multiple directories for better concurrency and manageability
EP1234237B1 (en) Storage management system having common volume manager
US7360030B1 (en) Methods and apparatus facilitating volume management
EP0935374B1 (en) Dynamic and consistent naming of fabric attached storage
EP0935375B1 (en) Name service for a highly configurable multi-node processing system
EP0935186B1 (en) Volume set configuration using a single operational view
US7191357B2 (en) Hybrid quorum/primary-backup fault-tolerance model
US6105122A (en) I/O protocol for highly configurable multi-node processing system
US7036039B2 (en) Distributing manager failure-induced workload through the use of a manager-naming scheme
EP0935200B1 (en) Highly scalable parallel processing computer system architecture
US6243814B1 (en) Method and apparatus for reliable disk fencing in a multicomputer system
Devarakonda et al. Recovery in the Calypso file system
EP0989490A2 (en) Protocol for dynamic binding of shared resources
US6684231B1 (en) Migration of friendly volumes
US20120226673A1 (en) Configuration-less network locking infrastructure for shared file systems
US6629202B1 (en) Volume stacking model
CN110663031A (en) Distributed storage network
Vallath Oracle real application clusters
Chavis et al. A Guide to the IBM Clustered Network File System
Ingram High-Performance Oracle: Proven Methods for Achieving Optimum Performance and Availability

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 540586

Kind code of ref document: A

Format of ref document f/p: F

AK Designated states

Kind code of ref document: A3

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

WWE Wipo information: entry into national phase

Ref document number: 2000980603

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2000980603

Country of ref document: EP