WO1999055042A1 - System and method for establishing a multicast message delivery error recovery tree in a digital network - Google Patents

System and method for establishing a multicast message delivery error recovery tree in a digital network Download PDF

Info

Publication number
WO1999055042A1
WO1999055042A1 PCT/US1999/007750 US9907750W WO9955042A1 WO 1999055042 A1 WO1999055042 A1 WO 1999055042A1 US 9907750 W US9907750 W US 9907750W WO 9955042 A1 WO9955042 A1 WO 9955042A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
computer
tree
elected
election
Prior art date
Application number
PCT/US1999/007750
Other languages
French (fr)
Inventor
Dah Ming Chiu
Miriam Kadansky
Radia J. Perlman
Original Assignee
Sun Microsystems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems, Inc. filed Critical Sun Microsystems, Inc.
Priority to EP99915339A priority Critical patent/EP1075747A1/en
Priority to AU33877/99A priority patent/AU3387799A/en
Priority to JP2000545284A priority patent/JP2002512484A/en
Publication of WO1999055042A1 publication Critical patent/WO1999055042A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/48Routing tree calculation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/185Arrangements for providing special services to substations for broadcast or conference, e.g. multicast with management of multicast group membership
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/16Multipoint routing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/48Routing tree calculation
    • H04L45/488Routing tree calculation using root node determination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1863Arrangements for providing special services to substations for broadcast or conference, e.g. multicast comprising mechanisms for improved reliability, e.g. status reports

Definitions

  • the invention relates generally to the field of digital networks and more particularly provides an efficient arrangement for establishing a multicast message delivery error recovery tree in a digital network.
  • a number of personal computers, workstations, and other various network resources such as mass storage subsystems, network printers and interfaces to the public telephony system, are typically interconnected in a computer network.
  • the personal computers and workstations are used by individual users to perform processing in connection with data and programs that may be stored in the network mass storage subsystems.
  • the personal computers/workstations operating as clients, download the information, including data and programs, from the network mass storage subsystems for processing.
  • the personal computers or workstations will enable processed data to be uploaded to the network mass storage subsystems for storage, to a network printer for printing, to the telephony interface for transmission over the public telephony system, or the like.
  • the network mass storage subsystems, network printers and telephony interfaces operate as shared resources, since they are available to service requests from all of the clients in the network.
  • the servers are readily available for use by all of the personal computers/workstations in the network. Networks may be spread over a fairly wide area, and may interconnect personal computers, workstations and other devices among a number of companies and individuals.
  • Computers and other resources in a network share information by transferring messages thereamong.
  • Messages can be categorized into two general classes, namely, -2-
  • a unicast message is used by a device to transfer
  • 4 unicast message includes address information, including a source address, which identifies the
  • multicast message includes address information, including a source address, which identified the S particular device that transmitted the message, and a destination address that identifies a set of 9 devices (which may be either all of the devices or a selected subset) that are to receive the message 0 in the case of a multicast message
  • a particular device determines that a unicast message 1 contains a destination address that identifies it (that is, the particular device), or that a multicast 2 message contains a destination address that identifies a set of devices that includes the particular 3 device, it (that is, the particular device) will receive and process the message
  • unicast messages 9 transmitted by a particular source device to a particular destination device include sequencing 0 information so that, if a destination device determines that there is a gap in the sequence of messages 1 received by it from a particular source device, it can notify the source device to request 2 retransmission.
  • the destination device may wait for a pe ⁇ od of time after it notices the gap in the 4 sequence to see if the missing messages might be eventually delivered, but, in any case, if the 5 missing message are not delivered to the destination device at least p ⁇ or to expiration of the pe ⁇ od 6 of time, the destination device will generate the "NACK" message and transmit it to the source 7 device to enable retransmission of the missing messages
  • a similar operation occurs in connection with multicast messages That is, if a destination device determines that a multicast message whose destination address identifies a set that includes the destination device, in a sequence of such multicast messages from the same source device, has not been received, it (that is the destination device) will generate a "NACK" message for transmission to the source device requesting retransmission of the message, or at least transmission of the information m the message in a unicast message to the destination device However, unless failure to receive a multicast message by one
  • the invention provides a new and improved arrangement for establishing a multicast message delivery error recovery tree in a digital network which may be used to reduce the number of "negative acknowledgment" ("NACK") messages which may be generated for transmission to a particular source device, which generates a multicast message for transmission to a plurality of destination devices, by destination devices if they fail to receive a multicast message generated and transmitted by the source device
  • NACK negative acknowledgment
  • the invention provides a system for, and a method of, logically organizing, in a digital data network compnsing a plurality of devices interconnected by a communication link, into a tree structure.
  • Each of the devices has an associated suitability value that generally relates to the device's suitability for becoming a node in the tree structure
  • the devices organize themselves -4-
  • 3 suitability values are such that they can become nodes in the tree broadcast over the communication
  • the devices in the network communicate with at least one of the device or devices which is or are
  • FIG 1 is a schematic diagram of a local area network including computers which establish
  • FIG 2 is a logical diagram of the local area network depicted in FIG 1 , logically organized
  • FIGS 3A and 3B depict structures of messages used in establishing a multicast message
  • FIGS 4A through 4G compnse flow diagrams depicting operations performed by the
  • FIG 1 is a schematic diagram of a local area network 10 including an arrangement for generating or constructing a multicast tree in the network, in accordance with the invention
  • local area network 10 includes a plurality of computers 11(1) through 1 1(N) (generally identified by reference numeral 11 (n)) interconnected by a communication link 13
  • computers 11(1), , 1 1(N-1) are m the form of personal computers or computer workstations, each of which includes a system unit, a video display unit and operator input devices such as a keyboard and mouse
  • the computer 11 ( ) also includes a system unit, and may also include a video display unit and operator input devices
  • the computers 1 1 (n) are of the conventional stored-program computer architecture
  • a system unit generally includes processing, memory, mass storage devices such as disk and/or tape storage elements and other elements (not separately shown), including network interface devices represented by respective arrows 14(n) for interfacing the respective computer to the communication link 13
  • a video display unit permits the computer to display processed data and processing status to
  • the communication link 13 interconnecting the computers 1 l(n) in the network 10 may, as is conventional, comp ⁇ se wires, optical fibers or other transmission media, and may further comp ⁇ se switches, for, respectively, carrying and switching signals representing messages among the computers 1 l(n)
  • each of the computers 1 l(n) typically includes a network interface device 14(n), which connects the respective computer to the communications link 13
  • the transmission media and switches compnsing communication link 13 may interconnect the computers 1 1 in any convenient topology -6- Information is transferred among the computers l l(n) in the form of messages
  • Each message contains a header portion, which generally contains information that is useful in controlling the transfer of the message from the source computer 11 (s), that is, the computer 1 1 (n) that transmits the message, to the destination computer 11(d), that is, the computer or computers that is/are to receive the message, and a data portion, which generally contains information that is to be transferred
  • the information contained in the header portion includes message transfer protocol information, including, inter
  • the message transfer protocol information that is provided in the header of a message can also include so-called "time to live” (“TTL”) information, which can limit the scope of transmission of a message
  • TTL time to live
  • the time to live information may be used to limit the scope of transmission of a message to, for example, a particular local area network, such as local area network 10, to va ⁇ ous groupings of contiguous local area networks in the wide area network, or to the entire wide area network.
  • a computer 1 l(n) in the local area network 10 transmits a message that contains a multicast address and time to live information indicating that the scope of transmission is limited to the local area network 10, then the message will be received ana processed only by the -7- computers 1 l(n) in the local area network 10 which are conditioned to respond to that multicast address Typically, the time to live information is used to control whether gateways (not shown) which are used to interconnect local area networks will transmit a message that they receive on one local area network onto another local area network.
  • the time to live information is in the form of an integer nume ⁇ cal value
  • a gateway when a gateway receives a message, it will decrement the time to live value and, if the decremented time to live value is greater than zero, transmit the message onto the other local area network
  • the gateway will not transmit the message received from the one local area network onto the other local area network
  • computer 1 l(n) can limit the scope of transmission of a message to local area network by transmitting a time to live value of "one," in the message
  • the local area network 10 schematically depicted in FIG 1 comp ⁇ ses a part of a wide area network, such as a p ⁇ vate wide area network maintained by a particular enterpnse and/or a public wide area network such as the Internet Similar to the local area network 10, the other portions of the wide area network that are not depicted in FIG.
  • each local area network can share information by transfer ⁇ ng respective unicast or multicast messages in a manner similar to that desc ⁇ bed above m connection with local area network 10
  • a computer in one local area network can share information with computers in other local area networks by transmitting respective unicast or multicast messages in a manner similar to that desc ⁇ bed above in connection with local area network 10
  • the invention provides an arrangement for facilitating the logical organization of at least some of the computers 11 (n) in the network 10 into a multicast message delivery error recovery tree
  • a multicast message delivery error recovery tree is used to provide for efficient error recovery in the event that one or more computers in the network (which may comp ⁇ se either the local area network 10 or a wide area network including local area network 10) fail to receive a multicast message transmitted by a source computer 1 l(s).
  • a computer 1 l(n) is to be a destination computer 1 1(d) to receive a message that is transmitted by a source computer 1 l(s)
  • the destination computer 1 1(d) fails to receive the message, it will generate a "negative acknowledgment" ("NACK") message for transmission to the source computer 11 (s) to enable the source computer 11 (s) to retransmit the message to the destination computer 1 1(d)
  • NACK negative acknowledgment
  • a destination computer 11 (d) can, for example, use message sequencing information that is generally provided in the protocol information in the headers of respective messages in a sequence transmitted by the source computer 1 l(s) to determine whether it failed to receive a particular message.
  • a problem can anse, however, if a number of destination computers 11 (d) do not receive a multicast message that is transmitted by a source computer 11 (s), and generate respective "NACK" messages for transmission to the source computer 1 l(s).
  • a multicast message delivery error recovery tree can assist in reducing the number of "NACK" messages which are transmitted to the source computer 11 (s) of a multicast message, which can serve to reduce the negative affect on processing thereby which might otherwise anse, particularly in a wide-area network in which large numbers of computers may be destinations of a multicast message.
  • the invention provides an arrangement for efficiently organizing the computers 1 l(n) in the local area network into a multicast message delivery e ⁇ or recovery tree.
  • FIG. 2 An illustrative example of -9- a multicast message delivery error recovery tree, identified by reference numeral 20, is depicted in FIG 2 .
  • the computers l l(n) compnsing local area network 10 are "logically interconnected" (as will be desc ⁇ bed below) m a manner represented by the dashed lines interconnecting the va ⁇ ous computers depicted in FIG. 2
  • the illustrative multicast message delivery error recovery tree 20 In the illustrative multicast message delivery error recovery tree 20
  • computer 11(R) which forms the root node in the tree, is the parent of the computers 1 1(R)(1) and 11 (R)(2), which form the second level in the tree, and additionally is the parent of computer 11(R)(3) which forms a leaf node in the tree 20.
  • each computer 11(R)(1), H(R)(2) and 11(R)(3) is a "child" of the computer 11(R) m the tree 20.
  • computer 1 1 (R)( 1 ) is the parent of computers 11 (R)( 1 )( 1 ) and 1 1 (R)( 1 )(2) (which comp ⁇ se nodes m the third level of tree 20) and 11(R)(1)(3) (which is a leaf node in tree 20), and each computer 1 1(R)(1 )(1), l l(R)(l)(2) and 11(R)(1)(3) is a child of computer 11(R)(1).
  • a computer other than the computer that forms the root node will have one parent, but it can have one or more than one children.
  • a computer that forms a leaf node in the multicast tree is logically connected to a computer m a higher level, but no computer m any lower level and no computer forming a leaf node logically connects to it (that is, the computer 1 l(R)(a)(b)...(l) that forms the leaf node).
  • a computer 1 l(R)(a)(b)...(l) that forms a leaf node may connect to a computer at any level in the tree It will be appreciated that the tree need not be balanced, that is, there may be different numbers of levels between the computer 11(R) which forms the root node, and the va ⁇ ous computers 1 l(R)(a)(b)...(l) which form leaf nodes in the tree, and various non-leaf nodes compnsing the tree 20 may have different numbers of children.
  • the number of levels between the computer 11 (R) which forms the root node and a computer 11 (R)(a)(b)...(1 , ) which forms a leaf node may differ from the number of levels between the computer 1 1(R) and another computer 1 l(R)(a)(b)...(l 2 ) which forms a leaf node, and so forth for all computers 11 (1) which form leaf nodes.
  • the local area network 10 in which the multicast message delivery error recovery tree 20 is formed is part of a wide area network including a plurality of local area networks
  • the multicast message delivery error recovery tree 20 depicted in FIG. 2 forms a sub-tree of a larger tree (not shown) as represented by the double-headed arrow associated with the legend "TO/FROM TREE ASSOCIATED WITH WIDE AREA NETWORK" in FIG. 2.
  • the computer 1 1 (R) which forms the root node in the multicast message delivery error recovery tree 20 for local-area network 10 does not have a parent within the tree 20 organized as depicted in FIG 2, it will have a parent in the larger tree
  • the parent, and computers in levels thereabove in the larger tree formed in the wide area network may be in other local area networks in the wi ⁇ e area network
  • computers at higher levels in the tree formed in the wide area network may include computers m the local area network 10
  • a computer m the local area network 10 may be at a node at a lower level or comp ⁇ se a leaf node l l(R)(a)(b) (1) in the multicast message delivery error recovery tree 20, and may in addition compnse a node at a higher level than the computer 1 1(R) that forms the root node in the multicast message delivery error recovery tree 20 formed in the local area network 10
  • the multicast message delivery error recovery tree 20 is used to provide a mechanism whereby, in the event that one or more of the computers m the network 10 fail to receive a multicast message from the computer which forms the root node in the tree formed in the wide area network, of which the local area network 10 is a part, those computers can efficiently request the re- transmission of a copy of the multicast message thereto
  • the tree 20 provides a mechanism whereby a computer, such as computer 11(R)(1)(1), m the network 10, if it fails to receive a multicast message from another computer as source computer 1 1 (s), instead of generating a "NACK" message for transfer to the source computer 1 l(s) requesting re-transmission of a copy of the multicast message, can generate a "NACK" message for transfer over communication link 13 to its parent computer 11 (R)( 1 )( 1 ) in multicast message delivery error recovery tree 20, the "NACK" -12- message requesting it (that is, the parent computer 11(R)(1)(1)) to
  • parent computer 11 (R)( 1 )( 1 ) If that parent computer 11 (R)( 1 )( 1 ) has a copy of the multicast message, it (that is, parent computer 11 (R)( 1 )( 1 )) will transmit the multicast message over the communication link 13.
  • the child computer 11(R)(1)(1)(1) which requested the copy of the multicast message can then receive the multicast message transmitted by the parent computer 1 1(R)(1)(1).
  • the other computers 1 l(n) in the local area network 10 can also receive the multicast message and use it if they had not previously received it; the other computers l l(n) in the local area network, that is, the computers which previously received the message, can ignore the message.
  • the parent computer 1 1(R)(1)(1) when it transmits the multicast message, will transmit the message with a time to live value that will limit its transmission to the local area network 10, so that the message is not transmitted to other portions of the wide area network which may have received the multicast message.
  • the computer 11(R)(1)(1) does not have a copy of the multicast message, it will generate a "NACK" message for transfer over communication link 13 to its parent computer 11(R)(1) requesting a copy therefrom. If that computer 11(R)(1) has a copy of the multicast message, it (that is, the computer 1 1(R)(1)) will transmit the multicast message over the communication link 13 as described above, to provide the multicast message to its child and grandchild computers 11(R)(1)(1) and 1 1(R)(1)(1)(1), along with any other computers l l(n) which had not previously received the message.
  • the computer 1 1(R)(1) does not have a copy of the multicast message, it, in turn, will generate a "NACK" message for transfer over communication link 13 to its parent, namely, the computer 1 1(R) which forms the root node in the multicast tree 20, requesting a copy therefrom. If the computer 11(R) has a copy of the requested multicast message, it will transmit the multicast message over the communication link 13 as described above.
  • the computer 11(R) which forms the root node in the multicast message delivery error recovery tree 20 determines that it does not have a copy of the requested message, it will attempt to obtain a copy.
  • the use of the multicast message delivery error recovery tree 20, and associated tree formed in the wide area network can serve to substantially reduce the number of "NACK" messages that are transmitted to the source computer 11 (s) for a multicast message If a multicast message is not received by a particular destination computer, it is frequently the case that a large number of destination computers 11(d) will not receive the multicast message, not just a single destination computer In that case, the destination computers which fail to receive the multicast message will generate only one "NACK" message Thus, for example, if a computer which forms a node m the multicast message delivery error recovery tree 20 has already generated a "NACK" message requesting a copy of a particular multicast message for transmission to its parent in the tree 20, either because it itself failed to receive the multicast message or in resDonse to receipt of a "NACK" message from one of its children requesting the multicast message, ana it later receives a "NACK” message from another child requesting the same multicast message, it will not
  • the portion of the larger tree above the multicast message delivery error recovery tree 20 established for the local area network 10 will be predetermined by, for example, a system administrator, and one computer in the wide area network will be identified as a parent for the computer 11 (R) in the local area network 10 that is selected as the parent for the root node m the tree 20
  • the computers 11 (n) in the local area network 10 themselves determine the logical organization of the multicast message delivery error recovery tree 20 for the local area network 10 Operations performe ⁇ by the computers 11 (n) in establishing the multicast message delivery error recovery tree 20 m the local area network 10 will be descnbed below in connection with the flow chart depicted m FIG 4
  • the computers 1 l(n) make use of three types of messages, namely, a node election message type, a node advertising message type, and a child solicitation message type
  • the establishment of a multicast message delivery error recovery tree 20 proceeds in a plurality of iterations, each iteration including two phases, namely, a no no
  • FIG 3 A depicts structure of a node election message 30 used by computers 1 l(n) in one embodiment of the invention du ⁇ ng the node election phase of each iteration
  • each of the computers 1 1 (n) that can become a node in the multicast message delivery error recovery tree, referenced as "participating" in the iteration multicast node election messages that identif ⁇ us suitability (as will be desc ⁇ bed below) of becoming a node in the tree
  • the node election messages also identify a number of nodes that will be selected du ⁇ ng the iteration
  • the computers 1 l (n) essentially self-select themselves as nodes at the end of the iteration
  • no ⁇ e election message 30 includes a header portion 31 and a data portion 32
  • the header portion 31 contains message transfer protocol information, including a message type identifier, source and destination addresses, and a time to live value
  • the message type identifier in the header portion 1 identifies the message as being of the node election
  • Tne data portion 32 contains a number of fields that are used du ⁇ ng the iteration in selecting the computers 1 l(n) that are to be nodes in the tree 20, including a pno ⁇ ty field 33, a capacity field 34, a computer system identifier field 35, a number of nodes field 36 and an election interval field 37
  • the p ⁇ o ⁇ ty field 33, capacity field 34 and computer system identifier field 35 contain values that sen e to rank the participating computers in their relative suitability to become a node du ⁇ ng the iteration
  • p ⁇ o ⁇ ty field contains a pnonty value that identifies a relative p ⁇ onty for the computer 1 l(n) that generates the node election message to be a node
  • Illustrative p ⁇ onty values used in pno ⁇ ty field 33 include, for example, a highest "must be" a node value, a relatively high "eager" to be a node value, and
  • the number of nodes field 36 contains a value that identifies the number of computers 1 l(n) that are to be selected as nodes dunng the iteration
  • the value contained in the number of nodes field 36 will be "one " If a subsequent iteration is needed to select additional computers to operate as nodes m the multicast message delivery error recovery tree 20, the computer 1 1 (R) previously selected as the root node in the tree 20 will provide a value identifying the number of computers 1 1 (n) that are to be selected as nodes in the subsequent iteration
  • the election interval field 37 identifies the time interval between transmissions of node election messages by the computer system which generates the respective message
  • the values in the pno ⁇ ty field 33, capacity field 34 ana computer system identifier field 35 are used to rank the participating computers 1 l(n) as to their relative suitability to be selected as nodes du ⁇ ng the iteration Generally, the ranking is based first on the p ⁇ onty value, so that those computers 1 l(n) which have the higher pno ⁇ ty values will be deemed more suitable and those which have lower pnonty values will be deemed less suitable
  • suitability is determined based on their respective capacity values, which identify the number of children that they can accommodate, that is, computers 11 (n) with higher capacity values will be deemed more suitable and computers 11 (n) with lower capacity values will be deemed less suitable, within the ranking as determined by their p ⁇ ontv values
  • computers 11 (n) with the same p ⁇ o ⁇ ty and capacity values the -17-
  • CSID VAL represents a field for the computer system identifier value
  • represents
  • FIG 3B depicts the structure of a node advertising message 40 used m one embodiment of I S the invention du ⁇ ng the second phase of each iteration
  • the node advertising message 40 used m one embodiment of I S the invention du ⁇ ng the second phase of each iteration.
  • the computer 1 l(n) that transmits the node advertising message 40 also provides some 3 information to the other computers as to the status of the computer 1 l(n), which they (that is, the
  • node advertising message 40 includes a header portion 41 and
  • the header portion 41 contains message transfer protocol information, including -18- a message type identifier and source and destination addresses, and a time to live value
  • the message type identifier m the header portion 41 identifies the message as being of the node advertising type
  • the source address identifies the particular computer 1 l(n) that generated and transmitted the node advertising message
  • the destination address identifies the multicast address
  • the time to live value is selected to ensure that the node advertising messages are transmitted only in the local area network
  • the data portion 42 contains a number of fields that are used du ⁇ ng the iteration in notifying the other computers in the local area network 10 as to the logical connection availability of the computer 1 l(n) that generates the message 40, including a pnonty field 43, a capacity field 44, a computer system identifier field 45, an advertising interval field 46, a number of children field 47, a connected field 48, a number of nodes field 49 and a status field 50
  • the p ⁇ o ⁇ ty field 43, capacity field 44 and computer system identifier field 45 contain p ⁇ onty, capacity and computer system identifier values that correspond to the respective values m fields 33, 34, and 35, respectively, of node election message 30 as descnbed above
  • the advertising interval field 46 identifies the time interval between transmissions of a node advertising message 40 by the particular computer 1 1 (n) that generates the node advertising message 40
  • the number of children field 47 of node advertising message 40 contains a value that identifies the number of other computers to which the particular computer 1 1 (n) is currently logically connected when it (that is, the particular computer 1 1 (n)) transmits the node advertising message 40
  • a computer that receives a node advertising message 40 from tne computer 1 1 (n) can determine an excess capacity value for computer 1 l(n), which is a measure of the ability of the computer 1 l(n) to take on additional children, by determining the difference between the capacity value in field 44 and the number of children value in file 47
  • a computer may use the excess capacity value for several purposes, including determining whether it should attempt to logically connect as a child to a particular computer 1 l(n) based on its excess capacity m relation to the excess capacity values of computers which form other nodes in the multicast message delivery error recovery tree 20 Other uses of the excess capacity value will be descnbed below -19-
  • the connected field 48 of node advertising message 40 contains a connected flag that
  • the number of nodes field 49 identifies the current number of nodes that have been selected for the tree 20.
  • the status field 50 provides information as to the computer's status in the tree 20. In one embodiment the status field 50 normally has an "active" value, which indicates that the computer 1 l(n) that generates the node advertising message 40 is currently a node in the tree 20. However, as will be described below, the status field 50 can also have a "resigning" value, which indicates that the computer l l(n) that generates the node advertising message 40 is a node in tree 20, but is in the process of resigning as a node and converting to a leaf. If status field 50 of the node advertising message 40 transmitted by a computer 1 l(n) indicates the "resigning" status, its children in the tree will need to attempt to logically connect to other computers that are nodes in the tree.
  • the first iteration will start when at least one of the computers 1 l(n) of the local area network 10 has been powered on and initialized. After a computer 1 l(n) has been initialized, if it is associated with a priority value that would allow it to be a node in the multicast message delivery error recovery tree 20, that is, if it is associated with a priority value other than "leaf only," it will begin periodically transmitting node election messages 30.
  • the computer 1 l(n) If, within a selected time interval after beginning transmitting node election messages 30, the computer 1 l(n) does not receive a node election message 30 from other computers in the network, it (that is, computer 11 (n)) will determine that it is "elected” as a node in the multicast message delivery error recovery tree 20, and stop transmitting node election messages 30. Since the computer l l(n) comprising the elected node is, at this point, the only node in the tree 20, it (that is, the computer l(n)) will comprise the root node 11(R) in the tree 20. After the computer 1 l(n) determines that it is the elected node, will begin transmitting node advertising messages 40 to notify other computers in the local area network 10 of its availability to accept children.
  • the computer 1 l(n) can increment its pno ⁇ ty value by a predetermined "elected node” increment value, the purpose for which will be desc ⁇ bed below
  • the computer 11 (n) receives a node election message 30 from at least one other computer 1 l(n') in the local area network 10, it will compare the suitability value (that is, as desc ⁇ bed above, the concatenation of the p ⁇ onty value, capacity value and computer system identifier value from fields 33, 34 and 35) from the received node election message to its own suitability value (that is, the concatenation of its p ⁇ o ⁇ ty value, capacity value and computer system identifier value) that it is providing in the node election messages 30 that it is transmitting If the computer 1 l(n) determines that its suitability value is less than the suitability value in the received node election message, it will stop transmitting node election messages On the other hand, if the computer 1 l(n) determines that its suitability value is greater than the suitability value in the received node election message, it will continue transmitting node election messages The other computer 1 l(n') will perform similar
  • Each computer m the local area network with a p ⁇ onty level above "leaf only” will perform similar operations
  • one of the computers if it is still transmitting node election messages, then it has not received a node election message from a computer m the local area network which has a higher suitability value, and so it (that is, the still- -21- transmitting computer) will determine that it is "elected” a node, m this case the root node, m the tree 20, and therefore will constitute the computer 11(R) (reference FIG 2) in the tree 20.
  • computer 1 1(R) After computer 1 1(R) determines that it is the elected node, it will stop transmitting node election messages 30 and begin transmitting node advertising messages 40 to notify other computers m the local area network 10 of its availability to accept children. Other computers in the local area network 10 that have been powered on and initialized can, after receiving a node advertising message 40, communicate with the computer 1 1 (R) to attempt to become children of the elected computer in the tree 20 In addition, the computer 11(R) can increment its p ⁇ o ⁇ ty value by the predetermined "elected node" increment value
  • the computer 11(R) that forms the root node of the multicast message delivery e ⁇ or recovery tree 20 determines, du ⁇ ng the communications with the other computers in the local area network 10, that it has the capacity to accommodate all of the other computers as leaves, it can logically connect to the other computers at that point.
  • the computer 11(R) determines that more computers are requesting to become children thereo f than it can accommodate, it (that is, the computer 11 (R)) can initiate a second iteration in the tree establishment process, to enable election of additional nodes for the multicast message delivery error recovery tree 20 It will be appreciated that the additional nodes will comp ⁇ se one or more levels in the tree 20 below the root level To initiate the second iteration, the computer 1 1(R) will resume transmitting node election messages 30 In this iteration, the node election messages 30 generated by the computer 1 1(R) will contain a number of nodes value in field 36 that is greater than one by an amount corresponding to the number of new nodes that are to be elected In the node election messages generated by the computer 11(R), the pnonty value in field 33 will reflect the incremented pnonty value, that is, its o ⁇ ginal p ⁇ onty value incremented by the predetermined "elected node” increment value
  • each of the other computers in the local area network 10 which can also become nodes in the tree (that is, each computer other than computer 11 (R) whose pno ⁇ ty value is greater than "leaf only”) will also resume transmitting node election messages 30
  • each of the other -22- computers will use their own p ⁇ o ⁇ ty, capacity and computer system identifier values m fields 33 through 35 of their node election messages, but will use the number of nodes and election interval values from fields 36 and 37 in the node election messages 30 received from the computer 11 (R) in node election messages that they transmit While transmitting node election messages, each of the computers will also receive node election messages from the other computers and will compare the suitability values therefrom with their own suitability values If a computer that is transmitting node election messages 30, determines that it has received node election messages 30 with higher suitability values than its (that is, the computer's) suitability value, from a number of other computers corresponding to the number of nodes value in field 36, it (that is, the computer
  • the computer 1 1(R) that was elected the root node du ⁇ ng the previous iteration may, but need not, have the highest suitability value (even with the p ⁇ o ⁇ ty value incremented by the predetermined "elected node” increment value) as among the elected nodes since one or more computers associated with higher suitability values than the computer 11 (R) may have been powered on and initialized and begun transmitting node election messages If the computer 11(R) that was elected as the root node du ⁇ ng the previous iteration has a p ⁇ o ⁇ ty level, as incremented by the "elected node” increment value, that is higher than the other computers, then it will remain the root node However, if another computer has a pnonty level that is sufficiently high that its suitability value is higher than the suitability of computer 11(R), the other computer will be elected root node dunng the current iteration Indeed, if sufficient numbers of newly-transmittmg computers have higher suitability values than the previously elected computer 11 (R), that computer
  • the elected computers will transmit node advertising messages 40, as in the first iteration, to notify other computers in the local area network 10 of their availability to accept children.
  • the computers elected during the iteration other than the computer 1 1 (R) that forms the root node for the tree 20, will receive the node advertising messages 40 transmitted by the computer 11 (R) and will communicate with it to become logically connected thereto, thereby to establish them as forming the second level of the tree. If a newly-elected node cannot logically connect to the computer 11 (R), it may logically connect to another newly-elected computer to begin the third level in the tree. The other, non-elected computers will communicate with any of the elected computers to negotiate formation of logical connections therewith.
  • the root node determines that additional nodes of the tree are required, it can repeat the operations through subsequent iterations.
  • the computers newly-elected to form node(s) in the tree 20 will increment their priority values by the predetermined "elected node” value, and, if a previously-elected computer is "de-elected,” it will reduce its priority value to the original priority value.
  • These operations will continue until preferably all of the computers 1 1 (n) in the local area network 10 form part of the multicast message delivery error recovery tree 20, either as nodes or leaves.
  • the root node 11(R) will logically connect as a child to a node specified for the local area network in the tree for the wide area network.
  • the computers 1 l(n) comprising nodes in the tree will -24- continually transmit node advertising messages 40, so that newly-imtiahzed computers can identify nodes in the multicast message delivery error recovery tree 20 to which they may logically connect
  • a computer l l(n) in the local area network after it is powered on and initialized, if it has a p ⁇ o ⁇ ty value that is other than "leaf only," will begin transmitting node election messages
  • the computers 11 (n) will repeat the tree-establishment operations as desc ⁇ bed above
  • the computers 1 l(n) which form nodes in the tree will use their p ⁇ onty values, as incremented by the "elected node” increment value, so that the tree 20 will not be disturbed dunng the new tree-establishment operations unless the newly- initialized computer has a sufficiently high p ⁇ onty value
  • the newly-imtiahzed computer has a p ⁇ o ⁇ tv value that is higher than the pnonty value associated with the computer 1 1 (R) forming the root node for the tree 20, it (that is, the newly-
  • the computers 1 l(n) will penodically repeat the election process In that operation, anv of the computers 1 l(n), other than a computer whose p ⁇ ontv level is "leaf only," can transmit a node election message
  • the node election message transmitted by a computer 11 (n) will include computer's suitability value, including its p ⁇ onty level either as incremented by the "elected node” increment value if the computer currently forms a node in the tree 20, or not incremented if the computer does not form a node in the tree 20
  • the computers 1 l(n) that form nodes in the tree 20 attempt to optimize the tree In that case
  • a node election message transmit sequence depicting operations performed by a node election message transmitter maintained by the computer 1 1 (n) in connection with transmission of node election messages
  • -26- depicting operations performed by a node election message receive sequence
  • FIGGS 4C and 4D depicting operations performed by a node election message receiver maintained by the computer 11 (n) in connection with reception of node election messages
  • FIG. 4E through 4G depicting operations performed by a tree organization component maintained by the computer 1 l(n) in connection with logically connecting the computer 1 l(n) to a respective parent and children as approp ⁇ ate
  • a computer 1 l(n) will begin transmitting node election messages 30 either
  • the computer 1 l(n) will, after being powered up and initialized (step 100), initially determine whether it has a pno ⁇ ty level greater than "leaf only" (step 101) If the computer l l(n) makes a negative determination m step 101, that is, if it has ap ⁇ o ⁇ ty level of "leaf only," it will operate only as a leaf in a multicast message delivery error recovery tree 20 established for local area network 10 In that case, the computer 1 l(n) can wait for a node advertising message 40 from a computer that forms a node in a multicast message delivery e ⁇ or recovery tree 20, or it can begin transmitting child solicitation messages to enable computers that form nodes in a tree 20 to transmit node advertising messages 40
  • the computer 11 (n) makes a positive determination in step 101, that is, if it has a p ⁇ onty level above "leaf only," it can operate as a node in a multicast message delivery -27- error recovery tree 20 established m local area network 10 In that case, computer 11 (n) will proceed to a se ⁇ es of steps to initiate transmission of node election messages 30 In that operation, the
  • J computer 1 l(n) will establish and initialize a node election message pe ⁇ od timer (step 102) to be
  • the computer l l(n) will establish and initialize a number of nodes store (step 103),
  • step 104 that will be used to count the number of computers 11 (n) from which it receives
  • step 105 If the computer 1 l(n) makes a positive determination in step 105, it will transmit a node
  • election message including its pno ⁇ ty, capacity and computer system identifier values m fields 33,
  • step 108 19 interval timer established in step 107 has timed out (step 108), if it determines that the node election 0 message pe ⁇ od timer has not timed out (step 109), the computer 1 l(n) will return to step 104 to
  • step 105 that the value provided by node counter is greater than or equal to the number 4 of nodes value, or
  • step 108 that the node election message pe ⁇ od timer has timed out -28-
  • step 105 If the computer 1 1 (n) determines in step 105 that the value provided by node counter is greater than
  • the local area network 10 (step 120, FIG. 4C), it will initially determine whether it (that is, the
  • step 121 makes a negative determination in step 121, that is, if it determines that it is not currently
  • step 122 the computer 1 l(n) will determine whether the number of nodes value in
  • step 123 If the computer l l(n) makes a positive determination in step 123, that is, if it
  • the computer l l(n) will 0 increase the number of nodes value in its number of nodes store (step 124) and adjust its node 1 election message period time interval to be used when the node election message transmission
  • step 107 22 interval timer is next established in step 107 (step 125) for use during transmission of node election
  • step 126 If the computer 1 l(n) makes a positive determination in step 126, then it determines whether it
  • step 127) previously received a node election message 30 from computer 11 (n') during the node election -29- message transmission interval (step 127) and, if so, will increment the node counter used to control node election message transmission (step 128).
  • the computer l l(n) will perform operations, depicted in FIG. 4E, in connection with logically connecting to other computers as parent and/or children as appropriate.
  • the computer 1 l(n) will first determine whether it is the root node in the tree (step 140). In that operation, the computer 11 (n) can determine whether it is the root node by determining whether the node counter has the value zero, which would indicate that it (that is, the computer 1 l(n) did not receive node election messages 30 from any other computer in the local area network with higher suitability values than that of computer l l(n). If the computer l l(n) makes a positive determination in step 140, it will attempt to form a logical connection to the parent assigned to the local area network in the portion of the tree formed therefor in the wide area network (step 141).
  • the computer 11 (n) makes a negative determination in step 140, it will determine whether it is a node other than the root node in the multicast message delivery error recovery tree 20. In that operation, the computer 11 (n) can determine whether the number of nodes value stored in the number of nodes store is higher than the value provided by the node counter (step 142). If the computer 1 l(n) makes a positive determination in step 142, then it will form a node in the tree; on the other hand, if the computer 1 l(n) makes a negative determination in step 140, then it received node election messages 30 from at least a number of other computers in the local area network 10 corresponding to the number of nodes value identifying the number of nodes to be elected.
  • step 141 the computer 1 l(n) will first increase its priority value by the "elected node" value (step 143). Thereafter, the computer 11 (n) will begin transmitting node advertising messages to identify the availability of the computer 1 l(n) to logically connect to other computers as children (step 144).
  • the -30- computer l l(n) determined in step 140 it will respond to the node advertising messages generated by other computers which have pnonty, capacity and computer system identifier values m fields 43-45 which define suitability values greater than its own to attempt to logically connect to them as children, thereby to establish respective levels in the tree 20 (step 145)
  • the computer 1 l(n) After the computer 1 l(n) has logically connected to another computer as parent of the other computer, it will increase the number of children value in field 47 of node advertising messages transmitted thereby (step 146)
  • the computer l l(n) has been accepted by another computer as a child of the other computer, it will be logically connected to that other computer and can save the identification of the other computer for use in sending "NACK" messages, and set the connected flag 48 in node advertising messages 40 transmitted thereby (step 147)
  • the computer 11 (n) determines that the multicast message delivery e ⁇ or recovery tree 20 has been established (step 148), it will determine whether its suitability value is the lowest among the computers elected nodes in the tree (step 149)
  • the computer 1 l(n) can do that by, for example, determining whether the value provided by its node counter co ⁇ esponds to the number of nodes value in field 49 of the node advertising messages 40 transmitted by the computers compnsing nodes in the tree 20 If the computer 1 l(n) makes a positive determination m step 149, it will determine whether the excess capacity, as desc ⁇ bed above, of the computers compnsing the other nodes is greater than the number of computers logically connected to it as children (step 150) If the computer l l(n) makes a positive determination in step 150, if it has any computers logically connected thereto as children (step 151) it will provide a value of "resigning" in the status field 50 of node advertising messages 40 transmitted thereby to enable its children to attempt to
  • a system m accordance with the invention can be constructed in whole or in part from special purpose hardware or a general purpose computer system, or any combination thereof, any portion of which may be controlled by a suitable program Any program may in whole or in part comp ⁇ se part of or be stored on the system m a conventional manner, or it may in whole or in part be provided in to the system over a network or other mechanism for transfemng information in a conventional manner.
  • the system may be operated and/or otherwise controlled by means of information provided by an operator using operator input elements (not shown) which may be connected directly to the system or which may transfer the information to the system over a network or other mechanism for transferring information m a conventional manner

Abstract

In a digital data network, a plurality of devices interconnected by a communication link organize themselves into a tree structure. Each of the devices has an associated suitability value that generally relates to the device's suitability for becoming a node in the tree structure. The devices organize themselves into a tree structure in one or more iterations, each iteration comprising two general steps, namely, a node election step and a tree establishment step. In the node election step, the devices whose suitability values are such that they can become nodes in the tree broadcast over the communication link node election messages including their respective suitability values. These devices also receive the node election messages that are broadcast by other devices. Each device determines whether it is elected a node in the tree structure in connection with a comparison between its suitability value and suitability values of node election messages received thereby. During the tree establishment step, the devices in the network communicate with at least one of the device or devices which is or are elected respective nodes in the tree structure to facilitate becoming respective children thereof.

Description

SYSTEM AND METHOD FOR ESTABLISHING A MULTICAST MESSAGE DELIVERY ERROR RECOVERY TREE IN A DIGITAL NETWORK
-1-
FIELD OF THE INVENTION
The invention relates generally to the field of digital networks and more particularly provides an efficient arrangement for establishing a multicast message delivery error recovery tree in a digital network.
BACKGROUND OF THE INVENTION
In modern "enterprise" digital data processing systems for use in an office environment in a company, a number of personal computers, workstations, and other various network resources such as mass storage subsystems, network printers and interfaces to the public telephony system, are typically interconnected in a computer network. The personal computers and workstations are used by individual users to perform processing in connection with data and programs that may be stored in the network mass storage subsystems. In such an arrangement, the personal computers/workstations, operating as clients, download the information, including data and programs, from the network mass storage subsystems for processing. In addition, the personal computers or workstations will enable processed data to be uploaded to the network mass storage subsystems for storage, to a network printer for printing, to the telephony interface for transmission over the public telephony system, or the like. In such an arrangement, the network mass storage subsystems, network printers and telephony interfaces operate as shared resources, since they are available to service requests from all of the clients in the network. By organizing the network in such a manner, the servers are readily available for use by all of the personal computers/workstations in the network. Networks may be spread over a fairly wide area, and may interconnect personal computers, workstations and other devices among a number of companies and individuals.
Computers and other resources (generally, "devices") in a network share information by transferring messages thereamong. Messages can be categorized into two general classes, namely, -2-
1 unicast messages and multicast messages. A unicast message is used by a device to transfer
2 information to one other device in the network, whereas a multicast messages is used by a device
3 to transfer information to all of the devices, or a selected subset of the devices, in the network. Each
4 unicast message includes address information, including a source address, which identifies the
5 particular device that transmitted the message, and a destination address, which identifies the
6 particular device that is to receive the message in the case of a unicast message Similarly, each
7 multicast message includes address information, including a source address, which identified the S particular device that transmitted the message, and a destination address that identifies a set of 9 devices (which may be either all of the devices or a selected subset) that are to receive the message 0 in the case of a multicast message When a particular device determines that a unicast message 1 contains a destination address that identifies it (that is, the particular device), or that a multicast 2 message contains a destination address that identifies a set of devices that includes the particular 3 device, it (that is, the particular device) will receive and process the message
4 Although networks generally transfer messages reliably, sometimes messages are lost and 5 not received by the destination devices, that is, the particular devices that are intended to receive 6 them If a destination device fails to receive a message, it can transmit a "negative acknowledgment" 7 ("NACK") message to the source device, that is, the device that generated and transmitted the S message over the network, requesting retransmission of the message Typically, unicast messages 9 transmitted by a particular source device to a particular destination device include sequencing 0 information so that, if a destination device determines that there is a gap in the sequence of messages 1 received by it from a particular source device, it can notify the source device to request 2 retransmission. If messages can be delivered to the destination device out of the order in which they 3 are transmitted, the destination device may wait for a peπod of time after it notices the gap in the 4 sequence to see if the missing messages might be eventually delivered, but, in any case, if the 5 missing message are not delivered to the destination device at least pπor to expiration of the peπod 6 of time, the destination device will generate the "NACK" message and transmit it to the source 7 device to enable retransmission of the missing messages A similar operation occurs in connection with multicast messages That is, if a destination device determines that a multicast message whose destination address identifies a set that includes the destination device, in a sequence of such multicast messages from the same source device, has not been received, it (that is the destination device) will generate a "NACK" message for transmission to the source device requesting retransmission of the message, or at least transmission of the information m the message in a unicast message to the destination device However, unless failure to receive a multicast message by one destination device is due to a malfunction of that destination device, if one destination device fails to receive a multicast message, frequently a large number or all of destination devices in the destination device set for the multicast message will also fail to receive the multicast message, all of which will generate and transmit "NACK" messages to the source device requesting retransmission of the multicast message In that case, if there are a large number of destination devices in the device set for the multicast message, the source device may receive and process a large number of such "NACK" messages, which can significantly negatively effect its (that is, the source device's) other processing operations
SUMMARY OF THE INVENTION
The invention provides a new and improved arrangement for establishing a multicast message delivery error recovery tree in a digital network which may be used to reduce the number of "negative acknowledgment" ("NACK") messages which may be generated for transmission to a particular source device, which generates a multicast message for transmission to a plurality of destination devices, by destination devices if they fail to receive a multicast message generated and transmitted by the source device
In bπef summary, the invention provides a system for, and a method of, logically organizing, in a digital data network compnsing a plurality of devices interconnected by a communication link, into a tree structure. Each of the devices has an associated suitability value that generally relates to the device's suitability for becoming a node in the tree structure The devices organize themselves -4-
1 into a tree structure in one or more iterations, each iteration compπsing two general steps, namely,
2 a node election step and a tree establishment step In the node election step, the devices whose
3 suitability values are such that they can become nodes in the tree broadcast over the communication
4 link node election messages including their respective suitability values These devices also receive
5 the node election messages that are broadcast by other devices Each device determines whether it
6 is elected a node in the tree structure in connection with a comparison between its suitability value
7 and suitability values node election messages received thereby Duπng the tree establishment step,
8 the devices in the network communicate with at least one of the device or devices which is or are
9 elected respective nodes in the tree structure to facilitate becoming respective children thereof
I o BRIEF DESCRIPTION OF THE DRAWINGS
I I This invention is pointed out with particulaπty in the appended claims The above and
12 further advantages of this invention may be better understood by referπng to the following
13 descπption taken in conjunction with the accompanying drawings, in which
14 FIG 1 is a schematic diagram of a local area network including computers which establish
15 and organize a multicast message delivery error recovery tree in accordance with the invention,
16 FIG 2 is a logical diagram of the local area network depicted in FIG 1 , logically organized
17 in an illustrative multicast message delivery error recovery tree,
18 FIGS 3A and 3B depict structures of messages used in establishing a multicast message
19 delivery error recovery tree m connection with the invention, and
0 FIGS 4A through 4G compnse flow diagrams depicting operations performed by the
21 computers in the local area network depicted in FIG 1 , in establishing an multicast message delivery
22 error recovery tree in accordance with the invention -5-
DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT
FIG 1 is a schematic diagram of a local area network 10 including an arrangement for generating or constructing a multicast tree in the network, in accordance with the invention With reference to FIG 1, local area network 10 includes a plurality of computers 11(1) through 1 1(N) (generally identified by reference numeral 11 (n)) interconnected by a communication link 13 As is conventional, at least some of the computers 11(1), , 1 1(N-1) are m the form of personal computers or computer workstations, each of which includes a system unit, a video display unit and operator input devices such as a keyboard and mouse The computer 11 ( ) also includes a system unit, and may also include a video display unit and operator input devices The computers 1 1 (n) are of the conventional stored-program computer architecture A system unit generally includes processing, memory, mass storage devices such as disk and/or tape storage elements and other elements (not separately shown), including network interface devices represented by respective arrows 14(n) for interfacing the respective computer to the communication link 13 A video display unit permits the computer to display processed data and processing status to the user, and an operator input dev ice enable the user to input data and control processing by the computer The computers l l(n) transfer information, m the form of messages, through their respective network interface devices 14(n) among each other over the communication link 13
The communication link 13 interconnecting the computers 1 l(n) in the network 10 may, as is conventional, compπse wires, optical fibers or other transmission media, and may further compπse switches, for, respectively, carrying and switching signals representing messages among the computers 1 l(n) As noted above, each of the computers 1 l(n) typically includes a network interface device 14(n), which connects the respective computer to the communications link 13 The transmission media and switches compnsing communication link 13 may interconnect the computers 1 1 in any convenient topology -6- Information is transferred among the computers l l(n) in the form of messages Each message contains a header portion, which generally contains information that is useful in controlling the transfer of the message from the source computer 11 (s), that is, the computer 1 1 (n) that transmits the message, to the destination computer 11(d), that is, the computer or computers that is/are to receive the message, and a data portion, which generally contains information that is to be transferred The information contained in the header portion includes message transfer protocol information, including, inter a a, source and destination addresses that identify the source computer 1 1 (s) and the destination computer(s) 1 1 (d) that is/are to receive the message, and each computer 1 l(n) can determine from a message's destination address whether it is to receive the message The destination address information in a message may identify one computer 1 l(n) which is to receive the message, m which case the message is referred to as a "unicast" message On the other hand, the destination address information m a message may compπse a multicast address, which enables all of the computers l l(n) in the local area network 10 to receive the message, a message which includes such a multicast address is referred to as a "multicast" message or, alternatively, a "broadcast" message A multicast address may, alternatively, enable vaπous subsets of the computers l l(n) to receive the message; a message which contains such destination address information will also be referred to as a multicast message Multicast addresses provide a mechanism by which a source computer 11 (s) can transmit a single multicast message, and enable a plurality of computers 11 (n) to receive the message as destination computers
The message transfer protocol information that is provided in the header of a message can also include so-called "time to live" ("TTL") information, which can limit the scope of transmission of a message The time to live information may be used to limit the scope of transmission of a message to, for example, a particular local area network, such as local area network 10, to vaπous groupings of contiguous local area networks in the wide area network, or to the entire wide area network. If, for example, a computer 1 l(n) in the local area network 10 transmits a message that contains a multicast address and time to live information indicating that the scope of transmission is limited to the local area network 10, then the message will be received ana processed only by the -7- computers 1 l(n) in the local area network 10 which are conditioned to respond to that multicast address Typically, the time to live information is used to control whether gateways (not shown) which are used to interconnect local area networks will transmit a message that they receive on one local area network onto another local area network. In that case, the time to live information is in the form of an integer numeπcal value, and, when a gateway receives a message, it will decrement the time to live value and, if the decremented time to live value is greater than zero, transmit the message onto the other local area network On the other hand, if the decremented time to live value is zero, the gateway will not transmit the message received from the one local area network onto the other local area network Thus, computer 1 l(n) can limit the scope of transmission of a message to local area network by transmitting a time to live value of "one," in the message
The local area network 10 schematically depicted in FIG 1 compπses a part of a wide area network, such as a pπvate wide area network maintained by a particular enterpnse and/or a public wide area network such as the Internet Similar to the local area network 10, the other portions of the wide area network that are not depicted in FIG. 1 generally include local area networks, each including computers interconnected by respective communication links, and the vaπous local area networks are interconnected by vaπous combinations of communication links and switches in a conventional manner The computers within each local area network can share information by transferπng respective unicast or multicast messages in a manner similar to that descπbed above m connection with local area network 10 In addition, a computer in one local area network can share information with computers in other local area networks by transmitting respective unicast or multicast messages in a manner similar to that descπbed above in connection with local area network 10
The invention provides an arrangement for facilitating the logical organization of at least some of the computers 11 (n) in the network 10 into a multicast message delivery error recovery tree Generally, a multicast message delivery error recovery tree is used to provide for efficient error recovery in the event that one or more computers in the network (which may compπse either the local area network 10 or a wide area network including local area network 10) fail to receive a multicast message transmitted by a source computer 1 l(s). Typically, if a computer 1 l(n) is to be a destination computer 1 1(d) to receive a message that is transmitted by a source computer 1 l(s), and if the destination computer 1 1(d) fails to receive the message, it will generate a "negative acknowledgment" ("NACK") message for transmission to the source computer 11 (s) to enable the source computer 11 (s) to retransmit the message to the destination computer 1 1(d) This may be repeated several times until the destination computer 11(d) actually receives the message. A destination computer 11 (d) can, for example, use message sequencing information that is generally provided in the protocol information in the headers of respective messages in a sequence transmitted by the source computer 1 l(s) to determine whether it failed to receive a particular message.
The use of "NACK" messages by a destination computer 11 (d) to request retransmission of messages which it did not receive is generally efficient in connection with unicast messages, since there is only one destination computer 11(d) which may generate a "NACK" message if a message is not received. In that case, the source computer 1 l(s) will only need to process one "NACK" message for each non-received message. A problem can anse, however, if a number of destination computers 11 (d) do not receive a multicast message that is transmitted by a source computer 11 (s), and generate respective "NACK" messages for transmission to the source computer 1 l(s). In that case, the source computer 1 1 (s) would need to process each of the "NACK" messages that it receives, which can significantly negatively affect other processing operations that are to be performed by the source computer 11 (s). A multicast message delivery error recovery tree can assist in reducing the number of "NACK" messages which are transmitted to the source computer 11 (s) of a multicast message, which can serve to reduce the negative affect on processing thereby which might otherwise anse, particularly in a wide-area network in which large numbers of computers may be destinations of a multicast message. The invention provides an arrangement for efficiently organizing the computers 1 l(n) in the local area network into a multicast message delivery eπor recovery tree.
Before proceeding further, it would be helpful to describe the logical organization of a multicast message delivery error recovery tree m local area network 10. An illustrative example of -9- a multicast message delivery error recovery tree, identified by reference numeral 20, is depicted in FIG 2 Generally, in the illustrative multicast message delivery error recovery tree 20, the computers l l(n) compnsing local area network 10 are "logically interconnected" (as will be descπbed below) m a manner represented by the dashed lines interconnecting the vaπous computers depicted in FIG. 2 In the illustrative multicast message delivery error recovery tree 20
(l) one of the computers 11(R) forms a root node, which forms the highest level in the tree 20,
(n) others of the computers 1 1(R)(1) through 1 1(R)(A) (generally identified by reference numeral 1 l(R)(a), "a" being an index between "1" and "A," inclusive) all of which are "logically connected" (as will be descπbed below) to the root node 11(R), form a first lower level m the tree 20 below the root node,
(in) others of the computers l l(R)(a)(l) through l l(R)(a)(B) (generally identified by reference numeral 11 (R)(a)(b), "b" being an index between " 1 " and "B," inclusive), all of which are logically connected to a computer 1 l(R)(a) in the first level, form a second lower level in the tree 20 below the root node,
and so on, until finally the rest of the computers 1 1 (R)(a)(b) ( 1 ) through 11 (R)(a)(b) (L) (generally identified bv reference numeral 11 (R)(a)(b) (1), "1" being an index between " 1 " and "L," inclusive) form respective "leaf nodes" in the tree 20 (It will be appreciated that the computers 1 1(R), l l(R)(a), l l(R)(a)(b), etc, compnse respective ones of computers l l(n) depicted m FIG 1, the indices "(n") used in the reference numerals in FIG 1 being changed in FIG 2 to depict the organization of the computers 11 (n) in FIG 1 into the multicast message delivery error recovery tree 20 ) Generally,
(a) a computer that is in a level in the tree below the root node, or that forms a leaf node is logically connected to one computer in a higher level, which computer will be referred to as its "parent" in the multicast message delivery error recovery tree 20, and -10-
(b) a computer that is in a level that is above the lowest level m the tree is logically connected to one or more computers in the next lower level and/or to one or more computers that form respective leaf node(s) in the multicast message delivery error recovery tree 20, which will be referred to as the computer's "children."
Thus, with reference to the illustrative multicast message delivery error recovery tree 20 depicted in FIG. 2, computer 11(R), which forms the root node in the tree, is the parent of the computers 1 1(R)(1) and 11 (R)(2), which form the second level in the tree, and additionally is the parent of computer 11(R)(3) which forms a leaf node in the tree 20. Contraπwise, each computer 11(R)(1), H(R)(2) and 11(R)(3) is a "child" of the computer 11(R) m the tree 20. Similarly, computer 1 1 (R)( 1 ) is the parent of computers 11 (R)( 1 )( 1 ) and 1 1 (R)( 1 )(2) (which compπse nodes m the third level of tree 20) and 11(R)(1)(3) (which is a leaf node in tree 20), and each computer 1 1(R)(1 )(1), l l(R)(l)(2) and 11(R)(1)(3) is a child of computer 11(R)(1). It will be appreciated that a computer other than the computer that forms the root node will have one parent, but it can have one or more than one children.
Generally, a computer that forms a leaf node in the multicast tree is logically connected to a computer m a higher level, but no computer m any lower level and no computer forming a leaf node logically connects to it (that is, the computer 1 l(R)(a)(b)...(l) that forms the leaf node). A computer 1 l(R)(a)(b)...(l) that forms a leaf node may connect to a computer at any level in the tree It will be appreciated that the tree need not be balanced, that is, there may be different numbers of levels between the computer 11(R) which forms the root node, and the vaπous computers 1 l(R)(a)(b)...(l) which form leaf nodes in the tree, and various non-leaf nodes compnsing the tree 20 may have different numbers of children. Thus, for example, the number of levels between the computer 11 (R) which forms the root node and a computer 11 (R)(a)(b)...(1 , ) which forms a leaf node, may differ from the number of levels between the computer 1 1(R) and another computer 1 l(R)(a)(b)...(l2) which forms a leaf node, and so forth for all computers 11 (1) which form leaf nodes. In each reference numeral l l(R)(a), l l(R)(a)(b), ...l l(R)(a)(b)...(l), the sequence of values for indices "a," "b," ...."1" define a logical path through the tree from the computer system 11 (R) forming -11- the root node, through a sequence of computers in successively lower levels to the computer identified by the respective reference numeral
As noted above, the local area network 10 in which the multicast message delivery error recovery tree 20 is formed is part of a wide area network including a plurality of local area networks In one embodiment, the multicast message delivery error recovery tree 20 depicted in FIG. 2 forms a sub-tree of a larger tree (not shown) as represented by the double-headed arrow associated with the legend "TO/FROM TREE ASSOCIATED WITH WIDE AREA NETWORK" in FIG. 2. In that case, although the computer 1 1 (R) which forms the root node in the multicast message delivery error recovery tree 20 for local-area network 10 does not have a parent within the tree 20 organized as depicted in FIG 2, it will have a parent in the larger tree The parent, and computers in levels thereabove in the larger tree formed in the wide area network, may be in other local area networks in the wiαe area network Alternatively or in addition, computers at higher levels in the tree formed in the wide area network may include computers m the local area network 10 In that case, a computer m the local area network 10 may be at a node at a lower level or compπse a leaf node l l(R)(a)(b) (1) in the multicast message delivery error recovery tree 20, and may in addition compnse a node at a higher level than the computer 1 1(R) that forms the root node in the multicast message delivery error recovery tree 20 formed in the local area network 10
The multicast message delivery error recovery tree 20 is used to provide a mechanism whereby, in the event that one or more of the computers m the network 10 fail to receive a multicast message from the computer which forms the root node in the tree formed in the wide area network, of which the local area network 10 is a part, those computers can efficiently request the re- transmission of a copy of the multicast message thereto For example, the tree 20 provides a mechanism whereby a computer, such as computer 11(R)(1)(1)(1), m the network 10, if it fails to receive a multicast message from another computer as source computer 1 1 (s), instead of generating a "NACK" message for transfer to the source computer 1 l(s) requesting re-transmission of a copy of the multicast message, can generate a "NACK" message for transfer over communication link 13 to its parent computer 11 (R)( 1 )( 1 ) in multicast message delivery error recovery tree 20, the "NACK" -12- message requesting it (that is, the parent computer 11(R)(1)(1)) to provide a copy of the multicast message to the child computer 11 (R)( 1 )(1 )(1 ). If that parent computer 11 (R)( 1 )( 1 ) has a copy of the multicast message, it (that is, parent computer 11 (R)( 1 )( 1 )) will transmit the multicast message over the communication link 13. The child computer 11(R)(1)(1)(1) which requested the copy of the multicast message can then receive the multicast message transmitted by the parent computer 1 1(R)(1)(1). It will be appreciated that the other computers 1 l(n) in the local area network 10 can also receive the multicast message and use it if they had not previously received it; the other computers l l(n) in the local area network, that is, the computers which previously received the message, can ignore the message. Preferably, the parent computer 1 1(R)(1)(1), when it transmits the multicast message, will transmit the message with a time to live value that will limit its transmission to the local area network 10, so that the message is not transmitted to other portions of the wide area network which may have received the multicast message.
On the other hand, if the computer 11(R)(1)(1) does not have a copy of the multicast message, it will generate a "NACK" message for transfer over communication link 13 to its parent computer 11(R)(1) requesting a copy therefrom. If that computer 11(R)(1) has a copy of the multicast message, it (that is, the computer 1 1(R)(1)) will transmit the multicast message over the communication link 13 as described above, to provide the multicast message to its child and grandchild computers 11(R)(1)(1) and 1 1(R)(1)(1)(1), along with any other computers l l(n) which had not previously received the message. Similarly, if the computer 1 1(R)(1) does not have a copy of the multicast message, it, in turn, will generate a "NACK" message for transfer over communication link 13 to its parent, namely, the computer 1 1(R) which forms the root node in the multicast tree 20, requesting a copy therefrom. If the computer 11(R) has a copy of the requested multicast message, it will transmit the multicast message over the communication link 13 as described above.
However, if, after receiving the request from the computer 11(R)(1), the computer 11(R) which forms the root node in the multicast message delivery error recovery tree 20 determines that it does not have a copy of the requested message, it will attempt to obtain a copy. As noted above, -13- in one embodiment, in which the local area network 10 forms part of a wide area network, and in which the multicast message delivery error recovery tree 20 forms a sub-tree of a larger tree formed in the wide area network, if the computer 11(R) which forms the root node m the multicast message delivery error recovery tree 20 does not have a copy of the message, it will generate a "NACK" message for transfer to a computer in the wide area network that is its parent in the larger tree This process, in which each computer which receives such a "NACK" message from its child in the tree determines whether it has a copy of the requested message, and (I) if so, transmits the multicast message, but (n) if not generates a "NACK" message for transfer to its parent, will continue until a "NACK" message reaches the computer compnsing root node in the tree forme in the wide area network, wmch is the source computer 1 l(s) for the message
It will be appreciated that the use of the multicast message delivery error recovery tree 20, and associated tree formed in the wide area network, can serve to substantially reduce the number of "NACK" messages that are transmitted to the source computer 11 (s) for a multicast message If a multicast message is not received by a particular destination computer, it is frequently the case that a large number of destination computers 11(d) will not receive the multicast message, not just a single destination computer In that case, the destination computers which fail to receive the multicast message will generate only one "NACK" message Thus, for example, if a computer which forms a node m the multicast message delivery error recovery tree 20 has already generated a "NACK" message requesting a copy of a particular multicast message for transmission to its parent in the tree 20, either because it itself failed to receive the multicast message or in resDonse to receipt of a "NACK" message from one of its children requesting the multicast message, ana it later receives a "NACK" message from another child requesting the same multicast message, it will not need to generate another "NACK" message for the same multicast message for transmission to its parent In addition, the "NACK" messages that are generated by the destination computers 1 1(d) are generally not for transmission to the source computer 11 (s) which generated the multicast message, but instead to their respective parents in the multicast message delivery error recovery tree 20, or to its parent the tree in the larger tree formed in the wide area network -14- In addition, each computer forming a node in the multicast message delivery eπor recovery tree 20, or in the larger tree formed m the wide area network, will receive at most only a limited number of "NACK" messages, namely, at most a number corresponding to the number of its children in the respective tree As will be descπbed below, in connection with establishment of the multicast message delivery error recovery tree 20, each computer forming a node in the tree 20 can determine the maximum number of children it can accommodate duπng establishment of the tree 20 In any case, since none of the computers, including the source computer 1 l(s) for the multicast message and the computers forming nodes in the multicast message delivery error recovery tree 20, or the larger tree formed in the wide area network, receives an unduly large number of "NACK" messages m the event of a failure to receive a multicast message, processing of the "NACK" messages will not unduly negatively affect their other processing operations
In one embodiment, the portion of the larger tree above the multicast message delivery error recovery tree 20 established for the local area network 10 will be predetermined by, for example, a system administrator, and one computer in the wide area network will be identified as a parent for the computer 11 (R) in the local area network 10 that is selected as the parent for the root node m the tree 20 However, the computers 11 (n) in the local area network 10 themselves determine the logical organization of the multicast message delivery error recovery tree 20 for the local area network 10 Operations performeα by the computers 11 (n) in establishing the multicast message delivery error recovery tree 20 m the local area network 10 will be descnbed below in connection with the flow chart depicted m FIG 4 Generally, the computers 1 l(n) make use of three types of messages, namely, a node election message type, a node advertising message type, and a child solicitation message type The establishment of a multicast message delivery error recovery tree 20 proceeds in a plurality of iterations, each iteration including two phases, namely, a node election phase and a child solicitation phase In the first iteration, a computer is selected as a root node in the tree 20, and in each subsequent iteration, a number of additional computer(s) 11 (n) will be added to the tree 20 In each iteration, the computers 1 l(n) use messages of the node election message type duπng the node selection phase and messages of the node advertisement message type duπng the child -15- so citation phase The use of messages of the child solicitation message type will be descπbed below
FIG 3 A depicts structure of a node election message 30 used by computers 1 l(n) in one embodiment of the invention duπng the node election phase of each iteration Generally, in each iteration each of the computers 1 1 (n) that can become a node in the multicast message delivery error recovery tree, referenced as "participating" in the iteration, multicast node election messages that identifλ us suitability (as will be descπbed below) of becoming a node in the tree The node election messages also identify a number of nodes that will be selected duπng the iteration The computers 1 l (n) essentially self-select themselves as nodes at the end of the iteration With reference to FIG 3 A, noαe election message 30 includes a header portion 31 and a data portion 32 The header portion 31 contains message transfer protocol information, including a message type identifier, source and destination addresses, and a time to live value The message type identifier in the header portion 1 identifies the message as being of the node election type, the source address identifies the particular computer l l(n) that generated and transmitted the node election message and the destination address identifies the multicast address for the computers 1 1 (n) compnsing the local area network 10 The time to live value is selected to ensure that the node election messages are transmitted only in the local area network
Tne data portion 32 contains a number of fields that are used duπng the iteration in selecting the computers 1 l(n) that are to be nodes in the tree 20, including a pnoπty field 33, a capacity field 34, a computer system identifier field 35, a number of nodes field 36 and an election interval field 37 The pπoπty field 33, capacity field 34 and computer system identifier field 35 contain values that sen e to rank the participating computers in their relative suitability to become a node duπng the iteration In particular, pπoπty field contains a pnonty value that identifies a relative pπonty for the computer 1 l(n) that generates the node election message to be a node Illustrative pπonty values used in pnoπty field 33 include, for example, a highest "must be" a node value, a relatively high "eager" to be a node value, and a relatively low "reluctantly would be" a node value In addition, the lowest "leaf only" node value can be assigned to a computer 1 1 (n) if it will only operate -16- as a leaf in the multicast message delivery error recovery tree 20, in that case, the computer 1 l(n) will not participate in the node election phase of any iteration The capacity field 37 contains a child capacity value that identifies the number of children the computer 1 l(n) can accommodate The computer system identifier field 35 contains a computer system identifier that identifies the computer 1 1 (n) in the local area network 10, m one embodiment, the computer system identifier compnses the network address for the computer 11 (n)
The number of nodes field 36 contains a value that identifies the number of computers 1 l(n) that are to be selected as nodes dunng the iteration Thus, in the first iteration, in which the computer to serve as the root node m the multicast message delivery error recovery tree 20 is selected, the value contained in the number of nodes field 36 will be "one " If a subsequent iteration is needed to select additional computers to operate as nodes m the multicast message delivery error recovery tree 20, the computer 1 1 (R) previously selected as the root node in the tree 20 will provide a value identifying the number of computers 1 1 (n) that are to be selected as nodes in the subsequent iteration These operations will continue through each subsequent iteration
Finally, the election interval field 37 identifies the time interval between transmissions of node election messages by the computer system which generates the respective message
λs noted above, the values in the pnoπty field 33, capacity field 34 ana computer system identifier field 35 are used to rank the participating computers 1 l(n) as to their relative suitability to be selected as nodes duπng the iteration Generally, the ranking is based first on the pπonty value, so that those computers 1 l(n) which have the higher pnoπty values will be deemed more suitable and those which have lower pnonty values will be deemed less suitable As among computers l l(n) which have the same pπonty value, suitability is determined based on their respective capacity values, which identify the number of children that they can accommodate, that is, computers 11 (n) with higher capacity values will be deemed more suitable and computers 11 (n) with lower capacity values will be deemed less suitable, within the ranking as determined by their pπontv values Finally, as among computers 11 (n) with the same pπoπty and capacity values, the -17-
1 suitability ranking is determined based on their respective computer system identifiers Each
2 computer system identifier is unique, at least among computers 11 (n) in the local area network, and
3 the use of the computer system identifier value is selected, somewhat arbitranly, as a means to
4 provide rankings as among computers m the local area network 10 which have the same pnoπty and
5 capacitv values
6 It will be appreciated that, if the pπoπty values, capacity values and computer system
7 identifier values are all numencal values, with higher pnonty, capacity and computer system
8 identifier values each being represented by a higher numencal value and lower pπonty, capacity and
9 computer system identifier values each being represented by a lower numencal value, a computer
10 can generate a unitary suitability value "S" as <PRI_VAL|CAP_VAL|CSID_VAL>, where
1 1 "PRI V AL" represents a field for the pπonty value, "CAP_VAL" represents a field for the capacity
12 value, "CSID VAL" represents a field for the computer system identifier value, and "|" represents
13 the concatenation operation, with the respective fields used in the node election messages from all
14 of the computers 1 l(n) having the same number of bits Each computerl l(n) determines its
15 suitability ranking in connection with the number of computers in the local area network 10 which
16 have higher suitability values than the computer 11 (n)
17 FIG 3B depicts the structure of a node advertising message 40 used m one embodiment of I S the invention duπng the second phase of each iteration In particular, the node advertising message
19 40 is used by computers 11 (n) which have determined that they are nodes in the multicast message 0 delivery error recovery tree 20 to notify other computers in the local area network 10 that they are 1 available to have other computers logically connect to them as children in the tree 20 In that
22 connection, the computer 1 l(n) that transmits the node advertising message 40 also provides some 3 information to the other computers as to the status of the computer 1 l(n), which they (that is, the
24 other computers) can use to determine whether they should logically connect to that computer 1 1 (n)
5 With reference to FIG 3B, node advertising message 40 includes a header portion 41 and
26 a data portion 42 The header portion 41 contains message transfer protocol information, including -18- a message type identifier and source and destination addresses, and a time to live value The message type identifier m the header portion 41 identifies the message as being of the node advertising type, the source address identifies the particular computer 1 l(n) that generated and transmitted the node advertising message and the destination address identifies the multicast address The time to live value is selected to ensure that the node advertising messages are transmitted only in the local area network
The data portion 42 contains a number of fields that are used duπng the iteration in notifying the other computers in the local area network 10 as to the logical connection availability of the computer 1 l(n) that generates the message 40, including a pnonty field 43, a capacity field 44, a computer system identifier field 45, an advertising interval field 46, a number of children field 47, a connected field 48, a number of nodes field 49 and a status field 50 The pπoπty field 43, capacity field 44 and computer system identifier field 45 contain pπonty, capacity and computer system identifier values that correspond to the respective values m fields 33, 34, and 35, respectively, of node election message 30 as descnbed above The advertising interval field 46 identifies the time interval between transmissions of a node advertising message 40 by the particular computer 1 1 (n) that generates the node advertising message 40
The number of children field 47 of node advertising message 40 contains a value that identifies the number of other computers to which the particular computer 1 1 (n) is currently logically connected when it (that is, the particular computer 1 1 (n)) transmits the node advertising message 40 A computer that receives a node advertising message 40 from tne computer 1 1 (n) can determine an excess capacity value for computer 1 l(n), which is a measure of the ability of the computer 1 l(n) to take on additional children, by determining the difference between the capacity value in field 44 and the number of children value in file 47 A computer may use the excess capacity value for several purposes, including determining whether it should attempt to logically connect as a child to a particular computer 1 l(n) based on its excess capacity m relation to the excess capacity values of computers which form other nodes in the multicast message delivery error recovery tree 20 Other uses of the excess capacity value will be descnbed below -19- The connected field 48 of node advertising message 40 contains a connected flag that, if set, indicates that the computer l l(n) that generates the node advertising message 40 is logically connected to multicast message delivery error recovery tree 20, and otherwise indicates that the computer is not logically connected to the tree 20. The number of nodes field 49 identifies the current number of nodes that have been selected for the tree 20. The status field 50 provides information as to the computer's status in the tree 20. In one embodiment the status field 50 normally has an "active" value, which indicates that the computer 1 l(n) that generates the node advertising message 40 is currently a node in the tree 20. However, as will be described below, the status field 50 can also have a "resigning" value, which indicates that the computer l l(n) that generates the node advertising message 40 is a node in tree 20, but is in the process of resigning as a node and converting to a leaf. If status field 50 of the node advertising message 40 transmitted by a computer 1 l(n) indicates the "resigning" status, its children in the tree will need to attempt to logically connect to other computers that are nodes in the tree.
As noted above, establishment of multicast message delivery eπor recovery tree 20 proceeds in a plurality of iterations. Generally, the first iteration will start when at least one of the computers 1 l(n) of the local area network 10 has been powered on and initialized. After a computer 1 l(n) has been initialized, if it is associated with a priority value that would allow it to be a node in the multicast message delivery error recovery tree 20, that is, if it is associated with a priority value other than "leaf only," it will begin periodically transmitting node election messages 30. If, within a selected time interval after beginning transmitting node election messages 30, the computer 1 l(n) does not receive a node election message 30 from other computers in the network, it (that is, computer 11 (n)) will determine that it is "elected" as a node in the multicast message delivery error recovery tree 20, and stop transmitting node election messages 30. Since the computer l l(n) comprising the elected node is, at this point, the only node in the tree 20, it (that is, the computer l(n)) will comprise the root node 11(R) in the tree 20. After the computer 1 l(n) determines that it is the elected node, will begin transmitting node advertising messages 40 to notify other computers in the local area network 10 of its availability to accept children. If other computers in the local area -20- network 10 that have been powered on and initialized, after receiving a node advertising message 40, they can communicate with the computer 1 l(n) to attempt to logically become children of the computer 1 l(n) in the tree 20 It will be appreciated that this may occur (that is, that the computer 1 1 (n) does not receive any node election messages 30 from other computers in the local area network within the selected time interval) if there are no other computers m the network which have been powered on and initialized which can operate as a node in the multicast message delivery error recovery tree 20 for the local area network In addition, after being "elected" a node in the tree 20, the computer 1 l(n) can increment its pnoπty value by a predetermined "elected node" increment value, the purpose for which will be descπbed below
On the other hand, if, within the selected time interval, the computer 11 (n) receives a node election message 30 from at least one other computer 1 l(n') in the local area network 10, it will compare the suitability value (that is, as descπbed above, the concatenation of the pπonty value, capacity value and computer system identifier value from fields 33, 34 and 35) from the received node election message to its own suitability value (that is, the concatenation of its pπoπty value, capacity value and computer system identifier value) that it is providing in the node election messages 30 that it is transmitting If the computer 1 l(n) determines that its suitability value is less than the suitability value in the received node election message, it will stop transmitting node election messages On the other hand, if the computer 1 l(n) determines that its suitability value is greater than the suitability value in the received node election message, it will continue transmitting node election messages The other computer 1 l(n') will perform similar operations, so that, if it determines that its suitability value is higher than that of computer 1 l(n), it will continue transmitting node election messages, but if it determines that its suitability value is lower than that of computer 1 l(n), it will stop transmitting node election messages
Each computer m the local area network with a pπonty level above "leaf only" will perform similar operations At the end of the selected time interval, one of the computers, if it is still transmitting node election messages, then it has not received a node election message from a computer m the local area network which has a higher suitability value, and so it (that is, the still- -21- transmitting computer) will determine that it is "elected" a node, m this case the root node, m the tree 20, and therefore will constitute the computer 11(R) (reference FIG 2) in the tree 20. After computer 1 1(R) determines that it is the elected node, it will stop transmitting node election messages 30 and begin transmitting node advertising messages 40 to notify other computers m the local area network 10 of its availability to accept children. Other computers in the local area network 10 that have been powered on and initialized can, after receiving a node advertising message 40, communicate with the computer 1 1 (R) to attempt to become children of the elected computer in the tree 20 In addition, the computer 11(R) can increment its pπoπty value by the predetermined "elected node" increment value
If the computer 11(R) that forms the root node of the multicast message delivery eπor recovery tree 20 determines, duπng the communications with the other computers in the local area network 10, that it has the capacity to accommodate all of the other computers as leaves, it can logically connect to the other computers at that point. On the other hand, if the computer 11(R) determines that more computers are requesting to become children thereo f than it can accommodate, it (that is, the computer 11 (R)) can initiate a second iteration in the tree establishment process, to enable election of additional nodes for the multicast message delivery error recovery tree 20 It will be appreciated that the additional nodes will compπse one or more levels in the tree 20 below the root level To initiate the second iteration, the computer 1 1(R) will resume transmitting node election messages 30 In this iteration, the node election messages 30 generated by the computer 1 1(R) will contain a number of nodes value in field 36 that is greater than one by an amount corresponding to the number of new nodes that are to be elected In the node election messages generated by the computer 11(R), the pnonty value in field 33 will reflect the incremented pnonty value, that is, its oπginal pπonty value incremented by the predetermined "elected node" increment value
Each of the other computers in the local area network 10 which can also become nodes in the tree (that is, each computer other than computer 11 (R) whose pnoπty value is greater than "leaf only") will also resume transmitting node election messages 30 In this case, each of the other -22- computers will use their own pπoπty, capacity and computer system identifier values m fields 33 through 35 of their node election messages, but will use the number of nodes and election interval values from fields 36 and 37 in the node election messages 30 received from the computer 11 (R) in node election messages that they transmit While transmitting node election messages, each of the computers will also receive node election messages from the other computers and will compare the suitability values therefrom with their own suitability values If a computer that is transmitting node election messages 30, determines that it has received node election messages 30 with higher suitability values than its (that is, the computer's) suitability value, from a number of other computers corresponding to the number of nodes value in field 36, it (that is, the computer) will stop sending node election messages 30 Until that occurs, the computer will continue transmitting node election messages until the end of the selected time interval The number of computers which are continuing to transmit node selection messages 30 at the end of the selected time interval will correspond to the number of nodes specified in the number of nodes field 36, and so the computers that are transmitting node election messages 30 at the end of the selected time interval will constitute the set of computers that are elected to be nodes in the tree 20, including the root node
It will be appreciated that the computer 1 1(R) that was elected the root node duπng the previous iteration may, but need not, have the highest suitability value (even with the pπoπty value incremented by the predetermined "elected node" increment value) as among the elected nodes since one or more computers associated with higher suitability values than the computer 11 (R) may have been powered on and initialized and begun transmitting node election messages If the computer 11(R) that was elected as the root node duπng the previous iteration has a pπoπty level, as incremented by the "elected node" increment value, that is higher than the other computers, then it will remain the root node However, if another computer has a pnonty level that is sufficiently high that its suitability value is higher than the suitability of computer 11(R), the other computer will be elected root node dunng the current iteration Indeed, if sufficient numbers of newly-transmittmg computers have higher suitability values than the previously elected computer 11 (R), that computer 1 1(R) may be "de-elected," that is, it may not be among the elected nodes for the iteration In that -23- case, the computer with the highest suitability value will be elected as the new root node 11 (R) for subsequent operations, and the other elected computers will form other portions, including (but not necessaπly limited to) the second level in the tree 20. However, it will be appreciated that, by incrementing the priority level of computer 11(R) that was previously-elected the root node in the tree 20, by the "elected node" increment value, the previous election will be "sticky," that is, it will not be disturbed unless the priority level of the other computer(s) is or are sufficiently higher than the original, non-incremented, priority level of the computer 11 (R) to warrant disturbing the previous election.
Thereafter, the elected computers will transmit node advertising messages 40, as in the first iteration, to notify other computers in the local area network 10 of their availability to accept children. The computers elected during the iteration, other than the computer 1 1 (R) that forms the root node for the tree 20, will receive the node advertising messages 40 transmitted by the computer 11 (R) and will communicate with it to become logically connected thereto, thereby to establish them as forming the second level of the tree. If a newly-elected node cannot logically connect to the computer 11 (R), it may logically connect to another newly-elected computer to begin the third level in the tree. The other, non-elected computers will communicate with any of the elected computers to negotiate formation of logical connections therewith.
If the root node determines that additional nodes of the tree are required, it can repeat the operations through subsequent iterations. In each iteration, the computers newly-elected to form node(s) in the tree 20 will increment their priority values by the predetermined "elected node" value, and, if a previously-elected computer is "de-elected," it will reduce its priority value to the original priority value. These operations will continue until preferably all of the computers 1 1 (n) in the local area network 10 form part of the multicast message delivery error recovery tree 20, either as nodes or leaves. In addition, the root node 11(R) will logically connect as a child to a node specified for the local area network in the tree for the wide area network. After the multicast message delivery error recovery tree 20 has been established, the computers 1 l(n) comprising nodes in the tree will -24- continually transmit node advertising messages 40, so that newly-imtiahzed computers can identify nodes in the multicast message delivery error recovery tree 20 to which they may logically connect
λs noted above, a computer l l(n) in the local area network, after it is powered on and initialized, if it has a pπoπty value that is other than "leaf only," will begin transmitting node election messages If a multicast message delivery error recovery tree 20, or any portion thereof established as descπbed above, already exists, the computers 11 (n) will repeat the tree-establishment operations as descπbed above In those operations, the computers 1 l(n) which form nodes in the tree will use their pπonty values, as incremented by the "elected node" increment value, so that the tree 20 will not be disturbed dunng the new tree-establishment operations unless the newly- initialized computer has a sufficiently high pπonty value If the newly-imtiahzed computer has a pπoπtv value that is higher than the pnonty value associated with the computer 1 1 (R) forming the root node for the tree 20, it (that is, the newly-initialized computer) will become the new root node for the tree In that case, the computer 1 l(R') forming the new root node will logically connect as a child to a node specified for the local area network in the tree for the wide area network, and the computer 11 (R) will logically connect to the computer 11 (R') forming the new root node
In addition, after the multicast message delivery error recovery tree 20 has been established in the local area network, the computers 1 l(n) will penodically repeat the election process In that operation, anv of the computers 1 l(n), other than a computer whose pπontv level is "leaf only," can transmit a node election message The node election message transmitted by a computer 11 (n) will include computer's suitability value, including its pπonty level either as incremented by the "elected node" increment value if the computer currently forms a node in the tree 20, or not incremented if the computer does not form a node in the tree 20 When other computers receive the node election message, if they have not themselves already initiated resuming transmission of respective node election messages, will resume transmission, and the node election process as descπbed above will be repeated -25- In one embodiment, after the multicast message delivery error recovery tree 20 has been established, the computers 1 l(n) that form nodes in the tree 20 attempt to optimize the tree In that case, the computer 11 (nLSV) that forms the node in the tree 20 with the lowest suitability value will determine whether it should continue operating as a node, or enable any children logically connected thereto to logically connect to computers that form other nodes m the tree To accomplish that, the computer l l(nLSV) will determine whether excess capacity of all of the other computers forming nodes in the tree 20, is greater than or equal to the number of its children In that case, the computer 1 1 (nLS ) will transmit node advertising messages 40 that identify its status as "resigning " When the children of computer 11 (nLSV) receive such node advertising messages, they will attempt to logically connect to computers that form the other nodes in the tree 20 After each such child logically connects to another computer, it will sever the logical connection with the computer 1 l(nLSV), and the computer 1 l(nLSV) can reduce the number of children value in the node advertising messages transmitted thereby Between the time at which the child has logically connected to the other computer and the time at which it (that is, the child) has severed the logical connection with the computer 1 l(nLSV), it (that is, the child) will effectively be connected to both computers, and may transmit "NACK" messages to either or both computers This will ensure that all computers 1 l(n) are logically connected in the multicast message delivery eπor recovery tree 20 After all of the children of computer 1 1 (nLSV) have severed their connections with computer 1 1 (nLSV), that computer 1 l(nLSV) will compπse a leaf in the multicast message delivery eπor recovery tree 20 and will stop transmitting node advertising messages 40 These operations can be repeated through a plurality of iterations, for successive computers that are nodes with low suitability values
With this background, the operations performed by a computer 11 (n) in connection with the invention will be descπbed in connection with FIGS 4A through 4G Generally, operations descπbed in connection with FIG 4 include several sequences, including
(l) a node election message transmit sequence (FIGS 4A and 4B) depicting operations performed by a node election message transmitter maintained by the computer 1 1 (n) in connection with transmission of node election messages, -26- (u) a node election message receive sequence (FIGS 4C and 4D), depicting operations performed by a node election message receiver maintained by the computer 11 (n) in connection with reception of node election messages,
(in) a tree organization sequence (FIG 4E through 4G), depicting operations performed by a tree organization component maintained by the computer 1 l(n) in connection with logically connecting the computer 1 l(n) to a respective parent and children as appropπate
which the computer 1 l(n) performs duπng an iteration, as descπbed above
λs descnbed above, a computer 1 l(n) will begin transmitting node election messages 30 either
(I) after being powered up and initialized,
(n) a selected time penod after a multicast message delivery error recovery tree 20 was previously established, or
(in) in response to receipt of a node election message from another computer,
for a particular node election message time peπod Thus, with reference to FIG 4A, the computer 1 l(n) will, after being powered up and initialized (step 100), initially determine whether it has a pnoπty level greater than "leaf only" (step 101) If the computer l l(n) makes a negative determination m step 101, that is, if it has apπoπty level of "leaf only," it will operate only as a leaf in a multicast message delivery error recovery tree 20 established for local area network 10 In that case, the computer 1 l(n) can wait for a node advertising message 40 from a computer that forms a node in a multicast message delivery eπor recovery tree 20, or it can begin transmitting child solicitation messages to enable computers that form nodes in a tree 20 to transmit node advertising messages 40
On the other hand, if the computer 11 (n) makes a positive determination in step 101, that is, if it has a pπonty level above "leaf only," it can operate as a node in a multicast message delivery -27- error recovery tree 20 established m local area network 10 In that case, computer 11 (n) will proceed to a seπes of steps to initiate transmission of node election messages 30 In that operation, the
J computer 1 l(n) will establish and initialize a node election message peπod timer (step 102) to be
4 used to time the time penod over which the computer 1 l(n) will transmit node election messages
5 In addition, the computer l l(n) will establish and initialize a number of nodes store (step 103),
6 which stores a number of nodes value that will be used in the node election messages, and a node
7 counter (step 104) that will be used to count the number of computers 11 (n) from which it receives
8 node election messages 30 with higher suitability values than its own suitability value Initially the
9 node counter will be set to a value of "zero " Following step 104, the computer 11 (n) will determine
10 whether the number of nodes value stored in the number of nodes store is higher than the value
1 1 provided by the node counter (step 105)
12 If the computer 1 l(n) makes a positive determination in step 105, it will transmit a node
13 election message including its pnoπty, capacity and computer system identifier values m fields 33,
14 34 and 35, the number of nodes value from the number of nodes store in field 36, and the election
15 interval value in field 37 (step 106) Following step 106, computer 1 l(n) will establish and initialize
16 a node election message transmission interval timer to time the interval between transmission of
17 node election messages, which will initially be set to a default value established for the computer I S 11 (n) (step 107) When the computer 11 (n) determines that the node election message transmission
19 interval timer established in step 107 has timed out (step 108), if it determines that the node election 0 message peπod timer has not timed out (step 109), the computer 1 l(n) will return to step 104 to
21 determine whether to transmit another node election message 30 It will be appreciated that the 2 computer 1 l(n) will repeat steps 104 through 109 until it determines
23 (l) in step 105 that the value provided by node counter is greater than or equal to the number 4 of nodes value, or
25 (n) in step 108 that the node election message peπod timer has timed out -28-
1 If the computer 1 1 (n) determines in step 105 that the value provided by node counter is greater than
2 or equal to the number of nodes value, then it has received node election messages from a number
3 of computers in the local area network whose suitability values are greater than the suitability value
4 of computer 1 l(n) corresponding to the number of nodes to be established in the multicast message
5 delivery error recovery tree 20, and so it (that is, computer 11 (n)) can stop transmitting node election
6 messages 30.
7 If the computer 1 l(n) receives a node election message 30 from another computer 1 l(n') in
8 the local area network 10 (step 120, FIG. 4C), it will initially determine whether it (that is, the
9 computer 11 (n)) is currently also transmitting node election messages (step 121). If the computer
10 1 l(n) makes a negative determination in step 121, that is, if it determines that it is not currently
1 1 transmitting node election messages, if its priority level is greater than "leaf only," the computer will
12 start transmitting node election messages as described above in connection with FIGS. 4A and 4B
13 (step 122). In addition, the computer 1 l(n) will determine whether the number of nodes value in
14 field 36 of the received message is larger than the number of nodes value in its number of nodes
15 store (step 123). If the computer l l(n) makes a positive determination in step 123, that is, if it
16 determines that the number of nodes value in field 36 of the received message is larger than the
17 number of nodes value in its number of nodes store, then the computer 1 l(n') from which the node I S election message 30 was received forms a part of a previously-established multicast message
19 delivery error recovery tree 20 that include a plurality of nodes. In that case, the computer l l(n) will 0 increase the number of nodes value in its number of nodes store (step 124) and adjust its node 1 election message period time interval to be used when the node election message transmission
22 interval timer is next established in step 107 (step 125) for use during transmission of node election
23 messages 30. In addition, the computer 1 l(n) will determine whether the suitability value of the
24 computer 11 (n') from which the message was received is greater than its suitability value (step 126).
25 If the computer 1 l(n) makes a positive determination in step 126, then it determines whether it
26 previously received a node election message 30 from computer 11 (n') during the node election -29- message transmission interval (step 127) and, if so, will increment the node counter used to control node election message transmission (step 128).
Returning to step 109 (FIG. 4B), after the node election message period timer has timed out, the computer l l(n) will perform operations, depicted in FIG. 4E, in connection with logically connecting to other computers as parent and/or children as appropriate. Thus, after the node election message period timer has timed out, the computer 1 l(n) will first determine whether it is the root node in the tree (step 140). In that operation, the computer 11 (n) can determine whether it is the root node by determining whether the node counter has the value zero, which would indicate that it (that is, the computer 1 l(n) did not receive node election messages 30 from any other computer in the local area network with higher suitability values than that of computer l l(n). If the computer l l(n) makes a positive determination in step 140, it will attempt to form a logical connection to the parent assigned to the local area network in the portion of the tree formed therefor in the wide area network (step 141).
On the other hand, if the computer 11 (n) makes a negative determination in step 140, it will determine whether it is a node other than the root node in the multicast message delivery error recovery tree 20. In that operation, the computer 11 (n) can determine whether the number of nodes value stored in the number of nodes store is higher than the value provided by the node counter (step 142). If the computer 1 l(n) makes a positive determination in step 142, then it will form a node in the tree; on the other hand, if the computer 1 l(n) makes a negative determination in step 140, then it received node election messages 30 from at least a number of other computers in the local area network 10 corresponding to the number of nodes value identifying the number of nodes to be elected.
Following step 141, orstep 142 ifapositive determination is made in that step, the computer 1 l(n) will first increase its priority value by the "elected node" value (step 143). Thereafter, the computer 11 (n) will begin transmitting node advertising messages to identify the availability of the computer 1 l(n) to logically connect to other computers as children (step 144). In addition, if the -30- computer l l(n) determined in step 140 that it is not the root node, it will respond to the node advertising messages generated by other computers which have pnonty, capacity and computer system identifier values m fields 43-45 which define suitability values greater than its own to attempt to logically connect to them as children, thereby to establish respective levels in the tree 20 (step 145) After the computer 1 l(n) has logically connected to another computer as parent of the other computer, it will increase the number of children value in field 47 of node advertising messages transmitted thereby (step 146) On the other hand, after the computer l l(n) has been accepted by another computer as a child of the other computer, it will be logically connected to that other computer and can save the identification of the other computer for use in sending "NACK" messages, and set the connected flag 48 in node advertising messages 40 transmitted thereby (step 147)
After the computer 11 (n) determines that the multicast message delivery eπor recovery tree 20 has been established (step 148), it will determine whether its suitability value is the lowest among the computers elected nodes in the tree (step 149) The computer 1 l(n) can do that by, for example, determining whether the value provided by its node counter coπesponds to the number of nodes value in field 49 of the node advertising messages 40 transmitted by the computers compnsing nodes in the tree 20 If the computer 1 l(n) makes a positive determination m step 149, it will determine whether the excess capacity, as descπbed above, of the computers compnsing the other nodes is greater than the number of computers logically connected to it as children (step 150) If the computer l l(n) makes a positive determination in step 150, if it has any computers logically connected thereto as children (step 151) it will provide a value of "resigning" in the status field 50 of node advertising messages 40 transmitted thereby to enable its children to attempt to logically connect to other computers forming nodes in the tree 20 (step 152) As the computer 11 (n) receives notification from each child that logically connects to another computer forming a node in the tree 20 (step 153), it will reduce the number of children indicated in field 47 of node advertising messages transmitted thereby (step 154), and, after it determines that the number of children has -31- reduced to zero (step 155), it will stop transmitting node advertising messages (step 156) and decrease its pπoπty value by the "elected node" value (step 157)
It will be appreciated that a system m accordance with the invention can be constructed in whole or in part from special purpose hardware or a general purpose computer system, or any combination thereof, any portion of which may be controlled by a suitable program Any program may in whole or in part compπse part of or be stored on the system m a conventional manner, or it may in whole or in part be provided in to the system over a network or other mechanism for transfemng information in a conventional manner. In addition, it will be appreciated that the system may be operated and/or otherwise controlled by means of information provided by an operator using operator input elements (not shown) which may be connected directly to the system or which may transfer the information to the system over a network or other mechanism for transferring information m a conventional manner
The foregoing descπption has been limited to a specific embodiment of this invention. It will be apparent, however, that vaπous variations and modifications may be made to the invention, with the attainment of some or all of the advantages of the invention It is the object of the appended claims to cover these and such other vanations and modifications as come within the true spiπt and scope of the invention
What is claimed as new and desired to be secured by Letters Patent of the United States is

Claims

-32-CLAIMS
1 A method of logically organizing, in a digital data network compnsing a plurality of devices interconnected by a communication link, the devices into a tree structure, each of the devices having an associated suitability value, the method compnsing
A a node election step m which at least one of the devices is enabled to broadcast, over the communication link, at least one node election message including the respective device's suitability value, any devices broadcasting node election messages also being enabled to receive from the communication link any node election messages broadcast by other devices and to determine whether it is elected a node in the tree structure in connection with a compaπson between its suitability value and suitability values in any received node election messages, and
B a tree establishment step in which at least one device which is elected a node in the tree structure is enabled to communicate with at least one other device to facilitate the other device becoming a child of the device that is elected a node m the tree
2 A method as defined in claim 1 in which the node election step and the tree establishment step are performed in at least one iteration, in said at least one iteration, dunng the node election step one device being elected a node, as a root node, in the tree structure
3 A method as defined in claim 2 in which the node election step and the tree establishment step are performed in a senes of iterations, in a first iteration the root node being elected dunng the node election step, and each subsequent iteration at least one other device being elected as a node m the tree structure -33-
4. A method as defined in claim 3 in which the at least one other device elected as a node in the tree
? structure, during the tree establishment step of at least one of said iterations, being enabled to
3 communicate with the device elected as root node to facilitate becoming a child thereof.
5. A method as defined in claim 3 in which the device elected as the root node is enabled to determine the number of additional nodes to be elected during each subsequent iteration.
1 6. A method as defined in claim 5 in which the device elected as the root node transmits, in its at
2 least one node election message transmitted during each iteration after the first iteration, a number
3 of nodes value identifying the number of additional nodes to be elected during the respective
4 iteration.
7. A method as defined in claim 6 in which each of the devices enabled to broadcast at least one node election message transmits, during each iteration after the first iteration, a number of nodes value coπesponding to the number of nodes value transmitted by the device elected as the root node.
8. A method as defined in claim 1 in which each device is assigned a priority value, the respective device's suitability value being a function of its priority value.
-34- 9. A method as defined in claim 1 in which each device has an associated capacity value associated with a number of children that it can accommodate, the respective device's suitability value being a function of its priority value.
10. A method as defined in claim 1 in which said at least one device is enabled to begin broadcasting at least one node election message after it is initialized.
11. A method as defined in claim 1 in which said at least one device is enabled to broadcast a plurality of node election messages, the at least one device stopping broadcasting node election messages if it receives a node election message from another device having a higher suitability value than the suitability value associated with the at least one device.
12. A method as defined in claim 1 in which, during the tree establishment step the at least one device that is elected a node in the tree is enabled to broadcast a node advertising message over the communication link indicating that it is available as a node in the tree.
13. A method as defined in claim 12 in which the at least one device that is elected a node in the tree is enabled to broadcast a node advertising message in response to receipt of a child solicitation message transmitted by at least one other device.
14. A method as defined in claim 12 in which at least one other device, after receiving a node advertising message, communicates with the at least one device that is elected a node in the tree to facilitate becoming a child thereof. -35-
1 15. A system for logically organizing devices, in a digital data network comprising a plurality of
? devices interconnected by a communication link, into a tree structure, each of the devices having an
3 associated suitability value, the system comprising:
4 A. a node election element configured to enable at least one of the devices to broadcast, over
5 the communication link, at least one node election message including the respective device's
6 suitability value, and to enable any devices broadcasting node election messages to receive
7 from the communication link any node election messages broadcast by other devices and to
8 determine whether it is elected a node in the tree structure in connection with a comparison
9 between its suitability value and suitability values in any received node election messages; 0 and
1 B. a tree establishment element configured to enable at least one device which is not elected a 2 node in the tree structure to communicate with at least one device which is elected a node 3 in the tree structure to facilitate becoming a child of the device that is elected a node in the 4 tree.
16. A system as defined in claim 15 in which the node election element and the tree establishment element are configured to operate in at least one iteration, in said at least one iteration, during the
_> .ode election step one device being elected a node, as a root node, in the free structure.
1 17. A system as defined in claim 16 in which the node election element and the tree establishment
2 element are configured to operate in a series of iterations, in a first iteration the root node being
3 elected during the node election step, and each subsequent iteration at least one other device being
4 elected as a node in the tree structure. -36-
1 18. A system as defined in claim 17 in which the node election element is configured to enable at
2 least one other device elected as a node in the tree structure, during the tree establishment step of at
3 least one of said iterations, to communicate with the device elected as root node to facilitate
4 becoming a child thereof.
1 19. A system as defined in claim 17 in which the node election element enables the device elected
2 as the root node to determine the number of additional nodes to be elected during each subsequent
3 iteration.
1 20. A system as defined in claim 19 in which the node election element enables the device elected
2 as the root node to fransmit, in its at least one node election message transmitted during each j iteration after the first iteration, a number of nodes value identifying the number of additional nodes
4 to be elected during the respective iteration.
1 21. A system as defined in claim 20 in which the node election element enables each of the devices
2 to broadcast at least one node election message transmits, during each iteration after the first j eration, a number of nodes value corresponding to the number of nodes value transmitted by the device elected as the root node.
1 22. A system as defined in claim 15 in which each device is assigned a priority value, the respective
2 device's suitability value being a function of its priority value.
-37- 23. A system as defined in claim 15 in which each device has an associated capacity value associated with a number of children that it can accommodate, the respective device's suitability value being a function of its priority value.
24. A system as defined in claim 15 in which said node election element enables at least one device to begin broadcasting at least one node election message after it is initialized.
25. A system as defined in claim 15 in which said node election element enables at least one device to broadcast a plurality of node election messages, and to stop broadcasting node election messages if it receives a node election message from another device having a higher suitability value than the suitability value associated with the at least one device.
26. A system as defined in claim 15 in which the tree establishment element enables the at least one device that is elected a node in the tree to broadcast a node advertising message over the communication link indicating that it is available as a node in the tree.
27. A system as defined in claim 26 in which the tree establishment element enables at least one device that is elected a node in the tree to broadcast a node advertising message in response to receipt of a child solicitation message fransmitted by at least one other device.
28. A system as defined in claim 26 in which the tree establishment element enables at least one other device, after receiving a node advertising message, to communicate with the at least one device that is elected a node in the tree to facilitate becoming a child thereof. -38-
29. A computer program product for use in connection with a computer, the computer program product enabling said computer to, in a digital data network comprising a plurality of computers interconnected by a communication link, organize themselves into a tree structure, each of the computers having an associated suitability value, the computer program product comprising a computer-readable medium having encoded thereon:
A. a node election module configured to enable at least one of the computers to broadcast, over the communication link, at least one node election message including the respective computer's suitability value, and to enable any computers broadcasting node election messages to receive from the communication link any node election messages broadcast by other computers and to determine whether it is elected a node in the tree structure in connection with a comparison between its suitability value and suitability values in any received node election messages; and
B. a tree establishment module configured to enable at least one computer which is not elected a node in the tree structure to communicate with at least one computer which is elected a node in the tree structure to facilitate becoming a child of the computer that is elected a node in the tree.
30. A computer program product as defined in claim 29 in which the node election module and the tree establishment module are configured to operate in at least one iteration, in said at least one iteration, during the node election step one computer being elected a node, as a root node, in the tree structure.
-39- 31. A computer program product as defined in claim 30 in which the node election module and the tree establishment module are configured to operate in a series of iterations, in a first iteration the root node being elected during the node election step, and each subsequent iteration at least one other computer being elected as a node in the tree structure.
32. A computer program product as defined in claim 31 in which the node election module is configured to enable at least one other computer elected as a node in the tree structure, during the tree establishment step of at least one of said iterations, to communicate with the computer elected as root node to facilitate becoming a child thereof.
33. A computer program product as defined in claim 31 in which the node election module enables the computer elected as the root node to determine the number of additional nodes to be elected during each subsequent iteration.
34. A computer program product as defined in claim 33 in which the node election module enables the computer elected as the root node to transmit, in its at least one node election message transmitted during each iteration after the first iteration, a number of nodes value identifying the number of additional nodes to be elected during the respective iteration.
35. A computer program product as defined in claim 34 in which the node election module enables each of the computers to broadcast at least one node election message transmits, during each iteration after the first iteration, a number of nodes value corresponding to the number of nodes value transmitted by the computer elected as the root node. -40-
36. A computer program product as defined in claim 29 in which each computer is assigned a priority value, the respective computer's suitability value being a function of its priority value.
37. A computer program product as defined in claim 29 in which each computer has an associated capacity value associated with a number of children that it can accommodate, the respective computer's suitability value being a function of its priority value.
38. A computer program product as defined in claim 29 in which said node election module enables at least one computer to begin broadcasting at least one node election message after it is initialized.
39. A computer program product as defined in claim 29 in which said node election module enables at least one computer to broadcast a plurality of node election messages, and to stop broadcasting node election messages if it receives a node election message from another computer having a higher suitability value than the suitability value associated with the at least one computer.
40. A computer program product as defined in claim 20 in which the free establishment module enables the at least one computer that is elected a node in the tree to broadcast a node advertising message over the communication link indicating that it is available as a node in the tree.
41. A computer program product as defined in claim 40 in which the tree establishment module enables at least one computer that is elected a node in the tree to broadcast a node advertising -41- message in response to receipt of a child solicitation message transmitted by at least one other computer.
42. A computer program product as defined in claim 40 in which the tree establishment module enables at least one other computer, after receiving a node advertising message, to communicates with the at least one computer that is elected a node in the tree to facilitate becoming a child thereof.
PCT/US1999/007750 1998-04-18 1999-04-08 System and method for establishing a multicast message delivery error recovery tree in a digital network WO1999055042A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP99915339A EP1075747A1 (en) 1998-04-18 1999-04-08 System and method for establishing a multicast message delivery error recovery tree in a digital network
AU33877/99A AU3387799A (en) 1998-04-18 1999-04-08 System and method for establishing a multicast message delivery error recovery tree in a digital network
JP2000545284A JP2002512484A (en) 1998-04-18 1999-04-08 System and method for establishing a multicast message delivery error recovery tree in a digital network

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/061,849 1998-04-18
US09/061,849 US6134599A (en) 1998-04-18 1998-04-18 System and method for organizing devices in a network into a tree using suitability values

Publications (1)

Publication Number Publication Date
WO1999055042A1 true WO1999055042A1 (en) 1999-10-28

Family

ID=22038544

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/007750 WO1999055042A1 (en) 1998-04-18 1999-04-08 System and method for establishing a multicast message delivery error recovery tree in a digital network

Country Status (5)

Country Link
US (1) US6134599A (en)
EP (1) EP1075747A1 (en)
JP (1) JP2002512484A (en)
AU (1) AU3387799A (en)
WO (1) WO1999055042A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6718361B1 (en) 2000-04-07 2004-04-06 Network Appliance Inc. Method and apparatus for reliable and scalable distribution of data files in distributed networks
US6748447B1 (en) 2000-04-07 2004-06-08 Network Appliance, Inc. Method and apparatus for scalable distribution of information in a distributed network
EP1535182A1 (en) * 2002-06-21 2005-06-01 International Business Machines Corporation Method and structure for autoconfiguration of overlay networks by automatic selection of a network designated router
US6993587B1 (en) 2000-04-07 2006-01-31 Network Appliance Inc. Method and apparatus for election of group leaders in a distributed network
US7836507B2 (en) 2004-03-19 2010-11-16 Hitachi, Ltd. Contents transmitter apparatus, contents receiver apparatus and contents transmitting method
US8010792B2 (en) 2004-01-16 2011-08-30 Hitachi, Ltd. Content transmission apparatus, content reception apparatus and content transmission method
US8225084B2 (en) 2003-06-10 2012-07-17 Hitachi, Ltd. Content transmitting device, content receiving device and content transmitting method
WO2015040358A1 (en) * 2013-09-20 2015-03-26 Tcs John Huxley Europe Limited Messaging system

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7050432B1 (en) * 1999-03-30 2006-05-23 International Busines Machines Corporation Message logging for reliable multicasting across a routing network
US6889254B1 (en) * 1999-03-30 2005-05-03 International Business Machines Corporation Scalable merge technique for information retrieval across a distributed network
US7693042B1 (en) * 1999-06-23 2010-04-06 At&T Mobility Ii Llc Intelligent presentation network management system
US6275859B1 (en) * 1999-10-28 2001-08-14 Sun Microsystems, Inc. Tree-based reliable multicast system where sessions are established by repair nodes that authenticate receiver nodes presenting participation certificates granted by a central authority
US7117273B1 (en) * 2000-01-25 2006-10-03 Cisco Technology, Inc. Methods and apparatus for maintaining a map of node relationships for a network
EP1148426A1 (en) * 2000-04-17 2001-10-24 Gladitz, Wilhelm A method for constructing objects in a computing environment
US20050237949A1 (en) * 2000-12-21 2005-10-27 Addessi Vincent M Dynamic connection structure for file transfer
US6807578B2 (en) * 2001-03-14 2004-10-19 International Business Machines Corporation Nack suppression for multicast protocols in mostly one-way networks
US7171476B2 (en) * 2001-04-20 2007-01-30 Motorola, Inc. Protocol and structure for self-organizing network
US7251222B2 (en) * 2001-05-15 2007-07-31 Motorola, Inc. Procedures for merging the mediation device protocol with a network layer protocol
US7010622B1 (en) * 2001-06-08 2006-03-07 Emc Corporation Scalable communication within a distributed system using dynamic communication trees
US8086738B2 (en) * 2007-05-24 2011-12-27 Russell Fish Distributed means of organizing an arbitrarily large number of computers
US7096356B1 (en) * 2001-06-27 2006-08-22 Cisco Technology, Inc. Method and apparatus for negotiating Diffie-Hellman keys among multiple parties using a distributed recursion approach
US7333486B2 (en) * 2001-07-16 2008-02-19 International Business Machines Corporation Methods and arrangements for monitoring subsource addressing multicast distribution trees
US7152113B2 (en) * 2001-10-19 2006-12-19 Sun Microsystems, Inc. Efficient system and method of node and link insertion for deadlock-free routing on arbitrary topologies
US7203743B2 (en) * 2001-12-28 2007-04-10 Nortel Networks Limited Hierarchical tree-based protection scheme for mesh networks
US20050169183A1 (en) * 2002-06-14 2005-08-04 Jani Lakkakorpi Method and network node for selecting a combining point
US7305430B2 (en) * 2002-08-01 2007-12-04 International Business Machines Corporation Reducing data storage requirements on mail servers
CA2500166A1 (en) * 2002-10-29 2004-05-13 British Telecommunications Public Limited Company Method and apparatus for network management
US7313101B2 (en) * 2002-12-20 2007-12-25 Hewlett-Packard Development Company, L.P. Need-based filtering for rapid selection of devices in a tree topology network
US7995497B2 (en) * 2003-02-27 2011-08-09 Hewlett-Packard Development Company, L.P. Spontaneous topology discovery in a multi-node computer system
US6915212B2 (en) * 2003-05-08 2005-07-05 Moac, Llc Systems and methods for processing complex data sets
US7596595B2 (en) * 2003-06-18 2009-09-29 Utah State University Efficient unicast-based multicast tree construction and maintenance for multimedia transmission
US7373394B1 (en) * 2003-06-30 2008-05-13 Cisco Technology, Inc. Method and apparatus for multicast cloud with integrated multicast and unicast channel routing in a content distribution network
US6996470B2 (en) * 2003-08-01 2006-02-07 Moac Llc Systems and methods for geophysical imaging using amorphous computational processing
KR100556885B1 (en) * 2003-09-18 2006-03-03 엘지전자 주식회사 Broadcasting message architecture and transmission method
US7490089B1 (en) * 2004-06-01 2009-02-10 Sanbolic, Inc. Methods and apparatus facilitating access to shared storage among multiple computers
US7821972B1 (en) * 2005-09-29 2010-10-26 Cisco Technology, Inc. System and method for building large-scale layer 2 computer networks
US20070153317A1 (en) * 2005-12-30 2007-07-05 Sap Ag Method and system for providing location based electronic device configuration and confirmation
US7876706B2 (en) * 2006-02-28 2011-01-25 Motorola, Inc. Method and apparatus for root node selection in an ad hoc network
US20070204021A1 (en) * 2006-02-28 2007-08-30 Ekl Randy L Method and apparatus for myopic root node selection in an ad hoc network
US7697456B2 (en) * 2006-02-28 2010-04-13 Motorola, Inc. Method and apparatus for omniscient root node selection in an ad hoc network
US7570927B2 (en) * 2006-06-16 2009-08-04 Motorola, Inc. Decentralized wireless communication network and method having a plurality of devices
WO2008062630A1 (en) * 2006-11-21 2008-05-29 Nec Corporation Wireless system designing method, wireless system designing system, wireless system designing apparatus and program
JP2008293197A (en) * 2007-05-23 2008-12-04 Brother Ind Ltd Information distribution system, terminal unit and program for use in the same system, and information processing method
US7872990B2 (en) * 2008-04-30 2011-01-18 Microsoft Corporation Multi-level interconnection network
US7872993B2 (en) * 2008-10-30 2011-01-18 Alcatel Lucent Method and system for classifying data packets
JP5163726B2 (en) * 2010-10-07 2013-03-13 株式会社日立製作所 Content transmission device, content reception device, and content transmission method
US8520676B2 (en) * 2010-11-09 2013-08-27 Cisco Technology, Inc. System and method for managing acknowledgement messages in a very large computer network
US9219682B2 (en) * 2012-11-05 2015-12-22 Cisco Technology, Inc. Mintree-based routing in highly constrained networks
US11082336B1 (en) 2020-01-15 2021-08-03 Cisco Technology, Inc. Automatic configuration and connection of heterogeneous bandwidth managed multicast fabrics

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5355371A (en) * 1982-06-18 1994-10-11 International Business Machines Corp. Multicast communication tree creation and control method and apparatus
US4864559A (en) * 1988-09-27 1989-09-05 Digital Equipment Corporation Method of multicast message distribution
US5684961A (en) * 1995-04-28 1997-11-04 Sun Microsystems, Inc. System for defining multicast message distribution paths having overlapping virtual connections in ATM networks and assigning identical labels to overlapping portions of the virtual channels
US5805578A (en) * 1995-10-27 1998-09-08 International Business Machines Corporation Automatic reconfiguration of multipoint communication channels

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KIM J L ET AL: "A DISTRIBUTED ELECTION PROTOCOL FOR UNRELIABLE NETWORKS", JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, vol. 35, no. 1, 25 May 1996 (1996-05-25), pages 35 - 42, XP000591142, ISSN: 0743-7315 *
KING C -T ET AL: "RELIABLE ELECTION IN BROADCAST NETWORKS", JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, vol. 7, no. 3, 1 December 1989 (1989-12-01), pages 521 - 540, XP000081901, ISSN: 0743-7315 *
SANJOY PAUL ET AL: "RELIABLE MULTICAST TRANSPORT PROTOCOL (RMTP)", IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, vol. 15, no. 3, 1 April 1997 (1997-04-01), pages 407 - 420, XP000683937, ISSN: 0733-8716 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7747741B2 (en) 2000-04-07 2010-06-29 Net App, Inc. Method and apparatus for dynamic resource discovery and information distribution in a data network
US6748447B1 (en) 2000-04-07 2004-06-08 Network Appliance, Inc. Method and apparatus for scalable distribution of information in a distributed network
US6993587B1 (en) 2000-04-07 2006-01-31 Network Appliance Inc. Method and apparatus for election of group leaders in a distributed network
US7346682B2 (en) 2000-04-07 2008-03-18 Network Appliance, Inc. System for creating and distributing prioritized list of computer nodes selected as participants in a distribution job
US6718361B1 (en) 2000-04-07 2004-04-06 Network Appliance Inc. Method and apparatus for reliable and scalable distribution of data files in distributed networks
EP1535182A1 (en) * 2002-06-21 2005-06-01 International Business Machines Corporation Method and structure for autoconfiguration of overlay networks by automatic selection of a network designated router
EP1535182A4 (en) * 2002-06-21 2008-06-04 Ibm Method and structure for autoconfiguration of overlay networks by automatic selection of a network designated router
US8225084B2 (en) 2003-06-10 2012-07-17 Hitachi, Ltd. Content transmitting device, content receiving device and content transmitting method
US8010792B2 (en) 2004-01-16 2011-08-30 Hitachi, Ltd. Content transmission apparatus, content reception apparatus and content transmission method
US8468350B2 (en) 2004-01-16 2013-06-18 Hitachi, Ltd. Content transmission apparatus, content reception apparatus and content transmission method
US7836507B2 (en) 2004-03-19 2010-11-16 Hitachi, Ltd. Contents transmitter apparatus, contents receiver apparatus and contents transmitting method
US8209534B2 (en) 2004-03-19 2012-06-26 Hitachi, Ltd. Contents transmitter apparatus, contents receiver apparatus and contents transmitting method
WO2015040358A1 (en) * 2013-09-20 2015-03-26 Tcs John Huxley Europe Limited Messaging system

Also Published As

Publication number Publication date
AU3387799A (en) 1999-11-08
JP2002512484A (en) 2002-04-23
EP1075747A1 (en) 2001-02-14
US6134599A (en) 2000-10-17

Similar Documents

Publication Publication Date Title
US6134599A (en) System and method for organizing devices in a network into a tree using suitability values
US4823122A (en) Local area network for digital data processing system
EP0374131B1 (en) Local area network for digital data processing system
Delgrossi et al. Internet stream protocol version 2 (ST2) protocol specification-version ST2+
US5701427A (en) Information transfer arrangement for distributed computer system
US4975905A (en) Message transmission control arrangement for node in local area network
US5805825A (en) Method for semi-reliable, unidirectional broadcast information services
EP0698975B1 (en) A method of multicasting
JP3187006B2 (en) Method for joining a multicast connection initiated from a source node distributing multicast information
US4975904A (en) Local area network for digital data processing system including timer-regulated message transfer arrangement
US6714966B1 (en) Information delivery service
US6829634B1 (en) Broadcasting network
US7412537B2 (en) Method for determining an estimated diameter of a broadcast channel
Jones et al. Protocol design for large group multicasting: the message distribution protocol
US8345576B2 (en) Methods and systems for dynamic subring definition within a multi-ring
US6920497B1 (en) Contacting a broadcast channel
CN112738240A (en) Large-scale distributed network data transmission and cooperation method
Gumbold Software distribution by reliable multicast
JP2002169738A (en) File distributing method
KR20020041851A (en) Error control method in the multicasting transmission system using repeater server
Delgrossi et al. RFC1819: Internet Stream Protocol version 2 (ST2) protocol specification-version st2+
JP2002183014A (en) Contents distribution system and method
SCHEME Department of Computer Engineering, Eastern Mediterranean University, Magusa, via Mersin 10, Turkey
WO2014130481A1 (en) An automated command and discovery process for network communications

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AT AU AZ BA BB BG BR BY CA CH CN CU CZ CZ DE DE DK DK EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 545284

Kind code of ref document: A

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: KR

WWE Wipo information: entry into national phase

Ref document number: 1999915339

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1999915339

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 1999915339

Country of ref document: EP