PROCESS FOR DETERMINING IN SERVICE
This invention relates to networked devices which act in concert with one another and give the appearance of a single 5 device, an Orchestrated Entity, when interacting with devices which are not part of the Orchestrated Entity. More particularly, this invention relates to a process for the Orchestrated Entity to determine whether each device comprising part of the Orchestrated Entity can declare itself "In 10 Service".
With the growth of distributed operations in telecommunications and computing operations, there are now often collections of equipment and software, i.e., devices, which operate in concert with one another in providing some service to the larger telecommunications computing, or electrical or electro-mechanical system in which they reside. ^
This can be as simple as a CPU which accesses a database stored on a separate server in responding to a plurality of workstations or can consist of a more elaborate collection of equipment.
For example, in a computer environment, a group of 25 computers may each act as a tasking station to provide information to a larger group of workstations. Each tasking station computer does not contain necessary databases populated with data to perform all the requested tasks to respond to the served workstations. Instead, when information from 30 a database is required, the tasking station computer accesses a server which has the populated databases resident. The separation of server from tasking station allows efficient use of system resources. For example, the server can have greater memory and speed than the querying tasking sta- 35 tions. Moreover, more than a single server can be used to increase response time and to segregate specific types of data. Further, to assure the efficient utilization of all servers, several controllers can be used to monitor data requests from the tasking stations and direct and distribute the tasking 40 stations' requests between all servers. In this example, communication between the servers and the tasking stations must be through the controllers. This interactive subset of system components, from the perspective of the remainder of the system, could just as well be a single piece of 45 equipment. It is unimportant which workstation connects to which tasking station since each tasking station has the potential to provide the same results in response to a request from the workstation. This seamless collaboration of devices providing a service to the other resources in the system is 50 referred to hereafter as an orchestrated entity (OE).
As long as the OE is fully functional, that is, each and every device making up the OE is functioning and the communication lines between the devices are operational, then the OE can properly serve the other system resources. 55 However, should any device or communication line go down, there is the potential that the OE may not be able to adequately serve the other system resources. As examples, this may be because, in the case of a server, that pertinent database information cannot be accessed; in the case of a 60 controller, that not all servers are accessible; and in the case of tasking stations, that requests from other system resources accessing the OE through that particular tasking station will not be recognized by the OE. On the other hand, OE's can be constructed to have a certain resilience against loss of one 65 or more devices within the OE, being capable of operating without all devices operating.
The problem, then, is to determine whether the OE is sufficiently operational at any point in time to adequately provide the desired service to the other network resources. In conventional systems, this has been the task of a master device or master node which monitors the condition of each other device and communication channel. If a device or communication channel does not respond to the master node's direct polling, that device is declared "Out of Service". Based on the master node's polling results, the master device then determines whether the OE, as a whole, is sufficiently operational to adequately provide the desired service to the other network resources and, if not, declares the OE "Out of Service" to the other network resources which must then either seek the service from alternative system resources or await the OE returning to "In Service" status.
The conventional approach to determining resource status for an OE suffers from several problems. First, the master node must be selected and connected to each device and line in the OE. Second, if the master node fails, the OE must be declared out of service because the status of other devices and lines in the OE cannot be otherwise determined.
There is therefor a need for an approach for determining whether devices in the OE are In Service and whether the OE is In Service without the use of a master polling device.
Each and Every Device Comprising Part of the OE determines OE Status for Itself.
This problem is solved and a major advance over the prior art is achieved by the instant invention which provides a distributed method for each device in an OE to self-diagnose whether the OE should be considered by that device In Service based that device, comprising part of OE, independently determining that sufficient other devices comprising the OE are operational that the device making the determination can declare itself In Service.
Conventional processor-based electronic equipment and software, i.e., devices, are capable of polling, that is, sending a signal through a communication line to another device, receiving a signal in response, and recognizing that responding signal. In one type of polling called "Echo" polling, the initiating device sends a signal which is simply returned by the polled devices. In a second type of polling, the initiating device sends a more complex signal, one which includes as part of the signal an identifier which identifies the initiating device, and the responding device likewise responds with a more complex signal, one which includes as part of the signal an identifier which identifies the responding device as part of the returned signal.
Both types of polling are used in providing device status to monitoring equipment.
The instant invention recognizes and implements a series of rules discovered by the inventors which, through device polling, permit the devices comprising the OE to selfdiagnose OE status.
To simplify this discussion, a device which relays signals from an initiating device to a responding device is hereafter called a "hub". An initiating device and a responding device are both called "nodes". The pathway by which signals are sent between the nodes and hubs are called "lines". Devices are said to "talk" to one another when an initiating signal results in a responding signal, regardless of the type of signal.