WO2002093399A1 - Managing a remote device - Google Patents

Managing a remote device Download PDF

Info

Publication number
WO2002093399A1
WO2002093399A1 PCT/US2002/014885 US0214885W WO02093399A1 WO 2002093399 A1 WO2002093399 A1 WO 2002093399A1 US 0214885 W US0214885 W US 0214885W WO 02093399 A1 WO02093399 A1 WO 02093399A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
disp
work
server
report
Prior art date
Application number
PCT/US2002/014885
Other languages
French (fr)
Inventor
Marcio Cravo De Almeida
Nelson Alves Da Silva Filho
Agostinho De Arruda Villela
Andre Araujo Da Fosenca
Marcelo Salim Da Silva
Original Assignee
Automatos, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/853,839 external-priority patent/US20020169871A1/en
Priority claimed from US09/954,819 external-priority patent/US20030055931A1/en
Application filed by Automatos, Inc. filed Critical Automatos, Inc.
Publication of WO2002093399A1 publication Critical patent/WO2002093399A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0876Aspects of the degree of configuration automation
    • H04L41/0883Semiautomatic configuration, e.g. proposals from system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/085Retrieval of network configuration; Tracking network configuration history
    • H04L41/0853Retrieval of network configuration; Tracking network configuration history by actively collecting configuration information or by backing up configuration information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/222Monitoring or handling of messages using geographical location information, e.g. messages transmitted or received in proximity of a certain spot or area
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks

Definitions

  • This invention relates to managing a remote device, including obtaining data from the remote device and presenting the data to a client device.
  • the invention is directed to obtaining data from a device using an agent.
  • This aspect includes receiving a plug-in containing system calls for obtaining the data from the device, loading the plug-in into the agent, obtaining the data from the device using the system calls, and transmitting the data over an external network using one or more of a plurality of protocols.
  • the agent may include shared libraries containing system calls for obtaining other data from the device.
  • the shared libraries may be loaded into the agent when the plug-in is loaded.
  • the data may be obtained from the device periodically, such as every minute.
  • the plurality of protocols may include simple mail transfer protocol (SMTP), hyper text transfer protocol (HTTP), and secure sockets layer (SSL) protocol. Data transmission may be effected using at least one of a proxy and socket.
  • SMTP simple mail transfer protocol
  • HTTP hyper text transfer protocol
  • SSL secure sockets layer
  • the agent may reside on an internal network that includes the device.
  • a machine may be selected on the internal network to transmit the data over the external network.
  • the external network may include the Internet.
  • the agent may reside on the device.
  • the agent may reside on a machine located on the internal network that is not the device.
  • the network may include a network device located on the internal network and the agent may reside on a server that is also on the internal network.
  • the data may relate to one or more of the following: a processor on the device, memory on the device, a hard drive on the device, the internal network on which the device is located, and software installed on the device.
  • the invention is directed to providing, to a client, data that was obtained by an agent from a remote device on an internal network.
  • This aspect includes receiving the data via an external network, at least some of the data being received periodically, formatting the data, and making the formatted data accessible to a client via the external network.
  • This aspect may include one or more of the following features.
  • Formatting the data may include generating a report based on the data.
  • the report may be a natural language report.
  • Formatting the data may include generating a display based on the data and updating the display periodically as new data is received periodically via the external network. The data may be received every minute.
  • Formatting the data may include determining if the data indicates that an operational parameter of the device exceeds a preset limit and generating a report to a client indicating that the operational parameter exceeds the preset limit.
  • the external network may include the Internet.
  • Making the formatted data accessible to the client may include providing a World Wide Web site through which the data can be accessed by the client.
  • the formatted data may be made accessible to the client using wireless application protocol.
  • a method in another general aspect of the invention, includes automatically and repeatedly collecting data indicative of an operating state of a machine, and automatically transmitting information related to the collected data to a location remote from the machine.
  • the information is transmitted in the form of electronic mail messages complying with a standard electronic mail messaging protocol, such as a Simple Mail Transfer Protocol.
  • a standard electronic mail messaging protocol such as a Simple Mail Transfer Protocol.
  • an article comprising a machine- readable medium on which are tangibly stored machine-executable instructions for monitoring a computer, includes instructions operable to cause a processor to perform the method of the first general aspect of the invention.
  • Embodiments of the invention may include one or more of the following features.
  • a monitoring computer receives the electronic mail messages and analyzes the information to derive performance measures.
  • the monitoring computer generates a report embodying the performance measures and makes the report available electronically, for example, from a web site.
  • the report includes a natural language document expressed in a natural language format.
  • the machine may be a network server, a desktop computer, or an intelligent appliance.
  • the data collected includes a time-ordered sequence of performance measurements taken at fixed time intervals.
  • the collected data for example, include measurements of CPU usage, process queue length, memory usage, memory paging rate, disk usage, network usage, paging space occupancy, file system occupancy, and process resource usage.
  • the collected data are typically collected from a registry, a system call, a virtual file system, a virtual device, or an input/output control call to a device.
  • the information related to the collected data is compressed and encrypted for inclusion in the electronic mail message.
  • a method in another general aspect of the invention, includes automatically and repeatedly receiving electronic mail messages that include information related to remotely collected data.
  • the collected data are indicative of a performance of a machine and the electronic mail messages comply with a standard electronic mail messaging protocol.
  • the method also includes automatically analyzing the information to determine the performance of the machine.
  • an article comprising a machine- readable medium on which are tangibly stored machine-executable instructions for monitoring a remote machine includes instructions operable to cause a machine to perform the method of the third general aspect of the invention.
  • Embodiments of the invention may include one or more the following features.
  • the information related to the remotely collected data is extracted from the electronic mail messages.
  • the collected data is a time ordered sequence of performance measurements and analyzing the collected data includes comparing at least some of the performance measurements with a corresponding threshold value to determine whether the performance measurements are within a range of acceptable values.
  • the analysis also includes determining the number of performance measurements that are within the range of acceptable values.
  • a natural language report is generated by selecting items of information to be added to the report based on the analysis of the information included in the email messages.
  • the items of information are, for example, selected based on the comparison of the performance measurements to the threshold values or based on the number of performance measurements that are within the range of acceptable values.
  • the natural language report typical includes a natural language sentence or a graphical display.
  • the natural language sentence may include a measurement value or a threshold value. Part of the natural language sentence is sometimes enhanced, for example, using bold typeface, italicized typeface, colored typeface, underlining, or a different font size from the rest of the sentence to draw attention to the sentence.
  • the natural language sentence includes a hyperlink to more detailed information about a section of the sequence of performance measurements.
  • An electronic mail message that includes the report is generated and transmitted over a network.
  • FIG. 1 is a view of a network that includes an internal network having devices to be monitored by an agent.
  • Figs. 2 to 9 and 28 to 41 show installation screens for the agent, including the relay portion of the agent.
  • Fig. 10 is a flowchart showing a process for monitoring a device on the internal network.
  • Fig. 11 is a flowchart showing a process for providing data from a monitored device to a user.
  • Figs. 12 to 26 show Web pages for viewing the data from the monitored device.
  • Fig. 27 shows a computer on which the processes of Figs. 10 and/or 11 may be implemented.
  • Figs. 42 to 51 shows a cellular telephone for viewing data obtained by the agent.
  • Figs. 52a, 52b and 53 show Web pages for enrolling in a service in order to download the agent.
  • FIG. 54 shows a system for monitoring a server
  • FIG. 55 A is a table of sampling periods
  • FIG. 55B is a table of sources of data indicative of an operating state
  • FIG. 55C shows kinds of data collected
  • FIG. 55D shows data contained within a rule for analyzing data
  • FIG. 56 is a flow chart of the process of collecting data from the server
  • FIG. 57 is a flow chart of the process of transmitting the collected data
  • FIG. 58 is a flow chart of the process of analyzing the collected data and generating a report
  • FIG. 59 is a block diagram of the structure of a report
  • FIG. 60 is a flow chart of the process of installing agent software
  • FIGS. 61-91 are screenshots of the process of FIG. 60.
  • FIG. 1 shows a network system 10.
  • Network system 10 includes an internal network, such as a local area network (LAN), and an external network, such as the LAN.
  • LAN local area network
  • the external network such as the LAN
  • Firewall 14 allows messages, such as e-mail, to be exchanged between devices (e.g., computers) on internal network 11 and external network 12. However, firewall 14 does not permit devices on external network 12 to directly access data stored on internal network 11.
  • Internal network 11 contains several devices. These devices may be computers with network interface cards, including servers and desktop computers, and/or network peripherals, such as routers, hubs or switches. Internal network 11 includes three desktop computers 16, 17 and 19, server 20, router 13 and switch 18. Other devices may also be included in addition to, or instead of, these devices.
  • External network 12 contains a server 21, which has access to a database 22.
  • server 21 is one or more World Wide Web (or simply "Web") servers that are capable of receiving data, storing the data in database 22, processing the data, and hosting a Web site that makes the processed data accessible to client devices, directly or indirectly via the Internet.
  • Web World Wide Web
  • a computer program known as an "agent" is installed on a device, such as computer 19, on internal network 11.
  • the agent permits a remote client device to manage computer 19 and to monitor computer 19 and other devices on internal network 11. This is done through the use of communications provided from the agent to server 21.
  • the communications may be transmitted via e-mail using simple mail transfer protocol (SMTP), hyper text transfer protocol (HTTP) or secure sockets layer (SSL) • protocol.
  • SSL is a protocol developed by Netscape® for transmitting private documents over the Internet. SSL works by using a public key to encrypt data that is transferred over an established SSL connection. Additionally, the communications might have to have additional provisions for crossing through a firewall, such as supporting authenticated proxies and the like. More than one agent may be installed on a single network.
  • Each agent 24 is comprised of three core software components: an engine 25, one or more plug-ins 26, and a relay 27. These core components may run on the same device or on different devices.
  • engine 25 and plug-ins 26 run on computer 19 and relay 22 runs on server 20.
  • Plug-ins 26 are installable computer programs that are responsible for collecting the state of hardware, operating systems and/or applications, in a device that is being managed/monitored by agent 24.
  • operating systems examples include, but are not limited to, the Microsoft® Windows® family (Intel 8086-like hardware platform), including NT4® (Workstation, Server, Terminal Server), Windows2000® (Professional, Server, Advanced Server) Windows9x® (95(all versions), 98 (all versions) and ME(Millennium), and Linux versions kernel 2.2, 2.4 (RedHat 6.2 and above, Conectiva 6.0 and above).
  • Microsoft® Windows® family Intel 8086-like hardware platform
  • NT4® Workstation, Server, Terminal Server
  • Windows2000® Professional, Server, Advanced Server
  • Windows9x® 95(all versions), 98 (all versions) and ME(Millennium
  • Linux versions kernel 2.2, 2.4 RedHat 6.2 and above, Conectiva 6.0 and above.
  • the plug-ins constitute shared libraries containing system calls for collecting data from a device.
  • Engine 25 is a computer program that is responsible for controlling plug- ins 26, grouping the collected data and sending the data to relay 27 using, e.g., transmission control protocol/internet protocol (TCP/IP).
  • Relay 27 is a computer program that is responsible for sending the collected data to server 21 over the Internet (or, more generally, external network) via, e.g., SMTP, HTTP or SSL.
  • Relay 27 need not be installed in all computers on internal network 11.
  • a client can choose to install relay 27 on a single computer on internal network 11 with Internet access and direct all agents running on internal network 11 to send data to that one relay, which will then send the data to server 21.
  • Agent 24 may be installed on the device to be monitored, as is the case here, or it may be stored on another devices (e.g., a server) on the same internal network as the device to monitored (which is the case for network peripherals management).
  • relay 27 is configured to permit functions such as sending and receiving messages using e-mail or HTTP or SSL.
  • Engine 25 is then executed. After engine 25 is executed for the first time, it calls all the installed plug-ins and reads configuration information contained therein. Engine 25 creates a schedule to call the plug-ins at periodic time intervals. Once engine 25 is up and running, engine 25 will, at the time intervals, call the plug-ins. For example, a plug-in can be scheduled to execute every minute, every 5 minutes, and so on. After each plug-in executes, the plug-in returns data that it collected to engine 25.
  • “Sysinfo” collects information regarding the configuration of the entire system from the point of view of the system's operating system.
  • Vmstat collects information regarding the CPU usage and memory usage of the computer system where the plug-in is installed.
  • Lostat collects information regarding the disk I/O usage of the computer system where the plug-in is installed.
  • Netstat collects information regarding the network statistics of the computer system where the plug-in is installed.
  • Fsinfo collects information regarding the file system of the computer system where the plug-in is installed.
  • “Psinfo” collects information regarding the processes that are miming on the computer system where the plug-in is installed.
  • “Swpinfo” collects information regarding the swap area of the computer system where the plug-in is installed.
  • “Lvminfo” collects information regarding the logical volume manager of the computer system where the plug-in is installed.
  • “SQL Server”, where “SQL” stands for "Structured Query Language”, collects information regarding the state of a Microsoft® SQL SERVER 2000® database server on internal network 11.
  • the “SQL SERVER plug-in” collects data that enables server 21 to generate a detailed report regarding the configuration, performance, etc. of the SQL SERVER 2000® database server.
  • "Network” collects information from network devices that are connected to internal network 11, i.e., devices that are not physically part of the device on which agent resides, but are in the same internal network.
  • “Oracle” plug-in collects information regarding the state of an Oracle® database server on internal network 11. The Oracle plug-in collects data that enables server 21 to generate a report regarding the configuration, performance, etc. of the Oracle® database server.
  • Engine 25 receives the collected data from plug-ins 26 and stores the collected data in a file in a binary and, in this case, proprietary format. Engine 25 compresses the file using a compression technique, such as the BZ2 compression method. Engine 25 sends the compressed data to the relay, which is responsible for encrypting the data.
  • a compression technique such as the BZ2 compression method.
  • Relay 27 receives data collected by one or more agents on internal network 11, encrypts the data, and sends the data through the Internet to server 21, where the data is analyzed.
  • Relay 27 can run in a device other than the monitored (shown) device and can receive connections from more than one agent simultaneously.
  • the relay's connection to the internet may be dial-up or permanent and may support SMPT, HTTP and/or SSL.
  • the relay supports proxies and SOCKS (Windows® sockets), making it easier for outbound connections to go through firewalls.
  • relay 27 uses two methods of encryption. The encryption method that relay 27 selects corresponds to the transfer protocol that relay 27 uses to send the data to server 21.
  • relay 27 uses the encryption method that is available from the OpenSSL library, hi this embodiment, SSL version 3/Transport Layer Security (TLS) version 1 with Rivest, Shamir, and Adelman (RSA), Triple Data Encryption Standard (3DES) is used with a key of 128.
  • SSL version 3/Transport Layer Security
  • RSA is a public-key encryption process developed by RSA Data Security, Inc. The RSA process is based on that fact that there is no efficient way to factor very large numbers. Deducing an RSA key, therefore, requires large amounts of computer processing power and time. The RSA process has become the de facto standard for industrial-strength encryption.
  • DES is a popular symmetric-key encryption method that uses a 56-bit key.
  • relay 27 encrypts the data using the sapphire, symmetrical, encryption process, in which the key used is a session key. This means that the key will only be used once. The key used is 128 bits. The server needs this key for decryption. Therefore, relay 27 uses the RSA, asymmetrical, encryption process to encrypt the key using a 1024 bits key.
  • Server 21 includes a computer program 29 to receive the encrypted and compressed data from agent 24, decrypt and decompress the data, and store the data in a database 22.
  • Database 22 may be part of, or external to, server 21.
  • Computer program 29 also retrieves the data from database 22 and presents the data to a client 30.
  • Computer program 29 may include a Web server module, which formats the data and makes the data accessible as a Web page or even a WAP (Wireless Application Protocol) page. The formatting may also include generating a report in Adobe PDF format or using Java applets for displaying real-time graphics of data collected by the agents.
  • An additional form of communicating information being collected by the agents that can be employed by server 21 is notifications.
  • Notification are "real time" alerts sent every time a certain event happens (such as a threshold being exceeded) to portable communication devices such as cellular phones, pagers, etc.
  • real-time is defined roughly by the data sampling rate of the agent and any delays associated with data transmission.
  • the notification process may operate as follows. The user can specify occurrences that prompt a notification and the necessary configuration. For example, the user can be notified in response to changes in CPU usage, memory usage, disk VO, network I/O, file system/logical drive utilization, and the status of a process.
  • CPU Utilization has the high point set to 80% and low point to 50%.
  • the following scenarios may occur: (1) The user has the high point flag set to false and the value is below the high point. (2) The value reaches the high point and the flag is set to false. In this case the user receives the form of notification chosen and the high point flag is set to true. (3) The value is above the high point and the high point flag is true. Nothing is done here, since the user has already been notified. (4) The value is below the high point, above the low point and the high point flag is true. Nothing is done here. (5) The value is below the low point and the high point flag is true. The user is notified that it reached the low point and the high point flag is false
  • Notifications in response to the status of a process status function analogously.
  • the user provides the name of the processes to be monitored.
  • a user is notified once when the process stops running and receives a notification when the process starts running again.
  • Computer program 29 also analyzes the data collected from a device (e.g., device
  • Computer program 29 in order to produce a natural language and conclusive report.
  • natural language means a human-readable format that can be presented and understood by, e.g., a network administrator or the like.
  • Computer program 29 generates the reports according to a rule-based system. For each of the reports there are sets of rules that determine what goes in the report.
  • computer program 29 includes the following software modules (called "wizards") for generating different types of reports.
  • Performance Wizard Service delivered through the Internet analyzes the foregoing performance of computational servers and presents results by means of conclusive, natural language reports.
  • Consolidated Performance Wizard Service delivered through the Internet analyzes the foregoing performance of a group of computational servers, as a whole, and presents the results by means of conclusive, natural language reports.
  • Capacity Wizard Service delivered through the Internet infers the future performance behavior of computational servers, studies possible upgrades, and presents results by means of conclusive, natural language reports.
  • Consolidated Capacity Wizard Service delivered through the Internet infers the future performance of a group of computational servers, as a whole, and possible upgrades, and presents the results by means of conclusive, natural language reports.
  • Real Time Monitoring (RTM) Service delivered tlirough the Internet shows, via an Internet browser or WAP (Wireless Application Protocol)-enabled device (such as a mobile phones or notepad), the updated status of the computational resources (such as memory usage, CPU usage, disk usage and network interface usage) of a computer.
  • WAP Wireless Application Protocol
  • the service can also send alerts by WAP, SMS (Short Message System), e- mail or similar electronic communication channels whenever the consumption of each computational resource exceed pre-defined thresholds.
  • the RTM Wizard service generates real-time graphical displays of data from an agent monitoring a device on internal network 11.
  • Asset Wizard Service delivered through the Internet collects, keeps and analyzes information about computer hardware and software components such as hardware internal configuration, operating system version, installed software and upgrade history.
  • Oracle Wizard Service delivered through the Internet analyzes the foregoing performance behavior of an Oracle ⁇ database and presents the results by means of conclusive, natural language reports.
  • SQL Server Wizard Service delivered through the Internet analyzes the foregoing performance behavior of a Microsoft SQL Server ⁇ database and presents the results by means of conclusive, natural language reports.
  • the rules used by computer program 29 are static and configurable in terms of thresholds and tolerances.
  • Thresholds define a level, for a given resource consumption variable, above wliich, resource usage is considered critical. For instance, with computer processing units (CPUs), a threshold value is 75% utilization. Tolerances define for what percentage of an analyzed period a threshold was exceeded. Exceeding a threshold may not indicate a problem, unless the threshold is exceeded for a certain amount of time.
  • thresholds and tolerances There are four combinations of situations involving thresholds and tolerances: (1) a threshold was never exceeded, (2) a threshold was exceeded for a period of time below tolerance, (3) a threshold was exceeded for a period of time above tolerance, and (4) a threshold was exceeded all the time. Different text may be provided (e.g., displayed) in a report for each of these four situations, for every resource variable being analyzed, and for every language supported.
  • agent(s) Prior to operation, agent(s) (including engine, relay and plug-ins) are installed on computers of internal network 11. Installation may be perfonned by downloading the agent software from a Web site. An agent may be downloaded and installed for each type of platform on the internal network, e.g., Linux, Windows2000, etc. The agent is installed on each device to be monitored and in each device that is to act as a relay for internal network 11. A user, such as a network administrator, identifies himself (e.g., by e-mail address) and selects desired installation options. The agent automatically enables operation under the user's account through a Web site, such as "my.automatos.com", that is accessible via the Internet. The user then activates the monitoring services on the various devices. Installation options are described in more detail below.
  • Figs. 52a and 52b show Web pages for creating an account via a Web site, from which the agent can be downloaded.
  • the Web pages request identification information for the user, such as the user's name, e-mail address, a password, and language preference, among other things.
  • Fig. 53 shows a similar Web page for entering information on the company of the user that enrolled via the Web pages of Figs. 52a and 52b.
  • agent 24 Once enrolled, the user downloads the agent from the Web site and begins the installation process.
  • agent 24 generates and displays a graphical user interface (GUI) that has three tabs for checking the status of the agent and altering the agent's operation.
  • the tabs are: "Status", “Settings" and "Start/Stop”. Each tab may have different panels. Each panel presents a set of closely related parameters displayed in separate fields. Some of these parameters can be edited. Each tab is described below, along with the meaning and functionality of the fields contained therein.
  • Fig. 2 shows an example of status tab 31.
  • Status tab 31 is displayed on a device running agent 24.
  • the fields in status tab 31 are fixed, meaning that they cannot be edited.
  • machine panel 32 presents information describing the device on which the agent is installed, e.g., device 19. This information includes the operating system 34 of the device, the name 35 of the device and the MachinelD 36 of the device. "MachinelD" is the device's machine identifier. The Machine ID is a number that is generated during installation and that uniquely identifies device 19 to computer program 29 running in server 21 (shown in Fig. 1).
  • Agent panel 37 presents a start time 39, which is the date and time of the agent's activation, and a PID number 40, which is the agent's process ID (identifier) number.
  • a process ID is a number that identifies a process in an operating system on the monitored device. Using the process ID or "PID", it is possible to send signals to a process running in an operating system, such as an instruction for the process to terminate.
  • the modules field 41 shows each active collection module and its version number. Each module is responsible for coordinating the collection of data related to a specific service (e.g., Capacity Wizard, Performance Wizard, etc.). Whenever plug-ins are installed for new services, new modules are inserted and collectors may be added.
  • Collector field 42 shows the name of each collector within a device being managed and indicates if such collectors are active ("UP"). Each collector is responsible for collecting data from a certain device resource, such as hard disk, memory, etc.
  • Fig. 28 shows status tab 31 with other options 43 in the pull-down menu of collector field 42.
  • Data TX Panel 44 shows the Internet Protocol (IP) address 45 of the device in which the agent is installed and indicates if the device is currently sending samples to server 21.
  • IP Internet Protocol
  • the device's IP address is 127.0.0.1 and it is sending samples. If the device were not sending samples, icon 46 (Fig. 3) would be displayed in lieu of icon 47.
  • LastTXBytes field 49 shows the amount of bytes sent to relay 27 in a last collected data sample.
  • TotalTXBytes 50 field shows the total amount of bytes sent to relay 27 to present.
  • Sent field 51 shows the amount of collected data sent to relay 27.
  • Last Sent field 52 shows the date and time that the last collected data sample was sent to server 21.
  • Failures field 54 shows the number of failed sample transmission attempts.
  • Last Failures field 55 shows the date and time of the last failed sample transmission attempt. When no failures occur an "unknown" status is indicated (as shown). Also shown in Fig. 2 is an agent service indicator 2. "UP” (shown) indicates that the agent is active. “DOWN” (not shown) indicates that the agent is inactive.
  • Fig. 4 shows an example of settings tab 57.
  • Settings tab 57 is displayed on a device running agent 24. Some of the fields in settings tab 57 are fixed, others may be edited.
  • General panel 59 displays a customer ID field 60 and a TMP
  • (temporary) path field 61 CustomerlD field 60 shows the e-mail address used during enrollment and input when the agent is installed.
  • TMP path field 61 shows where samples are stored until they are sent to relay 27.
  • Primary Relay panel 62 contains Relay Server field 69, which shows the JP address of the primary relay device on internal network 11, and Relay Port field 65 which shows the primary relay device's JP port number.
  • Alternate Relay panel 66 includes a Relay Server field 67 and a Relay Port field
  • Relay Server field 67 indicates an alternate relay server's IP address. The alternate relay is automatically used when the primary relays is down.
  • Relay Port field 69 provides the alternate relay server's IP port number. Clicking on Apply button 70 executes any alterations made in the fields shown in Fig. 4.
  • the Start/stop tab 71 is displayed on a device running agent 24. In this tab, it is possible to activate and/or deactivate agent data sampling.
  • Fig. 5 shows start/stop tab 71 when agent 24 is active ("UP").
  • Fig. 6 shows start/stop tab 71 when agent 24 is inactive ("DOWN").
  • Agent Service panel 72 Start button 74 activates agent sampling (i.e., data collecting) (shown active) and Stop button 75 deactivates agent sampling.
  • Agent sampling i.e., data collecting
  • Stop button 75 deactivates agent sampling.
  • Reload Plug- ins button 76 reloads plug-ins installed in the agent.
  • GUI 77 for the relay is similar to the GUI (Fig. 2) for the agent.
  • GUI 77 is displayed on relay server 20 (Fig. 1) during installation and/or operation.
  • relay GUI 77 also has Status tab 79, Settings tab 80, and Start/Stop tab 81 with similar panels and functionalities as those described above.
  • Fig. 7 shows the relay GUI status tab 79. As was the case with the agent GUI status tab, most of the fields in relay GUI status tab 79 cannot be edited.
  • Machine panel 82 presents information describing relay server 20, its operating system, name and MachinelD.
  • the example presented in Fig 7 shows a computer (relay server) named "WRIEIRO2" executing Windows 2000 Professional with Service Pack 1 installed.
  • the relay sever can be installed in a different operating system than the agents are installed.
  • Relay panel 84 includes Version field 85, which provides the relay's version number, Start Time field 86 which provides the date and time of relay activation, and PJD field 87 which provides the process ID number.
  • Data RX (Receive) panel 89 includes the TX (Transmit) Queue Len field 90 which indicates a backlog of samples to send to server 21 (Fig. 1), TotalRXBytes field 91 which shows the total amount of bytes received by the relay from all agents until the present, and Active Sessions field 92 which shows the number of active agents' sessions that are sending samples to the relay.
  • the IP addresses of the agents that are generating the samples are listed in drop-down field 94.
  • Data TX (Transmit) panel 95 includes the following fields.
  • Data TX time field 96 shows the amount of time spent transmitting a last sample from relay 27 to server 21.
  • Sent field 97 shows the amount of collected samples sent from relay 27 to server 21.
  • Failures field 99 shows the number of failed data transmission attempts from relay 27 to server 21.
  • Mode field 100 shows the mode of transmission from relay 27 to server 21 : in this embodiment, either SMTP for e-mail data transmission or SSL for SSL data transmission.
  • LastTXBytes field 101 shows the amount of bytes sent by relay 27 to server 21 in an immediately preceding transmission.
  • Last Sent field 102 shows the date and time that the last collected sample was sent from relay 27 to server 21.
  • Last Failure field 104 shows the date and time of the last failed data transmission attempt. When no failures occur "unknown" is displayed.
  • Status tab 79 also includes a relay service indicator 105.
  • Relay service indicator 105 indicates "UP” when relay 27 is active and “DOWN” when relay 27 is inactive.
  • TX and RX statistics are reset, e.g., TotalRXBytes, DataTXTime, etc.
  • Figs. 8 and 29 to 41 depict settings tab 80.
  • Settings tab 80 is displayed on a device running relay 27. Some of the fields in settings tab 80 are fixed, others may be edited.
  • General Panel 106 includes the following fields.
  • CustomerJD field 107 displays the e-mail address input while installing the relay. This e-mail address identifies the user in my.automatos.com and cannot be edited.
  • TMP path field 109 indicates where samples are stored until they are sent to server 21.
  • Communications port field 110 (Fig. 29) displays the JP communication port used to transmit samples from agent 24 to relay 27. hi this example, the default value is 1999.
  • Protocol selection panel 111 allow a user to select protocols 113 (Fig. 31), including SSL, HTTP and SMTP, that may be used to transmit data over the Internet.
  • Fig. 30 shows the case where SSL is selected. In this case, the server name and port 112 are input.
  • Fig. 32 shows the case where HTTP is selected. In this case as well, the server name and port 114 are input.
  • Fig. 33 shows the case where SMTP is selected. In this case the server name and port 118 are input, along with e-mail addresses 111, including the sender's e-mail address ("FROM") and the recipient's e-mail address ("TO").
  • the SMTP server default address is mail.automatos.com (not shown) and the SSL server default address is ssl.automatos.com (not shown).
  • Figs. 34 to 41 shows screens for allowing a user to select firewall settings 128.
  • the Start/stop tab 81 (Fig. 9) is displayed on a relay device, hi this tab, it is possible to activate and/or deactivate data sampling transmission. Start/stop tab 81 indicates "START" 122, when relay service is "UP” 124, and “STOP" 125 when relay service is “DOWN” (not shown).
  • Fig. 10 shows a process 126 performed by agent 24 (including relay 27) for obtaining data from a device and providing that data to a remote server (or other type of processing device).
  • Fig. 11 shows a process 127 performed by remote server 21 for processing received data and making that data accessible to remote client 30, e.g., over the Internet.
  • agent 24 is activated and receives (1001) a plug-in containing system calls for obtaining data from device 19. It is noted that agent 24 may use a previously-installed plug-in to obtain data from device 19. A new plug-in is used if agent 24 needs to retrieve added or different data not obtainable by plug-ins already available to agent 24. Agent 24 loads (1002) the new plug-in, along with the preexisting plug-ins.
  • engine 25 creates (1003) a schedule to call the plug-ins at periodic time intervals. For example, a plug-in can be scheduled to execute every minute (as in this example), every 5 minutes, and so on. After each plug-in executes, the plug-in returns data that it collected to engine 25.
  • process 126 waits (1004) for the scheduled time interval (one minute here) and calls (1005) the scheduled plug-in at the appropriate time.
  • the plug-in collects the appropriate data from the monitored device.
  • engine 25 uses system calls from the new plug-in to obtain (1006) data from device 19.
  • Engine 25 may also obtain any other available data using the system calls from the pre-existing plug-ins.
  • the data may relate to, but is not limited to, one or more of the following: a processor on the device, a memory on the device, a hard drive on the device, an internal network on which the device is located, an operating system of the device, and/or software installed on the device.
  • Engine 25 compresses (1007) the obtained data and transmits the compressed data to relay 27.
  • relay 27 may reside on the same device as engine 27 or on a different device (shown).
  • Relay 27 encrypts (1007) the data that it receives from engine 25 and transmits (1008) the encrypted data to server 21 over the Internet. Blocks 1004 to 1008 may be repeated periodically, as shown, in order to obtain real-time data from device 19. Data is thus transmitted from agent 24 to server 21 periodically, thereby allowing a client to monitor changes in device 19 in real-time. This feature is described in more detail below.
  • server 21 receives (1101) the compressed and encrypted data. The data is received periodically, as it is transmitted, e.g., every minute, five minutes, etc.
  • Computer program 29 in server 21 decompresses and decrypts the data and stores the data in database 22. Alternatively, instead of storing the data in database 22, computer program 29 may process the data as it is received, which is the case when real time notification is utilized.
  • Computer program 29 formats (1102) the data for display.
  • the data is formatted as one or more Web pages (e.g., Figs. 15 to 18), reports (see the attached appendices), notification messages (e.g. pager messages, e-mails, etc.) and/or or graphs/charts (e.g., Fig. 25) for showing real-time operation behavior of device 19.
  • Web pages e.g., Figs. 15 to 18
  • reports see the attached appendices
  • notification messages e.g. pager messages, e-mails, etc.
  • graphs/charts e.g., Fig. 25
  • Computer program 29 makes the formatted data accessible to a remote client via the Internet. That is computer program 29 functions as a Web server to provide a Web site containing Web pages with the formatted data. A user at client 30 can navigate through the site/data via one or more hyperlinks. Computer program 29 may generate natural language reports that indicate an operational parameter of a device exceeds a preset limit. In this scenario, computer program determines if received data indicates that an operational parameter of the device exceeds a preset limit and generates a report to client 30 indicating that the preset limit has been exceeded. Preset limits for the operational parameters may be stored in, and retrieved from, database 22 by computer program 29. Client 30 (Fig. 1) can access the formatted data from server 21 through one or more Web pages. Fig.
  • Web page 140 contains hyperlinks 141, 142 and 144 to data for devices, in this case computers, being monitored by agents.
  • Window 145 provides a list 146, which contains groupings by "department" of one or more devices being monitored by agents.
  • hyperlink 142 provides links to data for all computers being monitored.
  • hyperlink 144 provides links to data for a selected group from list 146. If hyperlink 146 is selected, Web page 147 (Fig. 13) is displayed. Web page 147 contains link 149 to one computer (BOSB000117) and link 150 to another computer (WVTLLELA). Clicking on hyperlink 149 displays Web page 151 (Fig. 14). Web page
  • hyperlinks 154 which allow a user to display information about the selected device.
  • Web page 152 displays information about the configuration and operation of the selected computer. As shown, this information includes the operating system on the computer, the operating system version, the CPU on the computer, the CPU speed, the amount of memory, the type of CD-ROM (Compact Disc Read Only Memory) on the computer, along with other information.
  • Clicking on hyperlink 156 (Fig. 14) displays the capacity of the device's hard drive, shown in Web page 157 (Fig. 16).
  • Clicking on hyperlink 159 displays network information (e.g., the IP address) for device 19, shown in Web page 160 (Fig. 17).
  • Clicking on hyperlink 161 displays a list of the software installed on device 19, shown in Web page 162 (Fig. 18). Other information also may be accessible.
  • Web page 164 (Fig. 19) is also accessible through the Web site provided by server 21.
  • Web page 164 provides options for viewing statistics relating to monitored devices. For example, clicking on hyperlink 165 displays Web page 166 (Fig. 20).
  • Web page 166 provides a list 167 of groupings of devices (by department), along with buttons 169 which link to Web pages that provide statistics for a selected grouping from list 167.
  • Selecting "All Dept" 170 and button 171 on Web page 166 displays Web page 172 (Fig. 21).
  • Web page 172 identifies the CPU on all computers from list 167. To select only computers from a single group (i.e., department), select that group and button 171.
  • Selecting button 174 (Fig. 20) generates a Web page 175 (Fig. 22) that displays operating system information for computers from a selected group.
  • Selecting button 176 generates a Web page (not shown) that displays memory statistics for computers from a selected group.
  • Selecting button 177 generates a Web page (not shown) that displays software statistics (e.g., software installed, versions, etc.) for computers from a selected group.
  • Selecting button 179 generates a Web page (not shown) that displays product information (e.g., model, version, etc.) for computers from a selected group.
  • Selecting button 180 generates a Web page (not shown) that displays manufacturer information for computers from a selected group.
  • Fig. 23 shows another example of a Web page 181 displayed by server 21.
  • Web page 181 allows a user to access services through server 21. Among these services are real-time monitor (RTM) wizard 182.
  • RTM wizard 182 is part of computer program 29 and allows a client to view data from device 19 as that data changes in real-time.
  • Selecting RTM wizard 182 displays Web page 184 (Fig. 24), in which a user can select a device 185 to be monitored from pull-down menu 186.
  • a window 187 (Fig. 25) is displayed for showing the status of a selected function over time.
  • a user can choose to monitor a device's memory usage 189, disk input/output (I/O) 190, CPU usage 191, and network VO 192.
  • the selected function is displayed in terms of percentage of use 194 versus time 195 and is updated automatically as new data arrives at server 21.
  • Web page 196 (Fig. 26) also provides options for obtaining natural-language reports based on the data collected by agent 24.
  • Performance wizard 197, capacity wizard 199, Oracle wizard 200, SQL server wizard 201, and asset wizard 202 are software modules that are included within computer program 29. These modules analyze the data received from the agent(s), generate reports, and provide those reports to a user, in Adobe PDF format, at client 30, on demand (through the site) or automatically (by e- mail).
  • the various reports generated by the "wizards" provide information relating to one or more devices on a network over a period of time, although each report is different.
  • the reports combine data, charts, and natural language information, making them look like reports generated by a human being.
  • Reports may include hyperlinks linking their sections, to make it easy to access a section that interests the user.
  • the beginning of each report also may contain a summary of the information found in more detail in other sections of the report, making it easy to jump to the other sections.
  • Appendix A shows an example of a report generated by asset wizard 202.
  • Appendix B shows an example of a report generated by Oracle wizard 200.
  • Appendix C shows examples reports generated by SQL server wizard 201.
  • Appendix D shows an example of a report generated by performance wizard 197.
  • Appendix E shows an example of a report generated by capacity wizard 199.
  • Other types of reports may be generated instead of, or in addition to, the reports shown in the appendices.
  • Web page 196 (Fig. 26)
  • the user can select a starting date 205 and an ending date 206 for the report.
  • Computer program 29 generates and displays a report that encompasses that time period.
  • Pull-down menu 207 allows the user to select the device or devices about which to generate a report.
  • Web page 196 relates to SQL server wizard 201; however, similar Web pages are provided for the other wizards shown in Fig. 26.
  • Server 21 may also transmit the device monitor data (e.g., reports, etc.) using wireless application protocol (WAP) to a wireless device, such as a cellular telephone 230 (Fig. 42).
  • WAP wireless application protocol
  • Fig. 42 shows a screen 232 for a wireless user to select the language in which to receive information.
  • Fig. 43 shows the selection of languages 233 on screen 232.
  • Fig. 44 shows a screen 235 for the user to enter a login ID, here called an "alias”.
  • Fig. 45 shows a screen 236 for the user to enter a password.
  • Fig. 46 shows a screen 237 for the user to obtain a list of devices on internal network 11 for which monitoring data is available.
  • Fig. 47 shows a screen 238 that shows the list of devices (in this example, servers).
  • Fig. 43 shows the selection of languages 233 on screen 232.
  • Fig. 44 shows a screen 235 for the user to enter a login ID, here called an "alias”.
  • Fig. 45 shows a screen 236 for the user to enter a password.
  • Fig. 46 shows a screen 237 for the user to obtain a list of devices on internal network 11 for which monitoring data is available.
  • Fig. 47 shows a screen 238 that shows the list of devices (in this example, servers).
  • Fig. 48 shows a screen 239 which allows the user to select which features to monitor on the selected server, e.g., configuration, CPU usage, virtual memory, disk I/O, etc.
  • Fig. 49 shows a screen 240 with the selected data, in this case, CPU usage.
  • Fig. 50 shows a screen 241 with the selected data, in this case, virtual memory usage.
  • Fig. 51 shows a screen 242 with the selected data, in this case, network information.
  • Fig. 27 shows a computer 210 on which either of processes 126 or 127 may be implemented. That is, computer 210 may represent either a device with an installed agent on internal network 11 or server 21 (Fig. 1).
  • Computer 210 includes a processor 211, a memory 212, and a storage medium 214 (e.g., a hard disk) (see view 215).
  • Storage medium 214 stores machine-executable instructions 216 that are executed by processor
  • Processes 126 and 127 are not limited to use with the hardware and software of Fig. 27. They may find applicability in any computing or processing environment. Processes 126 and 127 may be implemented in hardware, software, or a combination of hardware and software.
  • Processes 126 and 127 may be implemented in computer programs executing on programmable computers or other machines that each include a processor, a storage medium readable by the processor (including volatile and non- volatile memory and/or storage components), at least one input device, and one or more output devices.
  • Program code may be applied to data entered using an input device (e.g., a mouse or keyboard) to perform processes 126 and 127 and to generate information.
  • Each such program may be implemented in a high level procedural or object- oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language. The language may be a compiled or an interpreted language.
  • Each computer program may be stored on a storage medium or other type of article of manufacture, such as a CD-ROM, hard disk, or magnetic diskette, that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform processes 126 and 127.
  • Processes 126 and/or 127 may also be implemented as an article of manufacture, such as a machine-readable storage medium, configured with a computer program, where, upon execution, instructions in the computer program cause a machine to operate in accordance with processes 126 and 127.
  • the invention is not limited to the specific embodiments described above.
  • the invention is not limited to the protocols, hardware, or software described herein.
  • the invention is not limited to generating the specific Web pages or reports described herein.
  • the blocks of Figs. 10 and 11 may be reordered and/or blocks may be left out or added.
  • a system 310 includes a local server 312 connected to an intranet 314 that is connected to the Internet 316 through a firewall 318.
  • Intranet 314 includes a Mail server 3150 with a Simple Mail Transfer Protocol (SMTP) server 3151 that delivers mail to and from the intranet 314.
  • Intranet 314 also includes a workstation 3152 that is used by an administrator of the Intranet.
  • Workstation 3152 typically has a web browser 343b for browsing web pages and a mail client 3154 for receiving and sending email messages, for example, through SMTP server 3151.
  • a monitor server 320 which is also connected to the Internet 16, monitors the operations of the local server 312 automatically without requiring continued involvement by an administrator of the local server 312.
  • the administrator of the local server 312 may have a laptop computer 322, which is connected to the Internet 316 and may be used to access the local server 312.
  • local server 312 executes an agent 324, which collects data that indicates the operating state the local server 312, including configuration information and performance data. The data provide a measure of how well the local server 312 is performing its intended functions.
  • Agent 324 automatically transmits the collected data using email (which conforms to a standard email protocol) to an email address associated with the monitor server 320.
  • the monitor server 320 analyzes the data and automatically generates a report containing a summary of the status of the local server 312, diagnoses of problems or defects that may exist in the local server 312, and a listing of resources on the local server 312 that may need to be updated to keep up with future demands on the local server 312.
  • the monitor server 320 transmits the report using an email (which also conforms to a standard email protocol) to an email address associated with the administrator of the local server 312.
  • the administrator can then access the report from any computer that is reachable by email, including laptop computer 322 and workstation 3154.
  • the administrator can also access the report from a web page on monitor server 320 from any computer that has a web browser, such as workstation 3152.
  • the system 310 provides automatic unattended continuous monitoring of the server 320 and automatically sends performance reports to any authorized person located anywhere using simple email. By using email to send the data and the report, the system 310 allows information to be sent through the firewall 318 without compromising the security of the intranet 14 or requiring that the firewall 318 be reconfigured.
  • Local Server 312 includes a processor 330 and a storage subsystem 332.
  • Storage subsystem 332 is a computer readable medium, such as computer memory, a floppy disk, a hard disk, a CDROM, an optical disk or a tape drive.
  • Storage subsystem 332 stores an operating system program 334 that is executed by the processor 330.
  • local server 312 may have any one of a variety of operating systems installed.
  • Operating system 334 includes a kernel 336, which further contains device drivers 338 that are used by the operating system to access devices in the local server 312.
  • the device drivers 338 provide an input/output control (“IOCTL”) application programming interface (“API”) 339 that may be used to obtain performance data from the device drivers 338.
  • IOCTL input/output control
  • API application programming interface
  • the operating system 334 provides a system call API 340 and a registry 342 that may be used to obtain performance information from the operating system 334.
  • Storage subsystem 332 also includes a file system 342 that contains system files 344 that are used by the operating system 334 to store data and a web browser 343 that may be used to browse web pages, as described in greater detail below.
  • Storage subsystem 332 also stores agent software 324, which is executed by the processor 332 to collect and transmit data.
  • Agent software 324 occupies very little storage space on storage subsystem 332. Typically, agent software 324 occupies about 600KB of storage space.
  • Processor 330 executes agent software 324 as a background process, known as a service or a daemon process. Very little memory and processing power is required to execute agent software 324. Typically, agent software 324 requires less than 1 % of the processing power of processor 330 and about 3.5 megabytes of memory to execute.
  • Agent software 324 includes a data retriever module 346 that retrieves the data, a timer module 348 which directs the data retriever module 346 to retrieve the data at certain time intervals, a data compressor module 350 to compress the collected data, a data encryptor 352 to encrypt the data, and an SMTP sender module 354 to send the data via email.
  • the data retriever 46 includes a registry module 356 which retrieves data from the registry 342, a system call module 358 which uses the system call API 340 to retrieve data from the operating system 334, an IOCTL module 360 which retrieves data from device drivers 338, and a file system module 362 which retrieves data from system files 344 contained within file system 342.
  • the timer module 348 can be configured in a selected one of possible data collection modes, each of which is represented by a row 3202a, 3202b of FIG. 55 A.
  • the configuration mode is selected in a user interface screen of agent software 24.
  • the timer module 48 has multiple configuration modes, only two of them 3202a, 3202b are shown in FIG.
  • Each configuration mode is associated with a sampling period 3204a, 3204b, after which the data retriever 46 collects a new sample of the data from the local server 12.
  • Each configuration mode is also associated with an entry period 3206a, 3206b.
  • the data retriever 46 computes an average of the data samples collected over the duration of the same entry period 3206a, 3206b and writes the average in a current one of the data files 366.
  • the timer module 48 causes the data to be written in a new data file after each upload period 3208a, 3208b of the selected configuration.
  • each column 3210a-3210e of FIG. 55B corresponds to a different operating system.
  • the IBM AIX version of agent software acquires data from a virtual device file "/dev/kmem" 3212 within the file system 342 and from system calls 3214 from the system call API 340 (FIG. 54).
  • the Solaris version acquires data from a "/proc" virtual file system, from system calls 3218, and from IOCTL calls 3219.
  • the HP UX version acquires data from lOCTLs 3220 from the IOCTL API 39 (FIG. 54) and from system calls 3222.
  • the Linux version acquires data from lOCTLs 3224, system calls 3226, and the "/proc" virtual file system 3228.
  • the windows version acquires data from the registry 342, system calls 3232 and lOCTLs 3234.
  • data retriever collects data about the components or inventory 3239 of the local server 312, processor or CPU usage 3240, process queues 3242 which are listings of tasks awaiting performance by the processor, memory usage 3244, disk usage 3246, network usage 3248, resource usage or the amount of resources used by each process 3250, paging space occupancy 3252, file system occupancy 3254, and logical drive occupancy 3256.
  • the inventory data 3239 includes a CPU version 3239 that indicates the processor type 3239a and a CPU clock rate 3239b .
  • Typical CPU version may be "Pentium TV, stepping 6" and a typical clock rate is "1.5 Ghz”.
  • the inventory data also includes operating system information such as a operating system version 3239c, a version release number 3239d, a maintenance release number 3239e, and a patch level number 3239f.
  • the CPU usage data includes user mode (“usr”) CPU usage 3240a, system mode (“sys”) CPU usage 3240b, time spent by the CPU waiting for blocked processes (“wio”) 3240d, and idle time (idle) 3240c when the CPU has no tasks to perform.
  • the process queue data 3242 includes blocked queue data 3242a about process that cannot be performed because the processor 330 is waiting, for example, for an input/output operation and run queue data 3242b about processes that are ready to be performed by the processor 330.
  • the memory usage data 3244 includes free memory data (“fre") 3244a, total active virtual memory data (“avm”) 3244b, page-ins per second (“pi") 3244c, and page-outs per second (“po") 3244d.
  • the disk usage data 3246 includes disk bandwidth data ("tm_act”) 3246a, disk transfers per second (“tps”) 3246b, disk read counter data 3246c, and disk write counter data 3246d.
  • the data collected about the resources used by each process includes memory usage 3250a, input/output usage 3250b, and CPU usage 3250c.
  • the collected data is stored in a date file.
  • a sample data file is attached hereto as appendix F. Although the data files are typically stored in binary format, the sample data file in appendix F is configured in ASCII format to make it readable.
  • compressor 350 compresses the data files, and encryptor 352 encrypts the compressed files to reduce the risk of an unauthorized person accessing the data.
  • SMTP sender 354 then sends the data over the Intranet 314 via email to an email address associated with the monitor server 320.
  • the email message is sent via the Simple Mail Transport Protocol ("SMTP"), typically through SMTP server 3151.
  • SMTP Simple Mail Transport Protocol
  • Firewall 18 which contains a processor 370 and a storage subsystem 372, is configured to allow only certain kinds of information to be conveyed between Intranet
  • Firewall 318 is typically configured to allow email messages to be transmitted from the mail server 3150 into the Internet 316, allowing email messages sent from the SMTP sender 354 to be delivered to the monitor server 320.
  • firewall 318 may have an SMTP gateway 374 contained within the storage subsystem 374 of the firewall 318 that allows email messages to be securely transmitted from SMTP sender 354 to the monitor server 320 without going through mail server 3150. In either case, the Monitor server 320 eventually receives the email message from the Internet 316.
  • Monitor server 320 includes a processor 380 and storage subsystem 382.
  • Storage subsystem 382 stores mail server software 384 for sending and receiving email messages, a data analyzer 386 for analyzing data, a relational database management system
  • RDBMS Remote Desktop Server
  • file system 390 for storing files
  • web server 391 for serving web pages 393.
  • multiple computers are used to perform the tasks of the monitor server 320.
  • the web server 391 may, for example, be stored and executed on a separate computer to increase the responsiveness of the system.
  • Mail Server 384 includes an SMTP server 386 and a POP server 387.
  • SMTP server 386 receives the mail message containing the collected data and POP server makes the mail message available to analyzer 386 via the post office protocol ("POP").
  • POP post office protocol
  • the email message may be directly retrieved from the SMTP server using an "SMTP EXIT" call that is supported by the SMTP server 386.
  • RDBMS 388 stores User IDs 399 for identifying different users of the monitor server 320, Customer IDs 3100 to identify different organizations that have signed on for the monitoring service, Machine IDs 3102 for identifying the different servers being monitored for each of the organizations, an email address 3104 associated with the administrator of each of the machines, and data 3106 from the machines.
  • Analyzer 386 includes a POP client 3110 that retrieves the email message from the POP server 387 and extracts the data from it. In extracting the data, the POP client first decrypts the message and then decompresses the data. Analyzer 386 may be configured to store the data in the data section 3106 of the RDBMS or in data files 3113 contained within file system 390. Analyzer386 includes an engine 3112, which analyzes the data based on a set of rules 3114 contained within the analyzer. The analyzer may alternatively be configured to store the rules 3114 within RDMBS 388. A report generator 3116 of the analyzer generates a performance report 3118 for the local server 312 based on the analysis of the engine 3112. By performing the analysis of the data and generating the report on the momtor server 320 instead of the local server 312, the system 310 reduces the processing power and memory required on the local server 312 to monitor the server.
  • each rule is typically associated with a threshold value 3270 that specifies an acceptable range for a type of performance measurement, such as CPU usage, and a tolerance value 3272 that indicates how long a period of time the performance measurement may be out of the acceptable range when the local server 312 is operating properly.
  • Table 3274 shows the different pieces of information that are added to the report depending on whether or not performance measurement violates the threshold 3270 and on whether the period over which the threshold 3270 is violated is greater than the tolerance 3272.
  • Column 3276 shows text 3276a that is added to the report when performance measurement remains within the range specified by the threshold, while column 3278 shows two different versions 3278a and 3278b of text that are displayed when the performance measurement goes beyond the range.
  • the first version 3278a is only added to the report when the range is violated for a period that is less than the tolerance 3272 and the second version 3278b is only added to the report when the range is violated over a period that is greater than the tolerance 3272.
  • the analyzer 386 and the report generator 3116 generates a natural language report summarizing the collected data in a manner that is easy to understand.
  • the report generator may also be configured to include the actual percentage of the data, e.g. 40%, that exceeds the threshold value in the text segments 3278a and 3278b.
  • the versions 3278a and 3278b include text 3280a and 3280b that is emphasized to draw the attention of the reader.
  • the text 3280a and 3280b may be emphasized to alert the reader to a problem with the local server 312.
  • Report generator 3112 can be configured to emphasize the text 3280 using Italics, bold face font, underlining, larger fonts, a different foreground color, or a different background.
  • report generator 3116 generates an email message containing the report 3118 and retrieves an email address 3104 from RDBMS 388 associated with the administrator of the local server 312.
  • the report generator 3116 uses the SMTP server 386 to send the report to the email address.
  • Report generator 3116 also generates a web page corresponding to the report and provides the web page to web server 391.
  • the administrator of the local server 312 may retrieve the email message from any computer, such as laptop computer 322, that is equipped with a mail client.
  • Laptop computer 322 includes a processor and a storage subsystem 3122, which contains mail client software 3124.
  • Processor 3120 executes mail client software 3124, causing laptop computer 322 to retrieve the performance report email from an email server associated with the administrator.
  • the administrator can then view the report on a display associated with laptop computer 322.
  • the administrator can log onto web server 391 from a remote computer and view the report as a web page.
  • the agent software 324 initializes the monitoring process by getting (3304) the data upload period 3202 (FIG. 55A) corresponding to the timer configuration. Agent software 324 then determines (3306) the sample period 3204 (FIG. 55A) and entry period 3206 (FIG. 55A) of the timer configuration, for example, by looking them up in a table similar to FIG. 55A. Agent software 24 then starts (3308) the upload timer, starts (3310) the entry timer, and starts (3312) the sample timer of the timer module 348. Agent software 324 resets (3314) the total value and the counter value to zero. Agent software 24 checks (3316) whether the value of the sample timer is greater than or equal to the sample period.
  • data retriever 46 retrieves (3318) sample data values as previously described. Agent software 24 increments (3320) the total values by the value of the retrieved data, increments (3322) the value of the counter by one, and resets (3324) the sample timer. Agent software 24 then checks (3326) whether the value of the entry timer is greater than or equal to the entry period. If it is not, then agent software repeats the process of (3316-3326) of collecting another sample of data. Otherwise, if the value of the entry timer is greater than or equal to the value of the entry period, the data retriever 46 writes (3328) the ratio of the total values to the counter value to the data file and resets
  • Agent software 324 then checks (3332) if the value of the upload timer is greater than or equal to the upload period. If it is not, then agent software 324 resets (3314) the total values and the counter value and repeats the process (3316-3332) of making another data entry into the data file. Otherwise, if the value of the upload timer is greater than or equal to the upload period, agent software 24 directs (3334) the compressor 350, encryptor 352, and the SMTP sendor 354 to send the data file via SMTP. Agent software 324 creates (3336) a new empty data file for collecting more data, resets (3338) the upload timer to zero, and repeats the process (3314-3334) of populating the new file with data.
  • the process of collecting the data is typically implemented using timer interrupts of the processor 330 instead of the timer loops of FIG. 56 to minimize the CPU usage of the software agent 324.
  • the process may also be implemented using a sleep command.
  • FIG. 57 the process of sending the data file from the local server
  • agent software 324 reads (402) a closed data file into memory.
  • Compressor 350 compresses (404) the data contained within the file using the BZJP2 algorithm before encryptor 352 encrypts (406) the compressed data using the Sapphire algorithm.
  • Agent software 324 generates (408) an email message from the encrypted data by, for example, adding source and destination addresses to the email message.
  • Agent software 24 incorporates the encrypted file in the email message as an attachment.
  • Agent software 324 checks (412) if the email message was successfully sent. If it was not, agent software 324 closes (420) the unsent file and terminates the process of sending files. The closed file is resent at a later time when the agent software is invoked.
  • agent software 324 checks
  • agent software 324 reads (416) the first of the unsent files to memoryand performs the process (404-420) of sending the file.
  • agent software 324 reads (416) the first of the unsent files to memoryand performs the process (404-420) of sending the file.
  • the engine 3112 receives (502) data from the POP client 3110, it selects (504) the first data type for processing.
  • the engine 3112 retrieves (506) tolerances and thresholds for the rules corresponding to the selected data type.
  • the engine then reduces (508) the data being analyzed to produce a smaller data set that captures the information contained within the larger data set.
  • the engine for example, reduces CPU usage data to one entry per minute by only selecting the CPU usage datum with the largest value in each minute. By reducing the data, the time required to analyze the data is reduced.
  • the engine 3112 then checks (510) whether the data needs to be extrapolated to predict future trends or needs.
  • File system or logical drive data may need to be extrapolated to allow the engine to identify a need to update or replace resources to keep up with future demands on the local server 312. If the data needs to be extrapolated, the engine extrapolates (512) the reduced data.
  • the engine 3112 determines (514) the number of entries, if any, in the selected data that exceed the tolerance of the corresponding rule.
  • the engine 3112 checks (516) if no entries in the selected data exceed the threshold of the corresponding rule.
  • the report generator 3116 presents (518) a first display, such as a set of traffic lights that has the green light on, in the report before generating (532) natural language text to include in the report. Otherwise, if some entries exceed the threshold, the report generator 3116 generates (520) and presents blow-ups for entries exceeding the threshold.
  • the blow-ups contain more detailed information about the entries that exceed the threshold values and are typically used by an administrator to determine why the threshold value was exceeded.
  • the engine 3112 checks (522) if the number of entries that exceed the threshold value is below the tolerance value of the corresponding rule.
  • the report generator 3116 presents (524) a second display, such as a set of traffic lights that has the yellow light on before generating (532) natural language text to include in the report. Otherwise if the number of entries that exceed the threshold value is above the tolerance value of the corresponding rule, the engine 3112 checks (536) whether all the entries exceed the threshold value. If all of the entries do not exceed the threshold value, the report generator 3116 presents (528) a third display, such as a set of street lights with the red light on. Otherwise the report generator 3116 presents (530) a fourth graphic display that includes the red light and a warning that the resources represented by the data is insufficient. The report generator then selects (532) natural language text describing the selected data, as described above with reference to FIG. 55D, and presents the selected text in the report. The engine 3112 selects the next data type and repeats the process (506-532) described above.
  • the report 602 is, for example, a HyperText Markup Language (“HTML”) document or a Portable Document Format (“PDF”) document that is attached to the reply email message from the monitor server as an attachment.
  • HTML HyperText Markup Language
  • PDF Portable Document Format
  • Each report 602 has a brief introduction 604 that includes an inventory of the subsystems of the local server 312.
  • the report 602 also includes an executive summary 608, which, for example, has paragraphs 610a describing the performance of the CPU or processor 330, paragraphs 610b describing the performance of memory, paragraphs 610c describing the performance of the disks, and paragraphs 610d describing the performance of the network.
  • Each of the paragraphs 610 includes a hypertext link 612 to more detailed information about the corresponding component.
  • Each of the paragraphs may also have possible problems 614 in the corresponding component highlighted or emphasized to draw the readers attention, as previously described.
  • the report 602 has details 616 which are divided into sections corresponding to the paragraphs in the executive summary 608.
  • the details 616 include, for example, a CPU section 618a, a memory section 618b, a disk section 618c, and a network section 618d.
  • Each of the sections contains usage information 620 that includes a graphic, such as a traffic light indicating whether the performance of the component, natural language text describing the performance of the component in words, and a graph showing a plot of the data of the component.
  • the report presents the performance data in a format that is easy to understand.
  • the report 602 also includes blow-up detail 630 for each set of performance data that is not within the range of values set by the threshold values.
  • the blow-up detail 630 includes resource usage 632 for each process.
  • the resource usage 632 includes CPU usage 632a, input/output usage 632b, and memory usage 632c.
  • the report 602 also includes information on the occupancy of such resources, such as, paging space occupancy 640, file system occupancy 644, and logical drive occupancy 648.
  • the occupancy information typically includes extrapolations to allow an administrator to predict when the resources corresponding to the occupancy information will need to be updated or replaced. For instance, if the extrapolated occupancy data shows that the file system will be fully occupied in the next 15 days, an administrator may configure the server to expand an expandable resource, such as paging space. The administrator may also start looking into an upgrade or replacement of the components on the local server 312 to keep up with the demand for file system space.
  • a sample report is attached hereto as appendix G.
  • an administrator loads (702) a web page from web server 391 onto web browser 343.
  • the web page contains instructions for installing the software.
  • the user creates (704) a customer account on the monitor server 320.
  • the customer account is associated with a customer JJD 3100 and a user ID 399.
  • the customer JD 3100 and the user ID 399 are, for example, generated by the monitor server 320 using a hash function with the customer's phone number as the input to the hash function.
  • the customer JD typically has fourteen digits, twelve of which are from the hash function and two of which provide a checksum of the other twelve digits.
  • the machine ID also has fourteen digits, two of which are a checksum and twelve of wliich are from a hash function.
  • the machine ID is generated differently, depending on the operating system 334 of the local server 312. For example, on a UNIX RISC machine, the twelve digits of the machine ID are obtained from the unique UNAME of the machine, provided by the operating system.
  • the user downloads (706) the agent software 324 from the monitor server 320 and installs (708) it on the local server 312.
  • the user registers (710) the agent software 324 with the momtor server 320, thereby creating a unique machine ID 3102 associated with the local server.
  • the machine ID 3102 is also associated with the user ID
  • the user loads the web page 802 onto the web browser 343 by typing a uniform resource locator (URL) 804 into an input 806 of the browser 343.
  • URL uniform resource locator
  • the browser 343 loads the web page 802.
  • Web page 802 includes a hyperlink 808.
  • the web browser 343 loads an instruction web page, which is described below with reference to FIG. 62. o As shown in FIG. 62, upon clicking on the hyperlink 808, the web browser 343 loads an instruction web page 902 that contains instructions for installing agent software
  • Web page 902 contains a menu section 904 that has links 904a-904b that a user can click on to instructions for performing the steps in the installation of agent 324. The user can click on link 904 for instructions on creating an account, link 904b for instructions on 5 downloading agent software 324, link 904c for instructions on installing agent software
  • a section 906 of web page 902 contains instructions for creating an account. After reading the instructions, the user may click on link 908 to create an account.
  • Fig. 63 shows a section of the web page 902 that contains instructions 910 for 0 downloading agent software 324 and instructions 912a for installing the agent. The user moves scrollbar 913 to reveal this section shown in FIG. 63. After reading the instructions, the user may click on hyperlink 914 to download agent software 324.
  • Fig. 64 shows another section of the web page 902 containing additional instruction 912b for installing the software.
  • Fig. 65 shows yet another section of the web page 902 containing instructions 920 for registering the local server 312 or enabling the equipment. After reading the instructions, the user may register the server 312 by clicking on a hyperlink. Web page 902 also contains a section that has additional instructions for users that have already installed the agent software 24.
  • FIG. 66 shows a first section 1300a of web page 1300 that is loaded by web browser 343 when the user clicks on hyperlink 908 (FIG. 62) to create an account.
  • Section 1300a collects personal data from the user.
  • Section 1300a includes an input 1302 for entering a salutation that is to be used when referring to the user, an input 1304 for entering the first name of the user and an input 1306 for entering the last name of the user.
  • Section 1300a also includes an input 1310 for selecting the user's job title and an input 1312 for entering the user's department.
  • Section 1300a also includes an input 1314 for selecting a language that the user would like to communicate in and an input 1312 for selecting the medium through which the user heard about the web server 391.
  • Section 1300b includes an input 1320 for entering a name of the company, inputs 1322-1332 for entering the company's address information, input 1334 for entering telephone information and input
  • Section 1300b also has inputs 1338-1344 for entering demographic information about the company.
  • the user uses input 1338 to select an industry that the company is associated, input 1340 to select the number of employees in the company, input 1342 to select the number of servers in the company, and input 1344 to enter the number of server pools in the company.
  • Fig. 68 shows a third section 1300c of the web page 1300 for entering authentication or "login" information about the user.
  • Section 1300c includes an input
  • Section 1300c also contains an input 1354 for entering a login name, which is stored as user ID 399 on the monitor server 320. The user uses inputs 1356 and 1358 to enter and confirm a password for authenticating the user.
  • 1300c also contains inputs 1360-1362 for entering information that the user may use to retrieve a forgotten password.
  • Input 1360 is used for entering a question, such as "what is your mother's maiden name?" that only the user would know and input 1362 is for entering the answer to the question in input 1360.
  • monitor server 320 presents the question from input 1360 to the user. If the user can provide the answer from input 1362, the server provides the password fro input 1354 to the user. Thus, monitor server 320 collects authentication information from the user.
  • FIG. 69 shows yet another section 1300d of the web page 1300 for creating an account.
  • Section 1300d includes a button 1370 that the user may click on to submit the information entered in sections 1300a- 1300c to the server.
  • Section 1300d also contains a second button 1372 that the user may use to clear all the data entered in sections 1300a to 1300c if the user wants to re-enter the data.
  • FIG. 70 shows a web page 1700 that is presented to the user after clicking on the button 1372 (FIG. 70) to submit account information.
  • Web page 1700 includes a customer ID number 1702 for the user.
  • Web page 1700 also contains information 1703 notifying the user that the customer ID has been sent to the email address 1350 (Fig. 68) provided by the user.
  • Web page 1700 includes a hyperlink 1704 that the user may use to download agent software 324.
  • FIG. 71 shows a first section 1800a of a web page 1800 that the user may use to download agent software 324.
  • the section 1800a includes a hyperlink 1802a that the user may click on to obtain additional information about installing the agent 324 on a UNIX operating system.
  • Section 1800a also includes a hyperlink 3102b that the user may click on to obtain additional installation information and 1802b that the user may click on to retrieve additional information on installing the operating system on a Microsoft Windows operating system.
  • FIG. 72 shows a second section 1800b of the web page 1800.
  • Section 1800b includes a first portion 1804a relating to installing the agent on a Linux computer and a second portion 1804b relating to installing the agent on a Microsoft Windows computer.
  • the first portion 1804a includes a hyperlink 1806a for downloading a Windows version of the agent software 324 using the hypertext transfer protocol ("HTTP") and a second hyperlink for 1808a for downloading the Windows version of the agent software using the file transfer protocol (“FTP").
  • HTTP hypertext transfer protocol
  • FTP file transfer protocol
  • the first portion also contains information 1810a on the different versions of the windows operating system supported by the Windows version agent software 324.
  • the second portion 1804b includes a hyperlink 1806b for downloading a Linux version of the agent software 324 using HTTP and a second hyperlink for 1808b for downloading the Linux version of the agent software 324 using FTP.
  • the first portion also contains information 1810b on the different versions of the Linux operating system supported by the Linux version agent software 324.
  • FIGs. 73 and 74 also show sections 1800c and 1800d of the web page 1800.
  • the sections 1800c, 1800d contain portions 1804c, 1804d, 1804e, which respectively relate to installing agent software 324 on the IBM RS 6000 operating system, Sun operating systems, and HP-UX operating system.
  • Each of the portions includes hyperlinks 1806c, 1806d, and 1806e for downloading agent software 324 via HTTP and hyperlinks 1808c,
  • Each of the portions also includes information 1810c, 18 lOd, and 1810e about the different versions of the corresponding operating system that are supported by the agent software 324.
  • FIG. 75 upon clicking on one of the download hyperlinks 1806a- 1808e (FIGS. 72-74), the web browser 343 presents the user with a dialog 2200 asking the user whether the user would like to run agent installation software or to save it on the user's hard drive.
  • the user uses option controls 2202 and 2204 and then clicks on an "OK" button 2206 to submit the user's choice.
  • the user may also cancel the download by clicking on a "cancel" button 2208.
  • FIG. 76 shows the dialog 2300 that is presented to users who opt to save the agent installation software in the dialog of FIG. 75.
  • the dialog 2300 includes an input 2302 for selecting a directory where the agent installation software should be saved.
  • the dialog also includes an input 2304 for selecting a name that should be assigned to the agent installation software.
  • the user submits his selections by clicking on a "save” button 2306.
  • the user may also cancel the download by clicking on a "cancel” button 2308.
  • the user may execute the software by clicking on an icon associated with the installation software.
  • FIG. 77 shows a dialog 2400 that is presented to a user upon clicking on the installation software.
  • the dialog 2400 includes a message 2402 welcoming the user to the installation process. The user may continue with the process by clicking the "next" button 2404. The user may also cancel the installation by clicking on the cancel button 2406.
  • FIG. 78 shows a dialog 2500 that prompts the user for a customer ID 100 (FIG. 1).
  • a valid customer ID is required before the agent software 324 can be installed.
  • customer IDs 100 are assigned to users when they create an account on the monitor server 20.
  • the dialog 2500 includes an input 2502 for entering the customer ID, a "next" button 2504 for submitting the entered customer ED and proceeding with the installation process, a "back” button 2506 for moving back in the installation process, and a "cancel” button 2508 for terminating the installation.
  • FIG. 79 shows a dialog 2600 for entering SMTP information.
  • Dialog 2600 includes a input 2606 for entering an SMTP server, such as SMTP server 386, which will be used to transmit reports to the monitor server 320.
  • Dialog 2600 also includes an input 2604 for selecting an Internet Protocol ("JP") port that will be used to communicate with the SMTP server and an input 2606 for entering an email address from which the reports should be transmitted.
  • Dialog 2600 also includes a "next button" 2608 for submitting the data entered in the dialog 2600 and continuing with the installation process.
  • JP Internet Protocol
  • FIG. 80 shows a dialog 2700 that is used to select a directory in which agent software 324 should be installed. The user may change the directory by clicking on
  • Fig. 81 shows a dialog 2800 that is used to select whether the user would like a typical, compact, or custom installation based on selection inputs 2802.
  • the compact option only installs the minimum components of agent software 324 that are required for the agent to operate.
  • the compact option is often chosen on computers that have limited storage space.
  • the custom option allows the user to select the components that they would like to install. The user submits their selection and continues with the installation process by clicking a "next" button 2804.
  • FIG. 82 shows a dialog 2900 that is presented during a custom installation to allow the user to select the components they would like to install.
  • Options 2902 are used to select whether the user would like to install computer program files, documentation, or sample files of the agent software 324. The user submits their selection and proceeds with the installation software by clicking on the "next button 2904.
  • FIG. 83 shows a dialog 33000 that is used to enable the monitor server 20 to receive data from the agent software 324 on the local server 312.
  • the user may opt to enable the service by selecting input 33002.
  • the user may also opt to enable the service later by selecting input 33004.
  • the user can then enable the software on the web pages 393 presented by the monitor server 320.
  • the user submits their selection and proceeds with the installation process by clicking the "next" button 33006.
  • FIG. 84 shows a dialog 33100 that is presented to the user to allow the user to enter information that is required to enable the monitor server 320 to receive data from the local server 312.
  • the dialog 33100 includes an input 33102 for entering an email address where monitoring reports for the local server 312 should be sent.
  • the dialog 33100 also includes inputs 33104 and 33106 for entering and confirming a password for encrypting information sent from the monitor server 320 to the local server 312.
  • the user submits their selection and proceeds with the installation process by clicking the "next" button 33
  • FIG. 85 shows a dialog 33200 informing the user of the progress I transmitting the enablement information to the monitor server 320.
  • the dialog 33200 includes a log window 33202 containing a log of communications between the local server 312 and the monitor server 320. The user proceeds with the installation process by clicking the "next" button 33204.
  • FIG. 86 shows an email message 33300 that is transmitted by the monitor server 320 to the email address entered in input 33102 (FIG. 84) to inform the user that the service was successfully enabled.
  • Message 33300 includes a machine ID 33302 and a machine name 33304 that are assigned to the local server 312 by the monitor server 320, in addition to information 33308 about the number of processors and the class of the equipment on the local server 12.
  • Message 33300 also includes a customer ID 33306 associated with the user and a password 33310 for encrypting messages relating to the local server312.
  • FIG. 87 shows a dialog 33400 that is presented to the user when the installation is complete. The user may close the dialog by clicking on the finish button 33402.
  • FIG. 88 shows an email message 33500 that is transmitted by the monitor server 320 to the email address entered in input 33102 (FIG. 84) to inform the user that agent software 324 was successfully installed.
  • Message 33500 includes the name 33502, the version 33504 of the operating system 334, the number 33506 of processors 330, and the amount 33508 of memory on the local server 312.
  • FIG. 89 shows a first panel 33600 of a user interface for agent software 324.
  • Panel 33600 displays the version 33602 of the operating system, the name 33604, and the machine ID 33606 of the local server 312.
  • Panel 33600 also contains information 33610 about the data retriever and information 33608 about the SMTP sender 354. The user may switch to a second panel 3700 (FIG. 90) by clicking on selector 3612.
  • FIG. 90 shows a second panel 3700 of the user interface of agent software 324.
  • Panel 3700 includes an input 3702 for selecting a data upload interval or period, an input 3704 for changing the customer ED 3100, an input 3706 for entering a path to a file where the collected data should be stored, an input 3708 for entering a path to a file where the activities of agent software 324 should be logged, an input 3710 for disabling the delivery of reports by mail for users who only want to view reports through a web browser, an input 3712 for selecting an email address where reports are to be sent, an input 3714 for selecting an email address from which collected data should be sent to the monitor server 320, an input 3716 for changing the SMTP server, and an input 3718 for selecting the SMTP port.
  • the user submits any selections entered on panel 3700 by clicking "apply" button 3720.
  • the user may switch to a third panel of the user interface by clicking on selector 3722.
  • FIG. 91 shows a third panel 3800 of the user interface of agent software 324.
  • Panel 3800 includes a first button 3802 for starting agent software 324 and a second button 3804 for stopping the agent software.
  • the agent software 324 is normally started automatically when the computer is turned on, as described above.
  • Button 3804 may be used to stop the agent software 324.
  • Button 3802 may later be used to restart the agent software 24.
  • Button 3806 may be used to send a test email message, known as a probe, to the monitor server 320.
  • the test email message is used as a diagnostic tool to determine whether email is being conveyed from the SMTP sender 354 to the monitor server 320.
  • the agent software 324 may be used on a server that is not protected by a firewall.
  • Pentium II or Pentium II Xeon JDeschutes 1 6%
  • AMD-K6(tm) 3D processor 1 6%
  • Pentium III Xeon (Coppermine) 1 6%
  • Shape Explorer 2 Shape Explorer Help 2 SmartShape Wizard 2
  • the data used in this report was obtained from an exclusive collector, developed specially for this end, executing on the target instance with high resolution and low intrusion.
  • This collector obtains data directly from the Oracle instance, without any other libraries or additional tools, with a minimum overhead on the system.
  • the data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
  • This report is based on years of experience in performance analysis and capacity planning.
  • the tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, and is subject to continuous improvements. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
  • the buffer cache hit ratio was high during most of the monitored period.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was low all the time.
  • the table below shows the main configuration parameters for the tablespace.
  • the graph shows that the tablespace usage rate was high all the time. You may consider increasing the tablespace.
  • the table below informs the list of datafiles in the databas, with their tablespace, location, creation date, s sttaattuuss,, aaccttiivvaattiioonn mmooddee,, occupied bytes and free bytes.
  • TSCMDA02 E ⁇ INST00 ⁇ C T1 ⁇ DBS ⁇ DTRTSCMT1 DA021.
  • TSC IX02 E ⁇ INST00 ⁇ CMT1 ⁇ DBS ⁇ DTRTSCMT11X021.
  • the number of connections to the database did not exceed the limit, and was not a problem.
  • the table below shows the database redo logs, their switch history and their status.
  • This section shows the users with the most I/O activity in the database, per day.
  • the data used in this report was obtained from an exclusive collector, developed specially for this end, executing on the target instance with high resolution and low intrusion.
  • This collector obtains data directly from the SQL Server instance, without any other libraries or additional tools, with a minimum overhead on the system.
  • the data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
  • This report is based on years of experience in performance analysis and capacity planning.
  • the tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, and is subject to continuous improvements. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
  • the procedure cache hit rate remained low for most of the monitored period.
  • the server's memory consumption was low most of the time, so it was not a problem.
  • Locked memory remained low throughout the monitored period, so this was not a problem.
  • the Procedure Cache had a low memory usage throughout the period, but it exceeded the thershold on occasions, and this may have caused constraint.
  • SQL Server instance PROLIANT Below is a list of all databases for SQL Server instance PROLIANT, along with the database name, size and the list of log and data files.
  • MSDBLog C ⁇ Program Files ⁇ Microsoft SQL Server ⁇ MSSQL ⁇ data ⁇ msdblog.ldf 10,752.0 newdb2cop newdb2cop_Data
  • the table below shows the I/O data for the disks used by the SQL Server, along with the data files in each disk and their locations.
  • the graph below shows the number of connections made to the SQL Server during the monitored period.
  • the limit of simultaneous connections to the database is "unlimited”.
  • the buffer cache hit rate was high all the time, indicating that the server's memory is sufficient.

Abstract

An agent (14) obtains data from a device (19) by receiving a plug-in (26) containing system calls for obtaining the data from the device (19), loading the plug-in into the agent (14), obtaining the data from the device using the system calls, and transmitting the data over an external network (12) using one or more of a plurality of protocols. The data is provided to a client (30) by formatting the data, and making the formatted data accessible to a client (30) via the external network (12). Data indicative of an operating state of a machine is automatically and repeatedly collected. Information related to the collected date is automatically transmitted to a location remote from the machine. The information is transmitted in the form of electronic mail messages complying with standard electronic mail messaging protocol.

Description

MANAGING A REMOTE DEVICE TECHNICAL FIELD
This invention relates to managing a remote device, including obtaining data from the remote device and presenting the data to a client device. BACKGROUND
Today's rapidly changing information technology (IT) environment has created significant obstacles, or "pain points" for corporate IT managers worldwide. Corporations and their IT departments are faced with the daunting task of managing the sheer growth in the size and complexity of their internal and external networks, as well as the rapid integration of new Web-based applications with legacy systems. This creates the necessity of highly trained and specialized IT staff, to have the necessary intelligence to manage so many different systems that make up the internal and external network. When combined with an overall shortage of IT talent in the marketplace, more cautious IT spending, and a generally insufficient level of specialized training within existing IT staffs, the need for scalable third party management solutions has become urgent.
Third party management solutions can sometimes bring more problems than solutions. The implementation cycle associated with management tools are huge. The costs associated are also more than many IT departments had planned. When combined with the need for specialized team to work the third party tools, IT departments need to look elsewhere, creating a need for outsourced IT management services, which can deliver a continuous automated IT management solution, using the Internet, for example. Firewalls and other internal network security systems can prevent third party remote access to data stored in devices on an internal network. This can be problematic, particularly for network administrators who cannot access the internal network, but who need to obtain information about one or more devices on the internal network. Systems currently exist which allow such a device to send pre-selected status information to a remote device via electronic mail (e-mail). These existing systems, however, do not provide enough flexibility for some users.
SUMMARY In general, in one aspect, the invention is directed to obtaining data from a device using an agent. This aspect includes receiving a plug-in containing system calls for obtaining the data from the device, loading the plug-in into the agent, obtaining the data from the device using the system calls, and transmitting the data over an external network using one or more of a plurality of protocols. This aspect may include one or more of the following features. The agent may include shared libraries containing system calls for obtaining other data from the device. The shared libraries may be loaded into the agent when the plug-in is loaded. The data may be obtained from the device periodically, such as every minute. The plurality of protocols may include simple mail transfer protocol (SMTP), hyper text transfer protocol (HTTP), and secure sockets layer (SSL) protocol. Data transmission may be effected using at least one of a proxy and socket.
The agent may reside on an internal network that includes the device. A machine may be selected on the internal network to transmit the data over the external network. The external network may include the Internet. The agent may reside on the device. The agent may reside on a machine located on the internal network that is not the device. The network may include a network device located on the internal network and the agent may reside on a server that is also on the internal network. The data may relate to one or more of the following: a processor on the device, memory on the device, a hard drive on the device, the internal network on which the device is located, and software installed on the device.
In general, in another aspect, the invention is directed to providing, to a client, data that was obtained by an agent from a remote device on an internal network. This aspect includes receiving the data via an external network, at least some of the data being received periodically, formatting the data, and making the formatted data accessible to a client via the external network. This aspect may include one or more of the following features.
Formatting the data may include generating a report based on the data. The report may be a natural language report. Formatting the data may include generating a display based on the data and updating the display periodically as new data is received periodically via the external network. The data may be received every minute. Formatting the data may include determining if the data indicates that an operational parameter of the device exceeds a preset limit and generating a report to a client indicating that the operational parameter exceeds the preset limit.
The external network may include the Internet. Making the formatted data accessible to the client may include providing a World Wide Web site through which the data can be accessed by the client. The formatted data may be made accessible to the client using wireless application protocol.
In another general aspect of the invention, a method includes automatically and repeatedly collecting data indicative of an operating state of a machine, and automatically transmitting information related to the collected data to a location remote from the machine. The information is transmitted in the form of electronic mail messages complying with a standard electronic mail messaging protocol, such as a Simple Mail Transfer Protocol. h another general aspect of the invention, an article comprising a machine- readable medium on which are tangibly stored machine-executable instructions for monitoring a computer, includes instructions operable to cause a processor to perform the method of the first general aspect of the invention.
Embodiments of the invention may include one or more of the following features. A monitoring computer receives the electronic mail messages and analyzes the information to derive performance measures. The monitoring computer generates a report embodying the performance measures and makes the report available electronically, for example, from a web site. The report includes a natural language document expressed in a natural language format.
The machine may be a network server, a desktop computer, or an intelligent appliance. The data collected includes a time-ordered sequence of performance measurements taken at fixed time intervals. The collected data, for example, include measurements of CPU usage, process queue length, memory usage, memory paging rate, disk usage, network usage, paging space occupancy, file system occupancy, and process resource usage. The collected data are typically collected from a registry, a system call, a virtual file system, a virtual device, or an input/output control call to a device. The information related to the collected data is compressed and encrypted for inclusion in the electronic mail message.
In another general aspect of the invention, a method includes automatically and repeatedly receiving electronic mail messages that include information related to remotely collected data. The collected data are indicative of a performance of a machine and the electronic mail messages comply with a standard electronic mail messaging protocol. The method also includes automatically analyzing the information to determine the performance of the machine.
In yet another general aspect of the invention, an article comprising a machine- readable medium on which are tangibly stored machine-executable instructions for monitoring a remote machine includes instructions operable to cause a machine to perform the method of the third general aspect of the invention.
Embodiments of the invention may include one or more the following features. The information related to the remotely collected data is extracted from the electronic mail messages. The collected data is a time ordered sequence of performance measurements and analyzing the collected data includes comparing at least some of the performance measurements with a corresponding threshold value to determine whether the performance measurements are within a range of acceptable values. The analysis also includes determining the number of performance measurements that are within the range of acceptable values.
A natural language report is generated by selecting items of information to be added to the report based on the analysis of the information included in the email messages. The items of information are, for example, selected based on the comparison of the performance measurements to the threshold values or based on the number of performance measurements that are within the range of acceptable values. The natural language report typical includes a natural language sentence or a graphical display. The natural language sentence may include a measurement value or a threshold value. Part of the natural language sentence is sometimes enhanced, for example, using bold typeface, italicized typeface, colored typeface, underlining, or a different font size from the rest of the sentence to draw attention to the sentence. The natural language sentence includes a hyperlink to more detailed information about a section of the sequence of performance measurements. An electronic mail message that includes the report is generated and transmitted over a network.
Other features and advantages of the invention will be apparent from the following description and from the claims.
DESCRIPTION OF THE DRAWINGS Fig. 1 is a view of a network that includes an internal network having devices to be monitored by an agent. Figs. 2 to 9 and 28 to 41 show installation screens for the agent, including the relay portion of the agent.
Fig. 10 is a flowchart showing a process for monitoring a device on the internal network. Fig. 11 is a flowchart showing a process for providing data from a monitored device to a user.
Figs. 12 to 26 show Web pages for viewing the data from the monitored device.
Fig. 27 shows a computer on which the processes of Figs. 10 and/or 11 may be implemented. Figs. 42 to 51 shows a cellular telephone for viewing data obtained by the agent.
Figs. 52a, 52b and 53 show Web pages for enrolling in a service in order to download the agent.
FIG. 54 shows a system for monitoring a server;
FIG. 55 A is a table of sampling periods; FIG. 55B is a table of sources of data indicative of an operating state;
FIG. 55C shows kinds of data collected;
FIG. 55D shows data contained within a rule for analyzing data;
FIG. 56 is a flow chart of the process of collecting data from the server;
FIG. 57 is a flow chart of the process of transmitting the collected data; FIG. 58 is a flow chart of the process of analyzing the collected data and generating a report;
FIG. 59 is a block diagram of the structure of a report; FIG. 60 is a flow chart of the process of installing agent software; and FIGS. 61-91 are screenshots of the process of FIG. 60.
DESCRIPTION Fig. 1 shows a network system 10. Network system 10 includes an internal network, such as a local area network (LAN), and an external network, such as the
Internet. Internal network 11 is segregated from external network 12 via a firewall 14. Firewall 14 allows messages, such as e-mail, to be exchanged between devices (e.g., computers) on internal network 11 and external network 12. However, firewall 14 does not permit devices on external network 12 to directly access data stored on internal network 11.
Internal network 11 contains several devices. These devices may be computers with network interface cards, including servers and desktop computers, and/or network peripherals, such as routers, hubs or switches. Internal network 11 includes three desktop computers 16, 17 and 19, server 20, router 13 and switch 18. Other devices may also be included in addition to, or instead of, these devices.
External network 12 contains a server 21, which has access to a database 22. hi this embodiment, server 21 is one or more World Wide Web (or simply "Web") servers that are capable of receiving data, storing the data in database 22, processing the data, and hosting a Web site that makes the processed data accessible to client devices, directly or indirectly via the Internet. The details of the processing performed by server 21 and the
Web site hosted by server 21 are provided below. A computer program, known as an "agent", is installed on a device, such as computer 19, on internal network 11. The agent permits a remote client device to manage computer 19 and to monitor computer 19 and other devices on internal network 11. This is done through the use of communications provided from the agent to server 21. The communications may be transmitted via e-mail using simple mail transfer protocol (SMTP), hyper text transfer protocol (HTTP) or secure sockets layer (SSL) protocol. SSL is a protocol developed by Netscape® for transmitting private documents over the Internet. SSL works by using a public key to encrypt data that is transferred over an established SSL connection. Additionally, the communications might have to have additional provisions for crossing through a firewall, such as supporting authenticated proxies and the like. More than one agent may be installed on a single network.
Each agent 24 is comprised of three core software components: an engine 25, one or more plug-ins 26, and a relay 27. These core components may run on the same device or on different devices. Here, engine 25 and plug-ins 26 run on computer 19 and relay 22 runs on server 20. Plug-ins 26 are installable computer programs that are responsible for collecting the state of hardware, operating systems and/or applications, in a device that is being managed/monitored by agent 24. Examples of operating systems that may be managed/monitored include, but are not limited to, the Microsoft® Windows® family (Intel 8086-like hardware platform), including NT4® (Workstation, Server, Terminal Server), Windows2000® (Professional, Server, Advanced Server) Windows9x® (95(all versions), 98 (all versions) and ME(Millennium), and Linux versions kernel 2.2, 2.4 (RedHat 6.2 and above, Conectiva 6.0 and above).
The plug-ins constitute shared libraries containing system calls for collecting data from a device. Engine 25 is a computer program that is responsible for controlling plug- ins 26, grouping the collected data and sending the data to relay 27 using, e.g., transmission control protocol/internet protocol (TCP/IP). Relay 27 is a computer program that is responsible for sending the collected data to server 21 over the Internet (or, more generally, external network) via, e.g., SMTP, HTTP or SSL. Relay 27 need not be installed in all computers on internal network 11. A client can choose to install relay 27 on a single computer on internal network 11 with Internet access and direct all agents running on internal network 11 to send data to that one relay, which will then send the data to server 21.
Agent 24 may be installed on the device to be monitored, as is the case here, or it may be stored on another devices (e.g., a server) on the same internal network as the device to monitored (which is the case for network peripherals management). During the installation process, relay 27 is configured to permit functions such as sending and receiving messages using e-mail or HTTP or SSL. Engine 25 is then executed. After engine 25 is executed for the first time, it calls all the installed plug-ins and reads configuration information contained therein. Engine 25 creates a schedule to call the plug-ins at periodic time intervals. Once engine 25 is up and running, engine 25 will, at the time intervals, call the plug-ins. For example, a plug-in can be scheduled to execute every minute, every 5 minutes, and so on. After each plug-in executes, the plug-in returns data that it collected to engine 25.
In this embodiment, the following plugs-ins are available, although other plug-ins may be used instead of, or in addition, to the following. "Sysinfo" collects information regarding the configuration of the entire system from the point of view of the system's operating system. "Vmstat" collects information regarding the CPU usage and memory usage of the computer system where the plug-in is installed. "lostat" collects information regarding the disk I/O usage of the computer system where the plug-in is installed. "Netstat" collects information regarding the network statistics of the computer system where the plug-in is installed. "Fsinfo" collects information regarding the file system of the computer system where the plug-in is installed. "Psinfo" collects information regarding the processes that are miming on the computer system where the plug-in is installed. "Swpinfo" collects information regarding the swap area of the computer system where the plug-in is installed. "Lvminfo" collects information regarding the logical volume manager of the computer system where the plug-in is installed. "SQL Server", where "SQL" stands for "Structured Query Language", collects information regarding the state of a Microsoft® SQL SERVER 2000® database server on internal network 11. The "SQL SERVER plug-in" collects data that enables server 21 to generate a detailed report regarding the configuration, performance, etc. of the SQL SERVER 2000® database server. "Network" collects information from network devices that are connected to internal network 11, i.e., devices that are not physically part of the device on which agent resides, but are in the same internal network. "Oracle" plug-in collects information regarding the state of an Oracle® database server on internal network 11. The Oracle plug-in collects data that enables server 21 to generate a report regarding the configuration, performance, etc. of the Oracle® database server.
Engine 25 receives the collected data from plug-ins 26 and stores the collected data in a file in a binary and, in this case, proprietary format. Engine 25 compresses the file using a compression technique, such as the BZ2 compression method. Engine 25 sends the compressed data to the relay, which is responsible for encrypting the data.
Relay 27 receives data collected by one or more agents on internal network 11, encrypts the data, and sends the data through the Internet to server 21, where the data is analyzed. Relay 27 can run in a device other than the monitored (shown) device and can receive connections from more than one agent simultaneously. The relay's connection to the internet may be dial-up or permanent and may support SMPT, HTTP and/or SSL. In addition, the relay supports proxies and SOCKS (Windows® sockets), making it easier for outbound connections to go through firewalls. In this embodiment, relay 27 uses two methods of encryption. The encryption method that relay 27 selects corresponds to the transfer protocol that relay 27 uses to send the data to server 21. If SSL is used to transfer the data, relay 27 uses the encryption method that is available from the OpenSSL library, hi this embodiment, SSL version 3/Transport Layer Security (TLS) version 1 with Rivest, Shamir, and Adelman (RSA), Triple Data Encryption Standard (3DES) is used with a key of 128. RSA is a public-key encryption process developed by RSA Data Security, Inc. The RSA process is based on that fact that there is no efficient way to factor very large numbers. Deducing an RSA key, therefore, requires large amounts of computer processing power and time. The RSA process has become the de facto standard for industrial-strength encryption. DES is a popular symmetric-key encryption method that uses a 56-bit key.
If SMTP or HTTP are used to transfer the data, relay 27 encrypts the data using the sapphire, symmetrical, encryption process, in which the key used is a session key. This means that the key will only be used once. The key used is 128 bits. The server needs this key for decryption. Therefore, relay 27 uses the RSA, asymmetrical, encryption process to encrypt the key using a 1024 bits key.
Server 21 includes a computer program 29 to receive the encrypted and compressed data from agent 24, decrypt and decompress the data, and store the data in a database 22. Database 22 may be part of, or external to, server 21. Computer program 29 also retrieves the data from database 22 and presents the data to a client 30. Computer program 29 may include a Web server module, which formats the data and makes the data accessible as a Web page or even a WAP (Wireless Application Protocol) page. The formatting may also include generating a report in Adobe PDF format or using Java applets for displaying real-time graphics of data collected by the agents. An additional form of communicating information being collected by the agents that can be employed by server 21 is notifications. Notification are "real time" alerts sent every time a certain event happens (such as a threshold being exceeded) to portable communication devices such as cellular phones, pagers, etc. In this context, real-time is defined roughly by the data sampling rate of the agent and any delays associated with data transmission. The notification process may operate as follows. The user can specify occurrences that prompt a notification and the necessary configuration. For example, the user can be notified in response to changes in CPU usage, memory usage, disk VO, network I/O, file system/logical drive utilization, and the status of a process. For CPU usage, memory usage, disk I/O, network I/O, file system/logical drive utilization, the user configures a high point and a low point, e.g., CPU Utilization has the high point set to 80% and low point to 50%. The following scenarios may occur: (1) The user has the high point flag set to false and the value is below the high point. (2) The value reaches the high point and the flag is set to false. In this case the user receives the form of notification chosen and the high point flag is set to true. (3) The value is above the high point and the high point flag is true. Nothing is done here, since the user has already been notified. (4) The value is below the high point, above the low point and the high point flag is true. Nothing is done here. (5) The value is below the low point and the high point flag is true. The user is notified that it reached the low point and the high point flag is false
Notifications in response to the status of a process status function analogously. The user provides the name of the processes to be monitored. A user is notified once when the process stops running and receives a notification when the process starts running again. Generally speaking, only the resources the user has chosen are verified. Computer program 29 also analyzes the data collected from a device (e.g., device
19) in order to produce a natural language and conclusive report. In this context, the term "natural language" means a human-readable format that can be presented and understood by, e.g., a network administrator or the like. Computer program 29 generates the reports according to a rule-based system. For each of the reports there are sets of rules that determine what goes in the report.
In this embodiment, computer program 29 includes the following software modules (called "wizards") for generating different types of reports. Performance Wizard Service delivered through the Internet analyzes the foregoing performance of computational servers and presents results by means of conclusive, natural language reports. Consolidated Performance Wizard Service delivered through the Internet analyzes the foregoing performance of a group of computational servers, as a whole, and presents the results by means of conclusive, natural language reports. Capacity Wizard Service delivered through the Internet infers the future performance behavior of computational servers, studies possible upgrades, and presents results by means of conclusive, natural language reports. Consolidated Capacity Wizard Service delivered through the Internet infers the future performance of a group of computational servers, as a whole, and possible upgrades, and presents the results by means of conclusive, natural language reports. Real Time Monitoring (RTM) Service delivered tlirough the Internet shows, via an Internet browser or WAP (Wireless Application Protocol)-enabled device (such as a mobile phones or notepad), the updated status of the computational resources (such as memory usage, CPU usage, disk usage and network interface usage) of a computer. The service can also send alerts by WAP, SMS (Short Message System), e- mail or similar electronic communication channels whenever the consumption of each computational resource exceed pre-defined thresholds. The RTM Wizard service generates real-time graphical displays of data from an agent monitoring a device on internal network 11. Asset Wizard Service delivered through the Internet collects, keeps and analyzes information about computer hardware and software components such as hardware internal configuration, operating system version, installed software and upgrade history. Oracle Wizard Service delivered through the Internet analyzes the foregoing performance behavior of an Oracle © database and presents the results by means of conclusive, natural language reports. SQL Server Wizard Service delivered through the Internet analyzes the foregoing performance behavior of a Microsoft SQL Server © database and presents the results by means of conclusive, natural language reports. The rules used by computer program 29 are static and configurable in terms of thresholds and tolerances. This means that the addition of new rules requires adding or changing existing code in computer program 29, while changing the criteria of existing rules does not require such a change. Thresholds define a level, for a given resource consumption variable, above wliich, resource usage is considered critical. For instance, with computer processing units (CPUs), a threshold value is 75% utilization. Tolerances define for what percentage of an analyzed period a threshold was exceeded. Exceeding a threshold may not indicate a problem, unless the threshold is exceeded for a certain amount of time.
There are four combinations of situations involving thresholds and tolerances: (1) a threshold was never exceeded, (2) a threshold was exceeded for a period of time below tolerance, (3) a threshold was exceeded for a period of time above tolerance, and (4) a threshold was exceeded all the time. Different text may be provided (e.g., displayed) in a report for each of these four situations, for every resource variable being analyzed, and for every language supported.
Prior to operation, agent(s) (including engine, relay and plug-ins) are installed on computers of internal network 11. Installation may be perfonned by downloading the agent software from a Web site. An agent may be downloaded and installed for each type of platform on the internal network, e.g., Linux, Windows2000, etc. The agent is installed on each device to be monitored and in each device that is to act as a relay for internal network 11. A user, such as a network administrator, identifies himself (e.g., by e-mail address) and selects desired installation options. The agent automatically enables operation under the user's account through a Web site, such as "my.automatos.com", that is accessible via the Internet. The user then activates the monitoring services on the various devices. Installation options are described in more detail below.
Figs. 52a and 52b show Web pages for creating an account via a Web site, from which the agent can be downloaded. The Web pages request identification information for the user, such as the user's name, e-mail address, a password, and language preference, among other things. Fig. 53 shows a similar Web page for entering information on the company of the user that enrolled via the Web pages of Figs. 52a and 52b. Once enrolled, the user downloads the agent from the Web site and begins the installation process. During installation and operation, agent 24 generates and displays a graphical user interface (GUI) that has three tabs for checking the status of the agent and altering the agent's operation. The tabs are: "Status", "Settings" and "Start/Stop". Each tab may have different panels. Each panel presents a set of closely related parameters displayed in separate fields. Some of these parameters can be edited. Each tab is described below, along with the meaning and functionality of the fields contained therein.
Fig. 2 shows an example of status tab 31. Status tab 31 is displayed on a device running agent 24. The fields in status tab 31 are fixed, meaning that they cannot be edited.
In Fig. 2, machine panel 32 presents information describing the device on which the agent is installed, e.g., device 19. This information includes the operating system 34 of the device, the name 35 of the device and the MachinelD 36 of the device. "MachinelD" is the device's machine identifier. The Machine ID is a number that is generated during installation and that uniquely identifies device 19 to computer program 29 running in server 21 (shown in Fig. 1).
Agent panel 37 presents a start time 39, which is the date and time of the agent's activation, and a PID number 40, which is the agent's process ID (identifier) number. A process ID is a number that identifies a process in an operating system on the monitored device. Using the process ID or "PID", it is possible to send signals to a process running in an operating system, such as an instruction for the process to terminate. The modules field 41 shows each active collection module and its version number. Each module is responsible for coordinating the collection of data related to a specific service (e.g., Capacity Wizard, Performance Wizard, etc.). Whenever plug-ins are installed for new services, new modules are inserted and collectors may be added. Collector field 42 shows the name of each collector within a device being managed and indicates if such collectors are active ("UP"). Each collector is responsible for collecting data from a certain device resource, such as hard disk, memory, etc. Fig. 28 shows status tab 31 with other options 43 in the pull-down menu of collector field 42.
Data TX Panel 44 shows the Internet Protocol (IP) address 45 of the device in which the agent is installed and indicates if the device is currently sending samples to server 21. In the example of Fig. 2, the device's IP address is 127.0.0.1 and it is sending samples. If the device were not sending samples, icon 46 (Fig. 3) would be displayed in lieu of icon 47. LastTXBytes field 49 shows the amount of bytes sent to relay 27 in a last collected data sample. TotalTXBytes 50 field shows the total amount of bytes sent to relay 27 to present. Sent field 51 shows the amount of collected data sent to relay 27.
Last Sent field 52 shows the date and time that the last collected data sample was sent to server 21. Failures field 54 shows the number of failed sample transmission attempts. Last Failures field 55 shows the date and time of the last failed sample transmission attempt. When no failures occur an "unknown" status is indicated (as shown). Also shown in Fig. 2 is an agent service indicator 2. "UP" (shown) indicates that the agent is active. "DOWN" (not shown) indicates that the agent is inactive.
Fig. 4 shows an example of settings tab 57. Settings tab 57 is displayed on a device running agent 24. Some of the fields in settings tab 57 are fixed, others may be edited. In Fig. 4, General panel 59 displays a customer ID field 60 and a TMP
(temporary) path field 61. CustomerlD field 60 shows the e-mail address used during enrollment and input when the agent is installed. TMP path field 61 shows where samples are stored until they are sent to relay 27. Primary Relay panel 62 contains Relay Server field 69, which shows the JP address of the primary relay device on internal network 11, and Relay Port field 65 which shows the primary relay device's JP port number. Alternate Relay panel 66 includes a Relay Server field 67 and a Relay Port field
69. Relay Server field 67 indicates an alternate relay server's IP address. The alternate relay is automatically used when the primary relays is down. Relay Port field 69 provides the alternate relay server's IP port number. Clicking on Apply button 70 executes any alterations made in the fields shown in Fig. 4. The Start/stop tab 71 is displayed on a device running agent 24. In this tab, it is possible to activate and/or deactivate agent data sampling. Fig. 5 shows start/stop tab 71 when agent 24 is active ("UP"). Fig. 6 shows start/stop tab 71 when agent 24 is inactive ("DOWN").
In Agent Service panel 72, Start button 74 activates agent sampling (i.e., data collecting) (shown active) and Stop button 75 deactivates agent sampling. Reload Plug- ins button 76 reloads plug-ins installed in the agent.
Referring now to Fig. 7, a GUI 77 for the relay is similar to the GUI (Fig. 2) for the agent. GUI 77 is displayed on relay server 20 (Fig. 1) during installation and/or operation. As shown in Fig. 7, relay GUI 77 also has Status tab 79, Settings tab 80, and Start/Stop tab 81 with similar panels and functionalities as those described above.
Fig. 7 shows the relay GUI status tab 79. As was the case with the agent GUI status tab, most of the fields in relay GUI status tab 79 cannot be edited. Machine panel 82 presents information describing relay server 20, its operating system, name and MachinelD. The example presented in Fig 7 shows a computer (relay server) named "WRIEIRO2" executing Windows 2000 Professional with Service Pack 1 installed. The relay sever can be installed in a different operating system than the agents are installed.
Relay panel 84 includes Version field 85, which provides the relay's version number, Start Time field 86 which provides the date and time of relay activation, and PJD field 87 which provides the process ID number.
Data RX (Receive) panel 89 includes the TX (Transmit) Queue Len field 90 which indicates a backlog of samples to send to server 21 (Fig. 1), TotalRXBytes field 91 which shows the total amount of bytes received by the relay from all agents until the present, and Active Sessions field 92 which shows the number of active agents' sessions that are sending samples to the relay. The IP addresses of the agents that are generating the samples are listed in drop-down field 94. Data TX (Transmit) panel 95 includes the following fields. Data TX time field 96 shows the amount of time spent transmitting a last sample from relay 27 to server 21. Sent field 97 shows the amount of collected samples sent from relay 27 to server 21. Failures field 99 shows the number of failed data transmission attempts from relay 27 to server 21. Mode field 100 shows the mode of transmission from relay 27 to server 21 : in this embodiment, either SMTP for e-mail data transmission or SSL for SSL data transmission. LastTXBytes field 101 shows the amount of bytes sent by relay 27 to server 21 in an immediately preceding transmission. Last Sent field 102 shows the date and time that the last collected sample was sent from relay 27 to server 21. Last Failure field 104 shows the date and time of the last failed data transmission attempt. When no failures occur "unknown" is displayed.
Status tab 79 also includes a relay service indicator 105. Relay service indicator 105 indicates "UP" when relay 27 is active and "DOWN" when relay 27 is inactive.
When relay 27 is switched from "UP to "DOWN", the TX and RX statistics are reset, e.g., TotalRXBytes, DataTXTime, etc.
Figs. 8 and 29 to 41 depict settings tab 80. Settings tab 80 is displayed on a device running relay 27. Some of the fields in settings tab 80 are fixed, others may be edited.
General Panel 106 (Fig. 8) includes the following fields. CustomerJD field 107 displays the e-mail address input while installing the relay. This e-mail address identifies the user in my.automatos.com and cannot be edited. TMP path field 109 indicates where samples are stored until they are sent to server 21. Communications port field 110 (Fig. 29) displays the JP communication port used to transmit samples from agent 24 to relay 27. hi this example, the default value is 1999.
Protocol selection panel 111 (Figs. 30 to 33) allow a user to select protocols 113 (Fig. 31), including SSL, HTTP and SMTP, that may be used to transmit data over the Internet. Fig. 30 shows the case where SSL is selected. In this case, the server name and port 112 are input. Fig. 32 shows the case where HTTP is selected. In this case as well, the server name and port 114 are input. Fig. 33 shows the case where SMTP is selected. In this case the server name and port 118 are input, along with e-mail addresses 111, including the sender's e-mail address ("FROM") and the recipient's e-mail address ("TO"). In this embodiment, the SMTP server default address is mail.automatos.com (not shown) and the SSL server default address is ssl.automatos.com (not shown).
Figs. 34 to 41 shows screens for allowing a user to select firewall settings 128. hi this embodiment, there are several proxy and Windows® sockets (SOCKS) configurations. Basically, the user inputs the name or IP address of the proxy or SOCKS server and the port of the proxy or SOCKS server. In the case of an authenticated proxy or SOCKS server, a login ID and password may be required. Different screen configurations for inputting this information are shown in Figs. 34 to 41. The Start/stop tab 81 (Fig. 9) is displayed on a relay device, hi this tab, it is possible to activate and/or deactivate data sampling transmission. Start/stop tab 81 indicates "START" 122, when relay service is "UP" 124, and "STOP" 125 when relay service is "DOWN" (not shown).
Fig. 10 shows a process 126 performed by agent 24 (including relay 27) for obtaining data from a device and providing that data to a remote server (or other type of processing device). Fig. 11 shows a process 127 performed by remote server 21 for processing received data and making that data accessible to remote client 30, e.g., over the Internet.
Referring also to Fig. 1, in process 126, agent 24 is activated and receives (1001) a plug-in containing system calls for obtaining data from device 19. It is noted that agent 24 may use a previously-installed plug-in to obtain data from device 19. A new plug-in is used if agent 24 needs to retrieve added or different data not obtainable by plug-ins already available to agent 24. Agent 24 loads (1002) the new plug-in, along with the preexisting plug-ins.
As noted, engine 25 creates (1003) a schedule to call the plug-ins at periodic time intervals. For example, a plug-in can be scheduled to execute every minute (as in this example), every 5 minutes, and so on. After each plug-in executes, the plug-in returns data that it collected to engine 25.
Accordingly, process 126 waits (1004) for the scheduled time interval (one minute here) and calls (1005) the scheduled plug-in at the appropriate time. The plug-in collects the appropriate data from the monitored device. Here, engine 25 uses system calls from the new plug-in to obtain (1006) data from device 19. Engine 25 may also obtain any other available data using the system calls from the pre-existing plug-ins. The data may relate to, but is not limited to, one or more of the following: a processor on the device, a memory on the device, a hard drive on the device, an internal network on which the device is located, an operating system of the device, and/or software installed on the device.
Engine 25 compresses (1007) the obtained data and transmits the compressed data to relay 27. As noted above, relay 27 may reside on the same device as engine 27 or on a different device (shown).
Relay 27 encrypts (1007) the data that it receives from engine 25 and transmits (1008) the encrypted data to server 21 over the Internet. Blocks 1004 to 1008 may be repeated periodically, as shown, in order to obtain real-time data from device 19. Data is thus transmitted from agent 24 to server 21 periodically, thereby allowing a client to monitor changes in device 19 in real-time. This feature is described in more detail below. In process 127 (Fig. 11), server 21 receives (1101) the compressed and encrypted data. The data is received periodically, as it is transmitted, e.g., every minute, five minutes, etc. Computer program 29 in server 21 decompresses and decrypts the data and stores the data in database 22. Alternatively, instead of storing the data in database 22, computer program 29 may process the data as it is received, which is the case when real time notification is utilized.
Computer program 29 formats (1102) the data for display. In this embodiment, the data is formatted as one or more Web pages (e.g., Figs. 15 to 18), reports (see the attached appendices), notification messages (e.g. pager messages, e-mails, etc.) and/or or graphs/charts (e.g., Fig. 25) for showing real-time operation behavior of device 19.
Computer program 29 makes the formatted data accessible to a remote client via the Internet. That is computer program 29 functions as a Web server to provide a Web site containing Web pages with the formatted data. A user at client 30 can navigate through the site/data via one or more hyperlinks. Computer program 29 may generate natural language reports that indicate an operational parameter of a device exceeds a preset limit. In this scenario, computer program determines if received data indicates that an operational parameter of the device exceeds a preset limit and generates a report to client 30 indicating that the preset limit has been exceeded. Preset limits for the operational parameters may be stored in, and retrieved from, database 22 by computer program 29. Client 30 (Fig. 1) can access the formatted data from server 21 through one or more Web pages. Fig. 12 shows an example of a Web page 140 that can be used to access the data. Web page 140 contains hyperlinks 141, 142 and 144 to data for devices, in this case computers, being monitored by agents. Window 145 provides a list 146, which contains groupings by "department" of one or more devices being monitored by agents.
Clicking on hyperlink 142 provides links to data for all computers being monitored. Clicking on hyperlink 144 provides links to data for a selected group from list 146. If hyperlink 146 is selected, Web page 147 (Fig. 13) is displayed. Web page 147 contains link 149 to one computer (BOSB000117) and link 150 to another computer (WVTLLELA). Clicking on hyperlink 149 displays Web page 151 (Fig. 14). Web page
151 provides hyperlinks 154, which allow a user to display information about the selected device.
Clicking on hyperlink 155 displays the general information shown on Web page
152 (Fig. 15) about the selected computer. Web page 152 displays information about the configuration and operation of the selected computer. As shown, this information includes the operating system on the computer, the operating system version, the CPU on the computer, the CPU speed, the amount of memory, the type of CD-ROM (Compact Disc Read Only Memory) on the computer, along with other information. Clicking on hyperlink 156 (Fig. 14) displays the capacity of the device's hard drive, shown in Web page 157 (Fig. 16). Clicking on hyperlink 159 displays network information (e.g., the IP address) for device 19, shown in Web page 160 (Fig. 17). Clicking on hyperlink 161 displays a list of the software installed on device 19, shown in Web page 162 (Fig. 18). Other information also may be accessible.
Web page 164 (Fig. 19) is also accessible through the Web site provided by server 21. Web page 164 provides options for viewing statistics relating to monitored devices. For example, clicking on hyperlink 165 displays Web page 166 (Fig. 20). Web page 166 provides a list 167 of groupings of devices (by department), along with buttons 169 which link to Web pages that provide statistics for a selected grouping from list 167.
Selecting "All Dept" 170 and button 171 on Web page 166 displays Web page 172 (Fig. 21). Web page 172 identifies the CPU on all computers from list 167. To select only computers from a single group (i.e., department), select that group and button 171. Selecting button 174 (Fig. 20) generates a Web page 175 (Fig. 22) that displays operating system information for computers from a selected group. Selecting button 176 generates a Web page (not shown) that displays memory statistics for computers from a selected group. Selecting button 177 generates a Web page (not shown) that displays software statistics (e.g., software installed, versions, etc.) for computers from a selected group.
Selecting button 179 generates a Web page (not shown) that displays product information (e.g., model, version, etc.) for computers from a selected group. Selecting button 180 generates a Web page (not shown) that displays manufacturer information for computers from a selected group. Fig. 23 shows another example of a Web page 181 displayed by server 21. Web page 181 allows a user to access services through server 21. Among these services are real-time monitor (RTM) wizard 182. RTM wizard 182 is part of computer program 29 and allows a client to view data from device 19 as that data changes in real-time. Selecting RTM wizard 182 displays Web page 184 (Fig. 24), in which a user can select a device 185 to be monitored from pull-down menu 186. Once the device has been selected, a window 187 (Fig. 25) is displayed for showing the status of a selected function over time. In this embodiment, a user can choose to monitor a device's memory usage 189, disk input/output (I/O) 190, CPU usage 191, and network VO 192. The selected function is displayed in terms of percentage of use 194 versus time 195 and is updated automatically as new data arrives at server 21.
Web page 196 (Fig. 26) also provides options for obtaining natural-language reports based on the data collected by agent 24. Performance wizard 197, capacity wizard 199, Oracle wizard 200, SQL server wizard 201, and asset wizard 202 are software modules that are included within computer program 29. These modules analyze the data received from the agent(s), generate reports, and provide those reports to a user, in Adobe PDF format, at client 30, on demand (through the site) or automatically (by e- mail).
Generally speaking, the various reports generated by the "wizards" provide information relating to one or more devices on a network over a period of time, although each report is different. The reports combine data, charts, and natural language information, making them look like reports generated by a human being. Reports may include hyperlinks linking their sections, to make it easy to access a section that interests the user. Also, the beginning of each report also may contain a summary of the information found in more detail in other sections of the report, making it easy to jump to the other sections.
Appendix A shows an example of a report generated by asset wizard 202. Appendix B shows an example of a report generated by Oracle wizard 200. Appendix C shows examples reports generated by SQL server wizard 201. Appendix D shows an example of a report generated by performance wizard 197. Appendix E shows an example of a report generated by capacity wizard 199. Other types of reports may be generated instead of, or in addition to, the reports shown in the appendices.
As shown in Web page 196 (Fig. 26), for time-related reports, the user can select a starting date 205 and an ending date 206 for the report. Computer program 29 generates and displays a report that encompasses that time period. Pull-down menu 207 allows the user to select the device or devices about which to generate a report. Web page 196 relates to SQL server wizard 201; however, similar Web pages are provided for the other wizards shown in Fig. 26. Server 21 may also transmit the device monitor data (e.g., reports, etc.) using wireless application protocol (WAP) to a wireless device, such as a cellular telephone 230 (Fig. 42). Fig. 42 shows a screen 232 for a wireless user to select the language in which to receive information. User inputs to the wireless device are likewise sent back to server 21 via WAP. Fig. 43 shows the selection of languages 233 on screen 232. Fig. 44 shows a screen 235 for the user to enter a login ID, here called an "alias". Fig. 45 shows a screen 236 for the user to enter a password. Fig. 46 shows a screen 237 for the user to obtain a list of devices on internal network 11 for which monitoring data is available. Fig. 47 shows a screen 238 that shows the list of devices (in this example, servers). Fig.
48 shows a screen 239 which allows the user to select which features to monitor on the selected server, e.g., configuration, CPU usage, virtual memory, disk I/O, etc. Fig. 49 shows a screen 240 with the selected data, in this case, CPU usage. Fig. 50 shows a screen 241 with the selected data, in this case, virtual memory usage. Fig. 51 shows a screen 242 with the selected data, in this case, network information.
Fig. 27 shows a computer 210 on which either of processes 126 or 127 may be implemented. That is, computer 210 may represent either a device with an installed agent on internal network 11 or server 21 (Fig. 1). Computer 210 includes a processor 211, a memory 212, and a storage medium 214 (e.g., a hard disk) (see view 215). Storage medium 214 stores machine-executable instructions 216 that are executed by processor
211 out of memory 212 to perform processes 126 and/or 127.
Although a personal computer is shown in Fig. 27, processes 126 and 127 are not limited to use with the hardware and software of Fig. 27. They may find applicability in any computing or processing environment. Processes 126 and 127 may be implemented in hardware, software, or a combination of hardware and software.
Processes 126 and 127 may be implemented in computer programs executing on programmable computers or other machines that each include a processor, a storage medium readable by the processor (including volatile and non- volatile memory and/or storage components), at least one input device, and one or more output devices. Program code may be applied to data entered using an input device (e.g., a mouse or keyboard) to perform processes 126 and 127 and to generate information. Each such program may be implemented in a high level procedural or object- oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language. The language may be a compiled or an interpreted language. Each computer program may be stored on a storage medium or other type of article of manufacture, such as a CD-ROM, hard disk, or magnetic diskette, that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform processes 126 and 127. Processes 126 and/or 127 may also be implemented as an article of manufacture, such as a machine-readable storage medium, configured with a computer program, where, upon execution, instructions in the computer program cause a machine to operate in accordance with processes 126 and 127.
The invention is not limited to the specific embodiments described above. For example, the invention is not limited to the protocols, hardware, or software described herein. The invention is not limited to generating the specific Web pages or reports described herein. The blocks of Figs. 10 and 11 may be reordered and/or blocks may be left out or added.
As shown in FIG. 54, a system 310 includes a local server 312 connected to an intranet 314 that is connected to the Internet 316 through a firewall 318. Intranet 314 includes a Mail server 3150 with a Simple Mail Transfer Protocol (SMTP) server 3151 that delivers mail to and from the intranet 314. Intranet 314 also includes a workstation 3152 that is used by an administrator of the Intranet. Workstation 3152 typically has a web browser 343b for browsing web pages and a mail client 3154 for receiving and sending email messages, for example, through SMTP server 3151. A monitor server 320, which is also connected to the Internet 16, monitors the operations of the local server 312 automatically without requiring continued involvement by an administrator of the local server 312. The administrator of the local server 312 may have a laptop computer 322, which is connected to the Internet 316 and may be used to access the local server 312. For purposes of automatic monitoring, local server 312 executes an agent 324, which collects data that indicates the operating state the local server 312, including configuration information and performance data. The data provide a measure of how well the local server 312 is performing its intended functions. Agent 324 automatically transmits the collected data using email (which conforms to a standard email protocol) to an email address associated with the monitor server 320. The monitor server 320 analyzes the data and automatically generates a report containing a summary of the status of the local server 312, diagnoses of problems or defects that may exist in the local server 312, and a listing of resources on the local server 312 that may need to be updated to keep up with future demands on the local server 312. The monitor server 320 transmits the report using an email (which also conforms to a standard email protocol) to an email address associated with the administrator of the local server 312. The administrator can then access the report from any computer that is reachable by email, including laptop computer 322 and workstation 3154. The administrator can also access the report from a web page on monitor server 320 from any computer that has a web browser, such as workstation 3152. Thus, the system 310 provides automatic unattended continuous monitoring of the server 320 and automatically sends performance reports to any authorized person located anywhere using simple email. By using email to send the data and the report, the system 310 allows information to be sent through the firewall 318 without compromising the security of the intranet 14 or requiring that the firewall 318 be reconfigured.
Local Server 312 includes a processor 330 and a storage subsystem 332. Storage subsystem 332 is a computer readable medium, such as computer memory, a floppy disk, a hard disk, a CDROM, an optical disk or a tape drive. Storage subsystem 332 stores an operating system program 334 that is executed by the processor 330. As will be described in greater detail below with reference to FIG. 55 A, local server 312 may have any one of a variety of operating systems installed. Operating system 334 includes a kernel 336, which further contains device drivers 338 that are used by the operating system to access devices in the local server 312. The device drivers 338 provide an input/output control ("IOCTL") application programming interface ("API") 339 that may be used to obtain performance data from the device drivers 338. The operating system 334 provides a system call API 340 and a registry 342 that may be used to obtain performance information from the operating system 334. Storage subsystem 332 also includes a file system 342 that contains system files 344 that are used by the operating system 334 to store data and a web browser 343 that may be used to browse web pages, as described in greater detail below.
Storage subsystem 332 also stores agent software 324, which is executed by the processor 332 to collect and transmit data. Agent software 324 occupies very little storage space on storage subsystem 332. Typically, agent software 324 occupies about 600KB of storage space. Processor 330 executes agent software 324 as a background process, known as a service or a daemon process. Very little memory and processing power is required to execute agent software 324. Typically, agent software 324 requires less than 1 % of the processing power of processor 330 and about 3.5 megabytes of memory to execute.
Agent software 324 includes a data retriever module 346 that retrieves the data, a timer module 348 which directs the data retriever module 346 to retrieve the data at certain time intervals, a data compressor module 350 to compress the collected data, a data encryptor 352 to encrypt the data, and an SMTP sender module 354 to send the data via email. The data retriever 46 includes a registry module 356 which retrieves data from the registry 342, a system call module 358 which uses the system call API 340 to retrieve data from the operating system 334, an IOCTL module 360 which retrieves data from device drivers 338, and a file system module 362 which retrieves data from system files 344 contained within file system 342.
Referring to FIG. 55A, the timer module 348 can be configured in a selected one of possible data collection modes, each of which is represented by a row 3202a, 3202b of FIG. 55 A. As will be described in greater detail below, the configuration mode is selected in a user interface screen of agent software 24. Although the timer module 48 has multiple configuration modes, only two of them 3202a, 3202b are shown in FIG.
55A. Each configuration mode is associated with a sampling period 3204a, 3204b, after which the data retriever 46 collects a new sample of the data from the local server 12. Each configuration mode is also associated with an entry period 3206a, 3206b. The data retriever 46 computes an average of the data samples collected over the duration of the same entry period 3206a, 3206b and writes the average in a current one of the data files 366. The timer module 48 causes the data to be written in a new data file after each upload period 3208a, 3208b of the selected configuration.
As shown in FIG. 55B, different versions of agent software 324 are available for different operating systems and each of the versions is tailored to acquire data from its corresponding operating system. Each column 3210a-3210e of FIG. 55B corresponds to a different operating system. As shown in the first column 3210a, the IBM AIX version of agent software acquires data from a virtual device file "/dev/kmem" 3212 within the file system 342 and from system calls 3214 from the system call API 340 (FIG. 54). The Solaris version acquires data from a "/proc" virtual file system, from system calls 3218, and from IOCTL calls 3219. The HP UX version acquires data from lOCTLs 3220 from the IOCTL API 39 (FIG. 54) and from system calls 3222. The Linux version acquires data from lOCTLs 3224, system calls 3226, and the "/proc" virtual file system 3228. The windows version acquires data from the registry 342, system calls 3232 and lOCTLs 3234.
As shown in FIG. 55C, data retriever collects data about the components or inventory 3239 of the local server 312, processor or CPU usage 3240, process queues 3242 which are listings of tasks awaiting performance by the processor, memory usage 3244, disk usage 3246, network usage 3248, resource usage or the amount of resources used by each process 3250, paging space occupancy 3252, file system occupancy 3254, and logical drive occupancy 3256.
The inventory data 3239 includes a CPU version 3239 that indicates the processor type 3239a and a CPU clock rate 3239b . Typical CPU version may be "Pentium TV, stepping 6" and a typical clock rate is "1.5 Ghz". The inventory data also includes operating system information such as a operating system version 3239c, a version release number 3239d, a maintenance release number 3239e, and a patch level number 3239f.
The CPU usage data includes user mode ("usr") CPU usage 3240a, system mode ("sys") CPU usage 3240b, time spent by the CPU waiting for blocked processes ("wio") 3240d, and idle time (idle) 3240c when the CPU has no tasks to perform. The process queue data 3242 includes blocked queue data 3242a about process that cannot be performed because the processor 330 is waiting, for example, for an input/output operation and run queue data 3242b about processes that are ready to be performed by the processor 330. The memory usage data 3244 includes free memory data ("fre") 3244a, total active virtual memory data ("avm") 3244b, page-ins per second ("pi") 3244c, and page-outs per second ("po") 3244d. The disk usage data 3246 includes disk bandwidth data ("tm_act") 3246a, disk transfers per second ("tps") 3246b, disk read counter data 3246c, and disk write counter data 3246d. The data collected about the resources used by each process includes memory usage 3250a, input/output usage 3250b, and CPU usage 3250c. The collected data is stored in a date file. A sample data file is attached hereto as appendix F. Although the data files are typically stored in binary format, the sample data file in appendix F is configured in ASCII format to make it readable.
Referring again to FIG. 54, compressor 350 compresses the data files, and encryptor 352 encrypts the compressed files to reduce the risk of an unauthorized person accessing the data. SMTP sender 354 then sends the data over the Intranet 314 via email to an email address associated with the monitor server 320. The email message is sent via the Simple Mail Transport Protocol ("SMTP"), typically through SMTP server 3151.
Firewall 18, which contains a processor 370 and a storage subsystem 372, is configured to allow only certain kinds of information to be conveyed between Intranet
314 and Internet 316. Firewall 318 is typically configured to allow email messages to be transmitted from the mail server 3150 into the Internet 316, allowing email messages sent from the SMTP sender 354 to be delivered to the monitor server 320. Alternatively, firewall 318 may have an SMTP gateway 374 contained within the storage subsystem 374 of the firewall 318 that allows email messages to be securely transmitted from SMTP sender 354 to the monitor server 320 without going through mail server 3150. In either case, the Monitor server 320 eventually receives the email message from the Internet 316.
Monitor server 320 includes a processor 380 and storage subsystem 382. Storage subsystem 382 stores mail server software 384 for sending and receiving email messages, a data analyzer 386 for analyzing data, a relational database management system
("RDBMS") 388 for storing information, a file system 390 for storing files, and a web server 391 for serving web pages 393. In certain instances, multiple computers are used to perform the tasks of the monitor server 320. h these instances, the web server 391 may, for example, be stored and executed on a separate computer to increase the responsiveness of the system.
Mail Server 384 includes an SMTP server 386 and a POP server 387. SMTP server 386 receives the mail message containing the collected data and POP server makes the mail message available to analyzer 386 via the post office protocol ("POP"). Alternatively, the email message may be directly retrieved from the SMTP server using an "SMTP EXIT" call that is supported by the SMTP server 386. RDBMS 388 stores User IDs 399 for identifying different users of the monitor server 320, Customer IDs 3100 to identify different organizations that have signed on for the monitoring service, Machine IDs 3102 for identifying the different servers being monitored for each of the organizations, an email address 3104 associated with the administrator of each of the machines, and data 3106 from the machines.
Analyzer 386 includes a POP client 3110 that retrieves the email message from the POP server 387 and extracts the data from it. In extracting the data, the POP client first decrypts the message and then decompresses the data. Analyzer 386 may be configured to store the data in the data section 3106 of the RDBMS or in data files 3113 contained within file system 390. Analyzer386 includes an engine 3112, which analyzes the data based on a set of rules 3114 contained within the analyzer. The analyzer may alternatively be configured to store the rules 3114 within RDMBS 388. A report generator 3116 of the analyzer generates a performance report 3118 for the local server 312 based on the analysis of the engine 3112. By performing the analysis of the data and generating the report on the momtor server 320 instead of the local server 312, the system 310 reduces the processing power and memory required on the local server 312 to monitor the server.
As shown in FIG. 55D, each rule is typically associated with a threshold value 3270 that specifies an acceptable range for a type of performance measurement, such as CPU usage, and a tolerance value 3272 that indicates how long a period of time the performance measurement may be out of the acceptable range when the local server 312 is operating properly. Table 3274 shows the different pieces of information that are added to the report depending on whether or not performance measurement violates the threshold 3270 and on whether the period over which the threshold 3270 is violated is greater than the tolerance 3272. Column 3276 shows text 3276a that is added to the report when performance measurement remains within the range specified by the threshold, while column 3278 shows two different versions 3278a and 3278b of text that are displayed when the performance measurement goes beyond the range. The first version 3278a is only added to the report when the range is violated for a period that is less than the tolerance 3272 and the second version 3278b is only added to the report when the range is violated over a period that is greater than the tolerance 3272. Thus the analyzer 386 and the report generator 3116 generates a natural language report summarizing the collected data in a manner that is easy to understand. The report generator may also be configured to include the actual percentage of the data, e.g. 40%, that exceeds the threshold value in the text segments 3278a and 3278b. The versions 3278a and 3278b include text 3280a and 3280b that is emphasized to draw the attention of the reader. For example, the text 3280a and 3280b may be emphasized to alert the reader to a problem with the local server 312. Report generator 3112 can be configured to emphasize the text 3280 using Italics, bold face font, underlining, larger fonts, a different foreground color, or a different background.
Referring again to FIG. 54, report generator 3116 generates an email message containing the report 3118 and retrieves an email address 3104 from RDBMS 388 associated with the administrator of the local server 312. The report generator 3116 uses the SMTP server 386 to send the report to the email address. Report generator 3116 also generates a web page corresponding to the report and provides the web page to web server 391. The administrator of the local server 312 may retrieve the email message from any computer, such as laptop computer 322, that is equipped with a mail client. Laptop computer 322 includes a processor and a storage subsystem 3122, which contains mail client software 3124. Processor 3120 executes mail client software 3124, causing laptop computer 322 to retrieve the performance report email from an email server associated with the administrator. The administrator can then view the report on a display associated with laptop computer 322. Alternatively, the administrator can log onto web server 391 from a remote computer and view the report as a web page.
As shown in FIG. 56, the agent software 324 initializes the monitoring process by getting (3304) the data upload period 3202 (FIG. 55A) corresponding to the timer configuration. Agent software 324 then determines (3306) the sample period 3204 (FIG. 55A) and entry period 3206 (FIG. 55A) of the timer configuration, for example, by looking them up in a table similar to FIG. 55A. Agent software 24 then starts (3308) the upload timer, starts (3310) the entry timer, and starts (3312) the sample timer of the timer module 348. Agent software 324 resets (3314) the total value and the counter value to zero. Agent software 24 checks (3316) whether the value of the sample timer is greater than or equal to the sample period. If the value is not, then it waits for the value of the sample timer to reach the sample period. Otherwise, if the value is greater than or equal to the sample period, data retriever 46 retrieves (3318) sample data values as previously described. Agent software 24 increments (3320) the total values by the value of the retrieved data, increments (3322) the value of the counter by one, and resets (3324) the sample timer. Agent software 24 then checks (3326) whether the value of the entry timer is greater than or equal to the entry period. If it is not, then agent software repeats the process of (3316-3326) of collecting another sample of data. Otherwise, if the value of the entry timer is greater than or equal to the value of the entry period, the data retriever 46 writes (3328) the ratio of the total values to the counter value to the data file and resets
(3330) the entry timer value to zero.
Agent software 324 then checks (3332) if the value of the upload timer is greater than or equal to the upload period. If it is not, then agent software 324 resets (3314) the total values and the counter value and repeats the process (3316-3332) of making another data entry into the data file. Otherwise, if the value of the upload timer is greater than or equal to the upload period, agent software 24 directs (3334) the compressor 350, encryptor 352, and the SMTP sendor 354 to send the data file via SMTP. Agent software 324 creates (3336) a new empty data file for collecting more data, resets (3338) the upload timer to zero, and repeats the process (3314-3334) of populating the new file with data.
The process of collecting the data is typically implemented using timer interrupts of the processor 330 instead of the timer loops of FIG. 56 to minimize the CPU usage of the software agent 324. The process may also be implemented using a sleep command. As shown in FIG. 57, the process of sending the data file from the local server
312 begins when the agent software 324 reads (402) a closed data file into memory.
Compressor 350 compresses (404) the data contained within the file using the BZJP2 algorithm before encryptor 352 encrypts (406) the compressed data using the Sapphire algorithm. Agent software 324 generates (408) an email message from the encrypted data by, for example, adding source and destination addresses to the email message.
Agent software 24 incorporates the encrypted file in the email message as an attachment.
SMTP sender 354 then sends (410) the email message using the SMTP protocol. Agent software 324 then checks (412) if the email message was successfully sent. If it was not, agent software 324 closes (420) the unsent file and terminates the process of sending files. The closed file is resent at a later time when the agent software is invoked.
Otherwise, if the email message was successfully sent, agent software 324 checks
(414) whether there are any other closed files that have not been sent. If there are none, software agent 324 terminates the process of sending files. Otherwise, if there is a closed unsent file, agent software 324 reads (416) the first of the unsent files to memoryand performs the process (404-420) of sending the file. As shown in FIG. 58, when the engine 3112 receives (502) data from the POP client 3110, it selects (504) the first data type for processing. The engine 3112 retrieves (506) tolerances and thresholds for the rules corresponding to the selected data type. The engine then reduces (508) the data being analyzed to produce a smaller data set that captures the information contained within the larger data set. The engine, for example, reduces CPU usage data to one entry per minute by only selecting the CPU usage datum with the largest value in each minute. By reducing the data, the time required to analyze the data is reduced.
The engine 3112 then checks (510) whether the data needs to be extrapolated to predict future trends or needs. File system or logical drive data, for example, may need to be extrapolated to allow the engine to identify a need to update or replace resources to keep up with future demands on the local server 312. If the data needs to be extrapolated, the engine extrapolates (512) the reduced data. The engine 3112 then determines (514) the number of entries, if any, in the selected data that exceed the tolerance of the corresponding rule. The engine 3112 then checks (516) if no entries in the selected data exceed the threshold of the corresponding rule. If no entries exceed the threshold, the report generator 3116 presents (518) a first display, such as a set of traffic lights that has the green light on, in the report before generating (532) natural language text to include in the report. Otherwise, if some entries exceed the threshold, the report generator 3116 generates (520) and presents blow-ups for entries exceeding the threshold. The blow-ups contain more detailed information about the entries that exceed the threshold values and are typically used by an administrator to determine why the threshold value was exceeded. The engine 3112 then checks (522) if the number of entries that exceed the threshold value is below the tolerance value of the corresponding rule. If it is, then the report generator 3116 presents (524) a second display, such as a set of traffic lights that has the yellow light on before generating (532) natural language text to include in the report. Otherwise if the number of entries that exceed the threshold value is above the tolerance value of the corresponding rule, the engine 3112 checks (536) whether all the entries exceed the threshold value. If all of the entries do not exceed the threshold value, the report generator 3116 presents (528) a third display, such as a set of street lights with the red light on. Otherwise the report generator 3116 presents (530) a fourth graphic display that includes the red light and a warning that the resources represented by the data is insufficient. The report generator then selects (532) natural language text describing the selected data, as described above with reference to FIG. 55D, and presents the selected text in the report. The engine 3112 selects the next data type and repeats the process (506-532) described above.
As shown in FIG. 59, the report 602 is, for example, a HyperText Markup Language ("HTML") document or a Portable Document Format ("PDF") document that is attached to the reply email message from the monitor server as an attachment. Each report 602 has a brief introduction 604 that includes an inventory of the subsystems of the local server 312. The report 602 also includes an executive summary 608, which, for example, has paragraphs 610a describing the performance of the CPU or processor 330, paragraphs 610b describing the performance of memory, paragraphs 610c describing the performance of the disks, and paragraphs 610d describing the performance of the network. Each of the paragraphs 610 includes a hypertext link 612 to more detailed information about the corresponding component. Each of the paragraphs may also have possible problems 614 in the corresponding component highlighted or emphasized to draw the readers attention, as previously described.
The report 602 has details 616 which are divided into sections corresponding to the paragraphs in the executive summary 608. The details 616 include, for example, a CPU section 618a, a memory section 618b, a disk section 618c, and a network section 618d. Each of the sections contains usage information 620 that includes a graphic, such as a traffic light indicating whether the performance of the component, natural language text describing the performance of the component in words, and a graph showing a plot of the data of the component. Thus, the report presents the performance data in a format that is easy to understand. The report 602 also includes blow-up detail 630 for each set of performance data that is not within the range of values set by the threshold values. The blow-up detail 630 includes resource usage 632 for each process. The resource usage 632 includes CPU usage 632a, input/output usage 632b, and memory usage 632c.
The report 602 also includes information on the occupancy of such resources, such as, paging space occupancy 640, file system occupancy 644, and logical drive occupancy 648. The occupancy information typically includes extrapolations to allow an administrator to predict when the resources corresponding to the occupancy information will need to be updated or replaced. For instance, if the extrapolated occupancy data shows that the file system will be fully occupied in the next 15 days, an administrator may configure the server to expand an expandable resource, such as paging space. The administrator may also start looking into an upgrade or replacement of the components on the local server 312 to keep up with the demand for file system space. A sample report is attached hereto as appendix G.
As shown in FIG. 60, to install agent software 24 (FIG. 54), an administrator loads (702) a web page from web server 391 onto web browser 343. The web page contains instructions for installing the software. Based on the instructions, the user creates (704) a customer account on the monitor server 320. The customer account is associated with a customer JJD 3100 and a user ID 399. The customer JD 3100 and the user ID 399 are, for example, generated by the monitor server 320 using a hash function with the customer's phone number as the input to the hash function. The customer JD typically has fourteen digits, twelve of which are from the hash function and two of which provide a checksum of the other twelve digits. The machine ID also has fourteen digits, two of which are a checksum and twelve of wliich are from a hash function. The machine ID is generated differently, depending on the operating system 334 of the local server 312. For example, on a UNIX RISC machine, the twelve digits of the machine ID are obtained from the unique UNAME of the machine, provided by the operating system. The user then downloads (706) the agent software 324 from the monitor server 320 and installs (708) it on the local server 312. The user then registers (710) the agent software 324 with the momtor server 320, thereby creating a unique machine ID 3102 associated with the local server. The machine ID 3102 is also associated with the user ID
399 and customer ID 3100 of the user.
The process of downloading and installing the Windows version agent software
324 will now be described with reference to FIGS. 61-91. 5 As shown in FIG. 61, the user loads the web page 802 onto the web browser 343 by typing a uniform resource locator (URL) 804 into an input 806 of the browser 343.
The browser 343 loads the web page 802. Web page 802 includes a hyperlink 808.
When the user clicks on the hyperlink 808, the web browser 343 loads an instruction web page, which is described below with reference to FIG. 62. o As shown in FIG. 62, upon clicking on the hyperlink 808, the web browser 343 loads an instruction web page 902 that contains instructions for installing agent software
324. Web page 902 contains a menu section 904 that has links 904a-904b that a user can click on to instructions for performing the steps in the installation of agent 324. The user can click on link 904 for instructions on creating an account, link 904b for instructions on 5 downloading agent software 324, link 904c for instructions on installing agent software
324, and link 904d for registering equipment. A section 906 of web page 902 contains instructions for creating an account. After reading the instructions, the user may click on link 908 to create an account.
Fig. 63 shows a section of the web page 902 that contains instructions 910 for 0 downloading agent software 324 and instructions 912a for installing the agent. The user moves scrollbar 913 to reveal this section shown in FIG. 63. After reading the instructions, the user may click on hyperlink 914 to download agent software 324. Fig. 64 shows another section of the web page 902 containing additional instruction 912b for installing the software.
Fig. 65shows yet another section of the web page 902 containing instructions 920 for registering the local server 312 or enabling the equipment. After reading the instructions, the user may register the server 312 by clicking on a hyperlink. Web page 902 also contains a section that has additional instructions for users that have already installed the agent software 24.
FIG. 66 shows a first section 1300a of web page 1300 that is loaded by web browser 343 when the user clicks on hyperlink 908 (FIG. 62) to create an account. Section 1300a collects personal data from the user. Section 1300a includes an input 1302 for entering a salutation that is to be used when referring to the user, an input 1304 for entering the first name of the user and an input 1306 for entering the last name of the user. Section 1300a also includes an input 1310 for selecting the user's job title and an input 1312 for entering the user's department. Section 1300a also includes an input 1314 for selecting a language that the user would like to communicate in and an input 1312 for selecting the medium through which the user heard about the web server 391.
Fig. 67 shows a second section 1300b of the web page 1300 for entering information about a company that the user is associated with, Section 1300b includes an input 1320 for entering a name of the company, inputs 1322-1332 for entering the company's address information, input 1334 for entering telephone information and input
1336 for entering fax information. Section 1300b also has inputs 1338-1344 for entering demographic information about the company. The user uses input 1338 to select an industry that the company is associated, input 1340 to select the number of employees in the company, input 1342 to select the number of servers in the company, and input 1344 to enter the number of server pools in the company.
Fig. 68 shows a third section 1300c of the web page 1300 for entering authentication or "login" information about the user. Section 1300c includes an input
1350 for entering an email address that the monitor server 312 uses to communicate with the user and an input 1352 for confirming the email address to ensure that the user does not mistype the address. Section 1300c also contains an input 1354 for entering a login name, which is stored as user ID 399 on the monitor server 320. The user uses inputs 1356 and 1358 to enter and confirm a password for authenticating the user. Section
1300c also contains inputs 1360-1362 for entering information that the user may use to retrieve a forgotten password. Input 1360 is used for entering a question, such as "what is your mother's maiden name?" that only the user would know and input 1362 is for entering the answer to the question in input 1360. Should the user forget his password, monitor server 320 presents the question from input 1360 to the user. If the user can provide the answer from input 1362, the server provides the password fro input 1354 to the user. Thus, monitor server 320 collects authentication information from the user.
FIG. 69 shows yet another section 1300d of the web page 1300 for creating an account. Section 1300d includes a button 1370 that the user may click on to submit the information entered in sections 1300a- 1300c to the server. Section 1300d also contains a second button 1372 that the user may use to clear all the data entered in sections 1300a to 1300c if the user wants to re-enter the data. FIG. 70 shows a web page 1700 that is presented to the user after clicking on the button 1372 (FIG. 70) to submit account information. Web page 1700 includes a customer ID number 1702 for the user. Web page 1700 also contains information 1703 notifying the user that the customer ID has been sent to the email address 1350 (Fig. 68) provided by the user. Web page 1700 includes a hyperlink 1704 that the user may use to download agent software 324.
FIG. 71 shows a first section 1800a of a web page 1800 that the user may use to download agent software 324. The section 1800a includes a hyperlink 1802a that the user may click on to obtain additional information about installing the agent 324 on a UNIX operating system. Section 1800a also includes a hyperlink 3102b that the user may click on to obtain additional installation information and 1802b that the user may click on to retrieve additional information on installing the operating system on a Microsoft Windows operating system.
FIG. 72 shows a second section 1800b of the web page 1800. Section 1800b includes a first portion 1804a relating to installing the agent on a Linux computer and a second portion 1804b relating to installing the agent on a Microsoft Windows computer. The first portion 1804a includes a hyperlink 1806a for downloading a Windows version of the agent software 324 using the hypertext transfer protocol ("HTTP") and a second hyperlink for 1808a for downloading the Windows version of the agent software using the file transfer protocol ("FTP"). The first portion also contains information 1810a on the different versions of the windows operating system supported by the Windows version agent software 324. The second portion 1804b includes a hyperlink 1806b for downloading a Linux version of the agent software 324 using HTTP and a second hyperlink for 1808b for downloading the Linux version of the agent software 324 using FTP. The first portion also contains information 1810b on the different versions of the Linux operating system supported by the Linux version agent software 324.
FIGs. 73 and 74 also show sections 1800c and 1800d of the web page 1800. The sections 1800c, 1800d contain portions 1804c, 1804d, 1804e, which respectively relate to installing agent software 324 on the IBM RS 6000 operating system, Sun operating systems, and HP-UX operating system. Each of the portions includes hyperlinks 1806c, 1806d, and 1806e for downloading agent software 324 via HTTP and hyperlinks 1808c,
1808d, and 1808e for downloading agent software 324 via FTP. Each of the portions also includes information 1810c, 18 lOd, and 1810e about the different versions of the corresponding operating system that are supported by the agent software 324.
As shown in FIG. 75, upon clicking on one of the download hyperlinks 1806a- 1808e (FIGS. 72-74), the web browser 343 presents the user with a dialog 2200 asking the user whether the user would like to run agent installation software or to save it on the user's hard drive. The user uses option controls 2202 and 2204 and then clicks on an "OK" button 2206 to submit the user's choice. The user may also cancel the download by clicking on a "cancel" button 2208. FIG. 76 shows the dialog 2300 that is presented to users who opt to save the agent installation software in the dialog of FIG. 75. The dialog 2300 includes an input 2302 for selecting a directory where the agent installation software should be saved. The dialog also includes an input 2304 for selecting a name that should be assigned to the agent installation software. The user submits his selections by clicking on a "save" button 2306. The user may also cancel the download by clicking on a "cancel" button 2308. After saving the agent installation software, the user may execute the software by clicking on an icon associated with the installation software.
FIG. 77 shows a dialog 2400 that is presented to a user upon clicking on the installation software. The dialog 2400 includes a message 2402 welcoming the user to the installation process. The user may continue with the process by clicking the "next" button 2404. The user may also cancel the installation by clicking on the cancel button 2406.
FIG. 78 shows a dialog 2500 that prompts the user for a customer ID 100 (FIG. 1). A valid customer ID is required before the agent software 324 can be installed. As previously described with reference to FIG. 70, customer IDs 100 are assigned to users when they create an account on the monitor server 20. The dialog 2500 includes an input 2502 for entering the customer ID, a "next" button 2504 for submitting the entered customer ED and proceeding with the installation process, a "back" button 2506 for moving back in the installation process, and a "cancel" button 2508 for terminating the installation.
FIG. 79 shows a dialog 2600 for entering SMTP information. Dialog 2600 includes a input 2606 for entering an SMTP server, such as SMTP server 386, which will be used to transmit reports to the monitor server 320. Dialog 2600 also includes an input 2604 for selecting an Internet Protocol ("JP") port that will be used to communicate with the SMTP server and an input 2606 for entering an email address from which the reports should be transmitted. Dialog 2600 also includes a "next button" 2608 for submitting the data entered in the dialog 2600 and continuing with the installation process.
FIG. 80 shows a dialog 2700 that is used to select a directory in which agent software 324 should be installed. The user may change the directory by clicking on
"browse" button 2704, which opens a directory selection dialog. The user submits the selected directory and proceeds with the installation process by clicking on the "next" button 2706.
Fig. 81 shows a dialog 2800 that is used to select whether the user would like a typical, compact, or custom installation based on selection inputs 2802. The compact option only installs the minimum components of agent software 324 that are required for the agent to operate. The compact option is often chosen on computers that have limited storage space. The custom option allows the user to select the components that they would like to install. The user submits their selection and continues with the installation process by clicking a "next" button 2804.
FIG. 82 shows a dialog 2900 that is presented during a custom installation to allow the user to select the components they would like to install. Options 2902 are used to select whether the user would like to install computer program files, documentation, or sample files of the agent software 324. The user submits their selection and proceeds with the installation software by clicking on the "next button 2904.
FIG. 83 shows a dialog 33000 that is used to enable the monitor server 20 to receive data from the agent software 324 on the local server 312. The user may opt to enable the service by selecting input 33002. The user may also opt to enable the service later by selecting input 33004. The user can then enable the software on the web pages 393 presented by the monitor server 320. The user submits their selection and proceeds with the installation process by clicking the "next" button 33006. FIG. 84 shows a dialog 33100 that is presented to the user to allow the user to enter information that is required to enable the monitor server 320 to receive data from the local server 312. The dialog 33100 includes an input 33102 for entering an email address where monitoring reports for the local server 312 should be sent. The dialog 33100 also includes inputs 33104 and 33106 for entering and confirming a password for encrypting information sent from the monitor server 320 to the local server 312. The user submits their selection and proceeds with the installation process by clicking the "next" button 33108.
FIG. 85 shows a dialog 33200 informing the user of the progress I transmitting the enablement information to the monitor server 320. The dialog 33200 includes a log window 33202 containing a log of communications between the local server 312 and the monitor server 320. The user proceeds with the installation process by clicking the "next" button 33204.
FIG. 86 shows an email message 33300 that is transmitted by the monitor server 320 to the email address entered in input 33102 (FIG. 84) to inform the user that the service was successfully enabled. Message 33300 includes a machine ID 33302 and a machine name 33304 that are assigned to the local server 312 by the monitor server 320, in addition to information 33308 about the number of processors and the class of the equipment on the local server 12. Message 33300 also includes a customer ID 33306 associated with the user and a password 33310 for encrypting messages relating to the local server312.
FIG. 87 shows a dialog 33400 that is presented to the user when the installation is complete. The user may close the dialog by clicking on the finish button 33402.
FIG. 88 shows an email message 33500 that is transmitted by the monitor server 320 to the email address entered in input 33102 (FIG. 84) to inform the user that agent software 324 was successfully installed. Message 33500 includes the name 33502, the version 33504 of the operating system 334, the number 33506 of processors 330, and the amount 33508 of memory on the local server 312.
FIG. 89 shows a first panel 33600 of a user interface for agent software 324. Panel 33600 displays the version 33602 of the operating system, the name 33604, and the machine ID 33606 of the local server 312. Panel 33600 also contains information 33610 about the data retriever and information 33608 about the SMTP sender 354. The user may switch to a second panel 3700 (FIG. 90) by clicking on selector 3612.
FIG. 90 shows a second panel 3700 of the user interface of agent software 324. Panel 3700 includes an input 3702 for selecting a data upload interval or period, an input 3704 for changing the customer ED 3100, an input 3706 for entering a path to a file where the collected data should be stored, an input 3708 for entering a path to a file where the activities of agent software 324 should be logged, an input 3710 for disabling the delivery of reports by mail for users who only want to view reports through a web browser, an input 3712 for selecting an email address where reports are to be sent, an input 3714 for selecting an email address from which collected data should be sent to the monitor server 320, an input 3716 for changing the SMTP server, and an input 3718 for selecting the SMTP port. The user submits any selections entered on panel 3700 by clicking "apply" button 3720. The user may switch to a third panel of the user interface by clicking on selector 3722.
FIG. 91 shows a third panel 3800 of the user interface of agent software 324. Panel 3800 includes a first button 3802 for starting agent software 324 and a second button 3804 for stopping the agent software. The agent software 324 is normally started automatically when the computer is turned on, as described above. Button 3804 may be used to stop the agent software 324. Button 3802 may later be used to restart the agent software 24. Button 3806 may be used to send a test email message, known as a probe, to the monitor server 320. The test email message is used as a diagnostic tool to determine whether email is being conveyed from the SMTP sender 354 to the monitor server 320. The agent software 324 may be used on a server that is not protected by a firewall.
Other embodiments not described herein are also within the scope of the following claims.
Appendix A
Figure imgf000059_0001
A υ i UMA TUS Asset wizara
Consolidated IT Report
Computers by department
Department Computers %
ACCELA 6%
AUTOMATOS 6%
COMDEX 6%
COMPAQ 22%
DEMO 17%
FISH&RICHARDSON 11%
NATIONAL ACCOUNTS 6%
OPTIGLOBE - SANDRO 6%
PAULO ALMEIDA 6%
R&D 11%
THEBANKANDTRUST 6%
_52_ A υ i UMA τuz> Asset Wizara
CPU
CPU Type Number of computers %
Pentium II or Pentium II Xeon JDeschutes) 1 6% AMD-K6(tm) 3D processor 1 6%
Celeron (Mendociπo) 1 6%
Pentium III (Coppermine) 8 »%
Mobile Pentium II ' 2 11%
Pentium III Xeon (Coppermine) 1 6%
Celeron (Coppermine) 4 22%
Operating System
OS Number of computers %
WINDOWS 2000 15 83% WINDOWS 98 2 11% WINDOWS ME 1 6%
Memory
Memory Number of computers %
0-64 3 17%
6-4-128 A 22%
128-256 9 50%
256-512 1 6% above 512 1 6%
Installed software
Software Installations
Automatos Desktop Agent (Corporate Edition) 18
WebFldrε 1
Adobe Acrobat .0 10
WinZip 8
Adobe Acrobat 5.0 8
ICQ 7
LiveUpdate 6
Windows Media Player 7.1 6 lntel(R) PRO Ethernet Adapter and Software 6
Automatos Server Agent 3.1.2 5
NetMeeting 3.01 5
Windows 2000 Service Pack 2 5
PowerArchiver 5
-m- i UMA i us Asset wizara
Installed software
Software Installations
PowerArc iver- 5
Mjufce Components A
RealPlayer Basic A
Winamp (remove only) A
Windows 2000 Hotfix (Pre-SP3) [See Q300972 for more information] A
Windows 2000 Hotfix (Pre-Sp1) [See Q25393 for more information] A
Compaq Management Agents A
Norton AntjViπis Corporate Edition A
Microsoft Office 2000 SR-1 Standard A
Add-ons 3
Block Diagrams 3
Block Diagrams Help 3
Block Diagrams Samples 3
Borders and Backgrounds 3
Borders and Backgrounds Help 3
Callouts and Connectors 3 Callouts and Connectors Help 3
CAD Drawing Display 3
CAD Drawing Display Samples 3
Clip Art and Symbols 3
Clip Art and Symbols Help 3
Custom Properties Editor 3
Database Wizard 3
Database Wizard Samples 3
Developing Viεio Solutions 3
Developing Visio Solutions Help 3
Flowcharts .. 3
Flowcharts Help 3
Flowcharts Samples 3
Forms and Charts 3
Forms and Charts Help 3
Forms and Charts Samples 3
Graphics Filters 3
Help for Visio 2000 (HTML Help) 3
Maps 3
Maps Help 3
Maps Samples 3
Microsoft ActiveSync 3.1 3
Microsoft Office Integration 3
Microsoft Visio 2000 3
Microsoft Visual Studio Service Pack 3 3
Network Diagrams 3
Network Diagrams Help 3
L i UMA i us Asset wizara
Installed software
Software Installations
Network Diagrams Help 3
Network Diagrams Samples 3
Office Layout 3
Office Layout Help 3
Office Layout Samples 3
Organization Charts 3
Organization Charts Help 3
Organization Charts Samples 3
Page Layout Wizard 3
Program Files 3
Program Files Help 3
Project Schedules 3
Project Schedules Help 3
Project Schedules Samples 3
Property Reporting Wizard 3
QuickTime 3
Release Notes 3
Sample Drawings 3
Save as HTML 3
Solutions 3
VBA 3
Visio 3
Visio Core Files 3
DoS attack generating large number of zero length entries on the 3 reassembly queue - KB article Q259728
Microsoft Internet Explorer 5.5 SP1 3
Windows 2000 Hotfix (Pre-SP2) [See Q280838 for more information] 3
Compaq ARMS Server Agent 3
Microsoft Office 2000 Premium 3
Microsoft Office 2000 SR-1 Premium 3
Palm Desktop 3
Advanced Network Diagramming 2
Advanced Network Diagramming Help 2
Advanced Network Diagramming Samples 2
Automatos Server Agent 2
Citrix ICA Client 2
Database Design 2
Database Design Help 2
Database Design Samples 2
Directory Services 2
Directory Services Help 2
Directory Services Samples 2
Internet Diagrams 2 i Asset wizara
Installed software
Software Installations
Internet Diagrams 2
Internet Diagrams Help 2
Internet Diagrams Samples 2 iveAdvisor (Symantec Corporation) 2
MSDE 2
Online Documentation 2
Shape Explorer 2 Shape Explorer Help 2 SmartShape Wizard 2
Software Design 2
Software Design Help 2
Software Design Samples 2
Stencil Report Wizard 2
Symantec pcAnywhere 2
Lotus SmartSuite Release 9.5 2
ATI Display Driver 2
Carbon Copy 32 2
Compaq iPAQ H3000 Reference Guide 2
Compaq iPAQ H3000 Tour 2
Compaq Remote Services 2
Dicionάπ'o Webster 2
LG PC-Sync 2
Microsoft Encarta Encyclopedia 2000 2
Microsoft Pocket Streets 2001 2
Microsoft Project 2000 2
MSN Messenger Service 3.6 2
Norton AntiVirus 5.0 for Windows NT 2
Pivotal Relationship 99 2
The DIGITAL Commemorative CD-ROM 2
Transcriber CE Uninstall 2
Windows Media Player 7 2
Audiogalaxy Satellite 2
Copernic 2001 Pro 2
D-Link DWL-650 Control Utility 2
Desktop Architect 2
DVDExpress 2
• • Eudora 2
Intel SpeedStep technology Applet 2
McAfee ViruεScan v .0.3 (Licensed) 2
RealPlayer 7 Basic 2
Synaptics TouchPad 2
WinVNC 3.3.3 2
AT&T Global Network Dialer 2 M U I JIVIA I O Mύ5&l VVIZcif U
Installed software
Software Installations
. AT&T Global Network Dialer 2
AvantGo Client 2
Corel Applications 2
UveReg (Symantec Corporation) 2
LiveUpdate 1.6 (Symantec Corporation) 2
Mobile Unk 2
SiSofl Sandra 2001 εe Standard 2
VitalSignε Software s Net.Medic 2
XTNDConnect PC for MyPalm 2
Tera Term Pro 2
Forte(tm) for Java(t ), release 3.0, Community Editon 2
LeechFTP me 2
Netscape Communicator 4.7 2
SoundMAXWDM 2
AirCard 300 for Pocket PC
AirCard 3XX
Audio Converter 3.0 (Limited Edition)
AutoDiscovery and Layout
AutoDiscovery and Layout Help
AutoDiscovery and Layout Samples
BatteryScope
. BlueKite
Bonus Pack Documentation
BTrieve
CAD Drawing Converter
CAD Drawing Converter Help
CAD Drawing Converter Samples
Compaq Insight Manager 4.90
Developing Visio Solutions VNOM Sample
DivX Codec 3.1alpha release
EarthLink 5.0
F-Secure SSH Tunnel & Terminal
GASP Net 5.2.4
HotKey Utility
InoculatelT
Java 2 Runtime Environment Standard Edition v1.3
. Jog Dial Utility
LDAP Driver
Microsoft IntelliType Pro
Microsoft Office 2000 SR-1 Professional
Microsoft Repository
Microsoft Software Inventory Analyzer
Minitab Student Release 12
-6A- AU I UIVIA i ? Asset wizara
Installed software
Software Installations
Minitab Student Release 12
Motion JPEG Software Decoder
NDS Extensions
PowerPanel
Print ShapeSheet
Program Files Enterprise
Program Files Enterprise Help
Project Accounting
Quicken Basic 2000
Reflection Suite for X 8.0
Release Notes Enterprise
Sierra Wireless AirCard 510
Sony Notebook Setup
Sony Utilities Dll
Terminal Services Client
UML Specification VERITAS Backup Exec 8.x
VERITAS.NetBackup
Visio(R) Network Equipment
WinAMP Skin Importer
Windows 2000 Application Compatibility Update
Windows 2000 Hotfix (Pre-SP3) [See Q293826 for more infor ation]-
Windows Media 7 PowerToys
WinRAR archiver
Yahoo! Messenger
ATI Win2k Display Driver
Comet - Comet Cursor web browser extension
Comet - My Comet Cursor desktop application
Compaq 56K (V.90) Mini PCI
Compaq ActiveUpdate
Compaq DMI Insight Web Based Agents
Compaq Insight Manager 7
Compaq PowerCon Enhancements V1.00
Compaq Survey Utility
Compaq Version Control Repository Manager 1.0
Compaq Wireless LAN
Coupons
Halloweenl Screen Saver
Hoyle Card Games 4
InterVideo WinDVD
Microsoft Image Composer 1.5
Microsoft Project 98
Microsoft Web Publishing Wizard 1.53 i UMA i o Asset wizara
Installed software
Software Installations
Microsoft Web Publishing Wizard 1.53 Microsoft XML Parser
QFD Capture 4.0 turkey2 Screen Saver
WebEx Client
Yahoo! Player
Actuate LRX for Microsoft Internet Explorer
Aquatica 3
Carbon Copy Access Edition
CentraOne
Comet Cursor
Compaq Diagnostics For Windows NT
Compaq Insight Manager LC Remote Management
Easy Access Keyboard
FunnelWeb
Internet Explorer Error Reporting
Linksys PrintServer Driver
Localizar... Na Internet
Macromedia Dreamweaver 4
Macromedia Extension Manager
MGI PhotoSuite 4 (Remove Only)-
Microsoft Internet Explorers and Internet Tools
Microsoft Internet Explorer 5:01 SP2
Microsoft Outlook Express 5
Microsoft Windows 2000 Resource Kit
Microsoft Windows Critical Update Notification
Microsoft Windows Media Player 6.1
MouseWare 9.01
MUSICMATCH Jukebox
Netscape Communicator 4.61
Netscape Communicator 4.78
NetSupport PC-Duo
Norton Ghost
QuickCam
RealJukebox
S3 Gamma
S3 Information Property Sheet Page
S3 Refresh
TopStyle Lite (Version 1.5) webHancer Customer Companion
Windows 2000 Hotfix (Pre-SP1) [See Q259728 for more information]
Windows 2000 Hotfix (Pre-SP2) [See q260233 for more information]
Windows 2000 Hotfix (Pre-SP3) [See Q252795 for more information] i UMA i us Asset wizara
Installed software
Software Installations
Windows 2000 Hotfix (Pre-SP3} (See Q252795 for more information]
3DGceetings Embedded Player
Adobe Acrobat Reader 3.02
Adobe Circulate
AnswerWorks Runtime
Automatos Server Agent 3.1.3
Compaq Diagnostics for Windows
Compaq Insight Management Agents
Compaq Insight Management Web Agents
Creative Video Blaster WebCam Go Plus Driver
Creative WebCam Go Control
D-Link Wireless LAN AP Manager
Discador iG v3.01
Find... On the Internet
FR Employee List Wizard 1.1
Hands Client
HP 9100c Digital Sender - Client
HP PrecisionScan LTX
HP Scan-to-Web Wizard lomegaWare
IRPF2001
LEXIS-NEXIS Office 97
Lotus Organizer 97
Lotus SmartSuite Versδo 9
Matrox PowerDesk 4.18.026
MGI PhotoSuite 8.06 (Remove Only) GI VideoWave SE+ (Remove Only)
Microsoft Internet Explorer 4.0 Setup Files
Microsoft Music Control
Microsoft Office Sounds
Microsoft Wallet
Microsoft Web Publishing Wizard 1.6
Motherboard Monitor 5.0
Norton AntiVirus 2001
PGPfreeware 6.5.2a
PhoπeTools • Priority Packet Receitanet 2001 Restore Winsock 1.1 Configuration S3 Gamma Utility S3DuoVue Utility Screen Scroller Mouse Show do Milhao 3
-6T- AU I UMA i us Asset wizara
Installed software
Software Installations
Show.do MilhSo 3 Stariab CATBrowser - beta vl .0 Stariab CATBrowser - beta vl.O (C:\Program FilesNCATBrowseή) TrueSync Products VBA (2720) VDOLive Player- Viking Utilities Visio Professional WestMate ZoπeAlarm McAfee.com Automatic Installer McAfee.com MailScan Components Microsoft Internet Explorer 5.5 SP2 Microsoft Internet Explorer Administration Kit 5 Microsoft Office 2000 Resource Kit Tools and Utilities Microsoft Office XP Professional with FrontPage Discador iG v3.02 HSP56 MR Drivers 1st Page 20002.00 Free 602Pro LAN SUITE SendFax 602Pro PCΗUITE2000 AFPL Ghoεtscript 7.00 AFPL Ghostscript Fonts Anonymizer Plugin (remove only) CDex extraction audio Charts v 1.12 Demo Cooktop DJ Java Decompiler v.2.8.8.54 GSview 4.01 beta Headway review 2.4 HP LaserJet 4050 Printing System IBM DB2 Java 2 Runtime Environment Standard Edition v1.3.0_01 Java 2 Runtime Environment Standard Edition vl.3.1 Java 2 SDK Standard Edition .v1.3.0_02 Java 2 SDK Standard Edition v1.3.1 MarrowSoft Xselerator Microsoft Internet Explorer 6 MouseWare 9.25 NeoTrace Express 3.0 QuickLatin SoftQuad XMetaL 2.1 Eval Syndeo v2.3 AUTO A i us Asset wizara
Installed software
Software Installations
Syrjdeo v2.3
Turbo XML Version 2.2
UltraEdit-32 Uninstall
UP.SDK 3.2 for WML
UP.SDK4.1 XML Transform 1.0 ATI Display Driver Utilities AutoPlay Extender Cabinet File Viewer Command Prompt Here PowerToy Contents Submenu
DBWT
Dell AccessDirect
Dell Dock Quick Install for Windows
Dell Internal Modem Diagnostics Tool
Dell Solution Center
DellEPro Internet Service
Explore From Here (Remove only)
Find... Extensions
FlexiCD (Remove only) i-LEARN My Dell PC l-Price -
IBM AS/400 Client Access Express for Windows
Image Expert 2000 v3J2
Microsoft Interactive Training
Microsoft Office XP Media Content
Microsoft Office XP Small Business
Microsoft Proxy Client
NetObjects Fusion 5.0
Netscape 6 (6.1)
ODBC Pack 3.01 for Windows 95 Windows NT
Program Files Professional
Program Files Professional Help
Release Notes Professional
Send To Extensions PowerToy
Socket Viewer
Softex BayManager
Target Context Menu (Remove Only)
Trellix Web Dell Edition
Trellix Web Demo Shortcut Adder
TREEV
User ε Guides s Asset wizara
Machines which have not sent data for more than a day
Computer name Last contact
-^θ" AU I UMA i us Asset wizara -r —
Machines which have not sent data for more than a day
Computer name Last contact
Computer name Last contact
MCHAN 2001-09-04 16:28:05.000000
AUTOMATOS-2 2001-09-04 16:16:40.000000
AUTOMATO-MKTG 2001-08-31 20:31:28.000000
BCDCMARY 2001-09-04 16:27:40.000000
BCDCCIMTEST 2001-09-04 16:27:44.000000
CSIMOES02 2001 -09-0416:28:09.000000
CSIMOES02 2001-08-30 16:30:32.000000
GIBSON 2001-09-04 16:27:42.000000
EDIT-23AKH93 2001-08-24 16:49:02.000000
NMLESSA 2001-09-04 16:16:50.000000
WCLAYTON 2001-09-04 16:16:41.000000
WMARCIA 2001-09-04 10:36:55.000000
DEMO02 2001-08-31 20:01:29.000000
WNODARI 2001-09-04 16:27:52.000000
DRBT5 2001-08-31 17:11:28.000000
-73- Appendix B
Figure imgf000074_0001
Introduction
Based on data collected in the host SGR-RJ-17, from 08/16/2001, at 18:00, to 08/21/2001, at 18:00, the current performance analysis report was elaborated for Oracle instance cmtl.
The data used in this report was obtained from an exclusive collector, developed specially for this end, executing on the target instance with high resolution and low intrusion. This collector obtains data directly from the Oracle instance, without any other libraries or additional tools, with a minimum overhead on the system. The data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
The content of this report is based on years of experience in performance analysis and capacity planning. The tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, and is subject to continuous improvements. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
During the monitoring period, the summary configuration of the instance, which has been obtained dynamically, was:
Instance number 1 Instance name cmtl Machine name SGR-RJ-17 Version 8.1.6.0.0 Status OPEN
Parallel Database NO Archive Mode STOPPED Database status ACTIVE Instance function PRIMARYJNSTANCE Startup Time 06-08-2001 16:27:15
Figure imgf000076_0001
These are the highlights of the monitored period:
The number of connections was low during all of the monitored period.
The buffer cache hit ratio was high during most of the monitored period.
Server CPU usage was low during all of the monitored period.
Configuration data
ιeir current values, if they may be modified during a session and if they have been modified since installation.
Parameter Value Default Session modifiable Modified backgroun _dump. _dest e:\inst00\cmt1\dbs\bkg FALSE FALSE FALSE buffer_pool_keep TRUE FALSE FALSE compatible 8.1.0 FALSE FALSE FALSE cpu_count 2 TRUE FALSE FALSE db_block_buffers 67936 FALSE FALSE FALSE db_block_size 8192 FALSE FALSE FALSE db_filejτιultiblock_ ead_couπt 8 FALSE TRUE FALSE db_writer_processes 1 TRUE FALSE FALSE dbwr_io_slaves o TRUE FALSE FALSE hash_area_size 131072 TRUE TRUE FALSE java_pool_srze 20971520 FALSE FALSE FALSE large_pool_size 614400 FALSE FALSE FALSE lock_sga FALSE TRUE FALSE FALSE log_archive_dest TRUE FALSE FALSE log_archive_max_| processes 1 TRUE FALSE FALSE log_archive_start FALSE TRUE FALSE FALSE log_buffer 32768 FALSE FALSE FALSE log_checkpoint_interval 10000 FALSE FALSE FALSE
Configuration data
Parameter Value Default Session modifiable Modified log_checkpoint_timeout 1800 FALSE FALSE FALSE max_dump_file_size 10240 FALSE TRUE FALSE open_cursors 100 FALSE FALSE FALSE optimizer_mode CHOOSE TRUE TRUE FALSE pre_page_sga FALSE TRUE FALSE FALSE processes 50 FALSE FALSE FALSE shared_pool_resen/ed_size 9275596 TRUE FALSE FALSE shared_pool_size 185511936 FALSE FALSE FALSE sorl_area_retained_size 65536 FALSE TRUE FALSE sort_area_size 65536 FALSE TRUE FALSE user_dump_dest e:\inst00\cmt1\dbs\usr FALSE FALSE FALSE
Syst
Figure imgf000079_0001
9 Java Pool: 20 MB (2.75%)
Occupation Tablespaces
Figure imgf000080_0001
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
pa ce SYSTEM
16384
1
505
16384
50
Figure imgf000080_0003
Figure imgf000080_0007
146038784
PERMANENT
ONLINE
Figure imgf000080_0002
Disk Usage
Figure imgf000080_0005
Figure imgf000080_0004
Figure imgf000080_0006
Figure imgf000080_0008
Disk Occupation Tablespaces
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
Tablespace TSCMDA01
Initial Extent 40960 Minimum Extent 1
Maximum cxieπi ouo Next Extent 40960 Percentage increase 50
Largest 490807296
Content PERMANENT
Status ONLINE
Disk Usage
l 79.37 % H 79.37 % H 79.40 % -H 79.40 % B 79-40 % B 79 >° %
Figure imgf000081_0001
16/08117/08118/08119/08120/08121/08 ■ Used (MB) ■ Free (MB) Disk Occupation Tablespaces
Figure imgf000082_0001
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
Tablespace TSCMIX01
Initial Extent 40960
Minimum Extent 1
Maximum Extent 505
Next Extent 40960
Percentage increase 50
Largest 183459840
Content PERMANENT
Status ONLINE
Disk Usage
Figure imgf000082_0002
16/08 17/08 18/08 19/08 20/08 21/08 ■ Used (MB) ■ Free (MB) Disk Occupation Tablespaces
Figure imgf000083_0001
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
Tablespace TSCMTM01
Initial Extent 40960
Minimum Extent 1
Maximum Extent 0
Next Extent 40960
Percentage increase 50
Largest 382476288
Content TEMPORARY
Status ONLINE
Disk Usage
Figure imgf000083_0002
16/08 17/08 18/08 19/08 20/08 21/08 ■ Used (MB) ■ Free (MB) Disk Occupation Tablespaces
Figure imgf000084_0001
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
Tablespace TSCMRB01
Initial Extent 40960
Minimum Extent 1
Maximum Extent 505
Next Extent 40960
Percentage increase 50
Largest 503799808
Content PERMANENT
Status ONLINE
Disk Usage
Figure imgf000084_0002
16/08 17/08 18/08 19/08 20/08 21/08 ■ Used (MB) ■ Free (MB) Disk Occupation Tablespaces
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
Tablespace RBSWORK
Initial Extent 40960
Minimum Extent 1
Maximum Extent 505
Next Extent 40960
Percentage increase 50
Largest 838852608
Content PERMANENT
Status ONLINE
Disk Usage
Figure imgf000085_0001
16/08 17/08 18/08 19/08 20/08 21/08 ■ Used (MB) ■ Free (MB) Disk Occupation Tablespaces
Figure imgf000086_0001
The table below shows the main configuration parameters for the tablespace. The graph shows that the tablespace usage rate was low all the time.
Tablespace TSCMDA02
Initial Extent 1048576
Minimum Extent 1
Maximum Extent 120
Next Extent 106496
Percentage increase 1
Largest 538615808
Content PERMANENT
Status ONLINE
Disk Usage
Figure imgf000086_0002
16/08 17/08 18/08 19/08 20/08 21/08 ■ Used (MB) ■ Free (MB) Disk Occupation Tablespaces
The table below shows the main configuration parameters for the tablespace. the graph shows that the tablespace usage rate was high all the time. You may consider increasing the tablespace.
Tablespace TSCMIX02
Initial Extent 131072
Minimum Extent 1
Maximum Extent 120
Next Extent 131072
Percentage increase 1
Largest 5808128
Content PERMANENT
Status ONLINE
Disk Usage
Figure imgf000087_0001
■ Used (MB) ■ Free (MB) Disk Occupation Data file
The table below informs the list of datafiles in the databas, with their tablespace, location, creation date, s sttaattuuss,, aaccttiivvaattiioonn mmooddee,, occupied bytes and free bytes.
Tablespace Name Creation\Date Status Active Used Free
SYSTEM E:\INST00\CMT1\DBS\DTF\TSCMT1 SYS011.DTF 23-07-2001 16:45:20 SYSTEM READ WRITE 200 MB 139 MB
TSCMDA01 E:\INST00\CMT1\DBS\DTRTSCMT1 DA0 1.DTF 23-07-2001 17:37:17 ONLINE READ WRITE 2000 MB 519 MB
TSCMIX01 E:\INST00\CMT1\DBS\DTRTSC T11X011.DTF 23-07-2001 17:37:45 ONLINE READ WRITE 1000 MB 179 MB
TSCMTM01 E:\INST00\CMT1\DBS\DTRTSCMT1TM011.DTF 23-07-2001 17:37:58 ONLINE READ WRITE 500 MB 364 MB
TSCMRB01 E:\INST00\CMT1\DBS\DTRTSCMT1 RB011.DTF 23-07-2001 17:38:12 ONLINE READ WRITE 500 MB 480 MB
RBSWORK E:\INST00\CMT1\DBS\DTRRBSWORK.DTF 23-07-2001 18:25:50 ONLINE READ WRITE 1000 MB 799 MB
TSCMDA02 E:\INST00\C T1\DBS\DTRTSCMT1 DA021.DTF 01-08-2001 11:33:36 ONLINE READ WRITE 1000 MB 513 MB
TSC IX02 E:\INST00\CMT1\DBS\DTRTSCMT11X021.DTF 14-08-2001 13:34:51 ONLINE READ WRITE 300 MB 5 MB
Sessions
Figure imgf000089_0001
The number of connections to the database did not exceed the limit, and was not a problem.
50
V) c o υ ω 40 c c o υ t o.: 30
0)
Si ε 20
3 C
3 ε ε 10 co
Figure imgf000089_0002
08/16 17 18 19 20 21 date Redo Logs
The table below shows the database redo logs, their switch history and their status.
Group# Sequence* Bytes Members Archived Status First Time
1 1231 3145728 1 NO CURRENT 19-08-2001 17:00:04
2 1229 3145728 1 NO INACTIVE 18-08-2001 18:30:52
3 1230 3145728 1 NO INACTIVE 18-08-2001 18:44:55
Figure imgf000091_0001
Figure imgf000091_0002
he redo allocat hit ratio remained above the limit for all of the onitored period there were no constraints in the access to the redo ogs.
Redo Allocations
Figure imgf000091_0005
Figure imgf000091_0006
Figure imgf000091_0004
Figure imgf000091_0003
-
Figure imgf000092_0001
emained above the limit for all of the monitored edo logs was not a problem.
Redo Copies
Figure imgf000092_0005
Figure imgf000092_0002
Figure imgf000092_0004
Figure imgf000092_0003
0 -
J I L J I L J L 1 J L J 1_
Figure imgf000093_0001
s Buffer Cache
he buffer cache h atio was high for most of the monitored period. n some occasions wever, it fell below the limit, indicating that the uffer cache may n e large enough.
Figure imgf000093_0004
Figure imgf000093_0006
Figure imgf000093_0005
Figure imgf000093_0002
Figure imgf000093_0003
J I L J I L J I L I I I J I L J L_l
Cache Hit Ratios Library Cache
Figure imgf000094_0001
The Library Cache Hit ratio remained above the limit for all of the monitored period, not indicating any performance problems.
Figure imgf000094_0002
08/16 17 18 19 20 21 date
Cache Hit Ratios Dictionary Cache
Figure imgf000095_0001
The Dictionary Cache Hit ratio remained above the limit for all of the monitored period, not resulting in a database bottleneck.
Figure imgf000095_0002
08/16 17 18 19 20 21 date
Rollback segments
The following rollback segments presented constraints:
Rollback segments presenting constraints
Figure imgf000096_0001
08/16 17 18 19 20 21 date
■ SYSTEM
■ SYSROL Data files I/O
Below are the total I/O rates for all the disks used by the database, compared to all the data file I/O rates.
hdiskO
Figure imgf000097_0001
08/16 17 18 19 20 21 date
Disk I/O Oracle I/O Data files I/O
Below are the I/O rates for all the disks occupied by Oracle, with the percentage each one of them represented in the total database I O.
Figure imgf000098_0001
D hdiskO (100.00%)
User I/O 16/08
This section shows the users with the most I/O activity in the database, per day.
Users with the most I/O
User name System User Total
ORACLE PROC SYSTEM 442
SYSTEM SYSTEM 276
DBSNMP SYSTEM 10
User I/O 17/08
Users with the most I/O
User name System User Total
ORACLE PROC SYSTEM 4862
SYSTEM SYSTEM 3036
DBSNMP SYSTEM 110
User I/O 18/08
Users with the most I/O
User name System User Total
ORACLE PROC SYSTEM 8670
SYSTEM SYSTEM 5244
DBSNMP SYSTEM 190
User I/O 19/08
Users with the most I/O
User name System User Total
ORACLE PROC SYSTEM 9680
SYSTEM SYSTEM 5520
DBSNMP SYSTEM 200
User I/O 20/08
Users with the most I/O
User name System User Total
ORACLE PROC SYSTEM 5390
SYSTEM SYSTEM 3036
DBSNMP SYSTEM 110
User I/O 21/08
Users with the most I/O
User name System User Total
ORACLE PROC SYSTEM 1960
SYSTEM SYSTEM 1104
DBSNMP SYSTEM 40
Memory Usage
Figure imgf000105_0001
Server memory consumption was high during all of the monitored period. Oracle's maximum consumption was 96.5% of it.
Figure imgf000105_0002
Oracle Memory Used Memory Total Memory
CPU Usage
Figure imgf000106_0001
CPU consumption was low throughout the monitored period. Oracle's maximum consumption was 100% of the total.
CPU Usage
Figure imgf000106_0003
Figure imgf000106_0002
08/16 17 18 19 20 21 date
CPU Oracle
TOT Appendix C
Figure imgf000108_0001
Introduction
Based on data collected in the host proliant, from 08/10/2001, at 00:00, to 08/18/2001, at 23:00, the current performance analysis report was elaborated for SQL Server instance PROLIANT.
The data used in this report was obtained from an exclusive collector, developed specially for this end, executing on the target instance with high resolution and low intrusion. This collector obtains data directly from the SQL Server instance, without any other libraries or additional tools, with a minimum overhead on the system. The data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
The content of this report is based on years of experience in performance analysis and capacity planning. The tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, and is subject to continuous improvements. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
During the monitoring period, the summary configuration of the instance, which has been obtained dynamically, was:
Figure imgf000109_0001
Name : Microsoft SQL Server 2000 Version : 8.00.194 (Intel X86) Edition : Enterprise Evaluation Edition ummary
Figure imgf000110_0001
This report refers to the period from 08/10/2001, at 00:00, to 08/18/2001, at 23:00. The following highlights were registered:
CPU Consumption
Memory
Memory Locks
Procedure Cache
Connection Counters
Hit Rate
Logs
The procedure cache hit rate remained low for most of the monitored period.
The log usage rate remained high throughout the monitored period.
Memory
Figure imgf000111_0001
The server's memory consumption was low most of the time, so it was not a problem.
Total Memory
Figure imgf000111_0002
08/10 11 12 13 14 15 16 17 18 date
System (312.4 MB) SQL Server (132.1 MB)
Memory Memory Locks
Figure imgf000112_0001
Locked memory remained low throughout the monitored period, so this was not a problem.
Figure imgf000112_0002
08/10 11 12 13 14 15 16 17 18 date
Memory Procedure Cache
The Procedure Cache had a low memory usage throughout the period, but it exceeded the thershold on occasions, and this may have caused constraint.
Figure imgf000113_0001
Disk Occupation
The data below refers to the SQL Server data files' disk utilization at the end of the monitoring period. All disks are shown, with the free and the used percentages. hdiskO
Figure imgf000114_0001
Ξ Used (%) (78.58%) 9 Free (%) (21.42%)
Disk Occupation
Below is a list of all databases for SQL Server instance PROLIANT, along with the database name, size and the list of log and data files.
Database Name File name Size (KB copiaDB2 copiaDB2_dat C:\Program Files\Microsoft SQL Server\MSSQL\data\copiaDB2.mdf 307,200.0 copiaDB2_log C:\Program Files\Microsoft SQL Server\MSSQL\data\copiaDB2.ldf 102,400.0 insight_db_V2 insight_db_V2 C:\Program Files\Microsoft SQL Server\MSSQL\data\insig t_db_V2.mdf 30,720.0 insight_db_V2Log C:\Program Files\Microsoft SQL Server\MSSQL\data\insight_db_V2Log.ldf 51 ,080.0 master master C:\Program Files\Microsoft SQL Server\MSSQL\data\master.mdf 108,992.0 mastlog C:\Program Files\Microsoft SQL Server\MSSQL\data\mastlog.ldf 33,088.0 model modeldev C:\Program Files\Microsoft SQL Server\MSSQL\data\model.mdf 640.0 modellog C:\Program Files\ icrosoft SQL Server\MSSQL\data\modellog.ldf 512.0 msdb MSDBData C:\Program Files\Microsoft SQL Server\MSSQL\data\msdbdata.mdf 25,280.0
MSDBLog C:\Program Files\Microsoft SQL Server\MSSQL\data\msdblog.ldf 10,752.0 newdb2cop newdb2cop_Data D:\MSSQL7\Data\newdb2cop_Data.MDF 30,720.0 newdb2cop_Log D:\MSSQL7\Data\newdb2cop_Log.LDF 40,712.0
NEWDB2 NEWDB2_1_Data C:\MSSQL7\Data\NewDB2c 256,000.0
NEWDB2_1_Log C:\MSSQL7\Data\newdb2b_log 327,688.0
NEWDB2 Data D:\MSSQL7\Data\NEWDB2_Data.MDF 409,600.0
Disk Occupation
Database Name File name Size (KB
NEWDB2 NEWDB2_Log D:\MSSQL7\Data\NEWDB2_Log.LDF 65,736.0 Northwind Northwind C:\Program Files\Microsoft SQL Server\MSSQL\data\northwnd.mdf ' 3,008.0
Northwindjog C:\Program Files\Microsoft SQL Server\MSSQL\data\northwnd.ldf 2,048.0
NWCOPY nwcopy_Data C:\MSSQL7\data\nwcopy_Data.MDF 1 ,792.0 nwcopy_Log C:\MSSQL7\data\nwcopy_Log.LDF 1 ,024.0
NWCOPY_1_Data D:\MSSQL7\Data\nwcopy2_Data.MDF 20,480.0 pubs pubs C:\Program Files\Microsoft SQL Server\MSSQL\data\pubs.mdf 1 ,408.0 pubsjog C:\Program Files\Microsoft SQL Server\MSSQL\data\pubs_log.ldf 768.0 tempdb tempdev C:\Program Files\Microsoft SQL Server\MSSQL\data\tempdb.mdf 102,400.0 templog C:\Program FilesWlicrosoft SQL Server\MSSQL\data\templog.ldf 51 ,200.0
Disk I/O
The table below shows the I/O data for the disks used by the SQL Server, along with the data files in each disk and their locations.
Disk KB/sec Data file Location hdiskO 146.08 copiaDB2_dat C:\Program FilesNMicrosoft SQL Server\MSSQL\data\copiaDB2.mdf hdiskO 146.08 copiaDB2_log C:\Program FilesWlicrosoft SQL Server\MSSQL\data\copiaDB2.ldf hdiskO 146.08 insight_db_V2 C:\Program Files\Microsoft SQL Server\MSSQL\data\insight_db_V2.mdf hdiskO 146.08 insight_db_V2Log C:\Program Files\Microsoft SQL Server\MSSQL\data\insight_db_V2Log.ldf hdiskO 146.08 master C:\Program Files\Microsoft SQL Server\MSSQL\data\master.mdf hdiskO 146.08 mastlog C:\Program Files\Microsoft SQL Server\MSSQL\data\mastlog.ldf hdiskO 146.08 modeldev C:\Program Files\Microsoft SQL Server\MSSQL\data\model.mdf hdiskO 146.08 modellog C:\Program Files\Microso t SQL Server\MSSQL\data\modellog.ldf hdiskO 146.08 MSDBData C:\Program Files\Microsoft SQL Server\MSSQL\data\msdbdata.mdf hdiskO 146.08 MSDBLog C:\Program Files\Microsoft SQL Server\MSSQL\data\msdblog.ldf hdiskO 146.08 newdb2cop_Data D:\MSSQL7\Data\newdb2cop_Data.MDF hdiskO 146.08 newdb2cop_Log D:\MSSQL7\Data\newdb2cop_Log.LDF hdiskO 146.08 NEWDB2_1_Data C:\MSSQL7\Data\NewDB2c hdiskO 146.08 NEWDB2_1_Log C:\MSSQL7\Data\newdb2b_log hdiskO 146.08 NEWDB2_Data D:\MSSQL7\Data\NEWDB2_Data.MDF hdiskO 146.08 NEWDB2_Log D:\ SSQL7\Data\NEWDB2_Log.LDF
Disk I/O Consolidated
Below is the list of disks that contain at least one SQL Server data file, along with how much I/O each one is responsible for, as a percentage of the total I/O.
Figure imgf000118_0001
Ξ hdiskO (100.00%)
Connection Counters
Figure imgf000119_0001
The graph below shows the number of connections made to the SQL Server during the monitored period. The limit of simultaneous connections to the database is "unlimited".
Figure imgf000119_0002
08/10 11 12 13 14 15 16 17 18 date
Buffer Cache hits
Figure imgf000120_0001
The buffer cache hit rate was high all the time, indicating that the server's memory is sufficient.
Buffer Cache hits
Figure imgf000120_0002
08/10 11 12 13 14 15 16 17 18 date
Log Cache hits
Figure imgf000121_0001
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
copiaDB2
Figure imgf000121_0002
08/10 11 12 13 14 15 16 17 18 date
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
insight_db_V2
Figure imgf000121_0003
08/10 11 12 13 14 15 16 17 18 date Log Cache hits
Figure imgf000122_0001
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
master
Figure imgf000122_0002
08/10 11 12 13 14 15 16 17 18 date
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
model
Figure imgf000122_0003
Log Cache hits
Figure imgf000123_0001
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
msdb
Figure imgf000123_0002
08/10 11 12 13 14 15 16 17 18 date
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
newdb2cop
Figure imgf000123_0003
08/10 11 12 13 14 15 16 17 18 date Log Cache hits
Figure imgf000124_0001
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
NEWDB2
Figure imgf000124_0002
08/10 11 12 13 14 15 16 17 18 date
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
Northwind
Figure imgf000124_0003
08/10 11 12 13 14 15 16 17 18 date Log Cache hits
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
NWCOPY
Figure imgf000125_0001
08/10 11 12 13 14 15 16 17 18 date
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
pubs
Figure imgf000125_0002
08/10 11 12 13 14 15 16 17 18 date Log Cache hits
Figure imgf000126_0001
The log cache hit rate remained low throughout the monitored period, which probably indicates a memory shortage for the SQL Server.
tempdb
Figure imgf000126_0002
08/10 11 12 13 14 15 16 17 18 date
Logs
Below are the database log occupation rates. For each database, there is a graph indicating the evolution of these logs during the monitored period, and another graph showing the maximum daily occupation. Only the last 7 monitored days are shown
Logs
The log usage remained low during the whole monitored period, not indicating any problems.
copiaDB2
Figure imgf000128_0001
08/10 11 12 13 14 15 16 17 18 date
copiaDB2
Figure imgf000128_0002
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 EH Used D Free Ϊ27 Logs
Figure imgf000129_0001
The log usage remained low during the whole monitored period, not indicating any problems.
insight_db_V2
Figure imgf000129_0002
insight_db_V2
Figure imgf000129_0003
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 II Used D Free Logs
The log usage remained low during the whole monitored period, not indicating any problems.
master
Figure imgf000130_0001
08/10 11 12 13 14 15 16 17 18 date
master
m
Figure imgf000130_0002
Figure imgf000130_0003
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 EH Used CD Free Logs
Figure imgf000131_0001
The log usage remained low during the whole monitored period, not indicating any problems.
model
Figure imgf000131_0002
08/10 11 12 13 14 15 16 17 18 date
model
Figure imgf000131_0003
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 Used D Free Logs
Figure imgf000132_0001
The log usage remained low during the whole monitored period, not indicating any problems.
msdb
Figure imgf000132_0002
08/10 11 12 13 14 15 16 17 18 date
msdb
Figure imgf000132_0003
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 M Used D Free Logs
The log usage remained low duπng the whole monitored period, not indicating any problems.
newdb2cop
Figure imgf000133_0001
08/10 11 12 13 14 15 16 17 date
newdb2cop
Figure imgf000133_0002
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 §§ Used D Free Logs
The log usage rate remained high throughout the monitored period, indicating the possibility of a shortage in disk space.
NEWDB2
Figure imgf000134_0001
date
NEWDB2
Figure imgf000134_0002
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 B Used D Free Logs
Figure imgf000135_0001
The log usage remained low during the whole monitored period, not indicating any problems.
Northwind
Figure imgf000135_0002
08/10 11 12 13 14 15 16 17 18 date
Northwind
Figure imgf000135_0003
Figure imgf000135_0004
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 Hi Used D Free Logs
Figure imgf000136_0001
The log usage remained low during the whole monitored period, not indicating any problems.
NWCOPY
Figure imgf000136_0002
_ι_
08/10 11 12 13 14 15 16 17 18 date
NWCOPY
Figure imgf000136_0003
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 H Used L~3 Free
Figure imgf000137_0001
Logs
Figure imgf000137_0002
The log usage remained low during the whole monitored period, not indicating any problems.
pubs
Figure imgf000137_0003
08/10 11 12 13 14 15 16 17 18 date
pubs
Figure imgf000137_0004
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 IS Used G3 Free Logs
Figure imgf000138_0001
The log usage remained low during the whole monitored period, not indicating any problems.
tempdb
Figure imgf000138_0002
tempdb
Figure imgf000138_0003
11/08 12/08 13/08 14/08 15/08 16/08 17/08 18/08 B Used D Free Backup
Below is the list of system backups, their name, creation date and ending date.
Creation Ending
Name date date copiaDB2 2001-07-30 10:07:35 insight_db_V2 2001-06-07 20:12:19 2001-07-06 11 :38:35 master 2000-08-06 01:29:12 2001-07-22 02:00:34 model 2000-08-06 01 :40:52 2001-07-15 02:00:23 msdb 2000-08-06 01:40:56 2001-07-01 02:00:38 newdb2cop 2001 -06-27 19:04:26
NEWDB2 2001 -06-25 16:56:44 2001-08-07 18:01 :05
Northwind 2000-08-06 01 :41 :00
NWCOPY 2001-06-29 12:36:40 2001-07-29 02:00:18 pubs 2000-08-06 01:40:58 tempdb 2001 -08-17 15:19:13
T3 Concepts
In order to understand a performance analysis report, one must review a few basic concepts. Performance measurements are arbitrary and are usually based on the perception of the end users and on the nature of the commercial application. In a typical application, end user queries in T-SQL (see Glossary) format are sent through the network to the SQL Server 2000. The SQL Server 2000 executes these queries and, if necessary, sends the results back to the client. The time from the start to the end of this procedure is called the response time of the query. Response time measurement defines the system's performance. If the response time is within a reasonable limit, performance is considered satisfactory. There are several resources and variables that may have a significant effect upon the system's performance. These resources are:
• Memory
• Locks
Disk occupation
• Number of connections
• Hit rate • Logs • CPU
• Disk θ
Each one of these values, collectively or individually, may affect the performance of the system. Capacity planning, therefore, is an important step in the project of an application. The resources must be carefully evaluated, to make the environment as efficient as possible. Below is a description of the resources and clues to optimize them:
1 - Memory
Memory has different functions in a database environment. It is a fast storage area for program data and disk data caching. Memory consumption, considered not only as real memory (RAM), but also including the virtual memory subsystem, may be. evaluated in paging activity, virtual memory usage and paging space. There are a few parameters that indicate the SQL Server 2000's performance, regarding memory usage. From these we may infer if the database occupies too much memory and if there is the possibility of a memory shortage. Target Memory is the total memory reserved by the SQL Server 2000, while SQL Memory indicates how much was really used. These values should be close together, unless the database is idle. Fill factor indicates the percentage of free space on each index page; too large a percentage means the index is fragmented, too small a percentage implies Concepts
performance degradation. Normally, the ideal value for the Fill factor is about 50%, but if there are many page splits the Fill factor should be reduced to 30%.
2 - Locks
Locks occur when memory resources are blocked by the SQL-Server 2000 to prevent two users from accessing and altering the same table at the same time. If there are too many locks, the database's performance will deteriorate, since users will constantly have to wait for resources to become available.
3 - Disk Occupation
Indicates if the log file occupies a space proportional to the data file, and how much space it really uses. If the data file does not have the autoextent attribute, it will be necessary to reduce or increase the log file accordingly.
4 - Connection Counter
This expresses the number of connections to the SQL Server in a given period. Too many connections may cause a memory shortage, since each user consumes 40 Kbytes.
5 - Hit Rate
SQL Server 2000 has a buffer cache to improve response time for data that is frequently accessed. If the hit rate of the buffer cache is high, the disks were less used. This improves the performance of the database
6 - Logs
Log monitoring indicates how many alterations are being made in the database. For a database with a lot of recording activity, the time for an alteration to be executed may be reduced to increase security.
7 - CPU
Measuring CPU usage in the SQL Server 2000 is crucial for detecting performance problems. If total CPU usage exceeds 80% for long periods, there is a CPU bottleneck. Also, if the process queue is greater than 1 process per processor, a CPU bottleneck will be identified.
8 - Disk I/O
If the I/O subsystem is working efficiently, every time a server has to read or write data it will do so without waiting. But if a server has too great a workload, reading and writing will have to alternate. This can significantly reduce the performance of the server. Concepts
General Configuration Tips for the SQL Server
In this document we will present a description of several important factors in the configuration and monitoring of the database server.
- Partitions formatted in NTFS
This kind of format should only be used up to 80% capacity. Beyond this point, the I/O performance will fall drastically. NTFS partitions need space for management Limit the number of network protocols in Windows If a great number of protocols are configured in the server, an increase of traffic in the network may harm performance. It is recommendable to always use TCP/IP for communication between SQL Server and the clients.
- Sort Order
The sorting order chosen during installation of the SQL Server may affect performance. These are the possibilities:
Binary - The fastest, but may cause problems in the client applications.
Case Insensitive - The second fastest, use it if possible.
Accent insensitive, uppercase preference and Case Sensitive - The slowest
- Location of the Data Files
The best procedure is to create the data files and logs in separate disks or arrays, with the finality of isolating reading and writing conflicts. Use the most available physical disks for creation of the data files.
- Max assinc I O
If the SQL Server has an excellent disk controller, this parameter may be increased. The default value is 32 (maximum is 255), which is sufficient in most cases. The initial rule when increasing this parameter is multiplying by 2 or 3 the number of physical drives that support simultaneous I/O.
- Recovery Interval
This parameter defines the appropriate recovery interval for the SQL Server. If a server is actively being used for INSERT, UPDATE and DELETE operations, it is possible that the default value for this parameter (0) is not ideal. If the server presents periods with 100% read/write activity, this value may be increased until an optimum value Concepts
is found.
- SP ΓABLEOPTION PINTABLE
If a small table (or some) is used much more than the others, this parameter may be used to keep it always in the cache after its first reading.
- Network Packet Size
If the information switched between the clients and the servers are images, or any other large piece of data, this parameter may be increased to improve performance (default is 4096 bytes.
- Max Degree of Parallelism
If the SQL Server is used in OLTP and not OLAP applications, this parameter may be disenabled to increase performance of the server. When this parameter is enabled, the SQL Server analyzes every query to verify the possibility of dividing it in more than one processor. This is unnecessary in OLTP applications, since in this case most operations are simple and do not require parallelism.
- Max Worker Threads
This parameter indicates the maximum number of threads that the server reserves for the SQL Server (sqlservr.exe). Each user connection uses one thread. If there are more connected users than available threads, the SQL Server will use thread pooling, degrading performance. This parameter (default 255) will always be slightly greater than parameter Max user connections.
- TempDB
If the database is being used a lot by the applications, its physical location should be apart from other data files.
- Comments about Data Files and File groups
1. A data file or a file group may not be used in more than one database.
2. A data file may only belong to one file group
3. Data and log may not belong in the same data file.
4. File groups are mechanisms used to associate objects to specific files.
5. Tables may only belong to one file group.
6. Several data files may be created in different disks and associated to a Concepts
single file group.
7. File groups have a proportional growth strategy- the free space in the data files will always be proportional. For example: if a data file has 100 MB and another 200 MB, for each byte recorded in the first data file, two will be recorded in the second.
- Monitoring of the Data Files
Normally, the more data files, the better parallelism will be, but in some data files the saturation point may have been reached. To evaluate how often the data files are used, we must use the Windows NT Performance Monitor, checking the Physical Disk e Disk Queue Length counters.
Extended Memory Size
This option is available in the SQL Server for future versions of the Windows NT on alpha platforms. It indicates the number of megabytes that the SQL Server will use as cache in the memory, above 2GB.
- Glossary
Buffer cache - A definite amount of physical memory that is reserved by SQL -Server for data that is used more frequently. This reduces disk usage and improves general performance.
Disk I/O - This process strongly affects performance, since read write operations are usually much slower than physical memory operations.
Index page Index page used by SQL-Server to ease the search for information
Log file - File where all database alterations are registered. It is possible to restore information if there are problems.
Paging - Memory is organized in pages. The operating system can transfer these pages from physical memory to disk and vice-versa. This process is called paging.
Process queue - This is the queue of processes "waiting" to be processed by the CPU.
T-SQL - Transact-SQL is SQL-Server's programming language.
Virtual memory - The operating system manages the total available memory, composed of RAM and disk, creating a single "virtual" memory block.
Figure imgf000145_0001
Network Usage
Memory Recommendations
Disk Space Recommendation
Disk I/O Recommendation
Concepts
Tool Description
Figure imgf000145_0002
Introduction
Based on the data collected in machine proliant, from 08/09/2001, at 00:00, to 09/01/2001, at 17:00, this capacity planning report was produced.
The data used in this report was obtained from an exclusive collector, with high resolution and low intrusion, developed especially for this end. This collector obtained data directly from the core of the operating system, with no need for libraries or additional utilities, and minimum impact on the environment. The collected data is stored in binary format, to provide persistence. The data is automatically sent, compressed and encrypted to ensure fast transit and confidentiality.
The content of this report is the result of years of experience in performance analysis and capacity planning. The tool used to generate this document operates in a totally automatic manner, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules which are continuously improved. Through the use of concepts such as "watermarks"and regression, it is possible to determine when a computational resource will reach its saturation point.
Profile description:
ML330
We have assumed that the workload is CPU-bound, therefore the upgrades will be calculated based on SPECint.
During the monitored period, this was the summary configuration of the target machine:
Figure imgf000146_0001
OS MS Windows 2000 Advanced
Version 5.0.2195 (sp 2.0) Service Pack 2
Host Proliant
IP address 192.168.1.18
Processors 1 Pentium III (Coppermine)
Speed 728 MHz
Memory 191 MB Summary proHant
The last boot in machine proliant was on 08/30/2001, at 19:03.
This report refers to the monitoπng which took place between 08/09/2001, at 00:00, and 09/01/2001, at 17:00. The future horizon considered was 180 days. See below this period's highlights:
The CPU had a usage reduction. That is why no projection was made.
Memory is highly saturated, exceeding the limit of 90%, peaking at 2289k with a growth of 20.5% per month. If the amount of memory is increased in 291%. maximum usage should remain below the threshold of 90%, for the considered horizon.
The analyzed disks presented a good occupation, with the average usage not reaching the saturation level of 70%, growing 5.9% per month. The usage peaked at 60.5%. It is estimated that the saturation level will be reached on 01/2002. If the disk space is increased in 4.2%, maximum usage should remain below the threshold of 70%, for the considered horizon
The analyzed disks presented a good I O performance (time active), with the average usage not reaching the saturation limit of 40%, growing 69.9% per month. The total usage peaked at 32%. It is estimated that the saturation limit will be reached on 11/2001. If the number of disks is increased in 77.9%, maximum usage should remain below the threshold of 40%, for the considered horizon.
Network usage was satisfactory, with the average usage not reaching the saturation limit of 70%. Usage peaked at 0.2% and grew 102.3% per month.
For this environment to operate satisfactorily in a future horizon of 180 days, it is necessary to add 832 MB memory, add at least 0.4 GB of disk space and add 1 disks, spreading the load over them.
CPU Usage
a usage reduction. Because of that, there is no projected growth. The as 6.9% (SPECint_rate 22.11). The reliability of the linear regression is
Figure imgf000148_0001
Figure imgf000148_0002
08/09 19 29
Date
Memory
During the analyzed period, the memory was highly saturated, exceeding the limit of 90%, with a growth of 20.5% per month, peaking at 228%. The future horizon considered
Figure imgf000149_0001
is 180 days. If the memory is increased in 291%, the usage will remain below the threshold of 90%, for the considered horizon. The reliability of the linear regression is of 85.9%.
The lines below represent the total memory consumption and its growth.
Present Growth
Figure imgf000149_0002
08/09 19 29
Date Projected Growth
Figure imgf000149_0003
09/01 15 10/01 15 11/01 15 12/01 15 01/01 15 02/01 15
Date T4 'u -.-u J). ' i„ tu n miiα ""ii' i un ιuι _£ft
Disk Occupation
During the analyzed period, the analyzed disks presented an average (space) occupation below the saturation threshold of 70%, growing 5.9% per month. The peak was 60.5%. For a future horizon of 180 days, it is estimated that this saturation threshold will be reached on 01/2002. If the disk space is increased in 4.2%, maximum usage should remain below the threshold of 70%, for the considered horizon The reliability of the linear regression is of 98.6%.
All disks are considered here as a single storage device.
Present Growth
Figure imgf000150_0001
08/09 19 29 09/01
Date Projected Growth
Figure imgf000150_0002
09/01 15 10/01 15 11/01 15 12/01 15 01/01 15 02/01 15
Date Disk I/O
Duπng the analyzed period, the analyzed disks presented a good I/O performance (time active). The average usage did not reach the saturation limit of 40%, growing 69.9% per
Figure imgf000151_0001
month. The usage peaked at 32%. In a future horizon of 180 days, it is estimated that the saturation limit will be reached on 11/2001. If the number of disks is increased in 77.9%, maximum usage should remain below the threshold of 40%, for the considered horizon. The reliability of the linear regression is of 96.4%.
The line below represents the total disk time active. Here, all disks are considered as one.
Present Growth
Figure imgf000151_0002
08/09 19 29 09/01
Date Projected Growth
Figure imgf000151_0003
09/01 15 10/01 15 11/01 15 12/01 15 01/01 15 02/01 15
Date
Figure imgf000152_0001
Network Usage
Network usage was satisfactory, with the average usage not reaching the saturation limit of 70%, growing 102.3% per month and peaking at 0.2%. The future horizon considered is 180 days. The reliability of the linear regression is of 100%.
Here the total network bandwidth is considered, and the total consumption, aggregating all network adaptors.
Present Growth
Figure imgf000152_0002
08/09 19 29 09/01
Date Projected Growth
Figure imgf000152_0003
09/01 15 10/01 15 11/01 15 12/01 15 01/01 15 02/01 15
Date Memory Recommendations
For this environment to operate satisfactorily in a future horizon of 180 days, it is necessary to add 832 MB memory, as shown in the graph below.
Figure imgf000153_0001
09/01 15 10/01 15 11/01 15 12 01 15 01/01 15 02/01 15
Date
Disk Space Recommendation
For this environment to operate satisfactorily in a future horizon of 180 days, it is necessary to add at least 0.4 GB of disk space, as shown below.
The lines represent, respectively, the present configuration (orange) and the recommended configuration (brown) for the machine to operate within the usage limit.
Figure imgf000154_0001
09/01 15 10/01 15 11/01 15 12/01 15 01/01 15 02/01 15
Date
Figure imgf000154_0002
Disk I/O Recommendation
For this environment to operate satisfactorily in a future horizon of 180 days, it is necessary to add 1 disks, as shown in the graph below.
Figure imgf000155_0001
09/01 15 10/01 15 11/01 15 12/01 15 01/01 15 02/01 15
Date
Concepts
To understand a capacity planning report it is necessary to understand a few basic concepts. The idea is not to present a treaty, but to explain some fundamental aspects of capacity planning.
A computer installation may be represented by two compound systems, the user community and the computer system. The computer system is a hardware, software and transmission line complex destined to fulfill the processing and information needs of the users. These needs are communicated to the computer system through programs, data and commands produced by the users. This collection of programs, data and commands is called workload.
The computer system has a limited performance that may be quantified with measures such as: usage rate, response time, processing rate, availability index, etc. The performance of a computer system depends upon the interaction between the workload and the resources of the system. A new concept arises, the system's Capacity. This concept may be defined as the workload that a given computer system can process without exceeding the performance limits imposed by the installation.
When the workload surpasses the established limits, the system is said to have "exceeded the saturation limit", or the system is "saturated". Usually, from this moment on the response times to user requests become too slow or present an erratic behaviour.
The basic purpose of Capacity Planning is to provide, in due time, the necessary reports for rendering efficient Information Technology services .
There are different capacity planning techniques, depending basically on three factors: complexity, precision and cost. These techniques are the following, in order of growing complexity, precision and cost:
-empirical rules
-linear analysis
-analytic models
-simulation models
-"benchmarks"
Empirical rules are based on experience, knowledge, practice and feeling. It is cheap but very unreliable.
Linear Analysis, such as "Capacity Wizard", is based on the performance analysis for the current workload and the linear projection of the future behavior of this system from the present performance.
This method presents an excellent cost/benefit ratio, specially when applied in large scale in an automatic manner. One can have almost immediate capacity evaluations for hundreds of servers and workstations, with a minimum human intervention. These evaluations will permit rapid visualization of future bottlenecks in the system, permitting preventive measures, not mere reactions.
An additional advantage of Capacity Wizard is providing a constant and continued evaluation, before and after configuration changes. This means that, after an upgrade or improvement in the system, a new report will indicate if the results of the change were as good as predicted in the previous report (due to the great quantity of variables and unexpected occurrences involved, Capacity Planning is not an exact science).
The analytic model technique is based on a set of mathematical equations that
Concepts
represent the structure and functioning of a computer system during a given interval of time. Operational analysis and stochastic modeling are the techniques used.
Simulation is a numerical method describing the dynamic behavior of a system through time. These models consume a lot of CPU resources until they reach the intended results.
Benchmarking consists in the selection of a group of applications that represent as closely as possible the total workload involved. This group will be processed in a system as similar as possible to the system in question. This technique has many complications, such as the true representation of the programs and data masses and of the equipment used.
Tool Description
This tool makes the following analysis and projections:
-CPU
-Memory
-Disks- occupation
-Disks- usage
-Network Cards
The agent, installed in the analyzed system, collects data globally and per process for each of these resources.
The tool searches, in the collector agent database, the data used in the analysis. These data are validated and rearranged in the best manner.
The hourly consumptions, for each resource, are measured. The greatest recorded consumption is chosen as representative for that day. If the sample is long enough- default period is 3 months- the greatest consumption of the week will represent weekly consumption.
After these consumptions are defined, the tool will execute their linear regression, defining a line segment that approximates the defined daily or weekly consumption.
Once this segment is defined, one can extend it into the future and estimate when saturation of the resource will occur (the point where this line crosses the saturation level).
It may be possible to find that the resource is already saturated.
The tool accepts several definitions for saturation level. If the costumer does not want to define them, the tool will use the manufacturer's measurements for CPU and memory, and 50% capacity for disks and card networks.
If the user wants to define his own limits, he may do so. In doing so, he may know in advance what would happen if he upgraded his equipment.
Besides informing the moment of saturation, the tool informs the necessary alteration to prevent this excess.
Example: How much must we upgrade the present CPU so that it does not become saturated for the next 18 months?
Again, there is a default option (1 year), for all the resources.
The tool provides different types of graphs and charts, showing which applications consumed the most resources.
The following options influence the calculation of future projections and increments (equipment upgrades or alterations) :
For CPU and Memory:
1) Increment based on the present equipment (default).Note: if the equipment can no longer be upgraded, the user must choose the following option. 2) User-determined increment (user determines how many times he wants performance improved-2x,l .5x,etc). In this case, the tool will not have to search any hardware databases. Tool Description
For disks and card networks:
1) Pre-determined default increment. This increment assumes that an identical equipment was placed alongside the original one and the consumption was divided in half. 2) User-determined increments(Note: user may also define negative increments, that is, a downgrade of the equipment).
Interval of time before saturation:
1 )Pre-determined (default). This assumes that the resource will not become saturated before one year starting on the last measurement. If the resource does become saturated before one year, the report informs the necessary upgrade to avoid this. 2)User-determined. User determines how many days must pass before the resource becomes saturated.
Appendix D
Figure imgf000161_0001
Introduction
Based on data collected in the host proliant, from 08/03/2001, at 00:00, to 08/10/2001, at 23:00, the current performance analysis report was elaborated.
The data used in this report was obtained from an exclusive collector, developed specially for this end, executing on the target machine with high resolution and low intrusion. This collector obtains data directly from the operating system, without any other libraries or additional tools, with a minimum overhead on the system. The data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
The content of this report is based on years of experience in performance analysis and capacity planning. The tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, and is subject to continuous improvements. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
During the monitoring period, the summary configuration of the machine, which has been obtained dynamically, was:
Figure imgf000162_0001
OS MS Windows 2000 Advanced
Version 5.0.2195 (sp 2.0) Service Pack 2
Host proliant
IP address 192.168.1.18
Processors 1 Pentium III (Coppermine)
Speed 728 MHz
Memory 191 MB Summary proliant
The last boot of the host proliant took place on 08/07/2001, at 10:32.
This report is based on monitoring which occurred between 08/03/2001, at 00:00, and 08/10/2001, at 23:00. The following was worth highlighting within this period:
For most of the time, CPU usage remained above 75%, reaching the maximum of 99.4%, and causing a bottleneck in the system.
A bottleneck was caused by the runnable process queue, because it exceeded the number of active processors for most of the time.
The average paging rate was high all the time, reaching 185.1 pps and causing a strong memory constraint.
During the monitoring period all the real memory was used for processes and file caching.
The high amount of virtual memory in use, during all the monitoring period, indicates a need for more real memory.
The network bandwith was always sufficient, not indicating any restraints.
For over 25% of the monitoring period, disk hdiskO exceeded the usage limit.
The paging space usage exceeded the level of 70% for more than 50% of the time, reaching a maximum of 93.5%. thus resulting in a bottleneck in the system.
Availability
During this monitoring period the machine maintained the following availability rate.
Figure imgf000164_0001
I I I I I I I I I I I 1 I 1 I I I 1 I 1 I I I I I I I 1 I I I I I
08/03 04 05 06 07 08 09 10 11 date CPU Usage
Figure imgf000165_0001
CPU usage exceeded the level of 75% for most of the time, as shown in the graph below, thus resulting in a bottleneck in the system.
Figure imgf000165_0002
08/03 04 05 06 07 08 09 10 date
CPU Usage Zoom
On 08/07, at 12:00,the CPU reached its maximum recorded usage. The graph below refers to this day.
Figure imgf000166_0001
CPU Usage Zoom
These 10 processes were the ones which used the CPU the most on 08/07, at 12:00, when the highest level of usage was recorded.
ABSOLUTE
PROCESS PID THREADS USAGE sqlservr 944 36 9.8%
CSRSS 232_ : 1^. 7.0%
CMD 1568 . 1 1.6%
CMD 2584 : 1 : 1;4%
CMD 780 - 1 1.4%
CMD 1780 l 1 1.3%
LSASS 292 20 • 1.3%
CMD 2744 ; 1 1.3% explorer 2044 i 11 ' . 0.9%
Idle o I 1 ; 0.6%
GROUP
PROCESS THREADS USAGE sqlservr 67 9.9%
CMD__ 7 7.1%
CSRSS 15 ; 7.0%
LSASS 20 1.3% explorer 11 0.9%
Idle 1 0.6% cqmghost 9 0.4% osql 28 0.3% jre 137 0.3%
System 46 0.3% Process Queue
Figure imgf000168_0001
For most of the monitoring period the runnable process queue exceeded the level of 1, which resulted in a bottleneck, as the graph below demonstrates.
The greatest number of processes running was 64. This happened on 08/03, at 00:00.
The maximum number of running threads, 714, occurred on 08/04, at 23:00.
Figure imgf000168_0002
Process Queue Zoom
The runnable process queue reached its highest point on the 08/05, at 16:00.
At this exact moment, 54 processes and 681 threads were simultaneously running.
Figure imgf000169_0001
00 [05] 06 12 .18 date
Figure imgf000170_0001
CPU Usage Zoom
These 10 processes were the ones which most consumed the CPU on 08/05, at 16:00, when the runnable processes queue reached its highest number.
ABSOLUTE
PROCESS PID THREADS USAGE sqlservr 2040 j 44 9.9% .
CSRSS 232 j 20 7.9%
Idle __ _._. _ 2.6% jre J 78_4j_ 150 2.2% \
System 8 i 41 ■ 1.9% ;
CMD ; 1720 j 1 1.6% i
CMD 3232 ! 1 1.4%
CMD ! 2812 I 1 1.4% ■
CMD | 1140 j 1 ■■ 1.4% ,
CMD ! 1476 ! 1 1.4%
GROUP
PROCESS THREADS USAGE sqlservr 77 10.0%
CSRSS 20 7.9%
CMD_ 7 7.3%
Idle 1 2.6%
J're 150 2.2% System 47 1.9%
LSASS 28 1.2% cqmghost 9 0.3% osql 16 0.3% explorer 9 0.1 % Memory
The average paging rate was high all the time, indicating strong
Figure imgf000171_0001
memory constraint.
Figure imgf000171_0002
08/03 04 05 06 07 08 09 10 date
The virtual memory in use was high during all the recorded period.
Figure imgf000171_0003
08/03 04 05 06 07 08 09 10 date
All the real memory was used in processes and file caching during the monitoring period. Memory Zoom
The graph below represents the date 08/03. On this day, at 20:00, memory reached the highest level of usage.
Figure imgf000172_0001
Figure imgf000172_0002
00 [03] 06 12 18 date Memory Zoom
On 08/03, at 20:00, the highest level of paging occurred. These are the 10 processes that consumed most of the memory at this moment. Usage is shown in KB.
PROCESS PID THREADS USAGE jre 784 . . 151 430,731 sqlservr J 2040 j 37 43,836 sqlservr ; 828 j 32 13,240 inetinfo 1588 ! 27 6,100
DLLHOST i 2436 j 26 . 2,900 explorer 2656 j i i : 2,929 surveyor 1044 I 7 2,680
SERVICES ϊ 280 i 34 . 2,897
Win32s! ' 1188 ; 15 ! 1 ,220
LSASS 292 ! 18 ; 2,626
Network I/O Compaq Ethernet/FastEthernet
Figure imgf000174_0001
There was no interface overload. Both transmission and reception rates remained below 70%.
There were no errors in transmission and/or reception.
Figure imgf000174_0002
08/03 04 05 06 07 08 09 10 date
Network I/O MS TCP Loopback
Figure imgf000175_0001
There was no interface overload. Both transmission and reception rates remained below 70%.
There were no errors in transmission and/or reception.
Figure imgf000175_0002
08/03 04 05 06 07 08 09 10 date Disk I/O
Figure imgf000176_0001
The disk hdiskO was analyzed for the following requisites: transaction rate, transfers per second and usage.
For over 25% of the monitoring period, disk hdiskO exceeded the usage limit.
The graphs relating to disk hdiskO are shown on the following pages.
Disk I/O
hdiskO
Figure imgf000177_0001
hdiskO
Figure imgf000177_0002
08/03 04 05 06 07 08 09 10 date
hdiskO
Figure imgf000177_0003
08/03 04 05 06 07 08 09 10 date Disk I/O Zoom
On 08/03, at 18:00, disk I O showed the highest activity of the whole monitoring period. The graph below represents the status on this day.
Figure imgf000178_0001
00 [03] 06 12 18 date
Figure imgf000178_0002
00 [03] 06 12 18 date ll"" 'lm, 1 i ' rii " ii ι"iιi Mi .a
Disk I/O Zoom
The graph below shows the disk hdiskO Read/Write ratio, during the highest I/O activity period.
Figure imgf000179_0001
00 [03] 06 12 18 date
Paging Space
Figure imgf000180_0001
The paging space usage, 294,912 KB, exceeded the level of 70%, as shown in the graph below, reaching a maximum of 93.5%, thus resulting in a bottleneck of the system.
The occupation rate trend for paging space was not analyzed due to the short sample period.
Figure imgf000180_0002
08/03 04 05 06 07 08 09 10 date
File System
Status of the file system at the end of the monitoring period:
MountPoint PhysDrv Type Total (KB) Free (KB) %Used
D:\ NTFS 4,707,012 3,241,196 31
C:\ NTFS 4,136,737 886,382 78
Disk Occupation
This was the situation of the disk at the end of the monitored period:
Disk_Number Signature Size (MB) Free (MB) %Used
0 286383549 8,675 0 100
03/AUG CPU
Processes that most used CPU during the monitoring period, as from 03/08.
ABSOLUTE
PROCESS PID THREADS USAGE
Idle 0 1 22.9%
CSRSS ' 232 j 20 7.4% sqlservr , 856 I 43 . 6.8% jre 784 j 165 2.3% sqlservr 2040 j 34 1.1%
CMD 2856 , 1 1.0%
CMD . 764 J 1 1.0%
System 8 \ 46 ' 1.0%
CMD I 2116 ! 1 ! 1.0%
CMD 2880 I ι : 1.0%
GROUP
PROCESS THREADS USAGE
Idle 1 22,9% sqlservr 75 8.0%
CSRSS 20 7.4%
CMD 8 5.8% jre 165 2.3%
System 46 1.5%
LSASS 21 1.0% cqmghost 9 0.4% osql 25 0.3% explorer 11 < 0.1% 04/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE sqlservr 2040 . j 43 9.7%
CSRSS 232 ! 20 . 7.4%
Idle 0 I 1 , 6.5% jre '. 784 j 176 . 2.3%
CMD •; 1720. | 1.6%
CMD 2812 j 1.4%
CMD I 3232 i 1.4%
CMD i 1140 j 1.4%
CMD ; 1476 ] 1.4%
System 8 i 46 1.3%
GROUP
PROCESS THREADS USAGE sqlservr ' . 75 9.8%
CSRSS__ 20 7.4%
CMD 7 7.4%
Idle 1 6.5% jre 176 2.3%
System 46 1.6%
LSASS 28 1.3% cqmghost 9 0.4% osql 41 0.4% explorer 9 < 0.1% Top 10 05/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE sqlservr 2040 43 9.7%
CSRSS ; 232 20 7.5%
Idle _ _0_j 1 : 6.0% jre 784 174 2.3%
System ! 8 46 2.0%
CMD 1720 1.6%
CMD 2812" 1.4%
CMD 3232. 1.4%
CMD ! 1476 1.4%
CMD 1140 1.4%
GROUP
PROCESS THREADS USAGE sqlservr 76 9.8% CSRSS 20 7.5% CMD 7 7.4% Idle _ _ 1 6.0%
Jre . . 174 2.3% System 46 2.0%
LSASS 28 1.2% cqmghost 9 0.4% osql 73 0.4% explorer 9 < 0.1% Top 10 06/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE
Idle 0 1 19,0%
Figure imgf000186_0001
CMD I 2812 1 1.2%
CMD i 3232 1 1.2%
CMD ! 1140 1 1.1%
CMD 1476 1 1.1%
GROUP
PROCESS THREADS USAGE
Idle .._ 1 19.1% sqlservr 77 8.7%
CSRSS 20 6.4% CMD 7 6.3%
171 2.0%
System 46 1.7%
LSASS 28 1.1 % cqmghost 9 0.4% osql 367 0.3% explorer 9 < 0.1% Top 10 07/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE
Idle 0 1 43.2% sqlservr ' 944 ■' 35 , 6.5%
CSRSS 232 j 11 ; 4.4%
CMD 1568 ; 1 1.0%
CMD 2584 i 1 i 0.8%
LSASS 292 I 20 0.8%
CMD 780 I 1 0.8% i
CMD 2744 i 1 ; 0.8% .
CMD 1780 | 1 . 0.8% ; sqlservr 856 I 35 ; 0.4%
GROUP
PROCESS THREADS USAGE
Idle 1 43.2% sqlservr 66 7.1 %
CMD ' 7 . 4.4%
CSRSS 11 4.4%
LSASS 20 0.8% cqmg ost 9 0.4% explorer 11 9-3% jre 135 0.3% osql 28 0.2%
System 46 0.1 %
T Top 10 08/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE
Idle ! _0j _ _ _ _ 1_ . 95.6% sqlservr 944 | 39 1.7% ; explorer 2044 | 9 0.5% sqlservr ( 828 ! 33 . 0.5% cqmghost 1740 i 9 0.3% ; taskmgr j 2056 ! 3 : 0.3% jre > 796 } 136 ' 0.3%
WinVNC 1480 j 5 '. 0.2% :
LSASS 1 292 i 18 ' 0.2%
!
DLLHOST 2512 1 25 < 0.1%
GROUP
PROCESS THREADS USAGE Idle 1 95.6% sqlservr 72 2.2% explorer 9 0.5% cqmghost 9 0.3% taskmgr 3 0.3%
Jre 136 0.3% WinVNC 5 0.2%
LSASS 18 0.2%
DLLHOST 35 < 0.1% inetinfo 29 < 0.1% Top 10 09/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE
Idle l _ _ 0 __ 1 ' 96.6% sqlservr ! 944 39 1.7% sqlservr ! 828 ; 33 ; 0.5% cqmghost i 1740 ; 9 i 0.3% . jre | 796 ; 135 ; 0.3%
LSASS I 292 ! 18 : 0.2%
WinVNC j 1480 ! 5 : <0.1%
DLLHOST . I 2512 j 25 <0.1% ! inetinfo j 1632 ; 28 ; <0.1% ' explorer ! 2044. I 9 <0.1%
GROUP
PROCESS THREADS USAGE Idle 1 . 96.6% sqlservr 72 2.2% cqmghost 9 0.3% jre 135 0.3%
LSASS 18 0.2%
WinVNC 5 <0.1%
DLLHOST 35 <0.1% inetinfo 28 <0.1% explorer 9 <0.1% aengine 6 <0.1% Top 10 10/AUG CPU
ABSOLUTE
PROCESS PID THREADS USAGE
Idle 0 j 1 96.5% sqlservr 944 i 39 1.7% sqlservr 828 ; 33 0.5% cqmghost 1740J 9 0.3%
Jre._ 796 j 135 0.3%
WinVNC 1480 I 5 0.2%
LSASS j 292 | 18 0.1%
DLLHOST 2512 j 25 <0.1% inetinfo 1632. | 29 <0.1%
System ! 8 \ 46 <0.1%
GROUP
PROCESS THREADS USAGE Idle ___ 1 96.5% εqlεeryr_ 72 2.2% cqmghost 9 0.3% re 135 0.3% WinVNC 5 0.2%
LSASS 18 0.1%
DLLHOST 35 <0.1% inetinfo 29 <0.1% aengine 6 <0.1%
System 46 <0.1% 03/AUG Memory
Processes that most used memory during the monitoring period, as from 08/03. The usage is shown in KB.
ABSOLUTE
PROCESS PID THREADS USAGE jre 784 i 165 433,005 sqlservr 856 ; 43 48,601 sqlservr 2040 ! 34 _ 45,324 sqlservr ! 828 i 32 13,240 ■■ inetinfo j 1588 \ 27 ' 6,100
WINLOGON I 252 ! 15 5,480 mmc I 2896 I 3 : 4,750 : aengine i 652 j 6 ! 3,894 aengine I 836J 6 ' 3,774 explorer ! 2656 | 11 2,926
04/AUG Memory
ABSOLUTE
PROCESS PID THREADS USAGE jre j 784 I 176 , 472,577 sqlservr i 2040 j 43 47,673 sqlservr 828 i 32 ; 13,247
WINLOGON 252 i 15 : 5,480 aengine i 836 j 6 : 3,758 :
LSASS | 292 ; 28 '•■ 2,970
SERVICES i 280 ; 34 • 2,909 cpqwmgmt j 1864 j 5 i 2,568 ! snmp j 960 | ιo i 2,412
Webdmi ! 1132 i 8 2,332
05/AUG Memory
ABSOLUTE
PROCESS PID THREADS USAGE Jre _ _ _ ___ ___ 784 : 174 473,016 sqlservr 2040 ■ 43 ' 46,765 sqlservr 828J 33 : 13,323
WINLOGON 252 ! 15 5,480 aengine 836 j 6 3,765
LSASS : 292 j 28 j 2,966 .
SERVICES 280 i 35 ; 2,908 surveyor 1044-! 7 ! 2,680 cpqwmgmt 1864 ! 5 ! 2,568 arelay ; 648 i 6 . 2,444 :
06/AUG Memory
ABSOLUTE
PROCESS PID THREADS USAGE jre 784 171 , __471 ,122 sqlservr 856 ! 32 107,719 sqlservr i 2040 j 44 ' 47,548 , jre ; 792 j 136 ; 42,706
I EXPLORE I 2696 | 17 17,022 sqlservr I 824 i 31 ; 13,327 ; sqlservr 828 i 33 i 13,325 : inetinfo i 1584 j 27 5,892 ;
WINLOGON i 252 | 15 : 5,494 aengine j 836 | 6 •! 3,820 ■■
07/AUG Memory
ABSOLUTE
PROCESS PID THREADS USAGE sqlservr i 856 __ 3_5 ; 124,860 sqlservr 944 i 35 116,700 jre _„796 142 . 44,393 jre i 792. ; 135 : 43,765 sqlservr 824 j 31 ' 13,360 ; sqlservr j 828 j 31 13,339
(EXPLORE ; 2804 I 13 8,671 : inetinfo ! 1584 1 27 ; 6,072 i inetinfo 1632 | 27 i 5,961 ,
WINLOGON | 252' -i 15 j 5,480
08/AUG Memory
ABSOLUTE
PROCESS PID THREADS USAGE
Figure imgf000196_0001
sqlservr 828 ; 33 13,330 inetinfo 1632 ; 29 ' 6,131
WINLOGON 252 , 15 5,480 explorer 2044 ] 9 , 4,594 mmc i 2972 | 7 ' 3,932
SERVICES 280 I 36 ! 3,031 surveyor 1088 i 7 ! 2,853
DLLHOST 2512 i 25 i 2,788
09/AUG Memory
PROCESS PID THREADS USAGE sqlservr 944 39 120,017 jre 796 135 43,987 sqlservr 828 33 13,333 . inetinfo 1632 28 • 6,124 I
WINLOGON __ 252 15 5,480 ; explorer 2044 _9 J 4,848 . mmc 2972 7 ' 3,932 !
SERVICES 280 37 : 3,094 i surveyor ; 1088 7 2,875
DLLHOST 2512 25 ; 2,843
10/AUG Memory
ABSOLUTE
PROCESS PID THREADS USAGE sqlservr i 944 39 119,802 ' jre 796 ' 135 ; 44,126 sqlservr 828 ; 33 ; 13,217 inetinfo ; 1632 \ 29 : 6,109 :
WINLOGON 252 i 15 ; 5,493 explorer 2044 i 9 4,839 mmc ! 2972 ! 7 3,932 aengine 2048 ! 6 s 3,804
SERVICES ; 280 1 37 ; 3,078
DLLHOST i 2512 ; 25 ! 2,884
Concepts
In order to be able to understand a performance analysis report, it may be convenient to review some basic concepts. The idea here is not to make a treaty on the subject, but to go through some fundamental aspects related to performance.
System performance means different things to different people. This can range from resource consumption to amount of work performed per unit of time. It will be assumed here that improving performance means improving response time of end users and/or increasing throughput of both end user work and batch work.
The performance of any system depends on how tied up key resources are. The reason being that system performance is, essentially, a function of the time each key resource takes to service a request, plus the time a request has spent queued waiting to be serviced (more details on queues ahead). In case of an information-processing environment, based on computers, key resources are CPU, memory, disk I/O and network I/O.
In order to evaluate resource consumption, criteria must be established. These criteria consist of judging which system performance variables best express this consumption, since many are available. In addition, the watermarks (point where a resource starts to be considered overcornπϋted - also known as thresholds) for these variables need be defined. These watermarks are approximate and can vary depending on the characteristics of the system being analyzed.
Description of key resources
1 - CPU
CPUs can play a significant role in the response time of computational environments, especially when other resources are abundant. This is particularly true in environments where most of the data required is available in memory. For CPU, the key variables to evaluate resource consumption are run-queue and CPU usage.
1.1 - Run-queue
Run-queue means the amount of processes (threads, in fact) which are runnable (ready to execute), being either queued, waiting for a CPU. or already executing. It is a measure of how used up is the CPU, in an environment comprised of many processes (a commercial transaction environment, e.g.). The watermarks, typical of run-queue, are a range between the number of processors available and five times this value. This depends on the response time required for a transaction, versus the amount of CPU required by this transaction.
1.2 - CPU Usage
CPU usage is a measure of how used up is the CPU in an environment Concepts
comprised of few heavy processes (such is the case of scientific or commercial environments with few, but complex, batches). It can be used as a criterion for environments with many processes, but run-queue is more meaningful in these cases. CPU usage is expressed in percentage and it can be broken in four categories, usr, sys, idle and wio. Usr stands for user mode or the mode in which a process executes, when not using any operating system service. Sys means system mode, which is the mode a process is placed into when using any operating system service. Idle, as the term suggests, is when a CPU has no process to execute. Wio stands for waiting for I/O, a special case of idleness, where the CPU is available, but there are processes waiting for an I/O operation to complete. CPU usage is normally a concern when usr+sys is above 75 to 85%, in an environment with multiple processes, or is close to 100% / number of processors, in an environment with few processes.
2 - Memory
Memory can play different roles in a computational environment, ranging from fast storage area for program data to disk data caching (making up for the slower speed of disk subsystems). This means that memory is consumed for very distinct purposes. Memory consumption, being understood as not only real memory (RAM), but as the entire virtual memory subsystem, can be well evaluated by paging activity, virtual memory usage and paging space usage.
Paging activity occurs when the real memory being managed by a virtual memory subsystem is overcommitted. In a small degree, it is not a problem, since the main purpose of the virtual memory subsystem is to be able to maximize system throughput by allowing process memory to be swapped in and out. It becomes a critical issue when paging reaches high rates. The point is that paging indicates that the sum of the working sets (ranges of virtual memory addresses of processes that need to be accessible at a given moment) of the processes, plus what is left aside for the operating system and file caching, exceeds the amount of real memory available. Paging is broken down in page in (pi) and page out (po). A page in is usually regarded as more serious, since it may indicate a thrashing condition (the system is spending too much time just paging). The watermark for paging (pi+po) is in the range of 10 pages per second.
Paging space usage is a fundamental concern when analyzing the state of the virtual memory manager. If no paging space is available, definitely no new process will be spawned and, very likely, some existing processes may be terminated by the operating system in order to make room in the paging space. So, real memory constraints impact performance, but paging space constraints put in risk the entire execution environment. The amount of virtual memory devoted to process segments (process data area) is directly related to paging space usage, if the operating system in question is working with early paging Concepts
space allocation (allocate space in the paging area whenever it is allocated in real memory). In this case, the amount of space being used in the paging area is the sum of the data areas of all running processes, which is the major component in determining the amount of real memory required by the system. If the amount of virtual memory in use exceeds the amount of real memory available, paging will very likely occur, initiating peformance degradation. In an environment experiencing significant growth rates, especially in terms of users, it is advisable to keep the average paging space usage rate at 50%. Obviously, this concern does not exist or is less serious in the case of operating systems that are able to allocate paging space dynamically.
3 - Disk I/O
Disk I/O is certainly one of the main subjects when performance is in discussion. This is particularly true in commercial environments. Disks, being mechanical devices (in comparison with other devices which are faster, because they are eletronic) can, if not properly used, put in jeopardy the performance of an entire system. In addition, disks can present two very distinct personalities - one, when accessing data in random mode (slower, because it involves arm movement), and other, when accessing data in sequential mode (faster, because it only involves plate movement). Also, the performance of disks varies according to the blocking factor (amount of data involved in a same operation), since the impact of the overhead is dilluted, in the case of large blocks. Therefore, disks must be closely watched. Among the key variables that provide information on disk I O usage are bandwidth occupation, transfers (1/Os) per second, transfer rate (usually expressed in KB/s) and physical read to write ratio.
3.1 - Bandwidth occupation
Bandwidth occupation is probably the most important variable when evaluating disk I O. It is calculated based on the number of samples taken within a period (1 second, e.g.) that found a given disk to be busy. It is highly dependent on the rate of requests being sent to the disks and the type o data access being required by these requests (since random requests take longer to be serviced than sequential requests). With bandwidth occupation, it is possible to estimate whether disk I/O requests for a given disk are spending time in queues, instead of being serviced promptly. The disk bandwidth occupation watermarks regarded as acceptable vary from 15%, for environments with predominantly random access (OLTP with simple transactions), to 65%), for environments with predominantly sequential access (datawarehouse, complex batch applications, etc.). A criterion of 40% is adequate for mixed environments (which constitute the great majority of cases).
3.2 - Transfers per second
Transfers per second provide a good complementary information to Concepts
bandwidth occupation, specially in order to evaluate more subtle bottlenecks such as disk adapters (SCSI, FC-AL, SSA, etc.). Disks adapters have a ceiling, in terms of transfers per second, that might be reached without notice, limiting, therefore, the I O capability of multiple disks. Physical disks support about 100 to 120 transfers per second, when accessing data in random mode, and lOx these values or more, when accessing data in sequential mode. So, it is considered acceptable to keep disks operating at a sustained rate of about 50% of these values.
3.3 - Transfer rate
Transfer rate informs the amount of data that is being received from disks or being sent to disks, per unit of time. Like bandwidth occupation and transfers per second, transfer rate is a function of the rate of requests being sent to the disks and the type of access being required by these requests (random access requires more time to be serviced and, therefore, limits transfer rates). Another key aspect of transfer rate is that it may also expose the ceiling of disk adapters, regarding this characteristic. In addition, computer I/O buses may also impose a limit on the transfer rates of adapters inserted in these buses. The typical watermarks for transfer rates, per individual physical disk, varies from 400 to 1.000 KJB/s, for random I O, to between 4.000 KB/s and 25.000 KB/s (when using large blocks), for sequential I/O.
3.4 - Physical read to write ratio
The physical read to write ratio for disk I/O is important in determining whether a given database configuration is adequate and whether the type of disk layout being used is the most suitable. When the number of physical reads more than exceeds 5x the number of writes, it might mean (although not necessarily) that the size of a database buffer cache is not big enough. Therefore, the database software might be operating with an unacceptable hit ratio (ratio of logical reads that are satisfied by the cache). A physical read to write ratio below 2.5x, although not a problem in itself, might not be adequate for certain disk layouts such as RAID-5, since this arrangement has a significant write-penalty (additional effort required to perform write operations, when compared to read operations).
4 - Network I O
Network I O, although also a key resource, seldomly plays a major role in influencing response time, except when wide area networks (WAN's) are involved. Nevertheless, this resource must also be monitored, for it may hide some surprises. The key variables to evaluate network I/O are transfer rate (or, bandwidth occupation of the maximum transfer rate), error rate and latency.
4.1 - Transfer rate
20Ϊ Concepts
Similar to disk I/O, the transfer rate of network adapters depends on block size, although the impact is not as significant. On the other hand, there is direct correlation between transfer rate and bandwidth usage in the case of network I/O, whereas with disk I O this is not the case (since the data access mode must be considered). Typically, collision-prone networks (Ethernet, without switches, e.g.) should operate at 30 to 40% of their nominal capacity and other types of networks should be kept at 50 to 70% of their nominal capacity.
4.2 - Error rate
Error rate provides a measure of how effective the transfer rate is in a network, since a high error rate will mean that the effective transfer rate is very low. Most network adapters and network device drivers provide some means of retrieving error information, which is presented as a complementary statistic to the transfer rate itself. Major causes of high error rates in LANs are collisions (two or more network devices trying to send data at the same time), stationary waves (signal that remains in the network due to poor termination), full-duplex/half-duplex mismatch (operation mode mismatch between adapters and hubs/swiches) and speed mismatch (speed mismatch between adapters and hubs/switches). A major cause of high error rates in WANs is noise (heat, electromagnetic noise, etc.). The watermark for error rate in LANs should be of 1% or even less and for WANs should be of about 5%.
4.3 - Network latency
Network latency can be measured by various schemes, the most common being echo requests (TCP/IP ping, e.g.). It is a measure of how long the first packet, of a chain of packets, took to reach its destination. The point is that it is not enough to have high bandwidth if the first packet requires a very significant time to arrive. This is the case with satellite links, but may also apply to confined networks. For instance, the latency of ATM and gigabit Ethernet can be very small for conventional applications, but it is high for parallel computing applications.. Another aspect of latency is that it may have an important software component, such as the one caused by operating systems protocol stacks (communication subsystems). The typical latencies desired for corporate LAN's (Local Area Networks) are in the range of 1 to 10 ms.
Automatos does not guarantee that the recommendations in this report are necessarily the best opportunities available in the market for the specific needs of the client. Automatos will not be held responsible for any kind of damages or losses relating to the recommendations, either directly, indirectly, punitively, incidentally, specially or consequently, including, but not limited to. loss of functions, data or profit, regardless of any contractual, non-contractual or objective obligation. Automatos will not be held responsible even if previously informed of the possibility of damage. The decision on which investment, product or Concepts
service to use is a sole responsibility of the client.
Appendix E
Figure imgf000206_0001
Introduction
The current capacity planning report was based on data collected in the host ACMEsrvOl and ACMEsrv02, from 07/27/2001, at 13:00, to 08/08/2001, at 23:00.
The data used in this report has been obtained from an exclusive collector, which executes on the target machine, with high resolution and low intrusion, specially developed for this purpose.This collector obtains data directly from the operating system, without any other libraries or additional tools, with a minimum overhead on the system. The data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
The content of this report is based on years of experience in performance analysis and capacity planning. The tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, improved continuously. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
Summary
The table below represents the status of the machines analyzed in this report. The red square indicates that the machine has already exceeded the saturation limit. The black square indicates that the saturation limit will be exceeded in the future horizon considered. Click on the square to observe the corresponding graph.
Processor Memory Disk Disk I/O Network
ACMEsrvOl ■ ■ ■ ■
ACMEsrv02 ■ ■
The CPU utilization for machine ACMEsrvOl is saturated. The CPU recommendation is exchanging this model for 6 Compaq Proliant ML750 2048 machines, with 7 CPUs each (tpm = 446390.7 total).
The memory utilization for machine ACMEsrv02 will become saturated in the future horizon of 300 days. The memory recommendation is to add 1473 MB memory.
The disk space utilization for machine ACMEsrvOl is saturated. The disk space recommendation is to add at least another 52.6 GB of disk space.
The disk space utilization for machine ACMEsrv02 will become saturated in the future horizon of 300 days. The disk space recommendation is to add at least another 6.3 GB of disk space.
The disk I/O utilization for machine ACMEsrvOl is saturated. The disk I/O recommendation is an upgrade to 12 disks, spreading the load among them.
The network utilization for machine ACMEsrvOl will become saturated in the future horizon of 300 days. The network recommendation is to add 2 network adaptors, each one with a 100 Mb/s capacity.
CPU Usage
Machine ACMEsrvOl presented CPU saturation during the monitored period. The future horizon considered is of 300 days. Utilization grew 3763.3% per month. The accountability of this projection is of 60.9%.
To keep CPU utilization below the limit of 75%, an upgrade of 4376.5% is necessary.
The CPU recommendation is exchanging this model for 6 Compaq Proliant ML750 2048 machines, with 7 CPUs each (tpm = 446390.7 total).
Figure imgf000209_0001
08/2001 11 02/2002 05 date
— Actuaktpm 9833.48
— Minimum necessary :tpm 440194.65
— Recommendations Compaq Proliant L750 2048 Memory
Machine ACMEsrv02 presented memory saturation in the future horizon of 300 days. Utilization grew 40.1 % per month. The accountability of this projection is of 97.8%.
To keep memory utilization below the limit of 75%, an upgrade of 124% is necessary.
The memory recommendation is to add 1473 MB memory.
Figure imgf000210_0001
08/2001 11 02/2002 05 date
— Actual:575 MB
— Minimum necessary :1287.83 MB
— Recommendation:2048 MB Disk Occupation
Machine ACMEsrvOl presented disk space saturation during the monitored period. The future horizon considered is of 300 days. Utihzation grew 19% per month. The accountability of this projection is of 94.6%.
To keep disk space utilization below the limit of 75%, an upgrade of 279% is necessary.
The disk space recommendation is to add at least another 52.6 GB of disk space.
Figure imgf000211_0001
08/2001 11 02/2002 05 date
Actual:17.41 GB
Minimum necessary:65.97 GB
Recommendation:70 GB Disk Occupation
Machine ACMEsrv02 presented disk space saturation in the future horizon of 300 days. Utilization grew 6.6% per month. The accountability of this projection is of 99.6%.
To keep disk space utilization below the limit of 75%, an upgrade of 63.7% is necessary.
The disk space recommendation is to add at least another 6.3 GB of disk space.
Figure imgf000212_0001
08/2001 11 02/2002 05 date
Actual:3.73 GB
Minimum necessar -.6.11 GB
Recommendation:! 0 GB Disk I/O
Machine ACMEsrvOl presented disk I/O saturation during the monitored period. The future horizon considered is of 300 days. Utilization grew 124.5% per month. The accountability of this projection is of 64.2%.
To keep disk I O utilization below the limit of 75%, an upgrade of 1144% is necessary.
The disk I O recommendation is an upgrade to 12 disks, spreading the load among them.
Figure imgf000213_0001
08/2001 11 02/2002 05 date
— Actual:1 Disks
"~ Minimum necessary:12.44 Disks
— Recommendation:13 Disks Network Usage
Machine ACMEsrvOl presented network saturation in the future horizon of 300 days. Utilization grew 976.8% per month. The accountability of this projection is of 93.3%.
To keep network utilization below the limit of 75%, an upgrade of 185.5% is necessary.
The network recommendation is to add 2 network adaptors, each one with a 100 Mb/s capacity.
Figure imgf000214_0001
08/2001 11 02/2002 05 date
Actual:104 Mbps
Minimum neceεsary:296.91 Mbps
Recommendation:304 Mb/s List of Machines
Name Model Processor Number of CPUs Clock (Mhz) Memory (MB) OS Version
ACMEsrvOl Pentium III (Coppermine) 1 728 191 Win2000 0.0.
ACMEsrv02 Pentium III (Coppermine) 1 728 575 Win2000 ' 0.0.
^
Performance Indicators
Nominal Used Nominal Used Total Used Total Disk Used Dis
Name SPECint95 SPECint95 TPMC TPMC Memory (MB) Memory (MB) Space (GB) Space (GB
ACMEsrvOl 321:3 320.4 9,833.4 9,806.9 191 644 17 1
ACMEsrv02 321.3 160.4 9,833.4 4,910.8 575 178 3
Total 642.6 480.8 19,666.9 14,717.7 766 822 20 1
t
Figure imgf000217_0001
Introduction
The current performance analysis report was elaborated based on data collected in the host ACMEsrvOl, ACMEsrv02, ACMEsrv03, ACMEtst04 and ACMEtst05, from 06/15/2001, at 00:00, to 06/19/2001, at 23:00.
The data used in this report has been obtained from an exclusive collector, which executes on the target machine, with high resolution and low intrusion, specially developed for this purpose.This collector obtains data directly from the operating system, without any other libraries or additional tools, with a minimum overhead on the system. The data collected is stored using a binary format, in order to provide persistence. When automatically sent, it is compressed and encrypted, to ensure fast delivery and confidentiality.
The content of this report is based on years of experience in performance analysis and capacity planning. The tool used to generate this report operates in a completely automatic way, without direct human intervention. It uses an extensible inference machine, based on heuristics and rules, improved continuously. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
Summary
The chart below represents a summary of the situation of the machines analysed in this report. The red square indicates that a machine has exceeded, on average, the threshold. The black square indicates that only maximum values exceeded the treshold. Click on the square to see the corresponding graph.
Processor Memory Network Disk
ACMEsrvOl
ACMEsrv02
ACMEsrv03
ACMEtst04
ACMEtstOδ
Machine ACMEsrv02 reached a maximum consumption of 80%, exceeding the limit of 75%.
Machine ACMEtst04 reached a maximum consumption of 96.8%, exceeding the limit of 75%.
Machine ACMEtst05 reached a maximum consumption of 93.3%, exceeding the limit of 75%.
Machine ACMEtst04 reached a maximum paging rate of 10.2 pg/sec, exceeding the limit of 10 pg/sec.
Machine ACMEtst05 reached a maximum paging rate of 25.8 pg/sec, exceeding the limit of 10 pg/sec.
Machine ACMEtst04's network devices reached the maximum reception level of 84.5%, exceeding the limit of 70%.
Machine ACMEtst05's network devices reached the maximum transmission level of 91 %, exceeding the limit of 70%.
Machine ACMEsrv03 reached a maximum disk time active of 57.8%, exceeding the limit of 40%.
Machine ACMEsrv02 registered an average disk occupation of 75%. exceeding the limit of 70%.
Machine ACMEtst04 registered an average disk occupation of 89.1 %, exceeding the limit of 70%.
Machine ACMEtst05 registered an average disk occupation of 89.4%, Processor
Below, each pair of graphs represents the average and maximum values for CPU consumption in the specified period, subdivided in user and system modes.
averages
Figure imgf000220_0001
ACMEsπ 01 ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
H user I system
maximum values
Figure imgf000220_0002
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
B^'user I system Memory
Each pair of graphs below represents the average and maximum values for memory consumption in the specified period.
averages
Figure imgf000221_0001
ACMEsn 01 ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
maximum values
Figure imgf000221_0002
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtst05 Paging
Each pair of graphs below shows the average and maximum paging rates for the period, subdivided in page in and page out.
averages
Figure imgf000222_0001
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
01 pi D po
maximum values
Figure imgf000222_0002
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
□ pi D po Network I/O
Each pair of graphs below represents the maximum and average values for the network traffic in the period. Loopback interfaces were not taken into account, only transmission and reception.
averages
Figure imgf000223_0001
ACMEsn 01 ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ I transmission B reception
maximum values
Figure imgf000223_0002
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ B tr2ι2≤rnission I reception Disk Activity
Each pair of graphs below represents the average and maximum values for the disk usage rate (time active) in the specified period.
averages
Figure imgf000224_0001
ACMEsn 01 ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
maximum values
Figure imgf000224_0002
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ Disk Occupation
Each pair of graphs below represents the average and maximum values for the disk occupation in the specified period.
averages
Figure imgf000225_0001
ACMEsn 01 ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ
maximum values
Figure imgf000225_0002
ACMEsrvOl ACMEsrv02 ACMEsrv03 ACMEtst04 ACMEtstOδ Processes which most consumed the CPU during the monitoring period
ACMEsrvOl process usage
Figure imgf000226_0001
ACMEsrv02 process usage
Figure imgf000226_0002
ACMEsrv03
Figure imgf000227_0001
ACMEtst04 process usage
Figure imgf000227_0002
ACMEtstOδ process usage
Figure imgf000228_0001
List of Machines
Number of CPUs
Name Mode! Processor Clock (Mhz) Memory (MB) OS Version
ACMEsrvOl IBM.9076-270 PowerPC_POWER3 2 500 3,072 AIX 4.3.3.29
ACMEsrv02 IBM.9076-260 PowerPC_POWER3 2 750 2,048 AIX ' 4.3.3.16
ACMEsrv03 IBM.9076-260 PowerPC_POWER3 2 500 2,816 AIX 4.3.3.29
ACMEtst04 IBM.9076-260 PowerPC_POWER3 2 600 1 ,024 AIX 4.3.3.29
ACMEtstOδ IBM,9076-260 PowerPC_POWER3 2 450 2,560 AIX 4.3.3.29
t ro
00
Performance Indicators
Nominal Used Nominal Used Total Used Total Disk Used Disk
Name SPECint95 SPECint95 TPMC TPMC Memory (MB) Memory (MB) Space (GB) Space (GB)
ACMEsrvOl 349.2 157.7 9,596.6 4,333.7 3,072 365 103 79
ACMEsrv02 349.2 279.4 9,596.6 7,676.7 2,048 390 112 84
ACMEsrv03 349.2 25δ.δ 9,596.6 7,021.7 2,816 178 72 35
ACMEtst04 349.2 338.2 9,596.5 9,292.3 1,024 138 229 204
ACMEtstOδ 349.2 325.9 9,596.5 8,955.9 2,560 505 112 100
Total 1,746.4 1,356.8 47,982.6 37,280.5 11,520 1,576 628 502
t to
Appendix F
SHOWSUM Windows v2 . 0 . 3 . 0 - Nov 9 2000 00 : 24 : 17
Copyright ( c ) Automatos "MCMXCIX - MM . All rights reserved .
Opening file "pwc_20010118_0000_w4 . sum"
Total records : 24
—Header
Col version: 2.0.2.0 OS version: 4.0.1381.0 OS type: Windows Platform: x86 VirtualHour : 3600 - 00 01:00:00 (DD HH:MM:SS) VirtualDay: 86400 - 01 00:00:00 (DD HH:MM:SS) Timestamp: 2001/01/17 23:00 Col start time: 2001/01/10 11:09
—Syslnfo
Sysver : MS Windows NT Server ver 4.0.1381 (sp 6.0) Service
Pack 6
CustomerlD: 05500100011513
MachinelD: 90A30000000052
CompName : SAPDTI
HostName: sapdti
IPaddress: 192.168.1.149
Processor: x86 Family 6 Model 7 Stepping 2
ProcessorSpeed: 497 (MHz)
Processorldent : Genuinelntel
TotalProcessors : 2
ActiveProcessors : 2
TotalMe ory: 1047976 (KB)
PageSize : 4096 (B)
RegistrySizeLimit : -1 (B)
CurrentRegistrySize: 8032256 (B)
CurrentSyslogSize : 720984 (B)
BiosVersion : IBM BIOS Ver 7.0
BiosDate : 01/29/99
Syste ldentifier : AT/AT COMPATIBLE
SystemDir : C:\WINNT\System32
WindowsDir : C:\WINNT
LastBootMode : Normal
BootTime : Wed Jan 10 12:07:06 2001
TznameO: E. South America Standard Time
Tznamel : E. South America Daylight Time
Locale : C
—Timestamp (Record 1)
Wed Jan 17 23:09:00 2001
—VMstat
Processors: 2
Interval: 3,599,967,900 (us) usr: 5.18% sys: 0.62% idle: 188.37% queue: 0 freepte: 39590 pi: 1.53/s po : 2.29/s
--IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R.(c) W(c) Q S 0 35CB34EC 4,518,912 4,220,928 526.36 719.01 2354.61 990 1008 0 0 3,600,000,000
1 CE2EE96C 35,814,400 48,538,624 2362.13 1541.60 0.00 4937 1877 0 0 3,903,739,408
2 CE1F98E1 2,187,264 10,772,480 189.31 8693.15 0.00 265 677 0 0 8,882,474,152
3 CE2EE96D 32,768 647,168 3.68 15.88 3580.42 4 14 0 0 3, 600,000,000
— FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 128 2,048 6.25%
631AF993 2
D:\pagefile.sys PagingFile 2,378,916 3,072,000 77.44%
B5D97A0C 3000
H:\pagefile.sys PagingFile 344 184,320 0.19%
00D37BF0 180
C:\ 0 NTFS 1,099,205 4,192,492 26.22%
B8BA90A3 0
D:\ 1 NTFS 121,252 12,129,484 1.00%
6470EC21 0
E:\ 2 NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ . 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
—Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,046,875 6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,600,046,875
—TopTen
Interval: 3,600,078,125 (us)
ActiveCPUs: 2
TotalProcesses : 50
TotalThreads: 327
TotallO: 0
TotalHandles: 8431
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,781,391 2 0 disp+work 257 232,953 11,672 4 655 sqlservr 153 79,922 12,125 41 947 disp+work 299 29,078 3,188 4 177 disp+work 323 8,860 1,047 4 166 disp+work 267 6,891 640 4 273
System 2 0 7,375 35 1033 perfwcol 179 1,281 4,547 2 71 disp+work 270 5,031 297 4 174 inetinfo 133 2,625 2,407 21 342
CPU group User Sys Thrds Hlds
Idle 0 6,781, ■ 391 2 0 disp+work 288 ,766 17, ■ 938 82 3877 sqlservr 79 ,922 12, 125 41 947
System 0 7, 375 35 1033 perfwcol 1 ,281 4, .547 2 71 inetinfo 2 , 625 2, 407 21 342 sqlagent 500 47 8 91 saposcol 172 265 4 77
SERVICES 63 93 20 285 msdtc 16 31 21 110
PID Private Shared sqlservr 153 834, 699,264 0 disp+work 257 53,698, 560 0 disp+work 267 . 26,587, 136 0 disp+work 323 12,963, 840 0 disp+work 330 11, 644, 928 0 disp+work 259 10,477, 568 0 disp+work 29Q 9,560, 064 0 disp+work 254 7,888, 896 0 disp+work 299 7,815, 168 0 disp+work 296 6,975, 488 0
Private Shared sqlservr 834, 699,264 0 disp+work 176,537, 600 0 mmc 4,349 ,952 0 gwrd 2,596 ,864 0 sqlagent 2,260 ,992 0 perfwcol 1, 658 ,880 0 msg server 1,495 ,040 0
SERVICES 1,433 , 600 0
EXPLORER 1,413 ,120 0 sql angr 1,409 ,024 0
10 absolute
PID IO
10 group-
10
--Timestamp (Record 2)
Thu Jan 18 00:09:00 2001
--VMstat
Processors: 2
Interval: 3,599,961,300 (us) usr: 4.95% sys: 0.59% idle: 188.90% queue: 0 freepte: 39590 pi: 1.14/s po: 2.56/s
—IOstat
Dk Signatur Read(B) Write (B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 3,325,952 4,095,488 360.24 680.17 2559.57 738 976 0 0 3, 600, 000,000 1 CE2EE96C 13,614,080 49,569,792 1213.47 1404.73 981.79 3098 1571 0 0 3,600,000,000
2 CE1F98E1 802,816 131,072 80.39 4.43 3515.17 98 1 0 0 3, 600,000, 000
3 CE2EE96D 0 552,960 0.00 10.96 3589.03 0 9 0 0 3,600,000,000
—FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNu
C:\pagefile.sys PagingFile 116 2,048 5.66%
631AF993 2
D:\pagefile.sys PagingFile 2,367,696 3,072,000 77.07%
B5D97A0C 3000
H:\pagefile.sys PagingFile 136 184,320 0.07%
00D37BF0 180
C:\ 0 NTFS 1,099,109 4,192,492 26.22%
B8BA90A3 0
D:\ 1 NTFS 121,172 12,129,484 1.00%
6470EC21 0
E:\ 2 NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ 3 NTF 559,972 -1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
—Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,984,375
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,599,984,375
—TopTen
Interval: 3,599,953,125 (us)
ActiveCPUs: 2
TotalProcesses : 50
TotalThreads: 330
TotallO: 0
TotalHandles: 8458
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,800,546 . 2 0 disp+work 257 224,547 10,575 4 655 sqlservr 153 75,000 11,312 43 ' 961 disp+work 299 28,797 2,843 4 177 disp+work 323 7,922 1,032 4 166 disp+work 267 6,375 875 4 275
System 2 0 7,156 35 1033 perfwcol 179 1,031 4,625 2 71 disp+work 270 5,109 375 4 174 disp+work 356 3,266 328 4 163
CPU group
User Sys Thrds Hlds
Idle 0 6,800,546 2 0 disp+work 278,423 16,908 82 3879 sqlservr 75,000 11,312 43 961
System 0 7,156 35 1033 perfwcol 1,031 4,625 2 71 inetinfo 1,797 1,718 21 342 sqlagent 375 94 9 98 saposcol 140 172 4 77
SERVICES 47 125 20 285
LSASS 31 15 12 103
PID Private 5h,ared sqlservr 153 839,528,448 0 disp+work 257 40,382,464 0 disp+work 267 24,035,328 0 disp+work 323 '15,040,512 0 disp+work 330 12,075,008 0 disp+work 259 11, 649,024 0 disp+work 290 8,265,728 0 disp+work 254 7, 925,760 0 disp+work 299 7,815,168 0 disp+work 362 6, 918,144 0
MEM group- Private Shared sqlservr 839,528,448 0 disp+work 159,154,176 0 mc 4,456,448 0 sqlagent 2,351,104 0 gwrd 1,716,224 0 perfwcol 1,662,976 0 msg_server 1,527,808 0
SERVICES 1,466,368 0 saposcol 1,413,120 0 sqlmangr 1,409,024 0
10 absolute-
PID 10
10 group-
10
—Timestamp (Record 3)
Thu Jan 18 01:09:00 2001
--VMstat :
Processors: 2
Interval: 3,613,164,500 (us) usr: 6.57% sys: 0.77% idle: 184.56% queue: 0 freepte: 39590 pi: 6.27/s po: 3.25/s
--IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 10,611,712 5,949,440 1730.99 966.24 902.81 2504 1284 0 0 3, 600,046,875
1 CE2EE96C 81,314,304 57,608,192 8444.62 1914.87 0.00 19382 1615 0 0 10,359,496,492
2 CE1F98E1 1,184,382,976 15,892,48030145.99 13504.57 0.00 46435 1154 2 0 43,650,568,894
3 CE2EE96D 143,171,584 206,843,904 3150.72 6055.04 0.00 2196 3198 0 0 9,205,776,116
--FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNu
C:\pagefile.sys PagingFile 128 2,048 6.25%
631AF993 2
D:\pagefile.sys PagingFile 2,443,288 3,072,000 79.53%
B5D97A0C 3000
H:\pagefile.sys PagingFile 4,780 184,320 2.59%
00D37BF0 180
C:\ 0 NTFS 1,099,109 4,192,492 26.22%
B8BA90A3 0
D:\ 1 NTFS 121,104 12,129,484 1.00%
6470EC21 0
E:\ 2 NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,281,250
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,600,281,250
—TopTen
Interval: 3,600,375,000 (us)
ActiveCPUs: 2
TotalProcesses : 50
TotalThreads: 330
TotallO: 0
TotalHandles: 8473
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,668,766 2 0 disp+work 257 234,953 10,781 4 655 sqlservr 153 181,593 18,500 43 966 disp+work 299 28,922 2,719 4 177 disp+work 323 7,812 1,078 4 166
System 2 0 7,797 35 1033 disp+work 267 6,500 1,000 4 279 perfwcol 179 1,000 4,765 2 71 disp+work 270 5,063 281 4 174 inetinfo 133 2,500 2,563 21 342
CPU group
User Sys Thrds Hlds
Idle 0 6,668,766 2 0 disp+work 289,298 16,843 82 3883 sqlservr 181,593 18,500 43 966
System 0 7,797 35 1033 perfwcol 1,000 4,765 2 71 inetinfo 2,500 2,563 21 342 sqlagent 641 62 10 104 saposcol 125 188 4 77
SERVICES 47 94 20 285 msdtc 47 63 21 110
PID Private Sh,ared sqlservr 153 912,257,024 0 disp+work 257 31,469,568 0 disp+work 267 19,525,632 0 disp+work 323 12, 640,256 0 disp+work 330 12,107,776 0 disp+work 259 11,120,640 0 disp+work 296 8,568,832 0 disp+work 254 7,897,088 0 disp+work 362 6, 914,048 0 disp+work 299 6,139,904 0
Private Shared sqlservr 912,257,024 0 disp+work 139,837,440 0 mmc 4,276,224 0 sqlagent 2,416,640 0 perfwcol 1, 658,880 0 gwrd 1, 626,112 0 msg server 1, 601,536 0
SERVICES 1,437,696 0 saposcol 1,409,024 0 sql angr 1,409,024 0
10 absolute-
PI D 10
10 group
10
— imestamp ( Record 4 )
Thu Jan 18 02 : 09 : 00 2001
--VMstat
Processors: 2
Interval: 3,600,020,700 (us) usr: 16.41% sys: 1.45% idle: 164.25% queue: 1 freepte: 39590 pi: 4.92/s po : 3.11/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 15,378,432 8,090,112 2373.63 1164.08 62.26 3303 1334 0 0 3,599,984,375
1 CE2EE96C 58,666,496 60,874,240 9851.18 3949.96 0.00 13140 2356 0 0 13, 801, 158,254
2 CE1F98E1 9,642,704,896 17,563,648656950.62 18423.73 0.00 246238 1433 0 0 675,374,356,158
3 CE2EE96D 3,068,723,200 3,140,079,61674307.85 125170.79 0.00 47364 48041 2 0 199,478,646,369 —FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 120 2,048 5.86%
631AF993 2
D:\pagefile.sys PagingFile 2,457,168 3,072,000 79.99%
B5D97A0C 3000
H:\pagefile.sys PagingFile 5,004 184,320 2.71%
0OD37BFO 180
C:\ 0 NTFS 1,099,109 4,192,492 26.22%
B8BA90A3 0
D:\ 1 NTFS 121,700 12,129,484 1.00%
6470EC21 0
E:\ 2 NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 . NTFS 21,314 208,025 10.25%
5CDA4CC9 0
—Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,953,125
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [ 0004AC4CE34D] 3,599,953,125
—TopTen
Interval: 3,599,906,250 (us)
ActiveCPUs: 2
TotalProcesses : 50
TotalThreads: 331
TotallO: 0
TotalHandles: 8474
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 5,913,281 2 0 sqlservr 153 862,094 45,406 43 966 disp+work 257 233,047 12,125 4 656 disp+work 299 46,594 5,906 4 177 disp+work 296 12,047 2,437 4 165
System 2 0 8,484 35 1033 disp+work 267 6,750 672 4 279 disp+work 323 6,094 968 4 166 perfwcol 179 1,250 4,469 2 71 disp+work 270 5,265 359 4 174
CPU group
User Sys Thrds Hlds
Idle 0 5,913,281 2 0 sqlservr 862,094 45,406 43 966 disp+work 316,077 23,498 82 3884
System 0 8,484 35 1033 perfwcol 1,250 4,469 2 71 inetinfo 2,046 2,281 21 342 saposcol 188 281 4 77
SERVICES 46 141 20 285 msdtc 16 62 21 110 gwrd 32 32 6 164
PID Private Sh,ared sqlservr 153 892,227,584 0 disp+work 257 44,204,032 0 disp+work 267 13, 819, 904 0 disp+work 323 13,299,712 0 disp+work 330 12, 156, 928 0 disp+work 259 11,141,120 0 disp+work 296 9,084, 928 0 disp+work 254 7, 946,240 0 disp+work 362 6, 922,240 0 disp-'-work 299 5, 935, 104 0
Private Sh;ared sqlservr 892,227,584 0 disp+work 148,635,648 0 mmc 4,272 ,128 0 sqlagent 2,416 ,640 0 perfwcol 1,679 ,360 0 msg server 1,495 ,040 0
SERVICES 1,466 ,368 0 saposcol 1,413 ,120 0 sqlmangr 1,409 ,024 0 gwrd 1,380 ,352 0
10 absolute-
PID 10
10 group-
10
--Timestamp (Record 5)
Thu Jan 18 03:09:00 2001
— VMstat
Processors : 2
Interval: 3,586,720,200 (us) usr: 7.35% sys: 0.81% idle: 184 queue: 0 freepte: 39590 pi: 5.67/s po: 4.12/s
--IOstat
Dk Signatur Read(B) Wrιte(B) Read(t) Wrιte(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 16,280,576 8,559,616 2262.22 1233.07 104.69 3566 1599 0 0 3,600,000,000
1 CE2EE96C 69,891,584 68,056,064 7131.73 2272.67 0.00 15738 1823 0 0 9,404,409,396
2 CE1F98E1 2,161,000,448 20,086,784212505.37 36931.50 0.00 81196 1615 0 0 249, 436,884,130
, v r > ( ;', , r?i , n>: \ ip, -i iO. "^" ~ '5 ^ T-^-'-.O ?..? < •> ', / ' s 0 ϋ 32, 30, S02, i i
-1. a' --- Mount Point PhysDrv Type Free (KB) Total (KB) siree VolNum
C: \pagefile . sys PagingFile 124 2,048 6.05%
631AF993 2
D: \pagefile . sys PagingFile 2,451,004 3,072,000 79.79%
B5D97A0C 3000
H: \pagefile.sys PagingFile 4, 624 184,320 2.51%
00D37BF0 180
C:\ NTFS 1,099,109 4,192,492 26.22%
B8BA90A3 0
D:\ NTFS 121,572 12,129,484 1.00%
6470EC21 0
E:\ NTFS 3, 130,168 35,540,408 8.81%
E8E20702 0
F:\ NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 MS TCP Loopback interface [None] 3,599,859,375 6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,599,859,375
—TopTen
Interval : 3,599,812,500 (us) ActiveCPUs : 2 TotalProcesses : 50 TotalThreads: 330 TotallO: 0 TotalHandles: 8465
PID User Sys Thrds Hlds
Idle 0 0 6,613, , 875 2 0 sqlservr 153 240,813 19, , 641 43 961 disp+work 257 227, 156 11, , 344 4 655 disp+work 299 29,922 3, , 172 4 177 disp+work 323 8,703 782 4 166
System 2 0 7, ,875 35 1033 disp+work 267 6, 656 672 4 281 perfwcol 179 1,156 4, ,531 2 71 disp+work 270 5, 110 219 4 174 inetinfo 133 2,141 2, ,453 21 342
CPU group
User Sys Thrds Hlds
Idle 0 6,613,875 2 0 disp+work 283,531 17,441 82 3887 sqlservr 240,813 19, 641 43 961 System 0 7,875 35 1033 perfwcol 1,156 4,531 2 71 inetinfo 2,141 2,453 21 342 saposcol 93 203 4 77 sqlagent 125 47 98 SERVICES 47 78 20 285 gwrd 30 15 6 164 PID Private Sh,ared sqlservr 153 886,140,928 disp+work 257 20,058, 112 0 disp+work 267 17,321, 984 0 disp+work 323 14,999, 552 0 disp+work 330 12,259, 328 0 disp+work 259 11,747, 328 0 disp+work 290 9,072, 640 0 disp+work 296 8,331, 264 0 disp+work 254 7,884, 800 0 disp+work 299 7,135, 232 0
Private Sh;ared sqlservr 886,140, 928 0 disp+work 137,338,880 0 mmc 4,341 ,760 0 sqlagent 2,351 ,104 0 perfwcol 1, 650 ,688 0
SERVICES 1,626 ,112 0 msg_server 1, 609 ,728 0 gwrd 1,515 ,520 0 saposcol 1,409 ,024 0 sql angr 1,409 ,024 0
10 absolute-
PID 10
10 group-
10
--Timestamp (Record
Thu Jan 18 04:09:00 2001
—VMstat
Processors: 2
Interval: 3,599,965,200 (us) usr: 4.92% sys: 0.62% idle: 188, 89% queue: 0 freepte: 39590 pi: 2.39/s po : 2.48/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 6,725, 632 5,163,008 825.95 738.31 2035.73 1474 1072 0 0 3, 600,000, 000
1 CE2EE96C 28,791,808 46,677,504 2195.14 1316.68 88.16 6599 1425 0 0 3, 600,000,000
2 CE1 F98E1 9 , 363 , 456 8,192 1934.33 1.24 1664.42 582 1 0 0 3 , 600 , 000 , 000
3 CE2EE96D 409, 600 954,368 49.38 32.33 3518.29 50 15 0 0 3, 600, 015, 625
—FSstat--- Mount Point PhysDrv Type Free (KB) Total (KB) %free VolNu
C: \pagefile . sys Pagingtiarαje 124 2 , 04 8 6 . 05 % 631AF993 2
D: \pagefile . sys PagingFile 2, 444,780 3,072,000 79.58%
B5D97A0C 3000
H: \pagefile . sys PagingFile 4,280 184,320 2.32%
0OD37BFO 180
C:\ NTFS 1,039,109 4,192,492 26.22%
B8BA90A3 0
D:\ NTFS 121,508 12,129,484 1.00%
6470EC21 0
E:\ NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ NTFS 559, 972 1,198,060 46.74%
A891FC02 0
H:\ NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,984,375
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,599,984,375
--TopTen
Interval : 3,600,015,625 (us) ActiveCPUs : 2 TotalProcesses : 50 TotalThreads: 330 TotallO: 0 TotalHandles: 8468
PID User Sys Thrds Hlds
Idle 0 0 6,800,328 2 0 disp+work 257 221,250 : 12,031 4 655 sqlservr 153 ' 74,062 ; 11,859 43 962 disp+work 299 ; 28, 968 2, 938 4 177 disp+work 323 8,250 687 4 166 disp+work 267 6,281 1,016 Δ 283
System 2 0 7,000 35 1033 disp+work 270 5,328 313 4 174 perfwcol 179 1, 110 4,500 2 71 inetinfo 133 3,016 2,469 21 342
User Sys Thrds Hlds
Idle 0 6,800, ,328 2 0 disp+work 275, , 937 17, ,982 82 3889 sqlservr 74, ,062 11, .859 43 962
System 0 7, ,000 35 1033 perfwcol 1, ,110 4, ,500 2 71 inetinfo 3, .016 2, ,469 21 342 sqlagent 610 63 9 98 saposcol 125 109 4 77
SERVICES 32 125 20 285 msdtc 16 62 21 110
PID Private Shared sαlservr 153 880,480,256 0 disp+work 257 17,899, 520 0 disp+work 267 16,699, 392 0 disp+work 323 14,749, 696 0 disp+work 330 12,406, 784 0 disp+work 259 11,264, 000 0 disp+work 299 10,256, 384 0 disp+work 296 8,351, 744 0 disp+work 254 7,884, ,800 0 disp+work 362 6,930, 432 0
Private Shared sqlservr 880,480,256 0 disp+work 131,387,392 0 mmc 4,341,760 0 sqlagent 2,351,104 0 perfwcol 1,658,880 0 msg server 1,564,672 0
SERVICES 1,44,1,792 0 saposcol 1,413, 120 0 sqlmangr 1,409,024 0
EXPLORER 1,376,256 0
10 absolute-
PID 10
10 group-
10
--Timestamp (Record 7)
Thu Jan 18 05:09:00 2001
— VMstat
Processors: 2
Interval: 3,613,419,200 (us) usr: 4.90% sys: 0.60% idle: 188.22% queue: 0 freepte: 39590 pi: 1.73/s po : 2.27/s
— IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 5, 115, 904 4,850,176 596.42 750.39 2253.15 1179 1091 0 0 3,599, 984,375
1 CE2EE96C 20,445, 696 43,931,648 1792.79 1255.59 551.59 4631 1355 0 0 3,599, 984,375
2 CE1F98E1 1,015,808 13,172,736 159.80 31218.14 0.00 124 1040 0 0 31,377,950,478
3 CE2EE96D 81,920 585,728 9.94 14.66 3575.38 10 13 0 0 3,599,984,375
—FSstat---
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNu
C: \pagefile. sys PagingFile 124 2,048 6.05% 631AF993 2 D: \pagefile. sys PagingFile 2,432,956 3,072,000 79.20% B5D97A0C 3000 H: \pagefile. sys PaginigFile 3,424 184,320 1.86%
OOD37BF0 180
C:\ NTFS 1,099,109 4,192,492 26.22%
B8BA90A3 0
D:\ NTFS 121,436 12,129,484 1.00%
6470EC21 0
E:\ NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,968,750 6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,599,968,750
—TopTen
Interval : 3,599,968,750 (us) ActiveCPUs: 2 TotalProcesses : 50 TotalThreads : 329 TotallO: 0 TotalHandles: 8470
PID User Sys Thrds Hlds
Idle 0 0 6,801, ,407 2 0 disp+work 290 220,156 11, , 938 4 171 sqlservr 153 75,297 11, ,485 43 962 disp+work 299 29,329 2, .562 4 177 disp+work 323 7, 812 922 4 166
System 2 0 7, ,281 35 1033 i .p+work °QΓ, 5, ^0 ' "} 1 16"7
1 \>
| l W>-n ] 1 /<! 1 , "1 ' ■1 , / 1
1 , > 1 • 1! 1 1 J.
I Jser Sys Thrds Hlds
Idle 0 6,801, ,407 2 0 disp+work 274, ,592 17, ,359 82 3893 sqlservr 75, ,297 11, , 485 43 962
System 0 7, ,281 35 1033 perfwcol 1, ,297 4, ,297 2 71 metmfo 2, ,734 2, , 641 21 342 sqlagent 468 47 9 98 saposcol 94 282 4 77
SERVICES 15 140 20 285 msdtc 0 47 21 110
PID Private Shared sqlservr 153 867, 102,720 0 disp+work 257 18,313,216 disp+work 323 15,831,040 disp+work 290 15,097,856 disp+work 330 12,525,568 disp+work 267 11,501,568 disp+work 296 9,949,184 disp+work 254 7,880,704 disp+work 299 7,639,040 disp+work 356 7,049,216
MEM group-
Private Shared sqlservr 867,102,720 disp+work 135,348,224 mmc 4,341,760 0 sqlagent 2,351,104 0 perfwcol 1,658,880 0 msg_server 1,609,728 0
SERVICES 1,470,464 0 saposcol 1,409,024 0 sqlmangr 1,409,024 0
EXPLORER 1,376,256 0
10 absolute-
PID 10
10 group-
10
--Timestamp (Record 8)
Thu Jan 18 06:09:00 2001
—VMstat
Processors: 2
Interval: 3,586,512,100 (us) usr: 5.38% sys: 0.60% idle: 188.76? queue: 0 freepte: 39590 pi: 2.49/s po : 2.10/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 7,949,312 4,606,976 878.72 698.52 2022.76 1712 1030 0 0 3,600,015,625
1 CE2EE96C 28,785,152 41,876,480 2346.40 1288.21 0.00 6552 1389 0 0 3, 634, 621,595
2 CE1F98EI 39,854,080 24,57646027.57 3.21 0.00 3388 3 0 0 46,030,783,870
3 CE2EE96D 1,794,048 2,322,432 45.24 138.07 3416.63 37 41 0 0 3,599, 953,125
—FSstat— Mount Point PhysDrv Type Free (KB) Total (KB) %free VolNum
C: \pagefile . sys PagingFile 128 2,048 6.25%
631AF993 2
D: \pagefile. sys PagingFile 2,432,500 3,072,000 79.18%
B5D97A0C 3000
H: \pagefile. sys PagingFile 3,440 184,320 1.87%
00D37BF0 180
C:\ NTFS 1,099, 109 4,192,492 26.22% B8BA90A3 0
D:\ 1 NTFS 120,348 12 , 129 , 484 0 . 99%
6470EC21 0
E:\ 2 NTFS 3,130,168 35 , 540 , 408 8 . 81 %
E8E20702 0
F:\ 3 NTFS 559,972 1 , 198 , 060 4 6 . 74 %
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,000,000
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,600,000,000
—TopTen
Interval : 3,600,000,000 (us)
ActiveCPUs: 2
TotalProcesses : 50
TotalThreads: 329
TotallO: 0
TotalHandles: 8476
PID User Sys Thrds Hlds
Idle 0 0 6,770,109 2 0 disp+work 290 215,953 : 11,391 4 171 sqlservr 153 88,062 : 11, 984 43 962 disp+work 299 , 35,000 3,016 4 179 disp+work 296 ; 17,860 750 4 167 disp+work 356 9,328 406 4 163 disp+work 323 7,641 750 4 166 disp+work 257 6,703 906 4 663
System 2 0 6,438 35 1033 perfwcol 179 1,234 4,344 2 71
1 User Sys Thrds Hlds
Idle 0 6,770, , 109 £. 0 disp+work 294 ,236 17, 626 82 3899 sqlservr 88, ,062 11, .984 43 962
System 0 6, 438 35 1033 perfwcol 1 ,234 4, .344 71 metmfo I, ,875 1, 546 21 342 sqlagent 922 62 9 98 saposcol 141 109 4 77 msdtc 16 94 21 110
SERVICES 16 63 20 285
PID Private Shared sqlservr 153 869,076,992 0 disp+work 257 19,746,816 0 disp+work 323 15,761,408 0 αisp+work 290 15,204,352 0 disp+work 330 12,673,024 0 disp+work 267 11,780,096 0 αisp+work 296 10,698,752 0 disp+work 299 9,351, 168 0 disp+work 356 8,249, 344 0 disp+work 254 7,933, 952 0
Private Sha:red sqlservr 869,076,992 0 disp+work 137,785,344 0 me 4,354,048 0 sqlagent 2,351,104 0 gwrd 1,847,296 0 perfwcol 1, 654,784 0 msg server 1,560,576 0
SERVICES 1,466,368 0 saposcol 1,409,024 0 sqlmangr 1,409,024 0
10 absolute-
PID 10
IO group-
10
— Timestamp (Record 9)
Thu Jan 18 07:09:00 2001
— VMstat
Processors: 2
Interval: 3,613,594,500 (us) usr: 6.07% sys: 1.19% idle: 184.68% queue: 0 freepte: 39385 pi: 17.57/s po: 10.39/s
— IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 43,363,840 19,882,496 6809.32 2847.97 0.00 9330 3482 1 0 9,657,299,247
1 CE2EE96C 265,200,640 156,621,31225389.07 4996.59 0.00 51229 4333 1 0 30,385, 667,222
2 CE1F98E1 5,391,056,896 8,364,03242311.94 13944.01 0.00 9808 657 0 0 56,255, 963,029
3 CE2EE96D 548,864 1,438,720 34.90 42.59 3523.14 32 33 0 0 3, 600, 640, 625
— FSstat— -
MountPoint PhysDrv Type Free (KB) Total (KB) Ifree
VolNum
C: \paoefile . sys PagingFile 120 2,048 5.S6'ό
631AF 93 2
PagingFile 2, 436, 856 3,072,000 79.32%
Figure imgf000248_0001
H : \pagefile . sys PagingFile 3,576 184,320 1.94%
00D37BF0 180
C:\ NTFS 1,099, 109 4,192,492 26.22%
B8BA90A3 0
D:\ NTFS 120,040 12,129,484 0.99%
6470EC21 0 E:\ NTFS 3,130,168 35,540,408 8.81%
E8E20702 0 F:\ NTFS 559,972 1,198,060 46.74%
A891FC02 0 H:\ NTFS 21,314 208,025 10.25% 5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 MS TCP Loopback interface [None] 3,600,671,875 6 1 1500 10000000 0 4 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34DJ 3,600,671,875
—TopTen
Interval : 3,600,640,625 (us) ActiveCPUs: 2 TotalProcesses : 50 TotalThreads: 331 TotallO: 0 TotalHandles : 8520
CPU absolute-
PID User Sys Thrds Hlds
Idle 0 0 6,673,922 2 0 disp+work 290 223,719 12,250 4 171 sqlservr 153 102,032 17,813 43 965 disp+work 257 31,563 5,360 4 663 disp+work 299 29, 625 2, 844 4 179
Svstem 2 0 12, ^06 5 1045 i l l I ' 1 H i M I 10, '. i| I d i sp+wor k 270 9, 54 / 1,' ")7 4 17b
' I 1 , [ i I W' 1 i I / , H A A 'i A A 1 fl(, llliii. I /.' . . I, Hi 1 U I
CPU group-
User Sys Thrds Hlds
Idle 0 6, 673 922 2 0 disp+work 330, 985 28 939 82 3927 sqlservr 102,032 17 813 43 965
System 0 12 906 35 1045 mmc 2, 640 4 562 81 perfwcol 1,172 4 484 2 71 inetinfo 1,578 1 657 21 342 sqlagent 407 109 9 98 saposcol 125 328 4 77
EXPLORER 125 235 4 55
MEM absolute-
PID Private Shared sqlservr 153 810,029,056 0 disp+work 257 94,199,808 0 disp+work 267 72,314,880 0 disp+work 270 36,696,064 0 disp+work 274 17,391,616 0 disp+work 259 14,159,872 0 disp+work 330 12,705,792 0 disp+work 305 12,636,160 0 disp+work 290 12,021,760 0 disp+work 323 11,522,048 0 MEM group
Private Shared sqlservr 810,029,056 disp+work 321,236,992 mmc 9,789,440 0 sqlagent 2,371,584 0 gwrd 1,695,744 0 perfwcol 1,634,304 0 msg_server 1,486,848 0
SERVICES 1,454,080 0 saposcol 1,409,024 0 sqlmangr 1,409,024 0
10 absolute-
PID 10
10 group-
10
--Timestamp (Record 10)
Thu Jan 18 08:09:00 2001
—VMstat
Processors: 2
Interval: 3,586,380,300 (us) usr: 8.60% sys: 1.82% idle: 179.90% queue: 0 freepte: 39300 pi: 29.95/s po : 23.63/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 70,411,776 35,246,08011042.64 5426.62 0.00 16524 6567 0 0 16,469,277,106
1 CE2EE96C 368,412,672 347,737,08834606.29 11056.46 0.00 87938 9086 0 0 45, 662, 754,261
2 CE1F98E1 9,248,825,344 44,646,40094 16.82 150154.96 0.00 12013 4177 0 0 244,571,785, 623
3 CE2EE96D 16,384 1,650,688 2.42 55.89 3541.14 2 25 0 0 3,599,468,750
—FSstat-—
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C: \pagefile. sys PagingFile 116 2,048 5.66%
631AF993 2
D: \pagefile. sys PagingFile 2,444,208 3,072,000 79.56%
B5D97A0C 3000
H: \pagefile. sys PagingFile 3,860 184,320 2.09%
00D37BF0 180
C:\ NTFS 1,099, 109 4,192,492 26.22%
B8BA90A3 0
D:\ NTFS 119, 964 12,129,484 0.99%
6470EC21 0
E:\ NTFS 3,130,168 35,540,408 8.81%
E8E20702 0
F:\ NTFS 559, 972 1,198,060 46.74% A891FC02 0 H : \ NTFS 21 , 314 208 , 025 10 . 25 ^ 5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3 , 599 , 453 , 125
6 1 1500 10000000 1 12 0 0 AMD PCNET Family Ethernet
Adapter [ 0004AC4 CE34 D] 3 , 599 , 453 , 125
—TopTen
Interval : 3,599,468,750 (us) ActiveCPUs: 2 TotalProcesses : 50 TotalThreads : 330 TotallO: 0 TotalHandles : 8528 iso uce - -
PID User Sys Thrds Hlds
Idle 0 0 6,452,203 2 0 disp+work 290 228,594 12,406 4 171 sqlservr 153 119,328 24,843 43 965 disp+work 267 61,094 8,750 4 287 disp+work 257 58,015 9,047 4 665 disp+work 299 29,562 3,297 4 179 disp+work 274 28, 922 2, 656 4 160 disp+work 259 26,578 4,000 4 188 disp+work 270 23,922 4, 969 4 176
System 2 0 19,313 35 1047
CPU group-
User Sys Thrds Hlds
Idle 0 6,452,203 2 0 disp+work 493, ,795 51,533 82 3935 sqlservr 119, ,328 24,843 43 965
System 0 19,313 35 1047 perfwcol 1, ,281 4, 938 2 71 inetinfo 2, ,328 1,968 21 342 saposcol 265 391 4 77
SERVICES 47 203 20 285 sqlagent 140 32 9 98 msdtc 0 125 21 110
MEM absolute-
PID Private Shared sqlservr 153 740,327,424 disp+work 257 138,932,224 disp+work 267 122,298,368 disp+work 270 54,812,672 disp+work 259 54,771,712 disp+work 274 27,979,776 disp+work 323 13,533,184 disp+work 305 13, 168, 640 disp+work 330 12, 632,064 disp+work 290 11,563,008
MEM group-
Private Shar cg sqlservr 740,327,424 0 disp+work 496,574,464 0 mmc 9,650,176 0 sqlagent 2,371,584 0 gwrd 1,613,824 0 perfwcol 1,609,728 0
SERVICES 1,519,616 0 msg_server 1,499,136 0 sqlmangr 1,409,024 0 msdtc 1,363, 968 0
10 absolute-
PID 10
10 group-
10
— Timestamp (Record 11)
Thu Jan 18 09:09:00 2001
— VMstat
Processors: 2
Interval: 3,613,662,000 (us) usr: 11.77% sys: 1.96% idle: 171.76% queue: 0 freepte: 39496 pi: 24.67/s po : 23.22/s
— IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 59,459,072 30,252,032 8504.62 4400.66 0.00 13568 6139 0 0 12,905,292,039
1 CE2EE96C 331,708,928 360,704,51225091.91 11899.00 0.00 71749 11253 0 0 36, 990,920,067
2 CE1F98E1 2,185,641,984 157,745,152112235.35 144613.04 0 00 15216 5819 0 0 256,848,398,691
3 CE2EE96D 90,112 2,018,816 12.58 81.99 3505.34 11 49 0 0 3,599, 921,875
—FSstat—-
MountPoint PhysDrv Type Free (KB) Total (KB) sfree
VolNu
C: \pagefile . sys PagirigFile 120 2,048 5. 86%
631AF993 2
D: \pagefile . sys Pagir.igFile 2,425,572 3, 072,000 78. 96%
B5D97A0C 3000
H: \pagefile . sys PagirigFi .le 2, 628 184,320 1. 43%
OOD37BF0 180
C:\ NTFS 1,099,109 4, 192,492 26. 22%
B8BA90A3 0
P:\ NTFS 116, 28 12, 129, 484 π 96%
E:\ 2 NTFS 3, 027, 768 35, ,540, 408 8 .52%
KHK.'O <O ' 0 F:\ 3 NTFS 559, 972 1, , 198, 060 46 .74%
A891FC02 0 H:\ 0 NTFS 21,314 208, 025 10 .25% 5CDA4CC9 0 251 --Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 MS TCP Loopback interface [None] 3,599,937,500 6 1 1500 10000000 1 17 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,599,937,500
—TopTen
Interval : 3,599, 937,500 (us)
ActiveCPUs: 2
TotalProcesses: 50
TotalThreads: 328
TotallO: 0
TotalHandles: 8535
I \l
Figure imgf000253_0001
disp+work A 44 , 1 U tli 4 111 sqlservr 153 165,703 28 782 43 959 disp+work 257 132,188 15 734 4 669 disp+work 267 126,594 15 063 4 287 disp+work 274 44,812 4 438 4 160 disp+work 270 35,062 5 422 4 176 disp+work 299 29,094 3 156 4 179 disp+work 305 23,594 3 031 4 159 System 2 0 25 265 35 1054
CPU group-
User Sys Thrds Hlds
Idle 0 6,207,047 2 0 disp+work 681,220 69,231 82 3948 sqlservr 165,703 28,782 43 959
System 0 25,265 35 1054 perfwcol 1,141 4,812 2 71 inetinfo 1,625 1,563 21 342 saposcol 344 531 4 77 sqlagent 563 78 93
SPOOLSS 78 172 7
SERVICES 63 156 20 86
MEM absolute-
PID Private Shared sqlservr 153 09,84 9,088 disp+work 257 52,59 2,384 disp+work 267 37,37 5,744 disp+work 270 7,816 ,704 0 disp+work 259 4,985 , 600 0 disp+work 274 6,535 ,552 0 disp+work 323 6,441 ,344 0 disp+work 305 3,455 ,360 0 disp+work 330 3,398 ,016 0 disp+work 290 2,525 ,568 0
MEM group
Private Shared sqlservr 709,849,088 disp+work 476,282,880 mmc 9,789,440 sqlagent 2,265,088 0 gwrd 2,105,344 0 perfwcol 1, 654,784 0 msg_server 1,470,464 0 saposcol 1,409,024 0 sqlmangr 1,409,024 0 SERVICES 1,372,160 0
10 absolute-
PID IO
10 group-
10
—Timestamp (Record 12)
Thu Jan 18 10:09:00 2001
—VMstat
Processors: 2
Interval: 3,600,061.800 (us) usr: 8.29% sys: 1.42% idle: 180.56% queue: 0 freepte: 39590 pi: 9.80/s po: 17.81/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 20,634,112 22,803,968 2816.46 3143.95 0.00 4597 4794 0 0 5,960,423,308
1 CE2EE96C 337,077,248 272,611,840 9950.42 8343.35 0.^00 29598 7932 0 0 18,293,783,366
2 CE1F98E1 15,704,064 45,809,664 3286.01 112136.56 0.00 189Ϊ 4567 0 0 115,422,586,516
3 CE2EE96D 180,224 1,769,472 8.47 65.86 3525.72 8 28 0 0 3, 600,062,500
--FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free VolNum
C: \pagefile. sys PagingFile 128 2,048 6.25% 631AF993 2
D: \pagefile. sys PagingFile 2,429,128 3,072,000 79.07% B5D97A0C 3000
H: \pagefile. sys PagingFile 3,084 184,320 1.67% 00D37BF0 18C 1
C:\ 0 NTFS 1,099,109 4,192,492 26.22% B8BA90A3 0
D:\ 1 NTFS 116,800 12,129,484 0.96% 6470EC21 0
E:\ 2 NTFS 3,027,768 35,540, 408 8.52? E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
\\ : \ ^ NT':':: -, - -, , , . ' r-
Vl .vV'1 0
Figure imgf000254_0001
T S MTU Spe ?ed RX-fb/s TX-Kb/s RXe TXe Interface 24 1 1500 10000000 0 0 0 MS TCP Loopback interface
[ None ] 3 , 600 , 046, 875
6 1 1500 10000000 1 11 0 0 AMD PCNE1 Family Ethernet
Adapter [ OO04AC4CE34D] 3 , 600 , 046 , 875
— TopTen
Interval : 3,600,031,250 (us) <\. I l \ , ( l - Ml I I I l ....« ,. Ml I I i I I In ι-ι. I 1 Ml 1..I i I 1 i i II
TotalHandles: 8537
CPU absolute-
PID User Sys Thrds Hlds
Idle 0 0 6,500,453 2 0 disp+work 290 222,797 12,516 4 171 sqlservr 153 112,687 19,203 43 963 disp+work 257 68,359 11,453 4 673 disp+work 267 67,500 9,359 4 287 disp+work 270 33,516 5,594 4 176 disp+work 299 29,156 3,469 4 179 disp+work 274 19, 688 2,281 4 160 System 2 0 20,391 37 1050 disp+work 305 10,375 1,188 4 159
CPU group-
User Sys Thrds Hlds
Idle 0 6,500,453 2 0 disp+work 479,439 51,296 82 3950 sqlservr 112, 687 19,203 43 963
System 0 20,391 37 1050 perfwcol 1,203 4,485 2 71 inetinfo 2,172 2,078 21 342 mmc 344 656 4 81 sqlagent 609 156 93 saposcol 235 422 77
SPOOLSS 62 172 148
MEM absolute-
PID Prlvate Shared sqlservr 153 717,7 78, 944 0 disp+work 257 135,2 45,824 0 disp+work 267 126,9 96,480 0 disp+work 270 55, 98 0,032 disp+work 259 15,21 2,544 disp+work 274 14,70 8,736 disp+work 330 11, 98 4,896 disp+work 323 11, 64 0,832 mmc 172 9,79 7,632 disp+work 290 9,71 1, 616
MEM group
Private Shared sqlservr 717,778,944
Figure imgf000255_0001
sqlagent 2,265,088 0 perfwcol 1,609,728 0 msg server 1,462,272 0 SERVICES 1,458,176 sqlmangr 1,409,024
EXPLORER 1,372,160 saposcol 1,359,872
10 absolute-
PID 10
IO group-
10
—Timestamp (Record 13)
Thu Jan 18 11:09:00 2001
—VMstat
Processors: 2
Interval: 3,600,024,100 (us) usr: 6.28% sys: 0.92% idle: 185.56% queue: 0 freepte: 39590 pi: 5.04/s po: 8.53/ε
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 12,894,720 12,226,560 1602.43 1716.77 280.73 2923 2601 0 0 3,599, 937,500
1 CE2EE96C 71,990,272 136,085,504 4393.05 3789.30 0.00 14480 4034 0 0 8,182,350,902
2 CE1F98E1 8,314,880 25,313,280 1148.37 59131.20 0.00 928 2481 0 0 60,279,582,917
3 CE2EE96D 32,768 716,800 3.24 18.22 3578.40 4 14 0 0 3,599,875,000
—FSstat-—
MountPoint PhysDrv Type Free (KB) Total (KB)" ifree
VolNum
C: \pagefile . sys PagingFile 116 2,048 5.66%
631AF993 2
D: \pagefile . sys PagingFile 2,413,524 3,072,000 78.57%
B5D97A0C 3000
H : \pagefile . sys PagingFile 2,332 184,320 1.27%
0OD37BFO 180
C:\ NTFS 1,099, 109 4,192,492 26.22%
B8BA90A3 0
D:\ NTFS 116,768 12,129,484' 0.96%
6470EC21 0
E:\ NTFS 3,027,768 35 , 540 , 408 8 . 52 %
E8E20702 0
F:\ NTFS 559, 972 1 , 198 , 060 4 6'. 74 %.
A891FC02 0
H:\ NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,875,000 6 1 1500 10000000 ' 0 3 0 0 AMD PCNET Family Ethernet
Adapter [ 0004AC4CE34D] 3 , 599, 875 , 000
— TopTen
Interval : 3,599,906,250 (us)
ActiveCPUs : 2
TotalProcesses : 50
TotalThreads : 331
TotallO : 0
TotalHandles: 8536
PID User S Syyss TTh:hrrddss H Hllddss
Idle 0 0 6, 680,453 2 0 disp+work 290 239,344 14, 609 4 171 sqlservr 153 90,063 15,297 43 965 disp+work 299 29,547 2,984 4 179 disp+work 270 23,234 3,890 4 176 disp+work 257 23,750 2,860 4 677 disp+work 267 12,328 1,859 4 287
System 2 0 12,797 37 1047 disp+work 323 8,766 875 4 166 disp+work 296 6,750 641 4 167
CPU group-
User Sys Thrds Hlds
Idle 0 6,680,453 2 0 disp+work 358, 045 30,266 82 3948 sqlservr 90, 063 15,297 43 965
System 0 12,797 37 1047 perfwcol 1,156 4,515 2 71 inetinfo 1,953 1,891 21 342 sqlagent 1,047 63 8 93 saposcol 234 156 4 77
SERVICES 47 94 20 286 gwrd 32 94 6 164
MEM absolute-
PID Private Shared sqlservr 153 724,701,184 disp+work 257 130,547,712 disp+work 267 79,089, 664 disp+work 270 32,002,048 disp+work 290 24,842,240 disp+work 323 13,447,168 disp+work 330 12,316, 672 mmc 172 9, 687,040 disp+work 296 9,310,208 disp+work 254 7, 954,432
MEM group
Private Shared sqlservr 724, 701, 184 disp+work 353,329,152 mmc 9,687,040 sqlagent 2,265,088 gwrd 2,146,304 perfwcol 1,662,976 msg_server 1,527,808
SERVICES 1,490,944 saposcol 1,413,120 sqlmangr 1,409,024 10 absolute
PID 10
10 group-
10
—Timestamp (Record 14)
Thu Jan 18 12:09:00 2001
—VMstat
Processors: 2
Interval: 3,586,080,900 (us) usr: 5.34% sys: 0.65% idle: 188.78% queue: 0 freepte: 39590 pi: 3.85/s po : 4.39/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 9,034,752 7,531,008 1071.82 1030.60 1497.57 2069 1563 0 0 3, 600,000,000
1 CE2EE96C 58,501,120 76,995,072 3152.20 2286.69 0.00 11464 2474 0 0 5, 438, 907,089
2 CE1F98E1 3,284, 992 16,384 270.46 1.94 3327.59 389 2 0 0 3,600,000,000
3 CE2EE96D 0 552,960 0.00 10.10 3589.89 0 9 0 0 3,600,000,000
MountPoint PhysDrv Type Free (KB) Total (KB) %free VolNum
C: \pagefile . sys PagingFile 120 2, 048 5.86% 631AF993 2
D: \pagefile . sys PagingFile 2,415, , 724 3,072, 000 78.64% B5D97A0C 3000
H: \pagefile . sys PagingFile 2, ,476 184, 320 1-.34% 00D37BF0 180
C:\ 0 NTFS 1,099, , 109 4,192, 492 26.22% B8BA90A3 0
D:\ 1 NTFS 116, ,588 12,129, 484 0.96% 6470EC21 0
E:\ 2 NTFS 3,027, ,768 35,540, 408 8.52% E8E20702 0
F:\ 3 NTFS 559, , 972 1,198, 060 46.74% A891FC02 0
H:\ 0 NTFS 21, ,314 208, 025 10.25% 5CDA4CC9 0
T S MTU Speed RX- -Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 MS TCP Loopback interface [None] 3,599,953,125
6 1 1500 10000000 0 1 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,599,953,125 —TopTen
Interval : 3,599,984,375 (us)
ActiveCPUs: 2 TotalProcesses : 50 TotalThreads: 332 TotallO: 0 TotalHandles: 8548
CPU absolute-
PID User Sys Thrds Hlds
Idle 0 0 6,770,094 2 0 disp+work 290 226,062 10,297 4 171 sqlservr 153 77,891 12, 078 43 967 disp+work 299 29,625 2,906 4 179 disp+work 257 9,891 1,468 4 679 disp+work 296 10,312 547 4 167 disp+work 323 8,015 657 4 166
System 2 0 7,594 37 1048 disp+work 267 5,328 532 4 289 perfwcol 179" 1,000 4,547 2 71
CPU group-
User Sys Thrds Hlds
Idle 0 6,770,094 2 0 disp+work 300,343 18,250 82 3954 sqlservr 77,891 12,078 43 967
System 0 7,594 37 1048 perfwcol 1,000 4,547 2 71 inetinfo 2,797 2,672 21 342 sqlagent 891 171 8 93 saposcol 78 219 4 77
SERVICES 31 234 21 288
LSASS 16 32 12 102
MEM absolute-
PID Private Shared sqlservr 153 724,893,696 0 disp+work 257 128,856,064 0 disp+work 267 86,138,880 0 disp+work 270 32,739,328 0 disp+work 290 17,399,808 0 disp+work 274 15,421,440 0 disp+work 323 15,020,032 0 disp+work 330 13,037,568 0 disp+work 296 12,021,760 0 mmc 172 9,687,040 0
MEM group-
Private Shared sqlservr 724,893,696 0 disp+work 372,846,592 0 mmc 9, 687,040 0 sqlagent 2,265,088 0 perfwcol 1, 662,976 0 msg server 1, 609,728 0
SERVICES 1,490, 944 0 saposcol 1,413,120 0
EXPLORER 1,409,024 0 sqlmangr 1,409,024 0
10 absolute- PID 10
10 group-
10
—Timestamp (Record 15)
Thu Jan 18 13:09:00 2001
—VMstat
Processors: 2
Interval: 3,613,994,400 (us) usr: 8.93% sys: 1.32% idle: 178.70% queue: 0 freepte: 39590 pi: 16.09/s po : 18.13/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 37,955,584 22,844,928 5408.66 3364.65 0.00 8849 5009 0 0 8,773,327,175
1 CE2EE96C 215,400,960 278,340,60816070.45 7988.35 0.00 47311 7769 0 0 24,058,812,208
2 CE1F98E1 21,651,456 45,572,096 2974.15 99174.99 0.00 2266 4365 0 0 102,149,157,329
3 CE2EE96D 32,768 1,516,544 5.02 60.44 3534.61 4 23 0 0 3, 600,078,125
--FSstat—
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C: \pagefile. sys PagingFile 128 2,048 6.25%
631AF993 2
D: \pagefile . sys PagingFile 2, 06, 600 3,072,000 78.34%
B5D97A0C 3000
H : \pagefile . sys PagingFile 1,784 184,320 0.97%
00D37BF0 180
C:\ 0 NTFS 1,099, 109 4,192,492 26.22%
B8BA90A3 0
D:\ 1 NTFS 116,372 12 , 129, 484 0 . 96%
6470EC21 0
E:\ 2 NTFS 3,027,768 35 , 540 , 408 8 . 52%
E8E20702 0
F:\ 3 NTFS 559,972 1 , 198 , 060 4 6. 74 %
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,140,625 6 1 1500 10000000 1 12 0 0 AMD PCNET Family Ethernet Adapter [0004AC4CE34D] 3,600,140,625
--TopTen--- Interval : 3,600,093,750 (us) ActiveCPUs: 2 TotalProcesses: 50
TotalThreads: 328
TotallO: 0
TotalHandles: 8558
CPU absolute - -
PID User Sys 'rhrds Hlds
Idle 0 0 6,458, .344 2 0 disp+work 290 223,516 13, , 625 4 171 sqlservr 153 123,515 20, ,469 43 969 disp+work 267 115,656 10, , 609 4 289 disp+work 257 68,765 7, ,750 4 683 disp+work 299 29,797 3, .172 4 179 disp+work 274 19, 687 2, .125 4 160
System 2 0 17, .609 35 1051 disp+work 305 14,547 1, .610 4 159 disp+work 270 11,328 1, 875 4 176
User Sys Thrds Hlds
Idle 0 6,458, ,344 2 0 disp+work 517 ',623 47, ,124 82 3964 sqlservr 123 1,515 20, ,469 43 969
System 0 17, , 609 35 1051 perfwcol 1 ,422 4, ,422 2 71 inetinfo 1 ,672 1, ,703 21 342 sqlagent 828 94 8 93 saposcol 266 375 4 77
SERVICES 109 125 20 286 msdtc 62 94 21 110
PID Private Shared sqlservr 153 728,489, 984 0 disp+work 257 143,519,744 0 disp+work 267 103,096,320 0 disp+work 323 51,081,216 0 disp+work 270 35, 426,304 0 disp+work 274 15,572,992 0 disp+work 290 14,798,848 0 disp+work 259 12,541, 952 0 disp+work 330 12,230, 656 0 disp+work 296 11,108,352 0
Private Shared sqlservr 728,489,984 0 disp+work 460,496,896 0 mmc 9, 682 ,944 0 gwrd 2,924 ,544 0 sqlagent 2,265 ,088 0 perfwcol 1, 658 ,880 0 msg server 1,544 , 192 0 saposcol 1,409 ,024 0 sqlmangr 1, 409 ,024 0
SERVICES 1,392 , 640 0 0 absolute-
PID 10 0 group- 10
—Timestamp (Record 16)
Thu Jan 18 14:09:00 2001
—VMstat
Processors: 2
Interval: 3,600,042,000 (us) usr: 8.96% sys: 1.52% idle: 179.01% queue: 0 freepte: 39590 pi: 23.73/s po: 20.58/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) (c) Q S
0 35CB34EC 55,858,176 26,415,104 8099.28 4145.70 0.00 12707 5881 0 0 12,244,996,795
1 CE2EE96C 309,177,856 314,481,66421444.06 9355.81 0.00 69985 9156 0 0 30,799,881,911
2 CE1F98E1 52,355,072 26,378,240 8915.40 79102.25 0.00 4900 2842 0 0 88,017,652,084
3 CE2EE96D 65,536 21,647,360 10.91 20125.52 0.00 8 386 0 0 20,136,439, 648
—FSstat—
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C: \pagefile . sys PagingFile 120 2,048 5.86%
631AF993 2
D: \pagefile . sys PagingFile 2,391,916 3,072,000 77.86%
B5D97A0C 3000
H: \pagefile . sys PagingFile 1,088 184,320 0.59%
00D37BF0 180
C:\ NTFS 1,099, 012 4,192,492 26.21%
B8BA90A3 0
D:\ NTFS 116,296 12,129,484 0.96%
6470EC21 0
E:\ NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ NTFS 559, 972 1,198,060 46.74%
A891FC02 0
H:\ NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
MTU RX Kb ΓV U\< "\. !••' .- !" Ii-,-
i i I ' . (i n I di i i nn n n n n n M : TiT 1 ).|ι.ι,-|. ml .-ι I ,, ,
I . i. ! . , ii ' l ' l , 'l -i -l , I " .
6 1 1500 10000000 1 12 0 0 AMD PCNET Family Ethernet
Adapt er [ 0004AC4 CE34 D] 3 , 599 , 984 , 375
—TopTen
Interval : 3,599,984,375 (us) ActiveCPUs: 2 TotalProcesses : 50 TotalThreads: 328 TotallO: 0 TotalHandles: 8565
PID User Sys ' Thrds Hlds
Idle 0 0 6,444, 500 2 0 disp+work 290 224,172 12, 547 4 171 sqlservr 153 127,750 ; 21, 781 43 971 disp+work 257 80,891 : 11, 266 4 687 disp+work 267 65,750 : 10, 891 4 289 disp+work 299 29,375 3, 359 4 179 disp+work 270 28,594 2, 953 4 176
System 2 0 : 21, 406 35 1055 disp+work 323 18,547 2, 797 4 166 disp+work 274 18,954 2, 015 4 160
User Sys Thrds Hlds
Idle 0 6,444, ,500 2 0 disp+work 513 1,956 53, ,704 82 3965 sqlservr 127 ,750 21, ,781 43 971
System 0 21, ,406 35 1055 perfwcol 938 4, ,687 2 71 inetinfo 1 ,656 1, .672 21 342 sqlagent 578 141 8 93 saposcol 187 422 4 77
SPOOLSS 235 157 7 148
SERVICES 63 282 20 286
PID Private Sh ired sqlservr 153 704,569,344 0 disp+work 257 135,393,280 0 disp+work 267 125,841,408 0 disp+work 270 46,485,504 0 disp+work 323 38,518,784 0 disp+work 330 26,054, 656 0 disp+work 259 22,220,800 0 disp+work 274 21,331, 968 0 disp+work 305 15,630,336 0 disp+work 290 13,352, 960 0
Private Shared sqlservr 704,569,344 ( D disp+work 491,585,536 ( 3 mmc 9, 687 ,040 0 sqlagent 2,240 ,512 0 perfwcol 1, 654 ,784 0 gwrd 1, 634 ,304 0 msg server 1,568 ,768 0
SERVICES 1, 425 ,408 0 saposcol 1,409 ,024 0 sqlmangr 1,409 ,024 0 0 absolute-
PID 10 0 group-
10 — Timestamp (Record 17)
Thu Jan 18 15:09:00 2001
— VMstat
Processors: 2
Interval: 3,585,866,200 (us) usr: 6.63% sys: 0.98% idle: 185.5 queue: 0 freepte: 39590 pi: 10.04/s po: 12.39/s
— IOstat
Dk Signatur Read(B) Write(B) Read(t) Wrιte(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 28,039,168 20,866,048 3658.82 2588.25 0.00 5852 3703 0 0 6,247,077,672
1 CE2EE96C 136,020,992 189,797,376 9274.91 5120.55 0.00 27710 5363 0 0 14,395,475,520
2 CE1F98E1 19,988,480 21,749,760 6268.15 49485.03 0.00 1854 1936 0 0 55,753,191,085
3 CE2EE96D 3,022,848 4,022,272 33.22 188.54 3378.23 67 68 0 0 3, 600,000,000
--FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free VolNum
C: \pagefile . sys PagingFile 116 2,048 5.66% 631AF993 2
D: \pagefile . sys PagingFile 2,394,984 3, 072, 000 77.96% B5D97A0C 3000
H : \pagefile. sys PagingFile 1, 912 184,320 1.04% OOD37BF0 180
C:\ 0 NTFS 1, 099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 116,172 12, 129, 484 0.96%
6470EC21 0
E:\ 2 NTFS 3,027,768 35, 540, 408 8.52% E8E20702 0
F:\ 3 NTFS 559, 972 1, 198,060 46.74% A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
T S MTU Speed RX- -Kb/s TX-Kb/s RXe TXe Inter ace
24 1 1500 10000000 0 0 MS TCP Loopback interface [Nonol ^,599,984,375 t- i !'>0 10000000 0 i 0 .\M' Vi\; , j ' \ A apt . r [ 0004AC4CE34 D] 599, 984, 375
--Topi en
Interval: 3,599,968,750 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 329
TotallO: 0
TotalHandles: 8651
CPU absolute PID User Sys 'rhrds Hlds
Idle 0 0 6,653, 547 2 0 disp+work 290 223,406 12, 265 4 171 sqlservr 153 97,813 : L5, 515 43 958 disp+work 299 29,437 3, 032 4 179 disp+work 267 26, 969 3, 875 4 289
System 2 0 13, 031 35 1050 disp+work 270 10,281 1, ,422 4 176 disp+work 323 8,000 1, .031 4 166 disp+work 274 7,828 875 4 160 disp+work 397 6, 625 687 4 160
User Sys Thrds Hlds
Idle 0 6,653, .547 2 0 disp+work 334 ,624 27, ,046 82 3437 sqlservr 97 ,813 15, ,515 43 958
System 0 13, ,031 35 1050 perfwcol 1 ,046 4, ,454 2 71 inetinfo 1 ,735 1, ,625 21 342
DRWTSN32 328 703 1 622 sqlagent 734 94 8 93 saposcol 94 234 4 77
SERVICES 94 156 20 289
PID Private Shared sqlservr 153 706, 842, 624 0 disp+work 267 108,457,984 0 disp+work 397 46,120,960 0 disp+work 270 40,566,784 0 disp+work 323 28,794,880 0 disp+work 290 13,889,536 0 disp+work 330 13,312,000 0 disp+work 274 10, 153, 984 0 disp+work 296 10, 055, 680 0 disp+work 259 9,711, 616 0
Private Shared sqlservr 706,842, 624 0 disp+work 324,050, 944 0 mmc 9,682 , 944 0 sqlagent 2,248 ,704 0 gwrd 2,146 ,304 0 perfwcol 1, 654 ,784 0 msg server 1,560 ,576 0 saposcol 1,409 ,024 0 sqlmangr 1,409 ,024 0 msdtc 1,372 ,160 0
10 absolute-
PI D 10
10 group-
IO
-Timestamp- (Record 18 )
Thu Jan 18 16 : 09 : 00 2001 —VMstat
Processors: 2
Interval: 3,599,989,400 (us) usr: 8.89% sys: 1.89% idle: 178.43% queue: 0 freepte: 39590 pi: 9.79/s po: 16.62/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 21,124,608 24,628,736 2862.25 2929.93 0.00 4938 4518 0 0 5,792,192,835
1 CE2EE96C 133,763,072 252,774,912 9501.33 7129.33 0.00 29649 7568 0 0 16,630,666,634
2 CE1F98E1 21,651,456 32,374,784 7040.96 61060.04 0.00 2311 2684 0 0 68,101,004,980
3 CE2EE96D 1,392,640 2,314,240 6.75 90.70 3502.57 23 37 0 0 3, 600,031,250
—FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) free
VolNum
C:\pagefile.sys PagingFile 124 2,048 6.05%
631AF993 2
D:\pagefile.sys PagingFile 2,379,708 3,072,000 77.46%
B5D97A0C 3000
H:\pagefile.sys PagingFile 1,700 184,320 0.92%
00D37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 114,948 12,129,484 0.95%
6470EC21 0
E:\ 2 NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,109,375
6 1 1500 10000000 1 14 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,600,109,375
—TopTen
Interval: 3,600,109,375 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 332
TotallO: 0
TotalHandles: 8665
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,423,578 2 0 disp+work 290 224,313 11,969 4 171 sqlservr 153 127, 000 20, 297 43 965 disp+work 267 85, 485 20, 812 4 291 disp+work 397 88, 296 15, 031 4 169
System 2 0 41, .782 37 1048 disp+work 299 29, 203 3, 500 4 179 disp+work 270 27, 516 4, 828 4 176 disp+work 305 20, 671 2, 281 4 159 disp+work 274 12, 140 1, .797 4 160
User Sys Thrds Hlds
Idle 0 6,423, .578 2 0 disp+work 509 ,218 63, .798 82 3448 sqlservr 127 ,000 20, 297 43 965
System 0 '41, ,782 37 1048 perfwcol 1 ,157 4, ,484 2 71 inetinfo 1 ,703 1, ,578 21 342 sqlagent 719 140 8 93 saposcol 156 266 4 77 gwrd 62 93 6 164
SERVICES 0 141 20 289
PID Private < Shared sqlservr 153 707,829,760 0 disp+work 397 122,12- ?,240 0 disp+work 267 118, 96< 1,224 0 disp+work 270 96,399, 360 0 disp+work 274 22,401, 024 0 disp+work 305 17,231, 872 0 disp+work 290 11,591, 680 0 disp+work 330 10,891, 264 0 disp+work 323 10,813, 440 0 mmc 172 9, 687, 040 0
Private Shared sqlservr 707,829,760 ( D disp+work 447,066,112 0 mmc 9, 687 ,040 0 sqlagent 2,244 , 608 0 perfwcol 1, 658 ,880 0 msg server 1,466 ,368 0
SERVICES 1,417 ,216 0 saposcol 1,413 ,120 0 sql angr 1,409 ,024 0 msdtc 1,372 , 160 0
10 absolute-
PID 10
10 group-
10
—Timestamp (Record 19)
Thu Jan 18 17:09:00 2001
—VMstat
Processors: 2
Interval: 3,599,952,100 (us) usr : 7 . 44 % sys : 1 . 15% idle : 182 . 79% queue : 0 freepte : 39590 pi : 6 . 16/s po : 13 . 45/s
--IOstat
Dk Signatur Read(B) Write (B) Read(t) Write (t) Idle(t) R(c) W(c) Q S
0 35CB34EC 12,957,696 17,926,144 1745.79 2405.92 0.00 2926 3730 0 0 4,151,715,611
1 CE2EE96C 95,968,256 210,708,992 5793.25 6094.12 0.00 18579 6557 0 0 11,887,387,605
2 CE1F98E1 14,614,528 23,224,320 1111.76 45070.37 0.00 1753 2095 0 0 46,182,142,345
3 CE2EE96D 1,425,408 2,162,688 12.79 98.27 3488.82 27 38 0 0 3,599,890, 625
—FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 120 2,048 5.86%
631AF993 2
D:\pagefile.sys PagingFile 2,381,852 3,072,000 77.53%
B5D97A0C 3000
H:\pagefile.sys PagingFile 2,128 184,320 1.15%
00D37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 115,116 12,129,484 0.95%
6470EC21 0
E:\ 2 NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,812,500
6 1 1500 10000000 0 9 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,599,812,500
—TopTen
Interval: 3,599,812,500 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 332
TotallO: 0
TotalHandles: 8636
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,580,625 2 0 disp+work 290 222,156 11,828 4 171 sqlservr 153 101,453 17,938 43 967 disp+work 397 63,579 7,703 4 173 disp+work 267 53,218 7,063 4 291 disp+work 299 28, 813 3, 375 4 179 disp+work 305 14, 860 1, 688 4 159
System 2 0 15, .922 37 1037 disp+work 270 13, 062 2, 282 4 176 disp+work 274 13, 157 2, 000 4 160
User Sys Thrds Hlds
Figure imgf000269_0001
disp+work 429,751 39, 673 82 3426 sqlservr 101,453 17, 938 43 967
System 0 15, 922 37 1037 perfwcol 984 4, .781 2 71 inetinfo 2,781 2, 297 21 342 sqlagent 656 63 8 93 saposcol 235 203 4 77 msdtc 31 109 21 110
SERVICES 31 62 20 289
PID. Private Shared sqlservr 153 724,668 1,416 0 disp+work 267 108,761 .,088 0 disp+work 397 108,056 5,576 0 disp+work 290 15,904, 768 0 disp+work 323 14,020, 608 0 disp+work 270 12,587, 008 0 disp+work 330 11,407, 360 0 mmc 172 9,797, 632 0 disp+work 356 8,867, 840 0 disp+work 296 8,822, 784 0
Private Shared sqlservr 724, 668,416 0 disp+work 334,340,096 0 mmc 9,797, 632 0 sqlagent 2,252,800 0 perfwcol 1, 658,880 0
SERVICES 1, 622,016 0 msg server 1,560,576 0 saposcol 1,413, 120 0
EXPLORER 1,409,024 0 sqlmangr 1,409,024 0
10 absolute-
PID 10
10 group-
IO
--Timestamp (Record 20)
Thu Jan 18 18:09:00 2001
— Mstat
Processors: 2
Interval: 3,599,975,700 (us) usr: 4.95% sys: 0.59% idle: li queue: 0 freepte: 39590 pi: 2.10/s po: 2.16/s —IOs at
Dk Signatur Read(B) Write (B) Read(t) Write (t) Idle(t) R(c) W(c) Q S
0 35CB34EC 6,087,680 5,171,712 581.13 742.66 2276.19 1454 1041 0 0 3,600,000,000
1 CE2EE96C 24,870,912 44,877,824 1889.18 1328.59 382.21 5722 1477 0 0 3,600,000,000
2 CE1F98E1 1,024,000 8,626,176 236.89 19600.60 0.00 125 942 0 0 19,837,493,552
3 CE2EE96D 0 552,960 0.00 10.87 3589.12 0 9 0 0 3, 600,000,000
—FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 120 2,048 5.86%
631AF993 2
D:\paqefile.sys PagingFile 2,386,460 3,072,000 77.68%
B5D97Λ0C 3000
H:\pagefile.sys PagingFile 2,480 184,320 1.35%
OOD37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 115,104 12,129,484 0.95%
6470EC21 0
E:\ 2 NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,812,500
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,599,812,500
—TopTen
Interval: 3,599,812,500 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 331
TotallO: 0
TotalHandles: 8628
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,799,906 2 0 disp+work 290 219,344 11,485 4 171 sqlservr 153 72,859 11,484 43 967 disp+work 299 30,172 2,906 4 179 disp+work 296 10,047 515 4 167 disp+work 323 8,094 907 4 166 disp+work 397 6,375 1,047 4 177.
System 2 0 6,937 37 1033 perfwcol 179 1, 188 4,,328 2 71 disp+work 356 4, 531 391 4 163
User Sys Thrds Hlds
Idle 0 6,799, , 906 2 0 disp+work 280 ,220 17, .734 82 3426 sqlservr 72 ,859 11, .484 43 967
System 0 6, , 937 37 1033 perfwcol 1 ,188 4, ■ 328 2 71 inetinfo 1 ,672 1, .656 21 342 sqlagent 672 78 8 93
SERVICES 78 141 20 289 saposcol 109 94 4 77 msdtc 63 47 21 110
MEM absolute
PID Private Shared sqlservr 153 727,207 ',936 0 disp+work 397 94,773, 248 0 disp+work 267 76,574, 720 0 disp+work 290 16,416, 768 0 disp+work 323 14,204, 928 0 disp+work 296 13,021, 184 0 disp+work 330 12,775, 424 0 disp+work 270 9, 945, 088 0 mmc 172 9,797, 632 0 disp+work 254 7,987, 200 0
Private Shared sqlservr 727,207,936 0 disp+work 281,296,896 0 mmc 9,797 , 632 0 sqlagent 2,252 ,800 0 gwrd 2,158 ,592 0 perfwcol 1,662 ,976 0 msg server 1,560 ,576 0
SERVICES 1,458 ,176 0
EXPLORER 1,409 ,024 0 sqlmangr 1,409 ,024 0
10 absolute-
PID 10
10 group-
10
—Timestamp (Record 21)
Thu Jan 18 19:09:00 2001
—VMstat
Processors: 2
Interval: 3,614,433,700 (us) usr: 5.42% sys: 0.57% idle: 187.20% queue: 0 freepte: 39590 pi: 0.87/s po : 1.88/s
--IOstat
Dk Signatur Read(B) Write (B) Read(t) Write(t) Idle(t) R('c) W(c) Q S
0 35CB34EC 1,601,536 4,437,504 234.61 682.54 2682.83 390 979 0 0 3,600,000,000
1 CE2EE96C 11,269,120 39,311,360 1013.35 1276.91 1309.72 2689 1391 0 0 3,600,000,000
2 CE1F98E1 28,762,112 14,426,11226853.70 9844.53 0.00 2039 810 0 0 36,698,244,965
3 CE2EE96D 65,536 618,496 5.49 20.10 3574.40 8 18 0 0 3, 600,000,000
--FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 120 2,048 5.86%
631AF993 2
D:\pagefile.sys PagingFile 2,385,772 3,072,000 77.66%
B5D97A0C 3000
H:\pagefile.sys PagingFile 2,464 184,320 1.34%
00D37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 115,024 12,129,484 0.95%
6470EC21 0
E : \ 2 NTFS 3 , 027 , 768 35 , 540 , 408 8 . 52 %
E8E20702 0
F : \ 3 NTFS 559 , 972 1 , 198 , 060 4 6 . 74 %
A891 FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
T ': MTU -.p-mri RX-Kb/s T -Kb/ K/, '|'χ, li.l. , f ,r,
24 1 1500 lϋUOOOOO 0 0 ϋ 0 MS TCP Loopback interface [None] 3,600,093,750
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,600,093,750
—TopTen
Interval: 3,600,093,750 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 329
TotallO: 0
TotalHandles: 8632
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,766,500 2 0 disp+work 290 223,468 9,953 4 171 sqlservr 153 85,844 12,266 43 967 disp+work 299 35,000 2,766 4 179 disp+work 296 17,750 828 4 167 disp+work 356 9,687 562 4 163 disp+work 323 8,015 1,062 4 166 disp+work 397 6,578 782 4 181
System 2 0 6,188 35 1033 perfwcol 179 1,421 4,203 2 71 CPU group
User Sys Thrds Hlds
Idle 0 6,766, .500 2 0 disp+work 302 ,263 16, ,436 82 3430 sqlservr 85 , 844 12, ,266 43 967
System 0 6, .188 35 1033 perfwcol 1 ,421 4, ,203 2 71 inetinfo 1 ,891 1, .562 21 342 sqlagent 328 31 8 93 saposcol 109 203 4 77
SERVICES 47 156 20 289 msdtc 15 32 21 110
MEM
PID Private Shared sqlservr 153 759,226,368 0 disp+work 397 85,409, 792 0 disp+work 267 47,423, 488 0 disp+work 290 16,449, 536 0 disp+work 323 15,499, 264 0 disp+work 330 12,947, 456 0 disp+work 296 11,165, 696 0 mmc 172 9,797, 632 0 disp+work 356 9,502, 720 0 disp+work 254 7,958, 528 0
MEM
Private Shared sqlservr 759,226,368 0 disp+work 241,762,304 0 mmc 9,797 , 632 0 t sqlagent 2,252 ,800 0 gwrd 1,753 ,088 0 perfwcol 1, 658 , 880 0 msg server 1,560 ,576 0 saposcol 1,413 ,120 0 sqlmangr 1,409 ,024 0
EXPLORER 1,409 ,024 0
10
PID 10 10 group
10
—Timestamp (Record 22)
Thu Jan 18 20:09:00 2001
--VMstat
Processors: 2
Interval: 3,585,495,500 (us) usr: 4.93% sys: 0.55% idle: 189.82% queue: 0 freepte: 39590 pi: 0.07/s po: 1.55/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 205,312 3,997,696 35.41 631.64 2932.93 43 900 0 0 3, 600, 000,000
1 CE2EE96C 884,736 33,489,408 114.03 1020.14 2465.82 214 1232 0 0 3,600,000,000
2 CE1F98E1 933,888 0 92.31 0.00 3507.68 114
Figure imgf000274_0001
3 CE2EE96D 8,192 561,152 1.35 13.25 3585.39 1 11 0 0 3, 600, 000,000
—FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 120 2,048 5.86%
631AF993 2
D:\pagefile.sys PagingFile 2,382,948 3,072,000 77.57%
B5D97A0C 3000
H:\pagefile.sys PagingFile 2,248 184,320 1.22%
00D37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 • NTFS 114,036 12,129,484 ' 0.94%
6470EC21 0
E:\ 2 NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
—Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,599,890,625
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,599,890,625
—TopTen
Interval: 3,599,890,625 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 330
TotallO: 0
TotalHandles: 8638
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,806,062 2 0 disp+work 290 223,500 11,078 4 171 sqlservr 153 73,875 10,765 43 967 disp+work 299 28,859 2,609 4 179 disp+work 323 8,172 813 4 166
System 2 0 6,343 35 1033 disp+work 397 5,390 718 4 185 disp+work 296 5,359 407 4 167 perfwcol 179 1,375 4,219 2 71 disp+work 356 3,657 250 4 163
CPU group
User Sys Thrds Hlds Idle 0 6,806,062 2 0 disp+work 276,375 16, 281 82 3434 sqlservr 73 ,875 10, 765 43 967
System 0 6, 343 35 1033 perf col 1 ,375 4, 219 2 71 inetinfo 1 , 687 1, 782 21 342 sqlagent 360 47 8 93 saposcol 94 187 4 77 msg server 78 79 4 85
SERVICES 16 125 20 289
PID Private Sha .red
:,ιjlr>crvr 153 757, 80 3,152 0 disp+work 397 77, 152, 256 0 disp+work 267 35,467, 264 0 disp+work 323 16,543, 744 0 disp+work 290 13,807, 616 0 disp+work 330 13,312, 000 0 disp+work 296 11,780, 096 0 mmc 172 9,805, 824 0 disp+work 254 7,962, 624 0 disp+work 362 7,024, 640 0
Private Sh;ared sqlservr 757,809,152 0 disp+work 213,364,736 0 mmc 9,805 ,824 0 sqlagent 2,252 ,800 0 perfwcol 1, 662 ,976 0 msg server 1,581 ,056 0
SERVICES 1, 433 , 600 0 saposcol 1, 413 , 120 0
EXPLORER 1, 409 ,024 0 sqlmangr 1, 409 ,024 0
10 absolute-
PID 10
10 group-
10
--Timestamp (Record 23)
Thu Jan 18 21:09:00 2001
—VMstat
Processors: 2
Interval: 3,614,577,600 (us) usr: 6.06% sys: 0.90% idle: 185.24% queue: 0 freepte: 39590 pi: 1.26/s po: 2.15/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 4,179,456 5,065,728 527.50 813.43 2259.07 921 1146 0 0 3,600,015, 625
1 CE2EE96C 14,435,840 59,802,112 1516.88 1944.87 138.25 3360 2150 0 0 3, 600, 015, 625 2 CE1F98E1 75,284,480 22,192,128298151.60 54578.65 0.00 9017 1829 0 0 352,730,268,286
3 CE2EE96D 32,768 585,728 2.96 13.78 3583.26 4 13 0 0 3, 600,015,625
—FSstat
MountPoint PhysDrv Type Free (KB) Total (KB) %free
VolNum
C:\pagefile.sys PagingFile 120 2,048 5.86%
631AF993 2
D:\pagefile.sys PagingFile 2,376,784 3,072,000 77.37%
B5D97A0C 3000
H:\pagefile.sys PagingFile 1,828 184,320 0.99%
00D37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 113,100 12,129,484 0.93%
6470EC21 0
E:\ 2 NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,234,375
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,600,234,375
—TopTen
Interval: 3,600,250,000 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 330
TotallO: 0
TotalHandles: 8642
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,695,938 2 0 disp+work 296 228,656 12,640 4 167 sqlservr 153 101,719 22,016 43 967 disp+work 299 44,750 5,109 4 179 disp+work 290 27,953 4,953 4 171 disp+work 356 18,843 2,375 4 163
System 2 0 8,453 35 1035 disp+work 397 6,391 719 4 185 disp+work 323 5,250 734 4 166 perfwcol 179 969 4,625 2 71
CPU group
User Sys Thrds Hlds
Idle 0 6,695,938 2 0 disp+work 333,610 27,047 82 3434 sqlservr 101,719 22,016 43 967
System 0 8,453 35 1035 perfwcol 969 4,,625 2 71 inetinfo 1 ,750 1, .609 21 342 saposcol 109 156 4 77 sqlagent 234 31 8 93
SERVICES 46 203 20 289 mmc 16 78 4 83
PID Pri-vrate Sha;red sqlservr 153 821,698 1,560 0 disp+work 397 65,249, 280 0 disp+work 267 19,066, 880 0 disp+work 296 16,683, 008 0 disp+work 323 13,561, 856 0 disp+work 330 13,082, 624 0 disp+work 290 11,395, 072 0 mmc 172 9, 687, 040 0 disp+work 254 7, 966, 720 0 disp+work 356 7,454, 720 0
Private Shared sqlservr 821,698,560 0 disp+work 182,579,200 0 mmc 9,687 ,040 0 sqlagent 2,252 ,800 0 perfwcol 1,654 ,784 0 msg server 1,486 ,848 0 saposcol 1,409 ,024 0
EXPLORER 1,409 ,024 0 sql angr 1,409 ,024 0
SERVICES 1,376 ,256 0 10 absolute
PID 10
10 group-
10
—Timestamp (Record 24)
Thu Jan 18 22:09:00 2001
—VMstat
Processors: 2
Interval: 3,585,425,300 (us) usr: 4.92% sys: 0.60% idle: 189.75% queue: 0 freepte: 39590 pi: 1.94/s po : 2.00/s
—IOstat
Dk Signatur Read(B) Write(B) Read(t) Write(t) Idle(t) R(c) W(c) Q S
0 35CB34EC 5,820,416 4,212,224 668.09 689.01 2242.93 1252 1000 0 0 3, 600,046,875
1 CE2EE96C 22,791,680 41,038,336 2030.01 1318.03 251.99 5136 1425 0 0 3, 600,046,875
2 CE1F98E1 729,088 13,025,280 94.90 13541.68 0.00 89 887 0 0 13, 636,597, 615
3 CE2EE96D 0 552,960 0.00 12.07 3587.97 0 9 0 0 3 , 600 , 04 6 , 875
— FSstat
MountPoint PhysDrv Type Free ( KB) Total ( KB ) % free
VolNum
C : \pagefile . sys PagingFile 120 2 , 048 5 . 86%
631AF993 2
D:\pagefile.sys PagingFile 2,375,116 3,072,000 77.31%
B5D97A0C 3000
H:\pagefile.sys PagingFile 1,744 184,320 0.95%
00D37BF0 180
C:\ 0 NTFS 1,099,001 4,192,492 26.21%
B8BA90A3 0
D:\ 1 NTFS 112,940 12,129,484 0.93%
6470EC21 0
E:\ 2 NTFS 3,027,768 35,540,408 8.52%
E8E20702 0
F:\ 3 NTFS 559,972 1,198,060 46.74%
A891FC02 0
H:\ 0 NTFS 21,314 208,025 10.25%
5CDA4CC9 0
--Network
T S MTU Speed RX-Kb/s TX-Kb/s RXe TXe Interface
24 1 1500 10000000 0 0 0 0 MS TCP Loopback interface [None] 3,600,000,000
6 1 1500 10000000 0 0 0 0 AMD PCNET Family Ethernet
Adapter [0004AC4CE34D] 3,600,000,000
—TopTen
Interval: 3,600,000,000 (us)
ActiveCPUs: 2
TotalProcesses: 51
TotalThreads: 330
TotallO: 0
TotalHandles: 8640
CPU absolute
PID User Sys Thrds Hlds
Idle 0 0 6,803,672 2 0 disp+work 267 222,422 11,797 4 291 sqlservr 153 71,547 11,563 43 967 disp+work 299 28,875 2,875 4 179 disp+work 397 6,297 938 4 187 disp+work 323 6,422 797 4 166
System 2 0 6,719 35 1033 perfwcol 179 1,172 4,313 2 71 disp+work 270 5,078 344 4 176 inetinfo 133 2,609 2,391 21 342
CPU ηroυp
User Sys Thrds Hlds
Idle 0 6,803,672 2 0 disp+work 276,845 17,656 82 3436 sqlservr 71,547 11,563 43 967
System 0 6,719 35 1033 perfwcol 1,172 4,313 2 71 inetinfo 2,609 2,391 21 342 sqlagent 422 63 8 93 saposcol 157 141 4 77
SERVICES 47 203 20 289 msdtc 16 47 21 110
PID Private Sh;ared sqlservr 153 822, 194,176 0 disp+work 397 59,260,928 0 disp+work 267 24,383,488 0 disp+work 323 14,864,384 0 disp+work 259 11,591,680 0 disp+work 330 11,104,256 0 mmc 172 9,797,632 0 disp+work 254 7,954,432 0 disp+work 296 7,778,304 0 disp+work 362 7,098,368 0
Private Shaired sqlservr 822,194, 176 0 disp+work 180,441,088 0 mmc 9,797 , 632 0 sqlagent 2,252 ,800 0 gwrd 1,748 , 992 0 perfwcol 1,654 ,784 0 msg server 1,581 ,056 0
SERVICES 1,536 ,000 0 saposcol 1,409 ,024 0 sql angr 1,409 ,024 0 absolute-
PI D 10 group-
10
Appendix G
Figure imgf000281_0001
Performance Analysis
ACMECORP
Introduction
Summary
Concepts
Availability
CPU Usage
Process Queue
Memory
Network I/O
Disk I/O
Paging Space
File System
Top 10
JANUARY 10. 2001 Introduction
Based on data collated in the host ACMECORP, from 01/01/2001 , at 01 :00, to 10/01/2001, at 00:00, the current performance analysis report was elaborated.
The data used on this report has been obtained from an exclusive collector, with high resolution and low intrusion, developed specially for this end, which executes on the target machine. That collector obtains data directly from the operating system, without any other libraries or additional tools, with a minimum overhead on the system. The data collected is stored using a binary format, in order to provide persistence. When automatically sent, it gets compressed and encrypted, to ensure fast delivery and confidentiality.
The content of this report is based on years of experience on performance analysis and capacity planning. The tool used to generate this report operates in a completely automatic way, without human direct intervention. It uses an extensible inference machine, based on heuristics and rules, improved continuously. Using concepts such as "watermarks" and tolerance, it is possible to determine if a computational resource usage is excessive and if the excess is relevant.
During the monitoring period, the summary of the machine, which has been obtained dynamically, was:
Figure imgf000282_0001
OS : AIX 4.3.3.16
Host : ACMECORP
IP address : 192.168.1.15
Processors : 2 PowerPC_POWER3
Memory : 1024 MB
Model : IBM,9076-260
Serial : IBM,010003297 Summary ACMECORP
The last boot of the host ACMECORP took place on 27/12/2000, at 21 :45.
This report is based on monitoring which occurred place between 01/01/2001, at 01 :00, and 10/01/2001, at 00:00. The following was worth highlighting within this period:
CPU usage did not show a problem with its maximum usage not exceeding 13%.
The percentage of time the CPU spent waiting for an I/O occurrence remained below the limit of 40% during the whole monitoring period, which indicates that this system is not I O bound.
The runnable processes queue was not a problem, with its maximum level never exceeding the number of active processors.
The blocked processes queue (awaiting input/output) remained above the level of 1 process(es). as shown in the graph below, thus resulting in a tight bottleneck in the system.
The average paging rate remained low for the whole monitoring period, reaching at most 1.2 pps, indicating that there had been no memory constraint.
All of the real memory was allocated during the monitoring period for processes and file caching .
The amount of virtual memory in use remained low throughout the monitoring period, indicating a low demand for memory by processes.
The transmission rate of the network interface loO remained above 45% of the interface capacity, all the time, indicating, therefore, a high level of overload.
The reception rate of the network interface loO remained above 45% for most of the time, when the interface was overloaded.
The transmission and reception rates of the network interface cssO remained above 45% of the interface capacity, all the time, indicating, therefore, a high level of overload.
The disk hdisk2 exceeded the limit of usage, for more than 10% of the monitoring period.
The paging space was not a problem, with the usage not having exceeded 6.5%. Concepts
In order to be able to understand a performance analysis report, it may be convenient to review some basic concepts. The idea here is not to make a treaty on the subject, but to go through some fundamental aspects related to performance.
System performance means different things to different people. This can range from resource consumption to amount of work performed per unit of time. It will be assumed here that improving performance means improving response time of end users and/or increasing throughput of both end user work and batch work.
The p'erformance of any system depends on how tied up key resources are. The reason being that system performance is, essentially, a function of the time each key resource takes to service a request, plus the time a request has spent queued waiting to be serviced (more details on queues ahead). In case of an information-processing environment, based on computers, key resources are CPU, memory, disk I/O and network I O.
In order to evaluate resource consumption, criteria must be established. These criteria consist of judging which system performance variables best express this consumption, since many are available. In addition, the watermarks (point where a resource starts to be considered overcommited - also known as thresholds) for these variables need be defined. These watermarks are approximate and can vary depending on the characteristics of the system being analyzed.
Description of key resources
1 - CPU
CPU's can play a significant role in the response time of computational environments, especially when other resources are abundant. This is particularly true in environments where most of the data required is available in memory. For CPU, the key variables to evaluate resource consumption are run-queue and CPU usage.
1.1 - Run-queue
Run-queue means the amount of processes (threads, in fact) which are runnable (ready to execute), being either queued, waiting for a CPU, or already executing. It is a measure of how used up is the CPU, in an environment comprised of many processes (a commercial transaction environment, p.e.). The watermarks, typical of run-queue, are a range between the number of processors available and five times this value. This depends on the response time required for a transaction, versus the amount of CPU required by this transaction.
1.2 - CPU Usage
CPU usage is a measure of how used up is the CPU in an environment Concepts
comprised of few heavy processes (such is the case of scientific or commercial environments with few, but complex, batches). It can be used as a criterion for environments with many processes, but run-queue is more meaningful in these cases. CPU usage is expressed in percentage and it can be broken in four categories, usr, sys, idle and wio. Usr stands for user mode or the mode in which a process executes, when not using any operating system service. Sys means system mode, which is the mode a process is placed into when using any operating system service. Idle, as the term suggests, is when a CPU has no process to execute. Wio stands for waiting for I/O, a special case of idleness, where the CPU is available, but there are processes waiting for an I O operation to complete. CPU usage is normally a concern when usr+sys is above 75 to 85%, in an environment with multiple processes, or is close to 100% / number of processors, in an environment with few processes.
2 - Memory
Memory can play different roles in a computational environment, ranging from fast storage area for program data to disk data caching (making up for the slower speed of disk subsystems). This means that memory is consumed for very distinct purposes. Memory consumption, being understood as not only real memory (RAM), but as the entire virtual memory subsystem, can be well evaluated by paging activity, virtual memory usage and paging space usage.
Paging activity occurs when the real memory being managed by a virtual memory subsystem is overcommitted. In a small degree, it is not a problem, since the main purpose of the virtual memory subsystem is to be able to maximize system throughput by allowing process memory to be swapped in and out. It becomes a critical issue when paging reaches high rates. The point is that paging indicates that the sum of the working sets (ranges of virtual memory addresses of processes that need to be accessible at a given moment) of the processes, plus what is left aside for the operating system and file caching, exceeds the amount of real memory available. Paging is broken down in page in (pi) and page out (po). A page in is usually regarded as more serious, since it may indicate a thrashing condition (the system is spending too much time just paging). The watermark for paging (pi+po) is in the range of 10 pages per second.
Paging space usage is a fundamental concern when analyzing the state of the virtual memory manager. If no paging space is available, definitely no new process will be spawned and, very likely, some existing processes may be terminated by the operating system in order to make room in the paging space. So, real memory constraints impact performance, but paging space constraints put in risk the entire execution environment. The amount of virtual memory devoted to process segments (process data area) is directly related to paging space usage, if the operating system in question is working with early paging Concepts
space allocation (allocate space in the paging area whenever it is allocated in real memory). In this case, the amount of space being used in the paging area is the sum of the data areas of all running processes, which is the major component in determining the amount of real memory required by the system. If the amount of virtual memory in use exceeds the amount of real memory available, paging will very likely occur, initiating peformance degradation. In an environment experiencing significant growth rates, especially in terms of users, it is advisable to keep the average paging space usage rate at 50%. Obviously, this concern does not exist or is less serious in the case of operating systems that are able to allocate paging space dynamically.
3 - Disk I/O
Disk I O is certainly one of the main subjects when performance is in discussion. This is particularly true in commercial environments. Disks, being mechanical devices (in comparison with other devices which are faster, because they are eletronic) can, if not properly used, put in jeopardy the performance of an entire system. In addition, disks can present two very distinct personalities - one, when accessing data in random mode (slower, because it involves arm movement), and other, when accessing data in sequential mode (faster, because it only involves plate movement). Also, the performance of disks varies according to the blocking factor (amount of data involved in a same operation), since the impact of the overhead is dilluted, in the case of large blocks. Therefore, disks must be closely watched. Among the key variables that provide information on disk I O usage are bandwidth occupation, transfers (I/Os) per second, transfer rate (usually expressed in KB/s) and physical read to write ratio.
3.1 - Bandwidth occupation
Bandwidth occupation is probably the most important variable when evaluating disk I/O. It is calculated based on the number of samples taken within a period (1 second, p.e.) that found a given disk to be busy. It is highly dependent on the rate of requests being sent to the disks and the type of data access being required by these requests (since random requests take longer to be serviced than sequential requests). With bandwidth occupation, it is possible to estimate whether disk I/O requests for a given disk are spending time in queues, instead of being serviced promptly. The disk bandwidth occupation watermarks regarded as acceptable vary from 15%, for environments with predominantly random access (OLTP with simple transactions), to 65%, for environments with predominantly sequential access (datawarehouse, complex batch applications, etc.). A criterion of 40% is adequate for mixed environments (which constitute the great majority of the cases).
3.2 - Transfers per second
Transfers per second provide a good complementary information to Concepts
bandwidth occupation, specially in order to evaluate more subtle bottlenecks such as disk adapters (SCSI, FC-AL, SSA, etc.). Disks adapters have a ceiling, in terms of transfers per second, that might be reached without notice, limiting, therefore, the I O capability of multiple disks. Physical disks support about 100 to 120 transfers per second, when accessing data in random mode, and lOx these values or more, when accessing data in sequential mode. So, it is considered acceptable to keep disks operating at a sustained rate of about 50% of these values.
3.3 - Transfer rate
Transfer rate informs the amount of data that is being received from disks or being sent to disks, per unit of time. Like bandwidth occupation and transfers per second, transfer rate is a function of the rate of requests being sent to the disks and the type of access being required by these requests (random access requires more time to be serviced and. therefore, limits transfer rates). Another key aspect of transfer rate is that it may also expose the ceiling of disk adapters, regarding this characteristic. In addition, computer I/O buses may also impose a limit on the transfer rates of adapters inserted in these buses. The typical watermarks for transfer rates, per individual physical disk, varies from 400 to 1.000 KB/s, for random I/O, to between 4.000 KB/s and 25.000 KB/s (when using large blocks), for sequential I/O.
3.4 - Physical read to write ratio
The physical read to write ratio for disk I/O is important in determining whether a given database configuration is adequate and whether the type of disk layout being used is the most suitable. When the number of physical reads more than exceeds 5x the number of writes, it might mean (although not necessarily) that the size of a database buffer cache is not big enough. Therefore, the database software might be operating with an unacceptable hit ratio (ratio of logical reads that are satisfied by the cache). A physical read to write ratio below 2.5x, although not a problem in itself, might not be adequate for certain disk layouts such as RAID-5, since this arrangement has a significant write-penalty (additional effort required to perform write operations, when compared to read operations).
4 - Network I O
Network I/O, although also a key resource, seldomly plays a major role in influencing response time, except when wide area networks (WAN's) are involved. Nevertheless, this resource must also be monitored, for it may hide some surprises. The key variables to evaluate network I/O are transfer rate (or, bandwidth occupation of the maximum transfer rate), error rate and latency.
4.1 - Transfer rate 286 2/093399
Concepts
Similar to disk I/O, the transfer rate of network adapters depends on block size, although the impact is not as significant. On the other hand, there is direct correlation between transfer rate and bandwidth usage in the case of network I O, whereas with disk I O this is not the case (since the data access mode must be considered). Typically, collision-prone networks (Ethernet, without switches, p.e.) should operate at 30 to 40% of their nominal capacity and other types of networks should be kept at 50 to 70% of their nominal capacity.
4.2 - Error rate
Error rate provides a measure of how effective the transfer rate is in a network, since a high error rate will mean that the effetive transfer rate is very low. Most network adapters and network device drivers provide some means of retrieving error information, which is presented as a complementary statistic to the transfer rate itself. Major causes of high error rates in LANs are collisions (two or more network devices trying to send data at the same time), stationary waves (signal that remains in the network due to poor termination), full-duplex/half-duplex mismatch (operation mode mismatch between adapters and hubs/swiches) and speed mismatch (speed mismatch between adapters and hubs/switches). A major cause of high error rates in WANs is noise (heat, electromagnetic noise, etc.). The watermark for error rate in LANs should be of 1% or even less and for WANs should be of about 5%.
4.3 - Network latency
Network latency can be measured by various schemes, the most common being echo requests (TCP/IP ping, p.e.). It is a measure of how long the first packet, of a chain of packets, took to reach its destination. The point is that it is not enough to have high bandwidth if the first packet requires a very significant time to arrive. This is the case with satellite links, but may also apply to confined networks. For instance, the latency of ATM and gigabit Ethernet can be very small for conventional applications, but it is high for parallel computing applications.. Another aspect of latency is that it may have an important software component, such as the one caused by operating systems protocol stacks (communication subsystems). The typical latencies desired for corporate LAN's (Local Area Networks) are in the range of 1 to 10 ms. Availability
During this analysis monitoring period the machine antained the following availability rate.
100
J-) ro 50 ro >
Figure imgf000289_0001
I I I I I I ± 1 I I I I I I I.J, I I I
1/1 5 date
CPU Usage
Figure imgf000290_0001
CPU usage was not a problem, as shown in the graph below, with its average usage not exceeding 75%.
The percentage of time the CPU spent waiting for an I O occuπence remained below the level of 40% during the whole monitoring period.
Figure imgf000290_0003
Figure imgf000290_0002
1/1 5 date
Process Queue
3*1
The runnable processes queue was not a problem, as shown in the graph below, never exceeding 2 process(es).
The blocked processes queue (awaiting input/output) remained above the level of 1 process(es), as shown in the graph below, thus resulting in a tight bottleneck in the system.
The greatest number of processes and threads running occured on 05/01, at 00:00, whith 157 processes and 195 threads.
Figure imgf000291_0001
1/1 5 date
Memory
« The average paging rate remained below the recommended level during the whole monitoring period, indicating the absence of memory constraints.
Figure imgf000292_0001
1/1 5 date
The amount of virtual memory in use remained low throughout the monitoring period.
Figure imgf000292_0002
Figure imgf000292_0003
1/1 5 date
All of the real memory was allocated during the monitoring period for processes and file caching. Network I/O Interface loO
Figure imgf000293_0001
The transmission rate remained above 40% of the interface capacity, for the whole time, indicating, therefore, a heavy overload.
The reception rate remained above 40% for most of the time, when the interface was overloaded.
There were no eπors in transmission and/or reception.
Figure imgf000293_0002
Figure imgf000293_0003
1/1 5 date
Network I/O Interface enO
ta»ϊ
The transmission and reception rates remained below 40% of the interface capacity, for the whole time, without, therefore, having an overload.
There were no errors in transmission and/or reception.
Figure imgf000294_0001
1/1 5 date
Network I/O Interface en1
Figure imgf000295_0001
The transmission and reception rates remained below 40% of the interface capacity, for the whole time, without, therefore, having an overload.
There were no errors in transmission and/or reception.
Figure imgf000295_0003
Figure imgf000295_0002
1/1 5 date
Network I/O Interface en2
Figure imgf000296_0001
The transmission and reception rates remained below 40% of the interface capacity, for the whole time, without, therefore, having an overload.
There were no eπors in transmission and/or reception.
Figure imgf000296_0002
1/1 5 date
Network I/O Interface cssO
Figure imgf000297_0001
The transmission and reception rates remained above 40% of the interface capacity, for the whole time, indicating, therefore, a heavy overload.
There were no eπors in transmission and/or reception.
Figure imgf000297_0003
Figure imgf000297_0002
1/1 5 date
Disk I/O
Figure imgf000298_0001
The 4 disks were analyzed for the following requisites: transaction rate, transfer rate and usage.
The disk hdisk2 exceeded the limit of tps, for more than 10% of the monitoring period.
The disk hdisk2 exceeded the limit of usage, for more than 10% of the monitoring period.
The other disks showed good results, without exceeding the limits of each of the requisites.
The graphs relating to disk hdisk2 are shown on the following pages.
Disk I/O
_§- hdisk2
Figure imgf000299_0004
Figure imgf000299_0001
date
hdisk2
Figure imgf000299_0002
date
hdisk2
Figure imgf000299_0003
1/1 5 date Disk I/O Zoom
On the 05/01, at 01 :00, the disk I/O showed the highest activity during the whole monitoring period. The graph below represents the status on this day.
Figure imgf000300_0001
Figure imgf000300_0002
date Disk I/O Zoom
The graph below shows the disk hdisk2 Read/Write ratio, during the highest I/O activity period.
Figure imgf000301_0001
Figure imgf000301_0002
0[5] 12 18 0.6] date
I/O Activity Zoom
Status of 10 processes with the most I/O activity on the 05/01, at 01:00. At that time there was 1020 open files.
ABSOLUTE
Figure imgf000302_0001
GROUP
Figure imgf000302_0002
Paging Space
Figure imgf000303_0001
The paging space (1572864 pages) was not a problem, as shown in the graph below, with the usage not exceeding 6.5%.
Figure imgf000303_0002
1/1 5 date
File System
Status of the file system at the end of the monitoring period:
FileSystem Mounted Total Free %Used
/dev/hd4 / 32768 5232 - 84
/dev/ d2 /usr 1474560 264168 82
/dev/hd9var /var " 131072 30916 76
/dev/hd3 /tmp 131072 126776 3
/dev/hd1 /home 32768 31576 3
/dev/lv01 /backup 32768 26360 19
/dev/lv02 /temp 229376 74068 67
/dev/lv03 /usr/local 32768 28764 12
/dev/lvimage /image 2621440 2538212 3
/dev/hdwlv001 /var/opt/perf 4587520 4318872 5
/dev/hdwlv002 /usr/lpp/perf 32768 8580 73
/dev/utilv000 /util 32768 31668 3
/dev/lv04 /DENYALL 524288 384328 26
/dev/im1 p01 homlv002 /home/cltadmin 4096 3904 4
/dev/iml p01 homlv003 /home/db2fend 4096 3920 4
/dev/im1 p01db2dv101 /home/db2inst1 1843200 249700 86
/dev/im1p01homlv001 /home/lsadmin 65536 60968 6
/dev/im1p01tivlv001 /opt/tivim1p1 12288 4436 63
/dev/audlv001 /opt audim1p1 196608 190356 3
Paging space 1572864 1484584 5
Top 10 01/JAN CPU
Processes that most used CPU during the monitoring period, as from 01/01.
ABSOLUTE
Figure imgf000305_0001
GROUP
Figure imgf000305_0002
Top 10 02/JAN CPU
ABSOLUTE
Figure imgf000306_0001
GROUP
Figure imgf000306_0002
Top 10 03/JAN CPU
ABSOLUTE
Figure imgf000307_0001
GROUP
Figure imgf000307_0002
Top 10 04/JAN CPU
ABSOLUTE
Figure imgf000308_0001
GROUP
Figure imgf000308_0002
Top 10 05/JAN CPU
ABSOLUTE
Figure imgf000309_0001
GROUP
Figure imgf000309_0002
Figure imgf000310_0001
ABSOLUTE
Figure imgf000310_0002
GROUP
Figure imgf000310_0003
Top 10 07/JAN CPU
ABSOLUTE
Figure imgf000311_0001
GROUP
Figure imgf000311_0002
"3TTX Top 10 09/JAN CPU
ABSOLUTE
Figure imgf000312_0001
GROUP
Figure imgf000312_0002
Top 10 01 /JAN Memory
Processes that most used memory during the monitoring period, as from 01/01. The usage is showed as KB.
ABSOLUTE
Figure imgf000313_0001
Top 10 02/JAN Memory
ABSOLUTE
Figure imgf000314_0001
Top 10 03/JAN Memory
ABSOLUTE
Figure imgf000315_0001
svr Top 10 04/JAN Memory
ABSOLUTE
Figure imgf000316_0001
Top 10 05/JAN Memory
ABSOLUTE
Figure imgf000317_0001
Top 10 06/JAN Memory
ABSOLUTE
Figure imgf000318_0001
Top 10 07/JAN Memory
ABSOLUTE
Figure imgf000319_0001
Top 10 09/JAN Memory
ABSOLUTE
Figure imgf000320_0001

Claims

What is claimed is:
1. A method, for use by an agent, of obtaining data from a device, the method comprising: receiving a plug-in containing system calls for obtaining the data from the device; loading the plug-in into the agent; obtaining the data from the device using the system calls; and transmitting the data over an external network using one or more of a plurality of protocols.
2. The method of claim 1 , wherein: the agent includes shared libraries containing system calls for obtaining other data from the device; and the method further comprises loading the shared libraries into the agent when the plug-in is loaded.
3. The method of claim 1, wherein the data is obtained from the device periodically.
4. The method of claim 3, wherein the data is obtained every minute.
5. The method of claim 1, wherein the plurality of protocols comprises simple mail transfer protocol (SMTP), hyper text transfer protocol (HTTP), and secure sockets layer (SSL) protocol.
6. The method of claim 1, wherein data transmission is effected using at least one of a proxy and socket.
7. The method of claim 1, wherein: the agent resides on an internal network that includes the device; and the method further comprises selecting a machine on the internal network to transmit the data over the external network.
8. The method of claim 7, wherein the external network includes the Internet.
9. The method of claim 7, wherein the agent resides on the device.
10. The method of claim 7, wherein the agent resides on a machine located on the internal network that is not the device.
11. The method of claim 1, wherein: the device comprises a network device located on an internal network; and the agent resides on a server that is also on the internal network.
12. The method of claim 1, wherein the data relates to one or more of the following: a processor on the device, memory on the device, a hard drive on the device, an internal network on which the device is located, and software installed on the device.
13. A method of providing, to a client, data that was obtained by an agent from a remote device on an internal network, the method comprising: receiving the data via an external network, at least some of the data being received periodically; formatting the data; and making the formatted data accessible to a client via the external network.
14. The method of claim 13, wherein formatting comprises generating a report based on the data.
15. The method of claim 14, wherein the report comprises a natural language report.
16. The method of claim 13, wherein formatting comprises: generating a display based on the data; and updating the display periodically as new data is received periodically via the external network.
17. The method of claim 13, wherein the data is received every minute.
18. The method of claim 13, wherein formatting comprises: determining if the data indicates that an operational parameter of the device exceeds a preset limit; and generating a report to a client indicating that the operational parameter exceeds the preset limit.
19. The method of claim 13, wherein the external network includes the Internet.
20. The method of claim 13, wherein making the formatted data accessible to the client comprises providing a World Wide Web site through which the data can be accessed by the client.
21. The method of claim 13, wherein the formatted data is made accessible to a wireless device using wireless application protocol.
22. A computer program stored on a machine-readable medium, the computer program comprising an agent for obtaining data from a device, the computer program comprising instructions that cause a machine to: receive a plug-in containing system calls for obtaining the data from the device; load the plug-in into the agent; obtain the data from the device using the system calls; and transmit the data over an external network using one or more of a plurality of
protocols.
23. The computer program of claim 22, wherein: the agent includes shared libraries containing system calls for obtaining other data from the device; and the computer program further comprises instructions that cause the machine to load the shared libraries into the agent when the plug-in is loaded.
24. The computer program of claim 22, wherein the data is obtained from the device periodically.
25. The computer program of claim 24, wherein the data is obtained every minute.
26. The computer program of claim 22, wherein the plurality of protocols comprises simple mail transfer protocol (SMTP), hyper text transfer protocol (HTTP), and secure sockets layer (SSL) protocol.
27. The computer program of claim 22, wherein data transmission is effected using at least one of a proxy and socket.
28. The computer program of claim 22, wherein: the agent resides on an internal network that includes the device; and the computer program further comprises instructions that cause the machine to select another machine on the internal network to transmit the data over the external network.
29. The computer program of claim 28, wherein the external network includes the Internet.
30. The computer program of claim 28, wherein the agent resides on the device.
31. The computer program of claim 28, wherein the agent resides on a machine located on the internal network that is not the device.
32. The computer program of claim 22, wherein: the device comprises a network device located on an internal network; and the agent resides on a server that is also on the internal network.
33. The computer program of claim 22, wherein the data relates to one or more of the following: a processor on the device, memory on the device, a hard drive on the device, an internal network on which the device is located, and software installed on the device.
34. A computer program stored on a machine-readable medium for providing, to a client, data that was obtained by an agent from a remote device on an internal network, the computer program comprising instructions that cause the machine to: receive the data via an external network, at least some of the data being received periodically; format the data; and make the formatted data accessible to a client via the external network.
35. The computer program of claim 34, wherein formatting comprises generating a report based on the data.
36. The computer program of claim 35, wherein the report comprises a natural
language report.
37. The computer program of claim 34, wherein formatting comprises: generating a display based on the data; and updating the display periodically as new data is received periodically via the external network.
38. The computer program of claim 34, wherein the data is received every minute.
39. The computer program of claim 34, wherein formatting comprises: determining if the data indicates that an operational parameter of the device exceeds a preset limit; and generating a report to a client indicating that the operational parameter exceeds the preset limit.
40. The computer program of claim 34, wherein the external network includes the Internet.
41. The computer program of claim 34, wherein making the formatted data accessible to the client comprises providing a World Wide Web site through which the data can be accessed by the client.
42. The computer program of claim 34, wherein the formatted data is made accessible to a wireless device using wireless application protocol.
43. A method comprising:
(a) automatically and repeatedly collecting data indicative of an operating state of a machine, and
(b) automatically transmitting information related to the collected data to a location remote from the computer in the form of electronic mail messages complying with a standard electronic mail messaging protocol.
44. The method of claim 43 also including:
(a) receiving the electronic mail messages at computer, and
(b) analyzing the information at the computer to derive performance measures.
45. The method of claim 44 also including
(a) generating a report embodying the performance measures, and
(b) making the report available electronically.
46. The method of claim 45 in which (a) the report comprises a natural language document expressed in a natural language format.
47. The method of claim 45 in which the report is made available on a web site.
48. The method of claim 43 in which the machine comprises a network server, desktop computer, or an intelligent appliance.
49. The method of claim 43, in which the standard electronic mail messaging protocol comprises a Simple Mail Transfer Protocol.
50. The method of claim 43, in which the collected data includes a time- ordered sequence of performance measurements taken at fixed time intervals.
51. The method of claim 43, in which the collected data includes measurements of at least one of CPU usage, process queue length, memory usage, memory paging rate, disk usage, network usage, paging space occupancy, file system occupancy, and process resource usage.
52. The method of claim 43, in which the information related to the collected data is compressed and encrypted for inclusion in the electronic mail message.
53. The method of claim 43, in which the collected data is collected from at least one of: a registry, a system call, a virtual file system, a virtual device, and an input/output control call to a device.
54. An article comprising a machine readable medium on which are tangibly stored machine-executable instructions for monitoring a machine, the instructions being operable to cause a machine to: (a) automatically and repeatedly collect data indicative of an operating state of the machine, and
(b) automatically transmit information related to the collected data to a location remote from the machine in the form of electronic mail messages complying with a standard electronic mail messaging protocol.
55. The computer program product of claim 54 in which the computer comprises a network server.
56. The article of claim 54, in which the standard electronic mail messaging protocol comprises a Simple Mail Transfer Protocol.
57. The article of claim 54, in which the collected data includes a time- ordered sequence of performance measurements taken at fixed time intervals.
58. The article of claim 54, in which the collected data includes measurements of at least one of CPU usage, process queue length, memory usage, memory paging rate, disk usage, network usage, paging space occupancy, file system occupancy, and process resource usage.
59. The article of claim 54, in which the information related to the collected data is compressed and encrypted for inclusion in the electronic mail message.
60. The article of claim 54, in which the collected data is collected from at least one of: a registry, a system call, a virtual file system, a virtual device, and an input/output control call to a device.
61. A method comprising (a) automatically and repeatedly receiving electronic mail messages that include information related to remotely collected data indicative of a performance of a machine, the electronic mail messages complying with a standard electronic mail messaging protocol, and
(b) automatically analyzing the information to determine the performance of the machine.
62. The method of claim 61 further comprising:
(a) extracting the information from the electronic mail messages.
63. The method of claim 62 further comprising generating a natural language report based on the analysis.
64. The method of claim 61, further comprising: generating an electronic mail message that includes the report; and transmitting the electronic mail message over a network.
65. The method of claim 61 , wherein the collected data includes at least one time ordered sequence of performance measurements and wherein: analyzing the collected data includes comparing at least some of the collected data with a corresponding threshold value to deteπnine whether the performance measurements are within a range of acceptable values.
66. The method of claim 65, wherein generating the performance report includes: selecting an information item based on the comparison of the performance measurement; and adding the selected information item to the performance report.
67. The method of claim 65 wherein: analyzing the collected data includes determining the number of performance measurements that are within the range of acceptable values; and selecting the information item is further based on the number of performance measurements that are within the range of acceptable values.
68. The method of claim 65 wherein the item of information includes a natural language sentence.
69. The method of claim 68 wherein the item of information includes at least one of a measurement value or the threshold value.
70. The method of claim 68 wherein at least part of the natural language sentence is enhanced to draw attention to the sentence.
71. The method of claim 70 wherein the part of the natural language sentence is enhanced by at least one of bold typeface, italicized typeface, colored typeface, underlining, and a different font size.
72. The method of claim 65 wherein the item of information includes a graphical display.
73. The method of claim 68 wherein at least part of the natural language sentence is a hyperlink to more detailed information about a section of the sequence of performance measurements.
74. An article comprising a machine readable medium on which are tangibly stored machine- executable instructions for monitoring a remote machine, the instructions being operable to cause a machine to:
(a) automatically and repeatedly receive electronic mail messages that include information related to remotely collected data indicative of a performance of the remote machine, the electronic mail messages complying with a standard electronic mail messaging protocol, and (b) automatically analyze the information to determine the performance of the remote machine.
75. The article of claim 74, wherein the instructions further cause the processor to:
(a) extract the information from the electronic mail messages.
76. The article of claim 75 wherein the instructions further cause the processor to generate a natural language report based on the analysis.
77. The article of claim 75, wherein the instructions further cause the processor to: generate an electronic mail message that includes the report; and transmit the electronic mail message over a network.
78. The article of claim 75, wherein the collected data includes at least one time ordered sequence of performance measurements and wherein: analyzing the collected data includes comparing at least some of the performance measurements with a coπesponding threshold value to determine whether the performance measurements are within a range of acceptable values.
79. The article of claim 78, wherein generating the performance report includes: selecting an information item based on the comparison of the performance measurement; and adding the selected infoπriation item to the performance report.
80. The article of claim 78 wherein: analyzing the collected data includes determining the number of performance measurements that are within the range of acceptable values; and selecting the information item is further based on the number of performance measurements that are within the range of acceptable values.
81. The article of claim 78 wherein the item of information includes a natural language sentence.
82. The article of claim 81 wherein the item of information includes at least one of a measurement value or the threshold value.
83. The article of claim 81 wherein at least part of the natural language sentence is enhanced to draw attention to the sentence.
84. The article of claim 83 wherein the part of the natural language sentence is enhanced by at least one of bold typeface, italicized typeface, colored typeface, underlining, and a different font size.
85. The article of claim 78 wherein the item of information includes a graphical display.
86. The article of claim 81 wherein at least part of the natural language sentence is a hyperlink to more detailed information about a section of the sequence of performance measurements.
PCT/US2002/014885 2001-05-11 2002-05-10 Managing a remote device WO2002093399A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US09/853,839 2001-05-11
US09/853,839 US20020169871A1 (en) 2001-05-11 2001-05-11 Remote monitoring
US09/954,819 US20030055931A1 (en) 2001-09-18 2001-09-18 Managing a remote device
US09/954,819 2001-09-18

Publications (1)

Publication Number Publication Date
WO2002093399A1 true WO2002093399A1 (en) 2002-11-21

Family

ID=27127193

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/014885 WO2002093399A1 (en) 2001-05-11 2002-05-10 Managing a remote device

Country Status (1)

Country Link
WO (1) WO2002093399A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5758071A (en) * 1996-07-12 1998-05-26 Electronic Data Systems Corporation Method and system for tracking the configuration of a computer coupled to a computer network
US5958010A (en) * 1997-03-20 1999-09-28 Firstsense Software, Inc. Systems and methods for monitoring distributed applications including an interface running in an operating system kernel

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5758071A (en) * 1996-07-12 1998-05-26 Electronic Data Systems Corporation Method and system for tracking the configuration of a computer coupled to a computer network
US5958010A (en) * 1997-03-20 1999-09-28 Firstsense Software, Inc. Systems and methods for monitoring distributed applications including an interface running in an operating system kernel

Similar Documents

Publication Publication Date Title
US20020169871A1 (en) Remote monitoring
US6148335A (en) Performance/capacity management framework over many servers
US7979857B2 (en) Method and apparatus for dynamic memory resource management
US7979863B2 (en) Method and apparatus for dynamic CPU resource management
US7412709B2 (en) Method and apparatus for managing multiple data processing systems using existing heterogeneous systems management software
US6871228B2 (en) Methods and apparatus in distributed remote logging system for remote adhoc data analysis customized with multilevel hierarchical logger tree
US7076397B2 (en) System and method for statistical performance monitoring
EP1150212B1 (en) System and method for implementing polling agents in a client management tool
US9306975B2 (en) Transmitting aggregated information arising from appnet information
US6434613B1 (en) System and method for identifying latent computer system bottlenecks and for making recommendations for improving computer system performance
US20030055931A1 (en) Managing a remote device
US20090064086A1 (en) Systems and methods for packaging an application
US20080141240A1 (en) Verification of successful installation of computer software
US20110060827A1 (en) Managing application system load
US11818152B2 (en) Modeling topic-based message-oriented middleware within a security system
US9311598B1 (en) Automatic capture of detailed analysis information for web application outliers with very low overhead
CA2948700A1 (en) Systems and methods for websphere mq performance metrics analysis
US20070168053A1 (en) Framework for automatically analyzing I/O performance problems using multi-level analysis
US7275250B1 (en) Method and apparatus for correlating events
US20080271011A1 (en) Method and Apparatus for a Client Call Service
US20070250363A1 (en) Enumerating Events For A Client
WO2002093399A1 (en) Managing a remote device
US8990377B2 (en) Method to effectively collect data from systems that consists of dynamic sub-systems
Brim et al. M3c: managing and monitoring multiple clusters
US9448858B2 (en) Environment manager

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP