WO2004092971A1

WO2004092971A1 - Server allocation control method

Info

Publication number: WO2004092971A1
Application number: PCT/JP2003/004679
Authority: WO
Inventors: Yasuhiro Kokusho; Satoshi Tutiya; Tsutomu Kawai
Original assignee: Fujitsu Limited
Priority date: 2003-04-14
Filing date: 2003-04-14
Publication date: 2004-10-28
Also published as: JP3964909B2; JPWO2004092971A1

Abstract

Allocation of the servers in a data center to network services is automatically conducted by a load distributing device on real time without any manual operation. The variation in the quantity of requests arriving at each network service is monitored, the value of the quantity of requests after a predetermined elapsed time is predicted. Depending on the magnitude of the request quantity predicted value, the allocation quantity of servers to the network service can be controlled. The quantity of servers allocated to the network service is so determined that the average of response times to user terminals is a response time threshold or less predetermined by the operation manager when a traffic of a quantity indicated by the request quantity predicted value arrives at the network service. The server group can include a requisite minimum number of servers necessary for the operation of the network services so that it is judged whether or not the response times based on the predicted value are in a predetermined range each time the a server is added or eliminated.

Description

Server allocation control method

The present invention relates to a method for dynamically changing the configuration of a group of servers assigned to a network service so that a plurality of servers providing the network service achieve a certain level of response time. Background art

Currently, various network services are provided via networks such as the Internet. So-called service providers (xSP), such as Internet service providers (ISPs) and application service providers (ASPs), which provide network services, maintain and manage servers for providing network services. May be outsourced to a data center operator.

A data center operator has multiple servers in its own data center, and allocates some of them to each network service. That is, a server group consisting of a plurality of servers is formed for each network service, and a service request for a network service is processed by one of the servers belonging to the corresponding server group. In addition, the data center operator guarantees a certain level of service such as reliability, maintainability, availability, and response time to service users in contracts with each xSP. If the number of servers assigned to the network service is too large, there will be servers with a low operating rate, and effective utilization of servers will not be achieved. Conversely, if the number of servers is too small, the service level cannot be guaranteed beyond a certain level. In addition, the load on the server increases and decreases according to the number of service requests from multiple user terminals connected to the network, and fluctuates every moment. Therefore, if there are multiple network services provided by the data center, The load varies between services. Therefore, the data center operator must provide the guaranteed service level. Servers assigned to lightly loaded network services or unused servers are relocated to heavily loaded network services so that the number of servers does not fall below the level, and the configuration of the server group is changed, and the owned servers are used effectively. .

In order to make such a configuration change possible, the data center operator installs a load balancer in front of multiple servers owned by the data center operator, and changes the settings of the load balancer to specify the load balancer. Server is added to a group of servers that provide a certain network service, and a specific server is deleted from a group of servers that provide a certain network service. The load distribution device also distributes the load on servers belonging to the server group by distributing service requests from user terminals according to a preset distribution ratio.

The settings of the load balancer can be changed manually based on the experience and judgment of the data center operation manager, and whether the service level is above a certain level (from the CPU operation rate) according to the current operation status. It has been proposed that the judgment be made and the setting of the load balancer be changed automatically (see Patent Document 1).

However, the load on network services fluctuates in a complex manner due to time of day, seasonal factors, and human factors, and the fluctuations vary from network service to network service. However, it is difficult for an operation manager to accurately predict the fluctuation pattern based only on his or her experience and judgment. In addition, the conventional method enables the setting of the load balancer to be performed automatically in real time without manual intervention.However, do you need more servers in the future or fewer servers? Servers are allocated only according to the current operation status and the set service level, without deciding whether or not it is necessary. Therefore, there is a problem that the server assigned according to the current operation status is not always optimal in the future.

(Patent Document 1) Japanese Patent Application Laid-Open No. 2002-224192 Disclosure of the Invention

An object of the present invention is to provide a method for dynamically controlling the configuration of a plurality of servers assigned to a network service according to the number of requests from user terminals, and To provide a program. In addition, a method and a method for predicting the number of arrival requests after a certain period of time from the fluctuation of the number of requests arriving at each network service, and controlling server allocation to each network service in accordance with the predicted value. Is to provide gram.

The object is to provide a server system in a network system having a plurality of user terminals connected to a network, and a server group including a plurality of servers connected to the network and processing requests from the plurality of user terminals. The number of requests from the plurality of user terminals for each predetermined time is accumulated, and a characteristic of time and number of requests is obtained as a function based on the accumulated number of past requests. To predict the number of requests in the future by substituting the time into the future, and assuming that the number of requests from the plurality of user terminals follows a predetermined probability distribution, the average response time per one of the plurality of servers And substituting the predicted number of requests into a relational expression between the number of requests and the number of requests.

A first average response time per vehicle is determined, and it is determined whether the first average response time is a positive value and falls within a range equal to or less than a preset threshold, and according to the result of the determination. The present invention is achieved by the invention according to claim 1, which provides a method for adjusting the number of servers in a server group, wherein the number of servers included in the server group is increased or decreased. Further, in the above-mentioned object, according to claim 1, when the average response time is included in the range as a result of the determination, (a) selecting one server included in the server group, and (b) selecting the server (C) obtain a second average response time per one of a plurality of servers included in the assumed server group, and (d) obtain the second average response time. (E) when the second average response time is included in the range, as a result of the new determination, the selected server is referred to as the server group. The invention is achieved by the invention according to claim 2, which provides a method of adjusting the number of server groups, which is excluded from the configuration of (1).

In addition, the above object is achieved in claim 2 in which after (e), after (e), the above (a) to (e) are repeated again, and as a result of the new determination, the second average response time is as described above. A method for adjusting the number of servers in a server group, wherein a plurality of servers included in the server group are excluded from the configuration one by one until the servers are not included in the range. The invention described in the range 3 is achieved.

In addition, the above-described object may be configured as described in claim 1, further comprising an unused server group including a plurality of unused servers connected to the network, and as a result of the determination, the average response time is included in the range. In this case, a method of adjusting the number of servers is provided, wherein one server included in the unused server group is selected, and the selected server is added to the server group. This is achieved by the invention described in range 4.

Further, the above object is achieved in claim 4 in which: (f) in a new server group after the selected server is added, a third server per one of a plurality of servers included in the new server group; (G) performing a new determination to determine whether or not the third average response time falls within the range; (h) a result of the new determination; the third average response time If not included in the range, one server included in the unused server group is selected, and (i) the selected server is added to the new server group, and after (i), again The above (f) to (i) are repeated, and as a result of the new determination, the servers included in the unused server group are added to the server group until the third average response time is included in the range. A method for adjusting the number of server groups by adding one server at a time This is achieved by the invention described in claim 5 which provides the following. In addition, the above object is to provide a server group including a plurality of user terminals connected to a network, a plurality of servers connected to the network and processing requests from the user terminals, A load balancing device that is connected and includes storage means for storing the number of requests from the user terminal, the distribution rate of the number of requests, and the configuration information of the server group at predetermined time intervals; and resources connected to the network. A resource allocation control device in a network system having an allocation control device obtains a characteristic of time and the number of requests as a function based on the number of past requests stored in the load distribution device, and the function calculates a future time. The number of requests from the plurality of user terminals, assuming that the number of requests from the plurality of user terminals follows a predetermined probability distribution. Substituting the predicted number of requests into the relational expression between the response time and the number of requests to obtain a first average response time per one of the plurality of servers, wherein the first average response time is positive. Value and is determined to be included in a range equal to or less than a preset threshold, 7. The invention according to claim 6, which provides a program for changing the configuration information so as to increase or decrease the number of servers included in the server group according to the result of the determination.

Further, in the above-described object, in the claim 6, when the result of the determination is that the average response time is included in the range, (a) selecting one server included in the server group; (C) causing a second average response time per one of a plurality of servers included in the assumed server group to be obtained; A new determination is made to determine whether the average response time is included in the range. (E) As a result of the new determination, when the second average response time is included in the range, the selected server is The present invention is attained by the invention according to claim 7, which provides a program characterized by deleting from the configuration information.

In addition, the above object is achieved in claim 7, in which after (e), (a) to (e) are performed again, and as a result of the new determination, the second average response time falls within the range. The present invention is attained by the invention according to claim 8, which provides a program for providing a program for deleting a plurality of servers included in the server group one by one from the configuration information until the server is no longer included.

In addition, the above object is achieved in claim 6, further comprising an unused server group including a plurality of unused servers connected to the network, and as a result of the determination, the average response time is included in the range. In the case where there is no server, the invention according to claim 9, which provides a program characterized by selecting one server included in the unused server group and adding the selected server to the configuration information. Achieved. In addition, the above object is achieved in claim 9 in which: (f) in a new server group after the selected server has been added, a plurality of servers included in the new server group per second server are included. (G) making a new determination to determine whether the third average response time falls within the range, (h) a result of the new determination, the third average If the response time is not included in the above range, one of the servers included in the unused server group is selected, (i) the selected server is added to the new server group, and after (i) Again performing the above (f) to (i), and until the result of the new determination, the third average response time is included in the range, the configuration information The present invention is attained by the invention according to claim 10, which provides a program for adding a server included in the unused server group one by one. BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 is a diagram illustrating an example of the configuration of the entire system according to an embodiment of the present invention. FIG. 2 is a block diagram illustrating a configuration example of a load distribution device.

FIG. 3 is a block diagram showing a configuration example of a mobile terminal such as a mobile phone or a PDA used as a user terminal.

FIG. 4 is a block diagram illustrating a configuration example of a server.

FIG. 5 is a diagram illustrating an example of a data configuration of server group configuration information stored in the RAM of the load distribution device.

FIG. 6 is a diagram illustrating a data configuration example of statistical information stored in the RAM of the load balancing device.

FIG. 7 is a diagram illustrating a data configuration example of data center configuration information stored in the RAM of the resource allocation control device.

FIG. 8 is a diagram showing an example of a data configuration of a table stored in the RAM of the resource allocation control device, in which the number of requests processed by the server having the CPU of the reference clock frequency processed per second is stored for each application. is there.

FIG. 9 is a flowchart illustrating the server group new creation processing.

FIGS. 10 and 11 are flowcharts illustrating the server allocation adjustment processing. FIG. 12 is a diagram for explaining a prediction method using the least squares method. BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, the technical scope of the present invention is not limited to such an embodiment.

Embodiments of the present invention are described by way of an example of a configuration of the entire system to which the method of the present invention is applied, an example of a configuration of an apparatus included in the system, an example of a data configuration stored in an apparatus included in the system, The operation will be described in the order of the operation flow. First, a configuration example of the entire system to which the method of the present invention is applied will be described. FIG. 1 is a diagram illustrating an example of the configuration of the entire system according to an embodiment of the present invention. In the embodiment of the present invention, for simplicity, all servers that provide network services will be described as servers on which web servers operate. The user terminal sends an http request including the homepage address to be viewed as a service request, and the web server sends the content corresponding to the corresponding homepage address to the user terminal.

First, there is a network 2 that connects a plurality of user terminals 1 and a plurality of servers 10 in a data center 3. Network 2 can be wired or wireless. The network 2 may be a LAN (Local Area Network) or a WAN (Wide Area Network), and the network 2 may be an internet connecting LANs and WANs.

As the user terminal 1, a mobile terminal such as a PC (personal computer), a mobile phone, or a PDA (Personal Digital Assistant) is used. The plurality of servers 10 in the data center 3 are grouped for each network service to be provided.

For example, servers 7 to 7 ₃ to provide a network service exists in FIG. 1, the server group 7 have 7 ₂ provides separate network services as a web server. Server group 7 ₃ are unused servers that are not assigned to any network service. Each of the servers 10 in the data center is connected to the LAN 6 in the center. LA 6 in the center can be wireless, wired, or wired.

A request sent from the user terminal 1 is processed by the server 10 belonging to the server group that provides the network service corresponding to the request, and the network service is provided by responding the result to the user terminal. . For example, if the server group 7 provides company homepage content, the server group 7! Is sent to the user terminal that sent the company A homepage address as a browsing request (http request) from the user terminal. Server sends the content corresponding to the address.

A load balancer 4 is provided at a stage preceding the server in the data center, and the load balancer 4 is connected to the LAN 6 and the network 2 in the center. Load balancer 4 is to determine the server group corresponding to the request from the user terminal, and to prevent the load of the server belonging to that server group from being concentrated on a specific server, Distribute requests.

Further, the load balancer 4 counts the number of requests (http requests) transmitted via the network 2 at regular time intervals for each server group 7 and accumulates the information as statistical information. In addition, a resource allocation control device 5 for controlling the configuration of the server group 7 by adding or deleting servers 10 belonging to the server group 7 is connected to the LAN 6 in the center.

Although not shown in FIG. 1, a configuration including a gateway, a router, a firewall, and the like between the load distribution device 4 and the network 2 is also possible. Although not shown in the figure, a storage device such as a disk array may be connected to the outside of the server 10 if the hard disk provided inside the server is insufficient.

Next, a configuration example of the devices included in the system of FIG. 1 will be described.

FIG. 2 is a block diagram showing a configuration example of the load distribution device 4. The CPU 21 executes the control in the load distribution device 4. A ROM (Read Only Memory) 22 stores a program executed when the load distribution device 4 is initialized, data necessary for the initialization, and a control program transferred to the RAM 23 at the time of initialization.

A RAM (Random Access Memory) 23 stores a control program and data such as a work result when the control program is executed. The communication device 24 has an interface for connection to the network 2 and the LAN 6 in the center, and enables transmission and reception of data to and from devices connected through the network 2 and the LAN 6 in the center. These forces are connected by a connection line 25 as shown in Fig.2.

FIG. 3 is a block diagram showing a configuration example of a mobile terminal such as a mobile phone or a PDA used as the user terminal 1. The CPU 31 executes control in the mobile terminal. In R0M32, a program executed when the mobile terminal is initialized, data necessary for initialization, and a control program transferred to the MM34 at the time of initialization are recorded.

The RAM 34 stores a control program and data S such as a work result when the control program is executed. The communication device 35 connects to the network 2 and the LM 6 in the center. Interface to enable data transmission and reception with devices connected via the network 2 or the LAN 6 in the center.

The input device 36 is a keypad, a pen-type input device, or the like, and is used by a user for inputting various commands and data. The display device 33 is a liquid crystal screen or the like, and displays a control result or the like of the CPU 31 to a user. These forces are connected by connecting lines 37 as shown in Fig.3.

FIG. 4 is a block diagram showing a configuration example of the resource allocation control device 5. The CPU 41 executes the control in the resource allocation control device 5. The ROM 42 stores a program executed when the resource allocation control device 5 is initialized and data necessary for the initialization. The hard disk 46 stores OS (Operating System) data for controlling the resource allocation control device 5.

The RAM 45 stores data such as a work result when the OS is executed. The communication device 48 has an interface for connecting to the network 2 and the LAN 6 in the center, and enables data transmission and reception to and from devices connected through the network 2 and the LA 6 in the center.

The input device 47 is a keyboard mouse or the like, and is used by a user to input various commands and data. The display device 43 is a liquid crystal monitor, a CRT, or the like, and displays a control result of the CPU to a user. The optical drive 44 is used to write or read data to media such as CD, DVD, and M0.

However, since the administrator can log in to the resource allocation control device 5 from the server 10 connected via the network and remotely control the resource allocation control device 5, a configuration without the input device 47 and the display device 43 may be used. The configuration of the PC used as the server 10 and the user terminal 1 is the same as that shown in FIG.

Next, an example of a data configuration stored in a device included in the system will be described.

FIG. 5 is a diagram illustrating a data configuration example of the server group configuration information stored in the MM 23 of the load distribution device 4. As server group configuration information, server group name 51, representative IP address 52, response time threshold 53, server name 54 belonging to the server, IP address 55, distribution ratio 56 are assigned to each server group. Is stored in

The server group name 51 is a management name for specifying the server group 7. Representative IP address The address 52 is a number consisting of 32 bits for IPv4 and 128 bits for IPv6, and is an IP address that is disclosed to the outside for network services provided by the server group 7. That is, when the network service provided by the server group 7 is used, the user terminal 1 sends a request (http request) to the representative IP address 52 of the server group 7 and the server group 7 ₂ provides the request. If the network service is available, the user terminal 1 transmits a request to the representative IP Adoresu 5 second server group 7 _2.

As the representative IP address 52, a global IP address is used, but a private IP address may be used if the network service is for a limited organization. it can. The response time threshold 53 is one of the service levels required by xSP when contracting with the data center, and the data center operator operates the network so that the response time to user terminal 1 does not fall below the response time threshold '53. Server 10 is assigned to server group 7 that provides services, and data center 3 is managed. The server name 54 of the belonging server information is a management name for specifying the server 10 belonging to each server group 7. The IP address 55 is the IP address of the server corresponding to the server name 54. As the IP address 55, a private IP address is used, but a global IP address can also be used if there is a surplus of global IP addresses.

The distribution ratio 56 is a ratio at which the server 10 to which the user terminal 1 belongs processes the request from the user terminal 1 to the representative IP address 52 of the server group 7. Using this server group configuration information, the load balancer 4 searches for a representative IP address that matches the destination IP address of the request (http request) transmitted from the user terminal 1 and configures the server group 7. Identify and select the server that will process the request based on the distribution ratio 56 of the server information to which the server group 7 belongs, and translate the destination IP address of the request into the IP address 55 of the selected server. Transfers the request to the server selected based on the distribution rate.

For example, Figure 5 stores information about two server groups. In server group A, the representative IP address 52 is GIP1, the response time threshold 53 is ST1, and the server name 54 is WEB-A, WEB-B, WEB-C, and WEB-D. The IP address 5 5 PIP1, PIP2, PIP3, PIP4, and distribution ratio 56 are 0.5, 0.3, 0.1, and 0.1, respectively. In other words, of the 10 requests, 5 times WEB-A, 3 times WEB-B, 1 time WEB-C, and 1 time WEB D. FIG. 5 also stores the representative IP address 52 of the server group B and information about three servers to which the server group B belongs.

FIG. 6 is a diagram illustrating an example of a data configuration of statistical information stored in the RAM 23 of the load distribution device 4. As statistical information, a value obtained by totaling the number of arriving requests at fixed time intervals is stored for each server group. For example, in FIG. 6, contains the number of arriving requests per second, store that R ₂₁ amino request arrives in R _u number of requests force server group B in server group A during time 6 1 Ί~Τ ₂ Is done. The latest data is added below Figure 6. In this way, information on the number of arrival requests 62 for the last n seconds (n is a natural number) from the current time is stored.

FIG. 7 is a diagram showing a data configuration example of the data center configuration information stored in the RAM 45 of the resource allocation control device 5. The data center configuration information stores the server name 71, IP address 72, CPU clock speed 73, and assigned server group name 74 of each server 10, including unused servers not assigned to network services. Is performed. The server name 71 and the IP address 72 are the same as the server name 54 and the IP address 55 in FIG.

The CPU clock speed 73 stores a value obtained by dividing the clock frequency of the CPU 41 mounted on the server 10 by the reference clock frequency. In the embodiment of the present invention, the reference cook frequency is 1 GHz. From the CPU clock speed 73, it can be seen that the clock frequency mounted on the server 10 is several times the reference clock frequency. For the server 10 having a plurality of CPUs 41, a value obtained by dividing the total value of the clock frequencies by the reference clock frequency is stored.

If the average request processing count in a server equipped with a CPU of the reference clock frequency (1 GHz) is taken, the value obtained by multiplying the value of the CPU clock speed 73 by // is the average request processing in the server. Calculated as a number. The belonging server group name 74 specifies the server group 7 to which the server 10 of each server name 71 belongs.

Unlike the server group configuration information in Fig. 5, the data center configuration information Information is also stored. For example, in FIG. 7, it can be seen that four servers whose server names 71 are WEB-H, WEB-1, WEB-J, and WEB-K are unused servers.

FIG. 8 is a table in which the number of requests processed per second by the server equipped with the CPU 41 of the reference clock frequency (1 GHz) stored in the RAM 45 of the resource allocation control device 5 is stored for each Web server software 81. FIG. 4 is a diagram showing an example of the data configuration of FIG. For example, in Fig. 8, when an application is used as web server software, multiple http requests are sent to a server equipped with a CPU 41 with a reference clock frequency (1 GHz), and as a result, requests per second are processed. When application _An is used, it is understood that c _n requests are processed per second as a result of sending a plurality of http requests to a server equipped with CPU 41 of the reference clock frequency. This data is collected in advance and input to the resource allocation controller in advance.

Next, an operation flow for explaining the method of the present invention will be described.

FIG. 9 is a flowchart illustrating the server group new creation processing. When providing a new network service, the operation will not start unless the server group that provides the network service is defined first. Then, this processing is executed. First, the resource allocation control device 5 receives a server group new creation request (S91). Normally, the new creation of the server group 7 is based on the initiative of the operation manager, and here, the command input by the operation manager through the input device 47 provided in the resource allocation control device 5 is received. . It also receives the name of the newly created server group, the initial number of servers to be assigned, and the response time threshold as command arguments.

Next, the name of the newly created server group, the initial number of servers to be allocated, and the response time threshold are stored (S92). In step S92, the information received as the command argument is temporarily stored in the RAM 45 provided in the resource allocation control device 5. Next, the data center configuration information is updated so that the servers for the initial number received in step S91 are added to the newly created server group (S93). In step S93, the servers corresponding to the initial number received in step S91 are selected from the entry of the server in which the server group name 74 is "unused" in the data center configuration information in FIG. Then, the field of the server group name 74 of the selected server may be changed to the name of the newly created server group received in step S91. Then, the representative IP address to be assigned to the newly created server group and the distribution rate of each server belonging to the server group are determined (S94). The representative IP address and the distribution ratio are set in the load distribution device 4, but are determined by the resource allocation control device 5.

The representative IP address is determined by selecting an arbitrary one from a set of unused IP addresses stored in the hard disk of the resource allocation control device 5 (because it is exclusively used as the representative IP address). . The method of determining the initial value of the distribution rate is not particularly limited as long as the sum of the distribution rates is 1, so for example, 1 is divided by the initial number received in step S91, and the value of What is necessary is just to determine it as a distribution rate.

After step S94, new configuration information is transmitted to the load balancer (S95), and the server group new creation process ends. In step S95, the name of the newly created server group, the representative IP address, the response time threshold stored in step S92, the server name for the initial number selected in step S93, and the step S94 Is transmitted to the load balancer by the resource allocation controller.

When a new server group is created, the processing in FIG. 9 is executed.To determine whether a server belonging to an existing server group is optimal for providing network services, the resource allocation control device 5 Perform server allocation adjustment processing. Here, the CPU clock frequency is used as a numerical value for measuring the processing capacity of the server group, and the number of servers is simply increased to improve the processing capacity of the server group. FIGS. 10 and 11 are flowcharts illustrating the server allocation adjustment processing. The resource allocation control device 5 periodically executes a server allocation adjustment process to determine whether a server allocated to a group of servers providing each network service is optimal. It is assumed that the table data in FIG. 8 has been input before this processing.

First, one server group that provides a network service is selected (S101). The resource allocation control device 5 refers to the belonging server group name 74 of the data center configuration information in FIG. 7 and selects one server group name other than “unused”.

Then, the number of arrival requests of the server group selected in step S101 after 60 seconds is predicted (S102). In step S 102, first, the latest 300 seconds of the server group selected in step S 101 are sent to the resource allocation control device 5 and the load distribution device 4. 3 004679

, The load balancer 4 refers to the statistical information in FIG. 6 and sends the latest number of arrival requests 62 for the latest 300 seconds of the corresponding server group to the resource allocation controller 5. . Then, the number of arrival requests 60 seconds from now is predicted using the least squares method based on the number of arrival requests for 300 seconds obtained.

FIG. 12 is a diagram for explaining a prediction method using the least squares method. Figure 12 shows the time (seconds) on the horizontal axis and the number of arrival requests (number) on the vertical axis, and plots data for 300 seconds on the coordinate plane at 1-second intervals. On the coordinate plane, a straight line that minimizes the distance in the vertical axis direction from each plotted point is obtained by the least squares method.

That is, α and] 3 are obtained when the straight line is expressed as Y = c¾ * X + i3. The value obtained by substituting X = 60 into the obtained straight-line equation is the predicted value of the number of arrival requests after 60 seconds. If the calculated predicted value is smaller than 0, the predicted value after 60 seconds is assumed to be 0. In Fig. 12, the number of arrivals at each time interval is plotted as the number of arrivals at the midpoint of the time interval.

In step S102, the number of arrival requests after 60 seconds is predicted. However, the present invention is not limited to this and can be determined according to the policy of the operation manager. In addition, although the latest 300 seconds of data was used to estimate the number of arrival requests after 60 seconds, the present invention is not limited to this. In the present embodiment, data of five times the difference between the predicted future time and the current time is used as a guide.

Returning to FIG. 10, the description will be continued. Next, based on the number of arrival requests predicted in step S102, the average response time T after 60 seconds per server belonging to the server group selected in step S101 is calculated. Yes (S103).

The response time T is calculated by the following equation (A).

Here, μ is the average number of requests processed per second at the reference click frequency of the Web server software used by the servers belonging to the server group selected in step S101, and can be obtained by referring to Fig. 8. Value. R is the number of arrival requests for the server group selected in step S101 after 60 seconds, and is predicted in step S102. Value. fi is the relative clock magnification of the i-th server among the n servers belonging to the server group selected in step S101, and is calculated from the CPU clock speed 73 of the data center configuration information in FIG. This is the value to be obtained.

The derivation method of equation (A) is as follows. Assuming that the entire server group selected in step S101 is a single window in queuing theory, the number of requests processed per second at that window is ∑Λ, and the frequency of request arrival at the window is the entire server group. The number of arrival requests R for

Assuming that the arrival of requests in the queue follows the Poisson model, the time that an incoming request waits in the queue is

P

It becomes. Here, p is the window utilization rate in queuing theory, and is the probability that the window is busy at any time.

The Poisson model guarantees that the queue will not overflow at p <1, which means that no matter how much time passes, the queue length will remain below a certain length. The waiting time means the average response time τ of the servers belonging to the server group selected in step S101. Then, assuming that the average response time of the server takes the worst value, since p is 1,

I can put it. This is equation (A).

The response time of the i-th server belonging to the server group selected in step S101 can be derived as follows. Assuming that the i-th server in the server group is a window in queuing theory, the number of requests processed per second at the window is f _; X μ, the number of requests to the window arrives at the number of arrival requests for the entire server group. , X multiplied by the distribution ratio r of the i-th server. Therefore, the response time 1 of the i-th server belonging to the server group becomes

^ ^ ~~ (Β)

Then, it is determined whether or not the response time T calculated in step S103 satisfies 0 ≦ T ≦ Tp with respect to a preset response time threshold Tp (S104). If the response time に falls within the range (0≤T≤Tp) in step S104, the service level satisfies the service level guaranteed in the contract with xSP, but the allocated server is probably excessive, This means that there may be room for server reduction.

Therefore, if the response time T falls within the predetermined range in step S104, one arbitrary server belonging to the server group is selected (S105). Then, assuming a configuration in which the server selected in step S105 is excluded from the server group selected in step S101, the future response time τ is calculated again in the assumed new configuration.

(S106). Step S106 can be calculated using equation (A) as in step S103.

Then, it is determined whether the response time calculated in step S106 satisfies 0≤T≤T with respect to the response time threshold T (S107). In the assumed new configuration, the number of servers is reduced by one, so the value of the denominator becomes smaller, and the response time τ is calculated in the previous calculation.

, (Step S103). Therefore, if the increased value satisfies 0≤ Τ≤Τρ, it is safe to delete one server.

Therefore, if 0 T≤Tp is satisfied in step S107, the server actually selected in step S105 is added to the unused server (S108), and the number of servers is reduced. In step S108, the server group name selected in step S101 is updated by updating the affiliated server group name 74 corresponding to the server selected in step S105 in the data center configuration information in FIG. Means that the server is removed from the configuration.

After step S108, the process returns to step S105 to determine whether there is room for further server reduction. In step S107, the response time T is 0≤ If T≤Tp is not satisfied, it is known that the current configuration is the minimum necessary, and the configuration is not changed, and the process proceeds to the next process (moving to FIG. 11), and the distribution ratio is calculated (S109) ).

In step S109 in FIG. 11, the distribution rates of the servers belonging to the server group are calculated by a method described later, so that the response times of the servers belonging to the same server group become equal to each other. If the response time of each server belonging to the same server group is not uniform, even if the average response time of the server obtained by averaging the entire server group is less than the response time threshold, the response time of some servers The time threshold may be exceeded. Therefore, it is necessary to control the distribution ratio of each server so that the response times of the servers belonging to the same server group are equal to each other.

The distribution rate in step S109 in FIG. 11 is calculated by equation (C).

^ =-(1- ^ ∑ /,) + ^ (c) where is the distribution ratio of the i-th server, n is the number of servers belonging to the server group, and μ is selected in step S101. The average number of requests processed per second at the reference frequency of the Web server software used by the servers belonging to the server group.R is the number of arrival requests after 60 seconds for the entire server group selected in step S101. Yes, it is the value predicted in step S102. f i is the relative clock magnification of the i-th server among the n servers belonging to the server group selected in step S101, and it is appropriate to refer to the CPU clock speed 73 in the data center configuration information in FIG.

The derivation method of equation (C) is as follows. From equation (B), the response time of the i-th server is

It is. Here, assuming that the response times of the n servers are all equal, Τ _λ = Τ ₂ = ... = 7 holds.

f ^ -r _x R = f ^ -r ₂ R = ... = / "〃一, '" R Ie

¾ 二 + (/ ;. ₊ /;) (D) holds.

By transforming equation (D) and expressing by,

r _x = J

Since Ή + | (/ Β -Λ) = ^ + | (Λ _ _) = 'Ί + Λ-:), r _t2 r fi-1 f (E) holds. Subtracting both sides of equation (E) for subscripts 1 to n gives

Since and, is obtained as follows.

^ = ^λ (ΐ- ^ ∑ (-Λ)) (F) Then, substituting equation (F) into equation (E) gives (1), and equation (C) is derived.

After step S109, the resource allocation control device 5 transmits new configuration information to the load distribution device 4, and the load distribution device 4 updates the server group configuration information with the received configuration information (S110). In step S110, first, the resource allocation control device 5 obtains the server group name selected in step S101 by referring to the data center configuration information in FIG. The server name 71, the IP address 72, and the distribution rate calculated in step S109, which belong to the server group, are transmitted to the load balancer 4.

Then, the load distribution device 4 only needs to update the server affiliation information corresponding to the received server group name and the portion corresponding to the received information. Then, for all the server groups, a force determination is made (S111) to complete the processing after step S102 to determine whether the assignment of the server belonging to the server group is appropriate (S111). If a server group exists, the process returns to step S101 to continue. In step S111, if the judgment has been completed for all the server groups, the processing ends.

'' If the response time T does not fall within the range of 0≤T≤Tp in step S104 of Fig. 10, select one server belonging to an unused server (S112) . Then, the server selected in step S112 is added to the server group (S113). The reason why the number of arrival requests does not fall within the predetermined range in step S104 is that the number of arrival requests exceeds the processing capacity of the server group, that is, the denominator of equation (Α) is negative, or the number of arrival requests is The range of processing capacity is 內, but the load is higher as the response time is shorter than the required response time. In any case, it is necessary to increase the number of servers and increase the processing capacity. Therefore, an arbitrary unused server is selected in step S112, and the selected server is added to the server group selected in step S101 in step S113. In the processing of step S113, the server group name 74 corresponding to the server selected in step S112 in the data center configuration information of FIG. If the resource allocation control device 5 updates the server group name selected in the above, Then, the future response time における in the new configuration is calculated again by the equation (Α) (S114).

It is determined again whether the response time calculated in step S114 falls within a predetermined range (0T≤Tp) (S115). In step S115, even if one is added, if it does not yet fall within the prescribed range, the service level guaranteed in the contract with xSP has not yet been satisfied, and in order to further add servers, steps S112 Return to and continue the process. If the value falls within the predetermined range in step S115, the flow advances to step S109 to calculate the distribution rate.

If there is a request to create a new server group during the execution of this process, the resource allocation control device 5 1 It is also possible to set to execute the new server group creation processing in Fig. 9 as interrupt processing.

According to the embodiment of the present invention described above, the present invention provides a method for automatically allocating servers in a data center to each network service in real time to a load distribution device without manual intervention. Can be. It also monitors the fluctuations in the amount of requests arriving at each network service, predicts the value of the amount of request after a certain period of time, and provides services for the network service in accordance with the magnitude of the estimated amount of request. It is possible to control the quota of the bus.

Therefore, the burden on the operation manager can be reduced, and operation by fewer operation managers is possible. In addition, it can be operated by inexperienced operation managers. Here, the amount of servers allocated to the network service is determined by the average of the response time to the user terminal when the amount of traffic indicated by the predicted request value arrives at the network service. Is set to be equal to or less than a predetermined response time threshold.

Therefore, it is possible to maintain the service level to be maintained in a contract with a customer of a data center operator such as xSP at a certain level or more. In addition, each time the number of servers is increased or decreased one by one, it is necessary to configure the server group with the minimum number of servers necessary for the operation of the network service in order to determine whether the response time based on the predicted value falls within a predetermined range. This allows data center operators to operate their server resources as efficiently as possible.

In the embodiment of the present invention described above, the server application is a web server, and a request from the user terminal is an ht¾) request. Another application program is used in the server, and the user terminal performs a request for the application program. If the server sends a request and the server sends a reply to the request to the user terminal, the present invention can be applied to such a case. In the embodiment of the present invention, the CPU clock frequency is used as a numerical value for measuring the processing capacity of the server group, and the number of servers is simply increased or decreased in order to adjust the processing capacity of the server group. The present invention is also applicable to a case where computer resources such as a CPU, a memory, and a hard disk are individually numerically increased and decreased. Further, in the embodiment of the present invention, the force S at which the reference clock frequency is 1 GHz is not limited thereto. In this case, if the average number of processes for each application in the server equipped with the peak frequency determined as the reference peak frequency is collected in advance and input to the resource allocation control device in advance as shown in FIG. Is applicable.

The protection scope of the present invention is not limited to the above embodiments, but extends to the inventions described in the claims and their equivalents. Industrial potential

According to the embodiment of the present invention described above, the present invention provides a method of automatically allocating servers in a data center to respective network services to a load distribution apparatus in real time without manual intervention. Can be. It also monitors the fluctuations in the amount of requests arriving at each network service, predicts the value of the amount of request after a certain period of time, and provides services for the network service in accordance with the magnitude of the estimated amount of request. It is possible to control the quota of the bus.

Claims

The scope of the claims

1. Multiple user terminals connected to the network,

A method for adjusting the number of server groups in a network system having a server group including a plurality of servers connected to the network and configured to process requests from the plurality of user terminals,

Accumulate the number of requests from the plurality of user terminals for each predetermined time,

Based on the accumulated number of requests in the past, the characteristics of time and the number of requests are obtained as a function, and the number of future requests is predicted by substituting future time for the function,

When the number of requests from the plurality of user terminals is assumed to follow a predetermined probability distribution, the predicted number of requests is substituted into a relational expression between the number of requests and the average response time per one of the plurality of servers. Calculating a first average response time per one of the plurality of servers; determining whether the first average response time is a positive value and is included in a range equal to or less than a preset threshold value;

A method of adjusting the number of servers, which is characterized by increasing or decreasing the number of servers included in the server group according to the result of the determination.

2. In Claim 1,

When the result of the determination is that the average response time is included in the range, (a) selecting one server included in the server group,

(b) Assuming a server group excluding the selected server,

(c) calculating a second average response time per one of a plurality of servers included in the assumed server group,

(d) performing a new determination to determine whether the second average response time falls within the range,

(e) as a result of the new determination, if the second average response time is included in the range, the selected server is excluded from the configuration of the server group; Adjustment method.

3. In Claim 2,

After the above (e), the above (a) to (e) are repeated again, and the result of the new judgment is obtained. As a result, a plurality of servers included in the server group are excluded from the configuration one by one until the second average response time is not included in the range.

4. In Claim 1,

Further, it has an unused server group including a plurality of unused servers connected to the network,

If the result of the determination is that the average response time is not included in the range, one server included in the unused server group is selected,

A method of adjusting the number of servers, wherein the selected server is added to the server group.

5. In Claim 4,

(f) in a new server group after the selected server has been added, obtain a third average response time per one of a plurality of servers included in the new server group;

(g) performing a new determination to determine whether the third average response time falls within the range,

(h) as a result of the new determination, if the third average response time is not included in the range, select one server included in the unused server group,

(i) adding the selected server to the new server group,

After the above (i), the above (f) to (i) are repeated again. A method for adjusting the number of servers, wherein one server is added to each used server group.

6. Multiple user terminals connected to the network,

A server group including a plurality of servers connected to the network and processing requests from the user terminals;

A load distribution device connected to the network and including storage means for storing the number of requests from the user terminal at predetermined time intervals, a distribution ratio of the number of requests, and configuration information of the server group;

A network system having a resource allocation control device connected to the network System resource allocation control device,

Based on the number of past requests accumulated in the load balancer, the characteristics of time and number of requests are obtained as a function,

Substituting a future time into the function to predict the number of future requests,

When the number of requests from the plurality of user terminals is assumed to follow a predetermined probability distribution, the predicted number of requests is substituted into a relational expression between the number of requests and the average response time per the plurality of servers. Calculating a first average response time per one of the plurality of servers; *

It is determined whether the first average response time is a positive value and is included in a range equal to or less than a predetermined threshold,

A program for changing the configuration information so as to reduce the number of servers included in the server group according to the result of the determination.

7. In Claim 6,

As a result of the determination, when the average response time is included in the range, (a) selecting one server included in the server group,

(b) assuming a server group excluding the selected server,

(c) causing a second average response time per one of a plurality of servers included in the assumed server group to be obtained,

(e) as a result of the new determination, when the second average response time is included in the range, the selected server is deleted from the configuration information.

8. In Claim 7,

After the above (e), the above (a) to (e) are performed again, and the second average response time is included in the server group until the second average response time is not included in the above range as a result of the new determination. A program for deleting a plurality of servers one by one from the configuration information.

9. In Claim 6, Further, it has an unused server group including a plurality of unused servers connected to the network,

If the result of the determination is that the average response time is not included in the range, one of the servers included in the unused server group is selected,

A program for causing the selected server to be added to the configuration information.

10. In Claim 9,

(f) in a new server group after the selected server has been added, a third average response time per one of a plurality of servers included in the new server group is calculated;

(i) causing the selected server to be added to the new server group,

After the above (i), the above (f) to the above (i) are performed again. As a result of the above-mentioned new determination, the above-mentioned configuration information is not updated until the third average response time is included in the above-mentioned range. Used A program characterized by adding servers included in a server group one by one.