US20060117200A1 - Method, system, and apparatus for improving multi-core processor performance - Google Patents
Method, system, and apparatus for improving multi-core processor performance Download PDFInfo
- Publication number
- US20060117200A1 US20060117200A1 US11/336,303 US33630306A US2006117200A1 US 20060117200 A1 US20060117200 A1 US 20060117200A1 US 33630306 A US33630306 A US 33630306A US 2006117200 A1 US2006117200 A1 US 2006117200A1
- Authority
- US
- United States
- Prior art keywords
- core
- state
- executing
- cores
- rationing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title abstract description 11
- 230000007704 transition Effects 0.000 claims description 8
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 101150019307 RSU1 gene Proteins 0.000 description 1
- 102100030800 Ras suppressor protein 1 Human genes 0.000 description 1
- 101100524346 Xenopus laevis req-a gene Proteins 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000013021 overheating Methods 0.000 description 1
- 101150029619 rsp1 gene Proteins 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3206—Monitoring of events, devices or parameters that trigger a change in power modality
- G06F1/3228—Monitoring task completion, e.g. by use of idle timers, stop commands or wait commands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3287—Power saving characterised by the action undertaken by switching off individual functional units in the computer system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/329—Power saving characterised by the action undertaken by task scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3296—Power saving characterised by the action undertaken by lowering the supply or operating voltage
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present disclosure pertains to the field of power management. More particularly, the present disclosure pertains to a new method and apparatus for improving multi-core processor performance despite power constraints.
- Power management schemes allow for reducing power consumption to achieve low power applications for various types of and systems and integrated devices, such as, servers, laptops, processors and desktops.
- software methods are employed for systems and integrated devices to support multiple power states for optimizing performance based at least in part on the Central Processing Unit (CPU) activity.
- CPU Central Processing Unit
- FIG. 1 illustrates a flowchart for a method utilized in accordance with an embodiment
- FIG. 2 illustrates a bar chart utilized in accordance with an embodiment.
- FIG. 3 illustrates a bar chart utilized in accordance with an embodiment.
- FIG. 4 illustrates an apparatus in accordance with one embodiment.
- the present methods incorporate lowering the voltage or frequency at the expense of overall performance.
- the claimed subject matter improves overall performance while adhering to power constraints.
- a concept of “rationing the number of executing cores for a processor system” allows for increasing frequency as a result of disabling clocks to cores that are idle as they wait for a memory transaction to complete.
- the claimed subject matter exploits the idle time period of processor cores by disabling the clocks to the core, that results in less power dissipation.
- a higher frequency can be utilized as a result of the decrease in power dissipation.
- an appropriate executing core limit is calculated for the workload.
- the number of executing cores are less than or equal to the number of available and ready threads.
- a thread is an independent set of instructions for a particular application.
- the claimed subject matter facilitates selecting a voltage/frequency operating point based on a prediction of the activity level of the threads running on all of the cores collectively. For example, TPC-C threads tend to be active 50-60% of the time, and spend 40-50% of their time idle, waiting for memory references to be completed. In such an environment, one would specify an executing core limit that would be, in one embodiment, 60% of the total number of cores on the die; if there were 8 cores, one would set the executing core limit to, in this case, five.
- the core rationing logic constrains the operations of the die, guaranteeing that no more than five cores (in this case) are active at any given moment.
- Statistics are gathered regarding the occupancy of the Waiting and Rationing queues (which will be discussed further in connection with FIG. 1 ); at intervals these statistics are analyzed to determine whether the operating point (executing core limit and its associated voltage/frequency pair) should be changed.
- the Waiting queue tends to be empty and the Rationing queue tends to be full, that is an indication that cores are not making progress when they could be, and that to improve performance the executing core limit should be raised and the voltage/frequency reduced; conversely, if the Rationing queue tends to be empty, and the Waiting queue tends to be full, this may be an indication that one can increase performance by reducing the executing core limit and increasing the voltage/frequency point.
- FIG. 1 illustrates a flowchart for a method utilized in accordance with an embodiment.
- the flowchart depicts a method for a state diagram.
- the state diagram illustrates a predetermined state machine for a processor core in a system.
- the state machine facilitates the “rationing of the cores” to improve processor performance as a result of disabling clocks to cores that are waiting for a memory transaction to complete.
- the state diagram has four defined states, such as, a Core Unassigned state 202 , an Executing state 204 , a Rationing FIFO Queue state 206 , and a Waiting state 208 .
- the Core Unassigned state is defined as follows: each core does not have an assigned thread. Subsequently, in the event that a core has a thread assigned to it, the claimed subject matter transitions to the Rationing FIFO Queue state 206 .
- FIFO is defined as a First In First Out.
- a comparison between the number of executing cores and an executing core limit is determined.
- ECL executing core limit
- a processor or system specification determines the proper executing core limit in order to adhere to thermal power considerations.
- the ECL is determined by a formula depicted later in the application. If the number of executing cores is less than ECL, the particular core transitions to the Executing state 204 if the core was the next one to be processed in the FIFO queue. Otherwise, the core remains in the Rationing FIFO queue 206 .
- a memory reference refers to a read or write operation to a particular memory address that does not reside in any cache memory coupled to the processor (“a miss in all levels of cache memory”). Therefore, an access to main memory is initiated.
- the core transitions to the Waiting state 208 .
- the core transitions to the Rationing FIFO queue state. This sequence of cycling between states 204 , 206 , and 208 occurs until the particular thread is completed.
- the core transitions to the Core Unassigned State.
- FIG. 1 merely illustrates an example of limiting the number of executing cores to be less than the available number of threads. For example, one embodiment would allow for multiple waiting states. Alternatively, the waiting states could be replaced by another queue state.
- state diagrams would allow multiple priority levels for cores, as well as having different waiting queues depending on the nature of the event that provoked exit from the executing state (memory wait, thermal wait, ACPI wait, etc).
- a core executes a memory read or write operation and subsequently executes an operation that is dependent on that operation (for example, it makes use of the data returned by a memory read operation). Subsequently, the core “stalls” waiting for that memory operation to be completed. In such a case, it asserts a signal to the central core rationing logic indicating that it is stalled; this is the indication that it is eligible to be disabled by the core rationing logic.
- the core rationing logic responds to this signal by “napping” the core in question—it asserts a “nap” signal to the core, which causes the core to block instruction issue and then transition into a (cache-coherent) low power state.
- the core rationing logic puts an identifier for that core in the Waiting queue.
- the core deasserts the “stall” signal; the core rationing logic responds to this by moving the identifier for that core from the Waiting queue to the Rationing queue. If the number of currently executing (not “napped”) cores is less than or equal to the Executing Core Limit, the core rationing logic removes the oldest identifier from the Rationing queue, and deasserts the “nap” signal to that core.
- FIG. 2 illustrates a bar chart utilized in accordance with an embodiment.
- the bar chart depicts a percentage time spent executing for a 16-core multiprocessor as calculated by a Monte Carlo simulation for a variety of workloads.
- the independent axis illustrates the ECL for 2, 4, 6, 8, 10, 12, 14, and 16.
- Analyzing the 50% memory reference duty cycle highlights the fact that the percentage time executing saturates at 50%. Thus, processing the memory references consumes half of the executing time when the ECL is equal to the number of available threads.
- FIG. 3 illustrates a bar chart utilized in accordance with an embodiment.
- FIG. 3 illustrates the total performance as calculated by the product of the percentage time executing and the frequency.
- the total performance also incorporates the fact that frequency is inversely proportional to the ECL. As previously described, this relationship exists because as one reduces the number of executing cores, this results in reducing power dissipation. Therefore, the frequency can be increased to remain at the steady-state thermal limit.
- FIG. 3 depicts the maximum percentage time executing is 70% for the 30% memory reference duty cycle. Also, the product of the saturation limit and the number of threads demarcates the onset of saturation. Of particular note is the onset of saturation because this may be the region for improved or optimum performance.
- a self optimization formula is utilized to determine the appropriate ECL.
- N depicts the number of threads that have context: % E depicts the percentage executing time; and % M depicts the percentage memory reference time.
- the formula is: int (N ⁇ (% E/(% E+% M)))
- FIG. 4 depicts an apparatus in accordance with one embodiment.
- the apparatus depicts a multi-core processor system with a plurality of processors 410 coupled individually to an independent bank of Level 3 (L3) Cache memory.
- a plurality of four busses form two counter rotating “rings”—a Request/Response (REQ 0 /RSP 0 ) ring ( 402 and 404 ) in the clockwise direction, and a Request/Response ring (REQ 1 /RSP 1 ) ( 406 and 408 ) in the counterclockwise direction.
- the circle in between the “P”s and the “C”s represents a pair of state devices for each ring.
- the system interface logic contains the memory controllers for memory DIMMs, the router logic to handle the interconnection links to other processor dies and/or I/O subsystems, and assorted other system control logic (including the central core rationing controller).
Abstract
A system, apparatus, and method for a core rationing logic to enable cores of a multi-core processor to adhere to various power and thermal constraints.
Description
- This application is a division of co-pending U.S. patent application Ser. No. 10/621,228 filed Jul. 15, 2003 and entitled “A Method, System, and Apparatus for Improving Multi-Core Processor Performance,” and is related to three concurrently filed U.S. Patent Applications, Attorney Docket Nos. 042390.P16355D, 042390.P16355D3 and 042390.P16355D4, also entitled “A Method, System, and Apparatus for Improving Multi-Core Processor Performance.”
- 1. Field
- The present disclosure pertains to the field of power management. More particularly, the present disclosure pertains to a new method and apparatus for improving multi-core processor performance despite power constraints.
- 2. Description of Related Art
- Power management schemes allow for reducing power consumption to achieve low power applications for various types of and systems and integrated devices, such as, servers, laptops, processors and desktops. Typically, software methods are employed for systems and integrated devices to support multiple power states for optimizing performance based at least in part on the Central Processing Unit (CPU) activity.
- Present power management schemes either decrease voltage or frequency or both for reducing power consumption. However, this results in decreased overall performance. Also, some methods incorporate analog designs that have various challenges relating to loop stability for transient workloads, calibration, and tuning.
- With the introduction of processors with multiple cores, power management becomes a major concern because of the increase in cores operating at high frequencies and voltages and need to adhere to various power constraints, such as, thermal limits, maximum current, and Vcc range.
- The present invention is illustrated by way of example and not limitation in the Figures of the accompanying drawings.
-
FIG. 1 illustrates a flowchart for a method utilized in accordance with an embodiment -
FIG. 2 illustrates a bar chart utilized in accordance with an embodiment. -
FIG. 3 illustrates a bar chart utilized in accordance with an embodiment. -
FIG. 4 illustrates an apparatus in accordance with one embodiment. - The following description provides method and apparatus for improved multi-core processor performance despite power constraints. In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be appreciated, however, by one skilled in the art that the invention may be practiced without such specific details. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate logic circuits without undue experimentation.
- As previously described, a problem exists for improving processor performance while adhering to power constraints. The present methods incorporate lowering the voltage or frequency at the expense of overall performance. In contrast, the claimed subject matter improves overall performance while adhering to power constraints. For example, a concept of “rationing the number of executing cores for a processor system” allows for increasing frequency as a result of disabling clocks to cores that are idle as they wait for a memory transaction to complete. For example, the claimed subject matter exploits the idle time period of processor cores by disabling the clocks to the core, that results in less power dissipation. Thus, a higher frequency can be utilized as a result of the decrease in power dissipation. In one embodiment, an appropriate executing core limit is calculated for the workload. Also, in the same embodiment, the number of executing cores are less than or equal to the number of available and ready threads. A thread is an independent set of instructions for a particular application.
- In one embodiment, the claimed subject matter facilitates selecting a voltage/frequency operating point based on a prediction of the activity level of the threads running on all of the cores collectively. For example, TPC-C threads tend to be active 50-60% of the time, and spend 40-50% of their time idle, waiting for memory references to be completed. In such an environment, one would specify an executing core limit that would be, in one embodiment, 60% of the total number of cores on the die; if there were 8 cores, one would set the executing core limit to, in this case, five. One would then specify a voltage-frequency operating point that corresponds to having only five cores active and three cores inactive (low power state) at a time; this is a significantly higher operating frequency than one would specify if one was allowing all eight cores to be simultaneously active. The core rationing logic constrains the operations of the die, guaranteeing that no more than five cores (in this case) are active at any given moment. Statistics are gathered regarding the occupancy of the Waiting and Rationing queues (which will be discussed further in connection with
FIG. 1 ); at intervals these statistics are analyzed to determine whether the operating point (executing core limit and its associated voltage/frequency pair) should be changed. If the Waiting queue tends to be empty and the Rationing queue tends to be full, that is an indication that cores are not making progress when they could be, and that to improve performance the executing core limit should be raised and the voltage/frequency reduced; conversely, if the Rationing queue tends to be empty, and the Waiting queue tends to be full, this may be an indication that one can increase performance by reducing the executing core limit and increasing the voltage/frequency point. -
FIG. 1 illustrates a flowchart for a method utilized in accordance with an embodiment. In one embodiment, the flowchart depicts a method for a state diagram. - In the same embodiment, the state diagram illustrates a predetermined state machine for a processor core in a system. In this same embodiment, the state machine facilitates the “rationing of the cores” to improve processor performance as a result of disabling clocks to cores that are waiting for a memory transaction to complete.
- In one embodiment, the state diagram has four defined states, such as, a Core Unassigned state 202, an Executing
state 204, a Rationing FIFO Queue state 206, and aWaiting state 208. Initially, the Core Unassigned state is defined as follows: each core does not have an assigned thread. Subsequently, in the event that a core has a thread assigned to it, the claimed subject matter transitions to the Rationing FIFO Queue state 206. In one embodiment, FIFO is defined as a First In First Out. - Upon transitioning to the Rationing FIFO Queue state, a comparison between the number of executing cores and an executing core limit (ECL) is determined. In one embodiment, a processor or system specification determines the proper executing core limit in order to adhere to thermal power considerations. In one embodiment, the ECL is determined by a formula depicted later in the application. If the number of executing cores is less than ECL, the particular core transitions to the Executing
state 204 if the core was the next one to be processed in the FIFO queue. Otherwise, the core remains in the Rationing FIFO queue 206. - Upon entering the Executing state, the core remains in this state unless an event occurs, such as, a memory reference and overheating event, and/or a fairness timeout. For example, a fairness timeout may be utilized to prevent a possible live lock state. In this context, a memory reference refers to a read or write operation to a particular memory address that does not reside in any cache memory coupled to the processor (“a miss in all levels of cache memory”). Therefore, an access to main memory is initiated.
- If an event occurs as previously described, the core transitions to the
Waiting state 208. Upon completion of the event, the core transitions to the Rationing FIFO queue state. This sequence of cycling betweenstates - However, the claimed subject matter is not limited to the four defined states in the state diagram. The claimed subject matter supports different amounts of states.
FIG. 1 merely illustrates an example of limiting the number of executing cores to be less than the available number of threads. For example, one embodiment would allow for multiple waiting states. Alternatively, the waiting states could be replaced by another queue state. - Also, other embodiments of state diagrams would allow multiple priority levels for cores, as well as having different waiting queues depending on the nature of the event that provoked exit from the executing state (memory wait, thermal wait, ACPI wait, etc).
- Typically, a core executes a memory read or write operation and subsequently executes an operation that is dependent on that operation (for example, it makes use of the data returned by a memory read operation). Subsequently, the core “stalls” waiting for that memory operation to be completed. In such a case, it asserts a signal to the central core rationing logic indicating that it is stalled; this is the indication that it is eligible to be disabled by the core rationing logic. The core rationing logic responds to this signal by “napping” the core in question—it asserts a “nap” signal to the core, which causes the core to block instruction issue and then transition into a (cache-coherent) low power state. Furthermore, the core rationing logic puts an identifier for that core in the Waiting queue. When the memory operation completes, the core deasserts the “stall” signal; the core rationing logic responds to this by moving the identifier for that core from the Waiting queue to the Rationing queue. If the number of currently executing (not “napped”) cores is less than or equal to the Executing Core Limit, the core rationing logic removes the oldest identifier from the Rationing queue, and deasserts the “nap” signal to that core.
-
FIG. 2 illustrates a bar chart utilized in accordance with an embodiment. In one embodiment, the bar chart depicts a percentage time spent executing for a 16-core multiprocessor as calculated by a Monte Carlo simulation for a variety of workloads. The independent axis illustrates the ECL for 2, 4, 6, 8, 10, 12, 14, and 16. Also, there is a bar for each ECL at a different workload as simulated with a memory reference duty cycle (with respect to executing time) of 1%, 30%, 40%, and 50%. - Analyzing the 50% memory reference duty cycle highlights the fact that the percentage time executing saturates at 50%. Thus, processing the memory references consumes half of the executing time when the ECL is equal to the number of available threads.
-
FIG. 3 illustrates a bar chart utilized in accordance with an embodiment. In addition toFIG. 2 ,FIG. 3 illustrates the total performance as calculated by the product of the percentage time executing and the frequency. The total performance also incorporates the fact that frequency is inversely proportional to the ECL. As previously described, this relationship exists because as one reduces the number of executing cores, this results in reducing power dissipation. Therefore, the frequency can be increased to remain at the steady-state thermal limit. - Also,
FIG. 3 depicts the maximum percentage time executing is 70% for the 30% memory reference duty cycle. Also, the product of the saturation limit and the number of threads demarcates the onset of saturation. Of particular note is the onset of saturation because this may be the region for improved or optimum performance. - In one embodiment, a self optimization formula is utilized to determine the appropriate ECL. In the formula, N depicts the number of threads that have context: % E depicts the percentage executing time; and % M depicts the percentage memory reference time. The formula is:
int (N×(% E/(% E+% M))) -
FIG. 4 depicts an apparatus in accordance with one embodiment. In one embodiment, the apparatus depicts a multi-core processor system with a plurality ofprocessors 410 coupled individually to an independent bank of Level 3 (L3) Cache memory. In the same embodiment, a plurality of four busses form two counter rotating “rings”—a Request/Response (REQ0/RSP0) ring (402 and 404) in the clockwise direction, and a Request/Response ring (REQ1/RSP1) (406 and 408) in the counterclockwise direction. The circle in between the “P”s and the “C”s represents a pair of state devices for each ring. Thus, a set of circular pipelines are utilized for passing information from each processor core/cache bank to any other processor core/cache bank. The system interface logic contains the memory controllers for memory DIMMs, the router logic to handle the interconnection links to other processor dies and/or I/O subsystems, and assorted other system control logic (including the central core rationing controller). - While certain exemplary embodiments have been described and shown in the accompanying drawings, it is to be understood that such embodiments are merely illustrative of and not restrictive on the broad invention, and that this invention not be limited to the specific constructions and arrangements shown and described, since various other modifications may occur to those ordinarily skilled in the art upon studying this disclosure.
Claims (5)
1. An integrated device with a plurality of processor cores comprising:
A first operational state for a core without an assigned thread;
A queue to store the plurality of cores with an assigned thread;
A second operational state for enabling the core to run a thread; and
A fourth operational state to disable a core.
2. The integrated device of claim 1 wherein the queue is a first in first out (FIFO) queue.
3. The integrated device of claim 1 wherein the core transitions from a second state to the third state if the number of enabled cores is less than an executing core limit.
4. The integrated device of claim 3 wherein the executing core limit is based at least in part on a formula, wherein N depicts the number of threads that have context; % E depicts the percentage executing time; and % M depicts the percentage memory reference time. and the formula is:
int(N×(% E/(% E+% M)))
5. The integrated device of claim 1 wherein the core transitions from a third state to the fourth state if the core is idle as it waits for completion of a memory operation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/336,303 US20060117200A1 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/621,228 US20050050310A1 (en) | 2003-07-15 | 2003-07-15 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,303 US20060117200A1 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/621,228 Division US20050050310A1 (en) | 2003-07-15 | 2003-07-15 | Method, system, and apparatus for improving multi-core processor performance |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060117200A1 true US20060117200A1 (en) | 2006-06-01 |
Family
ID=34103183
Family Applications (6)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/621,228 Abandoned US20050050310A1 (en) | 2003-07-15 | 2003-07-15 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,302 Abandoned US20060123263A1 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,681 Expired - Fee Related US7389440B2 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,303 Abandoned US20060117200A1 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,015 Expired - Fee Related US7392414B2 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
US11/686,861 Expired - Fee Related US7788519B2 (en) | 2003-07-15 | 2007-03-15 | Method, system, and apparatus for improving multi-core processor performance |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/621,228 Abandoned US20050050310A1 (en) | 2003-07-15 | 2003-07-15 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,302 Abandoned US20060123263A1 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
US11/336,681 Expired - Fee Related US7389440B2 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/336,015 Expired - Fee Related US7392414B2 (en) | 2003-07-15 | 2006-01-20 | Method, system, and apparatus for improving multi-core processor performance |
US11/686,861 Expired - Fee Related US7788519B2 (en) | 2003-07-15 | 2007-03-15 | Method, system, and apparatus for improving multi-core processor performance |
Country Status (8)
Country | Link |
---|---|
US (6) | US20050050310A1 (en) |
JP (1) | JP4413924B2 (en) |
KR (1) | KR100856605B1 (en) |
CN (2) | CN100555227C (en) |
DE (1) | DE112004001320B3 (en) |
GB (1) | GB2420435B (en) |
TW (1) | TWI280507B (en) |
WO (1) | WO2005010737A2 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100011363A1 (en) * | 2008-07-10 | 2010-01-14 | International Business Machines Corporation | Controlling a computer system having a processor including a plurality of cores |
US8769323B2 (en) * | 2007-11-15 | 2014-07-01 | Intel Corporation | Method, apparatus, and system for optimizing frequency and performance in a multidie microprocessor |
US10649943B2 (en) | 2017-05-26 | 2020-05-12 | Dell Products, L.P. | System and method for I/O aware processor configuration |
US10762031B2 (en) | 2017-06-12 | 2020-09-01 | Dell Products, L.P. | System and method for setting equalization for communication between a processor and a device |
US11755059B2 (en) | 2019-03-13 | 2023-09-12 | Denso Corporation | System for setting an operating clock of a CPU of a vehicular device |
Families Citing this family (49)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4015934B2 (en) | 2002-04-18 | 2007-11-28 | 株式会社東芝 | Video coding method and apparatus |
US20050050310A1 (en) | 2003-07-15 | 2005-03-03 | Bailey Daniel W. | Method, system, and apparatus for improving multi-core processor performance |
US20050154573A1 (en) * | 2004-01-08 | 2005-07-14 | Maly John W. | Systems and methods for initializing a lockstep mode test case simulation of a multi-core processor design |
KR101108397B1 (en) | 2005-06-10 | 2012-01-30 | 엘지전자 주식회사 | Apparatus and method for controlling power supply in a multi-core processor |
KR101177125B1 (en) | 2005-06-11 | 2012-08-24 | 엘지전자 주식회사 | Method and apparatus for implementing hybrid power management mode in a multi-core processor |
KR100663864B1 (en) | 2005-06-16 | 2007-01-03 | 엘지전자 주식회사 | Apparatus and method for controlling processor mode in a multi-core processor |
GB0519981D0 (en) | 2005-09-30 | 2005-11-09 | Ignios Ltd | Scheduling in a multicore architecture |
US20070271130A1 (en) * | 2006-05-19 | 2007-11-22 | Microsoft Corporation | Flexible scheduling and pricing of multicore computer chips |
US7752468B2 (en) * | 2006-06-06 | 2010-07-06 | Intel Corporation | Predict computing platform memory power utilization |
US8028290B2 (en) | 2006-08-30 | 2011-09-27 | International Business Machines Corporation | Multiple-core processor supporting multiple instruction set architectures |
CN100451972C (en) | 2006-09-26 | 2009-01-14 | 杭州华三通信技术有限公司 | Method and apparatus for improving speed of multi-core system accessing critical resources |
US8359487B2 (en) * | 2008-03-19 | 2013-01-22 | Sony Corporation | System and method for effectively performing a clock adjustment procedure |
US8010822B2 (en) * | 2008-03-28 | 2011-08-30 | Microsoft Corporation | Power-aware thread scheduling and dynamic use of processors |
US20090259793A1 (en) * | 2008-04-10 | 2009-10-15 | Sony Corporation And Sony Electronics Inc. | System and method for effectively implementing an erase mode for a memory device |
US8181049B2 (en) * | 2009-01-16 | 2012-05-15 | Freescale Semiconductor, Inc. | Method for controlling a frequency of a clock signal to control power consumption and a device having power consumption capabilities |
EP2435914B1 (en) * | 2009-05-26 | 2019-12-11 | Telefonaktiebolaget LM Ericsson (publ) | Method and scheduler in an operating system |
US8543857B2 (en) * | 2009-09-26 | 2013-09-24 | Intel Corporation | Method and apparatus for low power operation of multi-core processors |
KR101680109B1 (en) * | 2009-10-29 | 2016-12-12 | 삼성전자 주식회사 | Multi-Core Apparatus And Method For Balancing Load Of The Same |
KR101648978B1 (en) * | 2009-11-05 | 2016-08-18 | 삼성전자주식회사 | Method for controlling power in low power multi-core system |
US20110138395A1 (en) * | 2009-12-08 | 2011-06-09 | Empire Technology Development Llc | Thermal management in multi-core processor |
TWI425359B (en) * | 2010-03-05 | 2014-02-01 | Asustek Comp Inc | Cpu core unlocking control apparatus applied to computer system |
CN101799773B (en) * | 2010-04-07 | 2013-04-17 | 福州福昕软件开发有限公司 | Memory access method of parallel computing |
CN102243523B (en) * | 2010-05-12 | 2014-01-08 | 英业达股份有限公司 | Data storage system with standby power supply mechanism |
US8484498B2 (en) * | 2010-08-26 | 2013-07-09 | Advanced Micro Devices | Method and apparatus for demand-based control of processing node performance |
US8495395B2 (en) | 2010-09-14 | 2013-07-23 | Advanced Micro Devices | Mechanism for controlling power consumption in a processing node |
US8726055B2 (en) | 2010-09-20 | 2014-05-13 | Apple Inc. | Multi-core power management |
WO2012098684A1 (en) | 2011-01-21 | 2012-07-26 | 富士通株式会社 | Scheduling method and scheduling system |
US8185758B2 (en) * | 2011-06-30 | 2012-05-22 | Intel Corporation | Method and system for determining an energy-efficient operating point of a platform |
US8719607B2 (en) | 2011-12-01 | 2014-05-06 | International Business Machines Corporation | Advanced Pstate structure with frequency computation |
US8782466B2 (en) * | 2012-02-03 | 2014-07-15 | Hewlett-Packard Development Company, L.P. | Multiple processing elements |
GB2514972B (en) | 2012-03-31 | 2020-10-21 | Intel Corp | Controlling power consumption in multi-core environments |
CN102789301A (en) * | 2012-05-17 | 2012-11-21 | 江苏中科梦兰电子科技有限公司 | Power management method of computer |
US9250682B2 (en) | 2012-12-31 | 2016-02-02 | Intel Corporation | Distributed power management for multi-core processors |
US9292288B2 (en) | 2013-04-11 | 2016-03-22 | Intel Corporation | Systems and methods for flag tracking in move elimination operations |
KR102110812B1 (en) * | 2013-05-30 | 2020-05-14 | 삼성전자 주식회사 | Multicore system and job scheduling method thereof |
US9792112B2 (en) | 2013-08-28 | 2017-10-17 | Via Technologies, Inc. | Propagation of microcode patches to multiple cores in multicore microprocessor |
US9513687B2 (en) | 2013-08-28 | 2016-12-06 | Via Technologies, Inc. | Core synchronization mechanism in a multi-die multi-core microprocessor |
CN104216861B (en) * | 2013-08-28 | 2019-04-19 | 威盛电子股份有限公司 | Microprocessor and the in the microprocessor method of synchronization process core |
US9465432B2 (en) | 2013-08-28 | 2016-10-11 | Via Technologies, Inc. | Multi-core synchronization mechanism |
KR20150050135A (en) | 2013-10-31 | 2015-05-08 | 삼성전자주식회사 | Electronic system including a plurality of heterogeneous cores and operating method therof |
JP6291966B2 (en) * | 2014-03-31 | 2018-03-14 | 日本電気株式会社 | Initialization processing speed-up system, initialization processing speed-up device, initialization processing speed-up method, and initialization processing speed-up program |
KR102169692B1 (en) * | 2014-07-08 | 2020-10-26 | 삼성전자주식회사 | System on chip including multi-core processor and dynamic power management method thereof |
GB2528845B (en) * | 2014-07-30 | 2016-12-14 | Jaguar Land Rover Ltd | Feedback through brake inputs |
US9696787B2 (en) * | 2014-12-10 | 2017-07-04 | Qualcomm Innovation Center, Inc. | Dynamic control of processors to reduce thermal and power costs |
US9569264B2 (en) | 2015-03-17 | 2017-02-14 | Freescale Semiconductor,Inc. | Multi-core system for processing data packets |
CN106293644B (en) * | 2015-05-12 | 2022-02-01 | 超威半导体产品(中国)有限公司 | Power budget method considering time thermal coupling |
US20170185128A1 (en) * | 2015-12-24 | 2017-06-29 | Intel Corporation | Method and apparatus to control number of cores to transition operational states |
CN106201726A (en) * | 2016-07-26 | 2016-12-07 | 张升泽 | Many core chip thread distribution method and system |
CN106598203B (en) * | 2016-12-21 | 2019-04-23 | 上海海事大学 | A kind of method for managing power supply of chip multiprocessor system under data-intensive environment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040128663A1 (en) * | 2002-12-31 | 2004-07-01 | Efraim Rotem | Method and apparatus for thermally managed resource allocation |
US6804632B2 (en) * | 2001-12-06 | 2004-10-12 | Intel Corporation | Distribution of processing activity across processing hardware based on power consumption considerations |
US7093147B2 (en) * | 2003-04-25 | 2006-08-15 | Hewlett-Packard Development Company, L.P. | Dynamically selecting processor cores for overall power efficiency |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5168554A (en) * | 1989-10-13 | 1992-12-01 | International Business Machines Corporation | Converting trace data from processors executing in parallel into graphical form |
US5189314A (en) * | 1991-09-04 | 1993-02-23 | International Business Machines Corporation | Variable chip-clocking mechanism |
JPH0659906A (en) * | 1992-08-10 | 1994-03-04 | Hitachi Ltd | Method for controlling execution of parallel |
US5442775A (en) * | 1994-02-08 | 1995-08-15 | Meridian Semiconductor, Inc. | Two clock microprocessor design with stall |
US5737615A (en) * | 1995-04-12 | 1998-04-07 | Intel Corporation | Microprocessor power control in a multiprocessor computer system |
JPH09138716A (en) * | 1995-11-14 | 1997-05-27 | Toshiba Corp | Electronic computer |
JPH09185589A (en) | 1996-01-05 | 1997-07-15 | Toshiba Corp | Information processing system and power saving method for the system |
GB2311882B (en) | 1996-04-04 | 2000-08-09 | Videologic Ltd | A data processing management system |
US6549954B1 (en) * | 1997-01-16 | 2003-04-15 | Advanced Micro Devices, Inc. | Object oriented on-chip messaging |
US5913069A (en) | 1997-12-10 | 1999-06-15 | Cray Research, Inc. | Interleaving memory in distributed vector architecture multiprocessor system |
JP4172054B2 (en) | 1998-03-30 | 2008-10-29 | マツダ株式会社 | Control device for automatic transmission |
US6529921B1 (en) * | 1999-06-29 | 2003-03-04 | Microsoft Corporation | Dynamic synchronization of tables |
US6440282B1 (en) * | 1999-07-06 | 2002-08-27 | Applied Materials, Inc. | Sputtering reactor and method of using an unbalanced magnetron |
US6357016B1 (en) * | 1999-12-09 | 2002-03-12 | Intel Corporation | Method and apparatus for disabling a clock signal within a multithreaded processor |
US6889319B1 (en) * | 1999-12-09 | 2005-05-03 | Intel Corporation | Method and apparatus for entering and exiting multiple threads within a multithreaded processor |
US6550020B1 (en) * | 2000-01-10 | 2003-04-15 | International Business Machines Corporation | Method and system for dynamically configuring a central processing unit with multiple processing cores |
US6640282B2 (en) | 2000-01-25 | 2003-10-28 | Hewlett-Packard Development Company, L.P. | Hot replace power control sequence logic |
US6574739B1 (en) | 2000-04-14 | 2003-06-03 | Compal Electronics, Inc. | Dynamic power saving by monitoring CPU utilization |
US20020018877A1 (en) | 2000-08-02 | 2002-02-14 | Woodall Calvin L. | Reduced motion and anti slip pad |
AU2002236667A1 (en) * | 2000-10-31 | 2002-05-21 | Millennial Net, Inc. | Networked processing system with optimized power efficiency |
US6920572B2 (en) * | 2000-11-15 | 2005-07-19 | Texas Instruments Incorporated | Unanimous voting for disabling of shared component clocking in a multicore DSP device |
US6990598B2 (en) * | 2001-03-21 | 2006-01-24 | Gallitzin Allegheny Llc | Low power reconfigurable systems and methods |
AU2002368540A1 (en) * | 2001-05-16 | 2005-02-14 | North Carolina State University | Methods for forming tunable molecular gradients on substrates |
US6901522B2 (en) * | 2001-06-07 | 2005-05-31 | Intel Corporation | System and method for reducing power consumption in multiprocessor system |
JP3610930B2 (en) | 2001-07-12 | 2005-01-19 | 株式会社デンソー | Operating system, program, vehicle electronic control unit |
US20030079151A1 (en) * | 2001-10-18 | 2003-04-24 | International Business Machines Corporation | Energy-aware workload distribution |
US6985952B2 (en) * | 2001-10-31 | 2006-01-10 | International Business Machines Corporation | Energy-induced process migration |
US7318164B2 (en) * | 2001-12-13 | 2008-01-08 | International Business Machines Corporation | Conserving energy in a data processing system by selectively powering down processors |
EP1338956A1 (en) * | 2002-02-20 | 2003-08-27 | STMicroelectronics S.A. | Electronic data processing apparatus, especially audio processor for audio/video decoder |
US7480911B2 (en) * | 2002-05-09 | 2009-01-20 | International Business Machines Corporation | Method and apparatus for dynamically allocating and deallocating processors in a logical partitioned data processing system |
US6971034B2 (en) | 2003-01-09 | 2005-11-29 | Intel Corporation | Power/performance optimized memory controller considering processor power states |
US20050050310A1 (en) | 2003-07-15 | 2005-03-03 | Bailey Daniel W. | Method, system, and apparatus for improving multi-core processor performance |
-
2003
- 2003-07-15 US US10/621,228 patent/US20050050310A1/en not_active Abandoned
-
2004
- 2004-07-13 CN CNB2004100709137A patent/CN100555227C/en not_active Expired - Fee Related
- 2004-07-13 CN CN200710106805.4A patent/CN101320289B/en not_active Expired - Fee Related
- 2004-07-14 GB GB0602753A patent/GB2420435B/en not_active Expired - Fee Related
- 2004-07-14 DE DE112004001320T patent/DE112004001320B3/en not_active Expired - Fee Related
- 2004-07-14 WO PCT/US2004/022354 patent/WO2005010737A2/en active Application Filing
- 2004-07-14 KR KR1020067000942A patent/KR100856605B1/en not_active IP Right Cessation
- 2004-07-14 JP JP2006520257A patent/JP4413924B2/en not_active Expired - Fee Related
- 2004-07-14 TW TW093120990A patent/TWI280507B/en not_active IP Right Cessation
-
2006
- 2006-01-20 US US11/336,302 patent/US20060123263A1/en not_active Abandoned
- 2006-01-20 US US11/336,681 patent/US7389440B2/en not_active Expired - Fee Related
- 2006-01-20 US US11/336,303 patent/US20060117200A1/en not_active Abandoned
- 2006-01-20 US US11/336,015 patent/US7392414B2/en not_active Expired - Fee Related
-
2007
- 2007-03-15 US US11/686,861 patent/US7788519B2/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6804632B2 (en) * | 2001-12-06 | 2004-10-12 | Intel Corporation | Distribution of processing activity across processing hardware based on power consumption considerations |
US20040128663A1 (en) * | 2002-12-31 | 2004-07-01 | Efraim Rotem | Method and apparatus for thermally managed resource allocation |
US7093147B2 (en) * | 2003-04-25 | 2006-08-15 | Hewlett-Packard Development Company, L.P. | Dynamically selecting processor cores for overall power efficiency |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8769323B2 (en) * | 2007-11-15 | 2014-07-01 | Intel Corporation | Method, apparatus, and system for optimizing frequency and performance in a multidie microprocessor |
US8806248B2 (en) | 2007-11-15 | 2014-08-12 | Intel Corporation | Method, apparatus, and system for optimizing frequency and performance in a multidie microprocessor |
US9280172B2 (en) | 2007-11-15 | 2016-03-08 | Intel Corporation | Method, apparatus, and system for optimizing frequency and performance in a multidie microprocessor |
US9984038B2 (en) | 2007-11-15 | 2018-05-29 | Intel Corporation | Method, apparatus, and system for optimizing frequency and performance in a multidie microprocessor |
US20100011363A1 (en) * | 2008-07-10 | 2010-01-14 | International Business Machines Corporation | Controlling a computer system having a processor including a plurality of cores |
US7757233B2 (en) * | 2008-07-10 | 2010-07-13 | International Business Machines Corporation | Controlling a computer system having a processor including a plurality of cores |
US10649943B2 (en) | 2017-05-26 | 2020-05-12 | Dell Products, L.P. | System and method for I/O aware processor configuration |
US10877918B2 (en) | 2017-05-26 | 2020-12-29 | Dell Products, L.P. | System and method for I/O aware processor configuration |
US10762031B2 (en) | 2017-06-12 | 2020-09-01 | Dell Products, L.P. | System and method for setting equalization for communication between a processor and a device |
US11755059B2 (en) | 2019-03-13 | 2023-09-12 | Denso Corporation | System for setting an operating clock of a CPU of a vehicular device |
Also Published As
Publication number | Publication date |
---|---|
US7788519B2 (en) | 2010-08-31 |
GB2420435B (en) | 2008-06-04 |
US7392414B2 (en) | 2008-06-24 |
KR100856605B1 (en) | 2008-09-03 |
GB2420435A (en) | 2006-05-24 |
CN1577280A (en) | 2005-02-09 |
US20050050310A1 (en) | 2005-03-03 |
JP2007535721A (en) | 2007-12-06 |
DE112004001320B3 (en) | 2011-09-15 |
KR20060031868A (en) | 2006-04-13 |
US20060123263A1 (en) | 2006-06-08 |
WO2005010737A2 (en) | 2005-02-03 |
TW200515289A (en) | 2005-05-01 |
US20070198872A1 (en) | 2007-08-23 |
JP4413924B2 (en) | 2010-02-10 |
CN101320289A (en) | 2008-12-10 |
CN101320289B (en) | 2014-06-25 |
CN100555227C (en) | 2009-10-28 |
US20060117199A1 (en) | 2006-06-01 |
US7389440B2 (en) | 2008-06-17 |
GB0602753D0 (en) | 2006-03-22 |
TWI280507B (en) | 2007-05-01 |
US20060123264A1 (en) | 2006-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7392414B2 (en) | Method, system, and apparatus for improving multi-core processor performance | |
US9904346B2 (en) | Methods and apparatus to improve turbo performance for events handling | |
TWI464570B (en) | Method, computer readable storage media, and multiple logical processor system for balancing performance and power savings of a computing device having muitiple cores | |
US8190863B2 (en) | Apparatus and method for heterogeneous chip multiprocessors via resource allocation and restriction | |
TWI522801B (en) | Digital power estimator to control processor power consumption | |
EP0662652B1 (en) | Method and apparatus for reducing power consumption in a computer system | |
US20130155081A1 (en) | Power management in multiple processor system | |
US20090320031A1 (en) | Power state-aware thread scheduling mechanism | |
WO2007078628A2 (en) | Method and apparatus for providing for detecting processor state transitions | |
EP1570342A2 (en) | Apparatus and method for multi-threaded processors performance control | |
EP2430541A1 (en) | Power management in a multi-processor computer system | |
US20120173904A1 (en) | Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state | |
US20140351828A1 (en) | Apparatus and method for controlling multi-core system on chip | |
Wang et al. | Cache latency control for application fairness or differentiation in power-constrained chip multiprocessors | |
US8707063B2 (en) | Hardware assisted performance state management based on processor state changes | |
US20230205306A1 (en) | Default Boost Mode State for Devices | |
JPH0876875A (en) | Microcomputer application system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BAILEY, DANIEL W.;DUTTON, TODD;FOSSUM, TRYGGVE;REEL/FRAME:017506/0187;SIGNING DATES FROM 20030918 TO 20040127 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |