CN100585567C - Method and device for delaying multiprocessor system data and/or dictation visit - Google Patents

Method and device for delaying multiprocessor system data and/or dictation visit Download PDF

Info

Publication number
CN100585567C
CN100585567C CN 200580036461 CN200580036461A CN100585567C CN 100585567 C CN100585567 C CN 100585567C CN 200580036461 CN200580036461 CN 200580036461 CN 200580036461 A CN200580036461 A CN 200580036461A CN 100585567 C CN100585567 C CN 100585567C
Authority
CN
China
Prior art keywords
processor
clock skew
processors
data
clock
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 200580036461
Other languages
Chinese (zh)
Other versions
CN101048747A (en
Inventor
T·科特克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Robert Bosch GmbH
Original Assignee
Robert Bosch GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Robert Bosch GmbH filed Critical Robert Bosch GmbH
Publication of CN101048747A publication Critical patent/CN101048747A/en
Application granted granted Critical
Publication of CN100585567C publication Critical patent/CN100585567C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30189Instruction operation extension or modification according to execution mode, e.g. mode flag
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1629Error detection by comparing the output of redundant processing systems
    • G06F11/1641Error detection by comparing the output of redundant processing systems where the comparison is not performed by the redundant processing components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/845Systems in which the redundancy can be transformed in increased performance

Abstract

The invention relates to a method and device for delaying accesses to data and/or commands of a multiprocessor system comprising a first and a second processor to both of which a memory unit is assigned. The second processor operates with a clock pulse offset, and the device is designed in such a manner that the first processor accesses the memory unit, and the second processor, with a clock pulse offset, receives the data and/or commands.

Description

Postpone the data of visit multicomputer system and/or the method and apparatus of instruction
Technical field
The present invention relates to a kind of method and a kind of corresponding delay cell that is used to postpone to the visit of the data of multicomputer system and/or instruction.
Background technology
In technology is used, as especially in automobile or in industrial quality field (promptly for example mechanical field) and in automatic field, for application, constantly adopt increasing control and regulating system based on microprocessor or computing machine to the security requirement strictness.At this, duplex computer system or two-processor system (double-core) nowadays are the conventional computer systems to the application of security requirement strictness, especially in all anti-lock braking systems in this way in the automobile, Electronic Stability Program (ESP) (ESP), as the stop conventional computer system of line transmission control system (X-by-wire) system (Break-by-Wire) or the like of drive-by wire (Drive-by-Wire or Steer-by-Wire) and line traffic control, perhaps also be the conventional computer system in other networked system.In order to satisfy the high security requirement in following the application, essential strong error mechanism and fault processing mechanism are especially so that the transient error that reply for example forms when the semiconductor structure that makes computer system diminishes.At this, it is difficult relatively protecting this nuclear itself (being processor).Such as mentioned, be to use duplex computer system or the double-core system detects mistake to this solution.
Therefore, this processor unit with at least two integrated performance elements is called as double-core framework or multicore architecture.Mainly advise this double-core framework or multicore architecture according to prior art of today for following two reasons:
Therefore, on the one hand, can realize in the following manner that power improves, realize that promptly performance improves, promptly these two performance elements or nuclear are regarded as and are treated to two computing units on the semiconductor module.In this configuration, these two performance elements or nuclear are carried out distinct program or task.Thus, can realize that power improves, therefore, this configuration is called as power mode or performance mode.
Second reason that realizes double-core or multicore architecture is that security improves, and its mode is that these two performance elements are carried out identical program redundantly.The result of these two performance elements or CPU (i.e. nuclear) is compared, and can identify mistake when comparing consistance.Below this configuration is called safe mode (Safety-Mode) or is also referred to as the wrong identification pattern.
Therefore, nowadays there be the dual processor or the multicomputer system (referring to double-core or main frame verifier (Master-Checker) system) of working on the one hand redundantly, and dual processor or the multicomputer system of carrying out different pieces of information on its processor arranged on the other hand in order to discern hard error.
Summary of the invention
If according to following form of implementation of the present invention these two kinds of working methods are attached in dual processor or the multicomputer system now and (also only mention two-processor system now for simple reason, can be used to multicomputer system but following invention is just the same), then these two processors obtain different data and obtain identical data under the wrong identification patterns under performance mode.
Effective operation of two-processor system can be realized in this equipment or unit, changes under two patterns (safe mode and performance mode) when working with activation.At this, further mentioned processor, but this is equally at conceptive nuclear or the computing unit of comprising.
When realizing particularly two-processor system (double-core), be generally each processor a high-speed cache all is set.A high-speed cache is normally not enough, must be disposed between two processors because this high-speed cache spatially be it seems.Because the long running time between high-speed cache and two processors, so these two processors only can be with limited clock frequency work.At this, in this system, high-speed cache is as quick intermediate store, so that processor needn't be always from obtaining data the primary memory slowly.In order to realize this point, must its access duration time of exactissima diligentia when realizing high-speed cache.This access duration time constitutes by the actual access time of obtaining data from high-speed cache and by the time of data being handed to processor.If high-speed cache is spatially placed away from processor now, then the transmission last very long of data and processor no longer can be with its complete clock works.Owing to timing problems, the high-speed cache of a special use is set usually at each processor in two-processor system.
Task of the present invention is a kind of method and apparatus of explanation, can save a high-speed cache by this method and apparatus in two-processor system, or saves redundant high-speed cache in multicomputer system.Realize saving by utilizing clock skew.
In order to solve this task, the present invention has illustrated a kind of method and apparatus that is used to postpone to the visit of the data of multicomputer system with first and second processors and/or instruction, give this first and second processor distribution, one storage unit, wherein second processor has the work of clock skew ground, and construct this equipment like this, make the first processor storage unit access, and second processor there is clock skew ground to obtain data and/or instruction.Advantageously, storage unit is a cache memory, the advantage of this memory technology can be combined with advantage of the present invention thus.
Aptly, storage unit is come addressing by at least one processor and directly is coupled on the processor of this storage unit of addressing.
Advantageously, comprise delay element, and construct this equipment like this, make and to use clock skew, so that realize the cross-over connection of data and/or the working time of instruction from storage unit to second processor by delay element.
In addition advantageously, comparison means is set, by this device comparing data and/or instruction, and this comparison means is spatially arranged near processor subsequently.
Aptly, construct this equipment like this, make and utilize this clock skew, so that the comparing data of first processor is directed to second processor.
Advantageously, according to improvement project, when when visit, delayed write operation and read operation or only postpone read operation or delayed write operation only.
If these two processors have the operation of clock skew ground now, then utilize the method and the corresponding apparatus of being advised to save at second high-speed cache from processor.
In duplex computer system, there are two processors, these two processors can be carried out identical or different tasks.Two processors of duplex computer system can clock synchronization ground or these tasks of clock skew ground execution.If two-processor system is configured to find mistake, then for fear of the common mode mistake advantageously, these two processors have the work of clock skew ground.If select non-integral clock skew>1, then this method is the most effective.That is to say that in first application form, these two processors or nuclear are carried out identical task.
If these two processors are carried out different tasks, then advantageously, these two processors can synchronously move at the clock edge, because the external module as storer can only utilize the clock of processor to control.If therefore the two-processor system that employing now for example can be changed between these two kinds of patterns then is optimized to a mode of operation.
According to the present invention, compensate this point in the following manner, promptly in two-processor system (or multicomputer system), this two-processor system can be changed between two patterns as safe mode and performance mode, and these two processors have the work of clock skew ground and no clock skew ground work under performance mode under safe mode.It is favourable not having clock skew under performance mode, because the external module as storer moves with lower clock frequency mostly, and the clock edge is only designed these external modules suitably with processor.In addition, the processor of second clock skew can have latent period when each memory access, because external module is controlled on this second processor, half clock ground in evening.
Clock conversion by to two-processor system obtains the optimum condition in the wrong identification, and obtain the maximal value on the performance under performance mode under safe mode.
Therefore, the present invention is advantageously from a kind of method and apparatus that is used to postpone to the visit of the data of multicomputer system with first and second processors and/or instruction, give this first and second processor distribution, one storage unit, wherein first and second processors have the work of clock skew ground, and construct this equipment like this, make two processors that the identical storage unit of clock skew accessing be arranged.
Aptly, at this, when visit, delayed write operation and read operation, wherein this equipment can and not postpone to change between the visit in the delay visit.In addition, a kind of multicomputer system with such equipment is disclosed.
Under at least one pattern, these two processors have the work of clock skew ground.This clock skew can be offset whole clock Offset portion clock toward each other again toward each other.Another flexible program is to use different clock frequencies under these two patterns.Under pattern, for example will be used to disturb inhibition than clock lower under performance mode to the security requirement strictness.At this, these two kinds of flexible programs also can mutually combine.
At this, first mode of operation is corresponding with safe mode, and in this safe mode, two computing units are carried out identical program and/or data, and comparison means is set, and these comparison means compare consistance to the state that forms when carrying out identical program.
Can in two-processor system, realize these two patterns best according to unit of the present invention or the method according to this invention.
If these two processors are operated in wrong identification pattern (F pattern), then these two processors obtain identical data/commands, and if these two processors are operated in performance mode (P pattern), then each processor can both be visited this storer.So this Single Component Management is to the only simple storer that exists or the visit of peripherals.
Under the F pattern, the data/address of this unit receiving processor (being called main frame) and these data/addresses are transmitted to assembly as storer, bus etc. at this.Second processor (being called slave at this) wants to carry out identical visit.This visit is received at second port in the data allocations unit, but this inquiry is not transmitted to other assembly.This data allocations unit will be identical with main frame the data delivery data of giving slave and these two processors of comparison.If these data differences, then data allocations unit (is DVE at this) shows this point by rub-out signal.Therefore, host work only on bus/storer, and slave obtains same data (as the method for operation in the double-core system).
Under the P pattern, these two processors are carried out different program parts.Therefore, memory access also is different.Therefore, DVE receives the requirement of processor and result/desired data is returned DVE requesting processor.If these two processors want to visit simultaneously an assembly now, then a processor is placed in waiting status, up to using another processor.
Conversion between these two patterns and therefore the different working method of data allocations unit all realize by control signal.This can generate or can externally be generated by in these two processors.
If two-processor system has the operation of clock skew ground and do not have the operation of clock skew ground under the P pattern under the F pattern, then the DVE unit correspondingly postpones the data or the output data of storage host longways like this of slave, output data up to main frame can compare with the output data of slave, to be used for identification error.
Description of drawings
Fig. 1 illustrates duplex computer system,
Fig. 2 illustrates the exemplary implementation about data allocations unit (DVE),
Fig. 3 illustrates the example of clock conversion,
Fig. 4 illustrates the high-speed cache that is arranged in each processor,
Fig. 5 illustrates the embodiment that a high-speed cache is used for two processors, and
Fig. 6 is illustrated in and uses two trigger embodiment under the situation of clock skew.
Embodiment
Further set forth clock skew with reference to Fig. 1 at duplex computer system.
Fig. 1 shows duplex computer system, and it has first computing machine 100 (especially principal computer) and second computing machine 101 (especially from computing machine).At this, total system is with clock that can be scheduled or moving by scheduled clock period (clock cycle) CLK.Input end of clock CLK1 by computing machine 100 and the input end of clock CLK2 by computing machine 101 flow to this duplex computer system with clock.In addition, in this duplex computer system, also exemplarily comprise the special characteristic that is used for identification error, wherein that is first computing machine 100 and second computing machine 101 with time migration (especially time migration that can be scheduled) or clock skew work that can be scheduled.At this, can be scheduled to each any time for time migration, and also can be scheduled to each any clock about the skew of clock period.This can be the integer skew of clock period (clock cycle), but equally also as shown in this example, for example be offset 1.5 clock period, wherein work or operation on 1.5 clock period ground before second computing machine 101 just at this first computing machine 100.Can avoid by this skew, homophase mistake (being so-called common mode failure (common mode failure)) is interference calculation machine or processor (being the nuclear of double-core system) and therefore be not identified in the same manner.That is to say that this homophase mistake is because skew and different in program circuit are run into computing machine constantly and therefore caused different effects for two computing machines can identify mistake thus.Having avoided not having the mistake of the same way as of clock skew to act on thus may not be identified in the comparison.In order in duplex computer system, to be implemented in skew aspect time or the clock (at this in particular for 1.5 clock period), realize offset module 112 to 115.
In order to identify described homophase mistake, this system just in time for example is designed to preset time skew or clock period skew work, especially it is 1.5 clock period at this, promptly during this 1.5 clock period, computing machine (for example computing machine 100) is directly made response to assembly (particularly external module 103 and 104), to this, second computing machine 101 postpones just in time 1.5 clock period ground work.In order to produce desirable semiperiod delay (i.e. 1.5 clock period) in this case, on input end of clock CLK2, present anti-phase clock for computing machine 101.But, thus, at the above-mentioned terminal of computing machine, therefore also its data or instruction must be passed through the described clock period of bus delay, promptly especially postpone 1.5 clock period, this just in time is offset or Postponement module 112 to 115 with described the setting like that at this.Except two computing machines or processor 100 and 101, assembly 103 and 104 also are set, and assembly 103 and 104 forms with these two computing machines 100 and 101 by the bus 116 be made up of bus line 116A and 116B and 116C and by the bus 117 that bus line 117A and 117B form and to be connected.At this, the 117th, the instruction bus in this instruction bus, identifies instruction address bus with 117A, and instructs (data) bus with the 117B identification division.Address bus 117A links to each other with computing machine 100 by instruction address terminal IA1 (instruction address 1), and links to each other with computing machine 101 by instruction address terminal I A2 (instruction address 2).Instruction itself is transmitted by part instruction bus 117B, and this part instruction bus 117B links to each other with computing machine 100 by instruction terminal II (instruction 1) and links to each other with computing machine 101 by instruction terminal I2 (instruction 2).In the instruction bus of forming by 117A and 117B 117, with assembly 103 (for example command memory, especially reliable command memory etc.) interconnection.These assemblies (especially as command memory) also move with clock CLK in this example.In addition, represent data bus with 116, this data bus comprises data address bus or data address circuit 116A and data bus or data circuit 116B.At this, 116A (being the data address circuit) links to each other with computing machine 100 by data address terminal DA1 (data address 1), and links to each other with computing machine 101 by data address terminal DA2 (data address 2).Equally, data bus or data circuit 116B link to each other with computing machine 100 by data terminal DO1 (data output 1) and link to each other with computing machine 101 by data terminal DO2 (data output 2).In addition, data bus line 116C belongs to data bus 116, and (data input 2) links to each other with computing machine 100 or computing machine 101 respectively this data bus line 116C with data terminal D12 by data terminal DI1 (data input 1).Interconnecting assembly 104 in the data bus of forming by circuit 116A, 116B and 116C 116 (for example data-carrier store, especially reliable data-carrier store etc.).In this example, also provide clock CLK for assembly 104.
At this, assembly 103 and 104 is represented random component, these assemblies link to each other with the computing machine of duplex computer system by data bus and/or instruction bus, and according to about obtaining or send vicious data and/or instruction in the data of the duplex computer system aspect write operation and/or the read operation and/or the visit of instruction.For fear of mistake, though be provided with wrong identification generator 105,106 and 107, these wrong identification generators 105,106 and 107 produce such as the wrong identification of parity check bit or also produce such as error correcting code (be ECC, Error-Correction-Code) another error code of Denging.So also be provided with corresponding wrong identification verifying attachment or calibration equipment 108 and 109, be used to check corresponding error identification (promptly for example parity check bit or another error code as ECC) for this reason.
As shown in FIG. 1, in duplex computer system, implement and comparing data and/or instruction realization in comparer or assembly 110 and 111 about redundancy.But now, if life period skew between computing machine 100 and 101, especially there is the skew of clock skew or clock period, this time migration is by nonsynchronous two-processor system or causing owing to the desirable time migration of wrong identification or clock period are offset by wrong in synchronously or also as in the specific example in the synchronous two-processor system, especially in these 1.5 clock period of skew, then in this time migration or clock skew, computing machine (at this computing machine 100 especially) but also relate to other user or executive component or sensor ground can write vicious data and/or instruction or read in assembly (external module especially, such as at this storer 103 or 104 particularly).Like this, this computing machine replaces set read access to carry out write access in vicious mode by clock skew.Self-evident, especially under the situation of the display possibility that does not have significantly to change data and/or instruction, these situations cause the mistake of total system, also produce the recovery problem thus just in time wrongly.
In order to address this problem, now with in the circuit that delay cell 102 is inserted data buss as illustrated and/or in the incoming instruction bus.For reason clearly, the access data bus only is shown.Aspect instruction bus, this is just the same naturally also to be possible and can to imagine.Delay cell 102 (Delay Unit) so postpones visit (in this especially memory access), so that especially when wrong identification, for example for example at least so compensate possible time migration or clock skew longways by comparer 110 and 111, up in duplex computer system, producing rub-out signal, i.e. execution error identification in duplex computer system.Can realize various flexible programs at this:
Delayed write operation and read operation, if delayed write operation only is perhaps neither be preferably then also postpone read operation.At this, the write operation that is delayed can be transformed into read operation by variable signal (especially rub-out signal), so that forbid vicious writing.
Now, show exemplary implementation about data allocations unit (DVE) below with reference to Fig. 2, this data allocations unit (DVE) preferably constitutes by being used for equipment, mode switching unit and Iram and Dram control module that (passing through IllOPDetect) detect the conversion hope.
IllOpDetect: the conversion between two patterns is discerned by " transition detection (Switch-Detect) " unit.This unit is between high-speed cache on the instruction bus and processor and watch, and whether instruction IllOp is loaded in the processor.If detect instruction, then give mode switching unit with this event notice.At each processor individualism " transition detection " unit." transition detection " unit needn't be implemented fault-tolerantly, because this " transition detection " unit is doubled ground and therefore exists redundantly.On the other hand, what can consider is individually to implement this unit, still preferably redundant embodiment fault-tolerantly and therefore.
ModeSwitch: the conversion between these two patterns triggers by " transition detection " unit.If should carry out the conversion from the locking mode to the clastotype, then two " transition detection " unit detect this conversion, because these two processors are carried out identical program code under locking mode.1.5 clocks of " transition detection " unit of processor 1 before " transition detection " unit of processor 2 identify this point." mode switch " unit makes processor 1 end two clocks by waiting signal.Processor 2 is ended 1.5 clocks equally a little later, but only ends clock half, so that this processor 2 is synchronous with system clock.Then, at other assembly, status signal is connected to clastotype, and these two processors work on.Implement different tasks for two processors now, these two processors must operation in succession in program code.This realizes the read access of processor ID by direct the realization after being transformed into clastotype.The processor ID that is read is different in these two processors each.If now given processor ID is compared, then then utilizes the condition jump instruction that corresponding processor is guided to other program point.When clastotype is transformed into locking mode, at first find this point for one in processor or this two processors.This processor includes conversion instruction with the executive routine code in this program code.Now, this registers by " transition detection " unit and this is notified to mode switching unit.This mode switching unit is ended corresponding processor and by interruption synchronous former prestige is notified to second processor.Second processor obtains to interrupt and present energy executive software routine, is used to finish its task.This processor jumps to the program point that instruction was positioned at that is used to change equally now.Mode switching unit is given with the hope signaling of mode conversion now equally in its " transition detection " unit.The waiting signal of 1.5 clock deactivation processors 2 of the at first waiting signal of the present deactivation processor 1 in system clock edge of Shang Shenging, and evening.Now, these two processors are once more with the clock skew synchronous working of 1.5 clocks.
If this system is in locking mode, then two " transition detection " unit must the notification mode converting units, and clastotype is wanted to enter in these two " transition detection " unit.If only realize changing hope by a unit, then discern this mistake by comparing unit, because one of these two processors continue data are offered these comparing units, and these comparator unit and suspended processor are inconsistent.
If these two processors do not turn back to locking mode in clastotype and a processor, then this can discern by the exterior monitoring timer.In the trigger pip of each processor, watchdog timer notices that the processor of wait is no longer reported for work.If only have a watchdog timer signal for processor system, then the triggering of watchdog timer only allows to realize at locking mode.Therefore, watchdog timer may identify, and does not have the implementation pattern conversion.Mode signal exists as the double track signal.At this, " 10 " represent locking mode and " 01 " expression clastotype.Under the situation of " 00 " and " 01 ", mistake appears.
IramControl: the visit to the command memory of these two processors is controlled by IRAMControl.This IRAMControl must design reliably, because it is single failpoint.IRAMControl is made up of two state automatas at each processor: separately as clock synchronization iram1clkreset and asynchronous readiram1.Under the pattern to the security requirement strictness, the state automata of these two processors is monitored mutually, and under performance mode, the state automata of these two processors is work separately.
Reloading by two state automatas (that is synchronous regime automat iramc1kreset and asynchronous mode automat readiram) of two high-speed caches of processor controlled.By these two state automatas, memory access also is assigned to clastotype.In this case, processor 1 has higher priority.After conducting interviews by 1 pair of primary memory of processor, (if these two processors want to visit primary memory again) gives processor 2 allocate memory access permissions now.Realize this two state automatas at each processor.Under locking mode, the output signal of automat is compared, so that can discern the mistake of appearance.
The data that are used for upgrading the high-speed cache 2 under the locking mode are delayed 1.5 clocks at the IRAM control module.
Encode in the position 5 in the register 0 of SysControl, relate to which nuclear.Nuclear 1 is high for position 0 and at nuclear 2 places.It is in 65528 the memory range that this register is mapped to the address.
When the memory access of nuclear 2, check at first which kind of pattern computing machine is in.If computing machine is in locking mode, then its memory access is suppressed.This signal is as rail signal existence altogether, because this signal is to the security requirement strictness.
The programmable counter of processor 1 is delayed 1.5 clocks, so that can compare with the programmable counter of processor 2 under locking mode.
Under clastotype, the high-speed cache of these two processors can differently be reloaded.If be transformed into locking mode now, then these two high-speed caches are uncorrelated each other.Thus, these two processors can move in succession, and therefore comparer with wrong signaling.For fear of this point, in IRAMControl, set up label table.In this label table, notice, locking mode down or the cache column under the clastotype write.Under locking mode, when cache column reloads, be configured to 0 for the corresponding clauses and subclauses of cache column, and (even only the high-speed cache of the cache column of a high-speed cache upgrades) is set to 1 under clastotype.If processor is only implemented memory access under locking mode now, check that then high-speed cache is listed under the locking mode and whether is updated, just whether the cache column in these two high-speed caches is identical.Under clastotype, processor is the access cache row always, and how to have nothing to do with Flag_Vector.This form must only exist once, because these two processors move and therefore identify reliably mistake in succession on comparer when mistake.Since high relatively to the access time of center form, so this form also is copied to each high-speed cache.
DramControl: in this assembly, for address signal, data-signal and the parity checking of storer control signal formation of each processor.
There is a process at these two processors, is used to block storer.This process needn't be implemented reliably, because discern vicious memory access by comparer under locking mode, and does not implement the security important use under clastotype.At this, check whether processor wants to block the storer of another processor.The obstruction of data-carrier store is realized by reference-to storage Di Zhi $FBFF$=64511.Even apply wait command in the moment of calling on processor, this signal also should just in time exist a clock long.The state automata that is used for the management data memory access is made up of two major states:
The locking of-processor state: these two processors are with locking mode work.That is to say that the function of data-carrier store locking is optional.The visit of processor 1 coordinate memory.
-processor state separates: now, it is essential that the access conflict of data storer is solved, and must can realize the storer obstruction.
State under the clastotype is divided into 7 states again, and these 7 states solve access conflicts and can be respectively at another processor blocking data storer.When wishing these two processor access at the same time, listed order is represented priority simultaneously.
-Core1 _ Lock: processor 1 is the blocking data storer.If processor 2 is wanted reference-to storage in this state, then this processor 2 is ended by waiting signal, discharges data-carrier store once more up to processor 1.
-Core2 _ Lock: if the state identical with the front only is that now processor 2 has blocked data-carrier store, and processor 1 is ended when data-carrier store is worked.
-lock1 _ wait: when processor 1 was intended for own retention data storer equally, this data-carrier store got clogged by processor 2.Therefore, processor 1 blocks for storer next time and is registered in advance.
-nex: this is identical for processor 2.Data-carrier store blocks by processor 1 between trial period at obstruction.Be storer predetermined process device 2 in advance.When not having normally the memory access of blocking,, then manage device 2 herein and can before processor 1, visit if processor 1 thereon before.
The memory access of-processor 1: storer does not get clogged in this case.Processor 1 allows accesses data memory.If this processor 1 wants to block storer, then this processor 1 can carry out this point in this state.
-pass through the memory access of processor 2: at same clock, processor 1 is not thought reference-to storage, so storer is idle for processor 2.
-there is not processor to want accesses data memory.
DVE such as mentioned conversion hope (IllOPDetect) and Iram-and DramControl formation by the detecting pattern converting unit.
In Fig. 3, with an example clock conversion is shown now, so that realizes the clock conversion comparing with other pattern aspect the pattern.At this, following two kinds of patterns are shown, i.e. clock clk and two processor clocks or nuclear clock.
Under a pattern, these two processors have the work of clock skew ground.This clock skew not only had been offset whole clock but also Offset portion clock toward each other toward each other.Another flexible program is to use different clock frequencies in these two kinds of patterns.Under pattern, for example will be used for suppressing to disturb than clock lower under performance mode to the security requirement strictness.At this, these two flexible programs also can make up mutually.
But shown in addition specific implementation scheme has also solved the described task of beginning.
When realizing especially two-processor system (double-core), be provided with a high-speed cache for each processor, as also schematically as shown in Fig. 4.A high-speed cache is normally not enough, must be disposed between two processors because this high-speed cache spatially be it seems.Because the long running time between high-speed cache and two processors, so these two processors only can be with limited clock frequency work.
High-speed cache is as quick working storage, so that processor needn't be always from obtaining data the primary memory slowly.In order to realize this point, must its access duration time of exactissima diligentia when realizing high-speed cache.This access duration time is by the actual access time of obtaining data from high-speed cache with by constituting to the time of processor data delivery.If now high-speed cache is spatially placed away from processor, then transmit data and continue very longly, and processor no longer can be with its complete clock work.Because timing problems is provided with special-purpose high-speed cache usually at each processor in two-processor system.
If these two processors have the operation of clock skew ground now, the method that then present utilization is advised in Fig. 5 can be saved second high-speed cache from processor.
The essential a plurality of chip faces of high-speed cache and also essential a plurality of electric currents.Thus, this high-speed cache has also produced many used heat, and this used heat must be drawn.If now can save high-speed cache, then two-processor system obviously can be implemented more at an easy rate.
In the duplex computer system described herein, a processor is a main frame, and a processor is a slave.Main frame at first carry out data and therefore also control as the peripheral assembly of storer, high-speed cache, dma controller or the like.Slave is for example being that the clock skew of 1.5 clocks is carried out identical data at this.This means that also slave obtains data and should obtain to the duration data of external module same evening from common storer.The output data of these two processors (as storage address, data etc.) is compared mutually.For comparing data mutually, the result of main frame equally must be by temporary 1.5 clocks.This example system is described below.
According to Fig. 5, for a high-speed cache being used for two processors, as in the single processor, the instruction and data high-speed cache directly is disposed on the main frame now.Therefore, main frame needn't be tolerated between high-speed cache and the processor working time the aspect performance loss.Because just evening, 1.5 clocks were carried out data to slave, thus this time utilized now, so that with present second processor of high-speed cache further away from each other spatially of direct data.
For this reason, under the situation of the exemplary clock skew of 1.5 clocks, use two triggers, as illustrated in Figure 6.First trigger utilizes the clock of main frame to control, and second trigger utilizes the clock of slave to control.First trigger directly is positioned in the output terminal in source.Second trigger is located according to the correspondingly more close slave of length of energy process in the difference of signal between these two clocks now.This is corresponding with length working time of half clock under the situation of the time migration of 1.5 clocks, and corresponding with length working time of a clock under the situation of the clock skew of 2 clocks.Then, second trigger receives this signal.Also consider once the distance of this signal now in whole clock period energy process.In the drawings, 1.) this is by illustrating near arranging on the receiver, 2.) this with the time clock correction in the length of energy process corresponding, and 3.) this is the length of energy process in the clock after second trigger.

Claims (20)

1. method that is used to move multicomputer system with first and second processors, give the described first and second processor distribution storage unit, wherein said first and second processors can move under performance mode, under this performance mode, two processors are carried out different programs, wherein said first and second processors can move under safe mode, under this safe mode, identical program is carried out on two processor redundancy ground, it is characterized in that, under safe mode, described second processor and described first processor have the work of clock skew ground; And described first processor is visited described storage unit, and described second processor has clock skew ground to obtain data; And under performance mode, two processors do not have the work of clock skew ground.
2. method according to claim 1 is characterized in that, under performance mode, each in two processors all visited described storage unit.
3. method according to claim 1 is characterized in that described clock skew produces by delay element, and described clock skew is used to realize the cross-over connection of data and/or the working time of instruction from described storage unit to described second processor.
4. method according to claim 1 is characterized in that write operation and read operation are delayed the clock skew of described second processor.
5. method according to claim 1 is characterized in that only write operation is delayed the clock skew of described second processor.
6. method according to claim 1 is characterized in that, only read operation is delayed the clock skew of described second processor.
7. method according to claim 1 is characterized in that, described clock skew is pre one or more half clock.
8. method according to claim 1 is characterized in that, is scheduled to described clock skew integer.
9. method according to claim 1 is characterized in that, described clock skew is pre 1.5 clocks.
10. method according to claim 1 is characterized in that, for the conversion from the performance mode to the safe mode or for the conversion from the safe mode to the performance mode, ends one of two processors.
11. multicomputer system with first and second processors, give the described first and second processor distribution storage unit, wherein said first and second processors can move under performance mode, under this performance mode, two processors are carried out different programs, wherein said first and second processors can move under safe mode, under this safe mode, identical program is carried out on two processor redundancy ground, it is characterized in that, under safe mode, described second processor and described first processor have the work of clock skew ground; And described first processor is visited described storage unit, and described second processor has clock skew ground to obtain data; And under performance mode, two processors do not have the work of clock skew ground.
12. multicomputer system according to claim 11, it is characterized in that, described storage unit is constructed to cache memory, described storage unit directly is disposed on the first processor and by delay element and produces described clock skew, and described clock skew is used to realize the cross-over connection of data and/or the working time of instruction from described storage unit to described second processor.
13. multicomputer system according to claim 11 is characterized in that, write operation and read operation are delayed the clock skew of described second processor.
14. multicomputer system according to claim 11 is characterized in that, only write operation is delayed the clock skew of described second processor.
15. multicomputer system according to claim 11 is characterized in that, only read operation is delayed the clock skew of described second processor.
16. multicomputer system according to claim 11 is characterized in that, described clock skew is pre one or more half clock.
17. multicomputer system according to claim 11 is characterized in that, is scheduled to described clock skew integer.
18. multicomputer system according to claim 11 is characterized in that, described clock skew is pre 1.5 clocks.
19. multicomputer system according to claim 11 is characterized in that, for the conversion from the performance mode to the safe mode or for the conversion from the safe mode to the performance mode, ends one of two processors.
20. multicomputer system according to claim 11 is characterized in that, described storage unit is a high-speed cache.
CN 200580036461 2004-10-25 2005-10-25 Method and device for delaying multiprocessor system data and/or dictation visit Expired - Fee Related CN100585567C (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
DE102004051952.8 2004-10-25
DE200410051952 DE102004051952A1 (en) 2004-10-25 2004-10-25 Data allocation method for multiprocessor system involves performing data allocation according to operating mode to which mode switch is shifted
DE102004051992.7 2004-10-25
DE102004051950.1 2004-10-25
DE102004051937.4 2004-10-25
DE102004051964.1 2004-10-25

Publications (2)

Publication Number Publication Date
CN101048747A CN101048747A (en) 2007-10-03
CN100585567C true CN100585567C (en) 2010-01-27

Family

ID=36129010

Family Applications (5)

Application Number Title Priority Date Filing Date
CN 200580036488 Expired - Fee Related CN100511167C (en) 2004-10-25 2005-10-25 Method and device for monitoring memory cell of multiprocessor system
CN 200580036617 Expired - Fee Related CN100555233C (en) 2004-10-25 2005-10-25 Be used for carrying out synchronous method and apparatus at multicomputer system
CN 200580036441 Pending CN101048745A (en) 2004-10-25 2005-10-25 Method and device for switching over in multiprocessor system
CN 200580036538 Pending CN101048754A (en) 2004-10-25 2005-10-25 Method and device for distributing data from at least data source in multiprocessor system
CN 200580036461 Expired - Fee Related CN100585567C (en) 2004-10-25 2005-10-25 Method and device for delaying multiprocessor system data and/or dictation visit

Family Applications Before (4)

Application Number Title Priority Date Filing Date
CN 200580036488 Expired - Fee Related CN100511167C (en) 2004-10-25 2005-10-25 Method and device for monitoring memory cell of multiprocessor system
CN 200580036617 Expired - Fee Related CN100555233C (en) 2004-10-25 2005-10-25 Be used for carrying out synchronous method and apparatus at multicomputer system
CN 200580036441 Pending CN101048745A (en) 2004-10-25 2005-10-25 Method and device for switching over in multiprocessor system
CN 200580036538 Pending CN101048754A (en) 2004-10-25 2005-10-25 Method and device for distributing data from at least data source in multiprocessor system

Country Status (2)

Country Link
CN (5) CN100511167C (en)
DE (1) DE102004051952A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI716074B (en) * 2019-01-16 2021-01-11 開曼群島商創新先進技術有限公司 Method and device for improving CPU performance and electronic equipment

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8090984B2 (en) * 2008-12-10 2012-01-03 Freescale Semiconductor, Inc. Error detection and communication of an error location in multi-processor data processing system having processors operating in Lockstep
JP5218585B2 (en) * 2011-03-15 2013-06-26 オムロン株式会社 Control device and system program
JP5796311B2 (en) * 2011-03-15 2015-10-21 オムロン株式会社 Control device and system program
CN106850944A (en) * 2016-12-13 2017-06-13 北京元心科技有限公司 Smart machine awakening method and device
US10353767B2 (en) * 2017-09-14 2019-07-16 Bae Systems Controls Inc. Use of multicore processor to mitigate common mode computing faults

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE1269827B (en) * 1965-09-09 1968-06-06 Siemens Ag Method and additional device for the synchronization of data processing systems working in parallel

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI716074B (en) * 2019-01-16 2021-01-11 開曼群島商創新先進技術有限公司 Method and device for improving CPU performance and electronic equipment
US10983839B2 (en) 2019-01-16 2021-04-20 Advanced New Technologies Co., Ltd. Method, apparatus, and electronic device for improving CPU performance
US11269693B2 (en) 2019-01-16 2022-03-08 Advanced New Technologies Co., Ltd. Method, apparatus, and electronic device for improving CPU performance

Also Published As

Publication number Publication date
CN100555233C (en) 2009-10-28
CN101048749A (en) 2007-10-03
CN101048754A (en) 2007-10-03
CN100511167C (en) 2009-07-08
CN101048761A (en) 2007-10-03
CN101048747A (en) 2007-10-03
DE102004051952A1 (en) 2006-04-27
CN101048745A (en) 2007-10-03

Similar Documents

Publication Publication Date Title
JP4532561B2 (en) Method and apparatus for synchronization in a multiprocessor system
EP2027538B1 (en) Systems and methods for providing remote pre-fetch buffers
CN100585567C (en) Method and device for delaying multiprocessor system data and/or dictation visit
JP5199088B2 (en) Method and apparatus for controlling a computer system comprising at least two instruction execution units and one comparison unit
US20090044044A1 (en) Device and method for correcting errors in a system having at least two execution units having registers
JPH05197582A (en) Fault tolerant processor having majority decision system, whose dynamic reconstitution is possible
US20150234661A1 (en) Semiconductor integrated circuit device and system using the same
US20090044048A1 (en) Method and device for generating a signal in a computer system having a plurality of components
CN112667450B (en) Dynamically configurable fault-tolerant system with multi-core processor
US20090119540A1 (en) Device and method for performing switchover operations in a computer system having at least two execution units
US20080263340A1 (en) Method and Device for Analyzing a Signal from a Computer System Having at Least Two Execution Units
US7237148B2 (en) Functional interrupt mitigation for fault tolerant computer
US20080313384A1 (en) Method and Device for Separating the Processing of Program Code in a Computer System Having at Least Two Execution Units
US20070294559A1 (en) Method and Device for Delaying Access to Data and/or Instructions of a Multiprocessor System
JP3746957B2 (en) Control method of logical partitioning system
US20060195849A1 (en) Method for synchronizing events, particularly for processors of fault-tolerant systems
JP2839664B2 (en) Computer system
JP2002014943A (en) Failure-proof system and its failure detection method
US20140372837A1 (en) Semiconductor integrated circuit and method of processing in semiconductor integrated circuit
JPS5847746B2 (en) multiprocessor system
JP5888419B2 (en) Data processing apparatus, processor, and operation history recording method
HU184570B (en) Circuit arrangement for increasing reliability of the microcomputer systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100127

Termination date: 20121025