WO1993019424A1 - System and method for supporting a multiple width memory subsystem - Google Patents

System and method for supporting a multiple width memory subsystem Download PDF

Info

Publication number
WO1993019424A1
WO1993019424A1 PCT/JP1993/000317 JP9300317W WO9319424A1 WO 1993019424 A1 WO1993019424 A1 WO 1993019424A1 JP 9300317 W JP9300317 W JP 9300317W WO 9319424 A1 WO9319424 A1 WO 9319424A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
bus
memory
width
bit
Prior art date
Application number
PCT/JP1993/000317
Other languages
French (fr)
Inventor
Derek J. Lentz
Cheng-Long Tang
Original Assignee
Seiko Epson Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seiko Epson Corporation filed Critical Seiko Epson Corporation
Priority to JP5516422A priority Critical patent/JPH07504773A/en
Publication of WO1993019424A1 publication Critical patent/WO1993019424A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/18Handling requests for interconnection or transfer for access to memory bus based on priority control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0855Overlapped cache accessing, e.g. pipeline
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4009Coupling between buses with data restructuring
    • G06F13/4018Coupling between buses with data restructuring with data-width conversion

Definitions

  • Microprocessor Architecture Capable of Supporting Multiple Heterogeneous Processors, invented by Arthur J. Lentz et al., Attorney Ref: SP016, Application Serial No. 07/726,893, filed July 8, 1991, which is hereby incorporated by reference in its entirety.
  • the present invention relates generally to the field of microprocessor memory systems, and more particularly to a system that supports a dual width memory bus.
  • a typical computer-based processor system (or computer system) consists of three major subsystems: a main memory, one or more central processing units (CPU) and an input-output (I O) subsystem.
  • CPU central processing units
  • I O input-output
  • the various subsystems must have interfaces to one another. For example, the memory and CPU need to communicate, as well as the CPU and I O devices.
  • This communication is typically done via a bus.
  • the bus serves as a shared communication link between the subsystems.
  • Two major advantages of having a bus are low cost and versatility. By defining a single interconnection scheme, new devices and subsystems can easily be added to the computer system. Moreover, peripherals may even be ported between separate computer systems that use a common bus.
  • bus speed is largely limited by physical factors: the length of the bus and the number of devices (and, hence, bus loading). These physical limits prevent arbitrary bus speedup.
  • the objective of designing a memory subsystem is to attempt to match processor speed with the rate of information (or bandwidth) of memory at the lowest level and most reasonable cost.
  • a wider bus called a "memory bus” to increase the memory bandwidth or to reduce the latency o memory.
  • the memory bandwidth is the number of memory bytes that can be transferred (either fetched or stored) between the CPU and the memory per unit time.
  • the present invention provides a memory system interface design for a processor and a method of operating such an interface which provides access to a dual width memory bus.
  • the present invention provides a mechanism that allows a computer-based system to access either a 32 bit memory bus or a 64 bit memory bus.
  • the 32 bit memory bus would be used for low-end products, while the 64 bit memory bus would be used for high-end products.
  • a memory control unit (MCU) of the present invention supports both modes: the 32 bit bus mode and the 64 bit bus mode.
  • the present invention in one embodiment has been integrated onto a microprocessor chip.
  • Selecting a 32 bit or 64 bit memory subsystem provides a user with a flexible framework in which to design a system.
  • the user can adjust system cost and performance by choosing to utilize a 32 bit or 64 bit external bus.
  • the present invention provides a system and method which decreases the amount of wires necessary to transfer data.
  • a microprocessor chip incorporating the present invention allows switching between the 32 bit or 64 bit external memory bus without changing the control signals and/or syste configuration.
  • the present invention provides a computer-based system and method for efficiently transferring data over an external memory bus between a main memory and a bus requestor, comprising a dual width memory subsystem configured to provide access to a plurality of different external memory buses.
  • the dual width memory subsyste comprises a plurality of multiplexers connected to receive data from the bus requestor and a storage device connected to receive and store data firom the plurality of multiplexers, the data is stored in blocks depending on the width of the external bus.
  • the dual width memory subsystem comprises a storage device connected to receive and store data from the external memory bus, the data is stored in blocks depending on the width of the external bus.
  • a plurality of multiplexers connected to receive data from the storage device, and connected to send said data to a bus requestor in blocks determined by the limitations of the system.
  • FIG. 1 is a general block diagram of the system architecture 100 of the present invention
  • FIG. 2 is a circuit block diagram for a cache 110 write (store) to the main memory 150;
  • FIG. 3a and FIG. 3b are the cache write (store) data timing for a 32 bit and a 64 bit memory bus, respectively;
  • FIG. 4 is a circuit block diagram for a read (fetch) from main memory 150 to cache 110;
  • FIG. 5a and FIG. 5b show return read data timing for a 32 bit and a 64 bit memory bus, respectively;
  • FIG. 6 is a detailed circuit diagram of the Data Multiplexer Select 240 shown in FIG. 2;
  • FIG. 7 is a general flowchart for writing a data stream to main memory 150;
  • FIG. 8 is a general flowchart for reading a data stream from main memor
  • System architecture 100 includes a CPU 105, a cache controlle unit 110 which includes cache memories 113 and 115, an I/O subsystem 130, memory control and interface unit 120 (MCU), and interleaved memory banks 150a 150b, 150c configured for interleaved operations.
  • the interleaved memory bank 150 are connected to MCU 120 via an external data bus 140.
  • the present inventio allows MCU 120 to accept data from either a 32 bit or a 64 bit external bus 140. I is contemplated that the present invention will operate in a multiprocesso environment.
  • Cache memory 113, 115 serves as a buffer between CPU 105 and memor 150a, 150b, and 150c.
  • a cache is a small, fast memory located close t CPU 105 that holds the most recently accessed code or data.
  • CPU 150 i the fastest unit in the system, with a processor cycle typically of tens of nanoseconds, while memory 150 has a cycle time of hundreds of nanoseconds.
  • the speed gap between CPU 105 and memory 150 can be closed by using fast cache memory 110 between the two.
  • fast cache memory 110 can be closed by using fast cache memory 110 between the two.
  • connecting a wider external bus 140 to MCU 120 allows more data to be transferred.
  • the present invention allows MCU 120 to be connected to memory buses with different data widths.
  • MCU 120 of a preferred embodiment of the present invention comprises a switch network 125 which includes a switch arbitration unit 132, a data cache interface circuit 117, an instruction cache interface circuit 112, an I/O interface circuit 135, and one or more memory port interface circuits 127 known as ports, each port interface circuit 127 includes a port arbitration unit 134.
  • MCU 120 is a circuit whereby data and instructions are transferred (read or written) between cache controller unit 110 (CCU) (both D-cache 115 and I-cache 113 (read only)), IOU 130 and main memory 150.
  • Switch network 125 is a means of communicating between a master and slave device.
  • the possible master devices are a D_Cache 115, an I_Cache 113, or an I O Controller Unit (IOU) 130 and the possible slave devices are a memory port 127 or an IOU 130, for example.
  • IOU I O Controller Unit
  • switch network 125 The function of switch network 125 is to receive the various instruction and data requests from CCU 110 and IOU 130. These units may be referred to as bus requestors. After having received these requests, the switch arbitration unit 132 passes them to the appropriate memory port (depending on the instruction address). The port 127, or ports as the case may be, will then generate the necessary timing signals, and send or receive the necessary data to/from external memory bus 140. Memory interface port 127 manages the data by sending to and receiving from interleaved memory 150. D-cache 115 requires that any data transaction be carried out in 64 bit blocks, regardless of whether the system is currently coupled to a 32 or 64 bit external memory bus 140.
  • Switch network 125 is connected to CCU 110, IOU 130, and memory port 127 via a set of tri-state buffered signal buses.
  • the tri-state buffered signal buses include a memory control data bus (MC_D_BUS) 126(a), a cache data bus (CC_D_BUS) 126(b), and a memory control instruction bus (MCJLBUS) 126(c).
  • the present invention includes request buses CC_D_REQ 128(a) and CC_I_REQ 128(b) and control signals (not shown) MC_D_REQ_ACK, MC_D_DA_ACK, and
  • a bus transaction includes two parts: sending the address and receiving or sending the data.
  • Bus transactions are usually defined by what they do to memory: a read transaction transfers data from memory (to either the CPU or an VO device, for example), and a write transaction writes data to the memory.
  • the address is first put on the memory address bus (not shown) to memory 150, together with the appropriate control signals indicating a read.
  • the memory responds by returning the data on bus 140 with the corresponding control signals.
  • a write transaction requires that the CPU or I/O device send both address and data, and requires no return of data.
  • the present invention contemplates being placed on a chip with either a 64 pin or a 32 pin external memory bus interface.
  • the 64 pin interface can be used in either 32 or 64 bit mode (i.e., with either a 64 bit or 32 bit external bus 140).
  • a chip with a 32 pin memory data bus interface can not operate in 64 bit mode.
  • a preferred embodiment will assume a 32 bit memory interface during power up (boot), read a word from a fixed location (on or off chip) and ascertain therefrom the configuration required for proper system operation.
  • CPU 105 reads and executes boot code.
  • the boot code instructs CPU 105 to read a specific memory location in memory 150. That memory location would have encoded in it the information to determine what size data bus is coupled to system 100.
  • An alternative embodiment includes pre-programming the chip hardware with the size of external bus 140.
  • the sub-systems described below for allowing access to a 32 bit or 64 bit external memory bus 140 are aware, immediately after the chip is powered up or after a hardware reset, which size external bus 140 is currently coupled to system 100.
  • other means for determining the size of the external bus will be apparent to those skilled in the art and in no way is the present invention limited to the techniques described above.
  • CC_D_BUS 126(b) and SW_WD 215, are used to send write data from the master device (e.g., D_cache 115) to a write data FIFO 230 (described below and shown in FIG. 2).
  • MC_D_BUS 126(a), SWJiD 450 and 455 are used to send the return read data from the slave device (memory port 127 or IOU 130) back to the master device.
  • Both SW_WD and SW_RD are both tri-state buses.
  • the present invention allows the system architecture described above to be interfaced with either a 32 or 64 bit external memory bus 140.
  • the present invention is designed to use a maximum of two clock cycles to send a word between cache 110 and memory interface port 127, or vice versa. For example, if cache 110 writes one long word (64 bits), and the system 100 is coupled to a 32 bit external bus 140, it will take two clock cycles to send the data to memory interface port 127.
  • FIG. 2 there is a logic design 200 (hereinafter subsystem 200) for writing to main memory 150.
  • Subsystem 200 represents the hardware necessary for either a 32 or 64 bit data transfer.
  • Subsystem 200 transfers data in a "double pumped" fashion.
  • the subsystem 200 can transfer one half word of data every half clock cycle. Since the buses are double-pumped, care is taken in the circuit design to ensure that there is no bus-conflict when the buses turn around and switch from one master to a new master. Double pumping reduces the number of required bit lines thereby minimizing expensive wire requirements with minimal performance degradation. Although the preferred embodiment implements a double pumping scheme, double pumping is not necessary for carrying out the present invention.
  • Subsystem 200 uses multiplexers 210, 220 to send data from data cache bus (CC_D_BUS) 126(b) to main memory 150.
  • Multiplexers 210, 220 of a preferred embodiment of the present invention uses multiplexer latches. In other words the multiplexers can temporarily store data. 16 or 32 bits (depending on whether there is a 32 or 64 bit memory bus 140, respectively) of data will be transferred to memory interface port 127 every half clock cycle and stored in a write data FIFO 230, located between memory interface port 127 and CCU 110.
  • Subsystem 200 also contains a buffer 250 and pad 260.
  • Buffer 250 is a tri- state output pad buffer to drive the external memory data bus and pad 260 is used to connect subsystem 200 to main memory 150.
  • FIG's. 3(a) and 3(b) show a pair of timing diagrams for writing data to memory 150 with either a 32 bit memory bus or a 64 bit memory bus, respectively. Data can be transferred in one cycle if a 64 bit memory bus is used, while it takes two cycles to transfer data using a 32 bit memory bus.
  • FIG. 3(a) shows a timing diagram for writing data to memory 150 with a 32 bit memory bus.
  • cache 110 sends a request via the cache request signal (CC_D_REQ) 128(a) for access to memory 150, shown at reference number 310.
  • MCU 120 acknowledges that request when MC_D_REQ_ACK goes high (at the rising edge of clock 305), as shown at reference number 315.
  • the data to be written to memory 150 appears on the CCJDJBUS 126(b), as shown at reference number 320.
  • the data is transferred to write data FIFO 230 during the next two clock 305 cycles.
  • the MC_D_DA_ACK signal indicates, as shown at reference number 325, that the data is currently being received in write data FIFO
  • the first 32 bits are sent during the first clock 305 cycle (15 bits per half clock cycle) and the second 32 bits are sent during the second clock 305 cycle.
  • FIG. 2 initially, all 64 bits of data act as inputs into multiplexers 210 and 220.
  • the first 32 bits are selected from multiplexers 210, 220 and saved via SW_WD 215, 217 in write data FIFO 230.
  • the second 32 bits are selected from multiplexers 210, 220 and saved in write data FIFO 230 via SW_WD 215, 217.
  • Those skilled in the relevant art will readily be capable of generating the necessary control signals/logic for multiplexers 210, 220 based on the disclosed timing signals described above and the information that the system 100 is currently coupled to a 32 bit external bus.
  • data multiplexer select 240 provides a scheme for selecting a set of bytes from the data being transferred from write data FIFO 230 to main memory 150.
  • a read-modify-write potentially only a portion of the data is modified. For example, as shown in FIG. 6(b) only the first 8 bits of W0 have been modified (as indicated by shading).
  • data multiplexer select 240 is not essential to practice the present invention. It is only an option that has been implemented in a preferred embodiment of the present invention.
  • FIG. 3(a) shows an example of two 64 bit words being written to memory 150 via a 32 bit external bus 140.
  • a sample write data FIFO 340 is shown with four 32 bit blocks. At this point, the data is ready to be sent on external data bus 140 in 32 bit blocks from write data FIFO 340.
  • FIG. 3(b) shows the timing diagram for writing data to main memory 150 with a 64 bit external memory bus 140.
  • cache 110 requests an acknowledgement from MCU 120 that it can write data to memory 150. Once again, this is accomplished by sending CC_D_REQ high at 350.
  • MCU 120 acknowledges the request by sending MC_D_REQ_ACK high at 355, at which point the data is sent onto the CC_D_BUS 126(b).
  • FIG. 4 shows the memory system for reading data (i.e., an information fetch).
  • cache 110 requires that data be returned back to cache 110 in 64 bit blocks. If memory port 127 returns a two long-word read request to cache 110, it will take two clock 505 cycles to send 128 bits to cache 110.
  • the SW_RD bus 450, 455 is used to send the return read data from the slave device (memory port 127 or IOU 130) back to the master device. This bus is not double- pumped because of the timing constraints of cache 110. Data is sent only when clock 505 is high. Cache 110 requires that the data be valid at the falling edge of clock 505.
  • Subsystem 400 also contains a buffer 440 and pad 450. Buffers 440 and 450 are used to translate the external pad voltages to the internal logic voltages and pad 450 is used to connect subsystem 400 to main memory 150.
  • FIG. 5(a) and FIG. 5(b) show the read data timing back to cache 110 for 32 bit and 64 bit bus modes, respectively.
  • FIG. 5(a) shows the timing diagram for reading data from main memory 150 using a 32 bit external memory bus 140. Initially, 32 bits of data are transferred over external memory bus 140 and placed in read data FIFO 430 in 32 bit blocks. Next, the data is placed on data lines SW_RD as shown at reference numbers 510 and 512.
  • the data when utilizing a 32 bit external bus 140 the data enters MCU 120 through port 127.
  • the data is then stored in read data FIFO 430 in 32 bit blocks. Initially, read data FIFO 430 is empty and data lines 450, 455 are available. However, once data lines SW_RD[31:0] 450 and SW_RD[63:32] 455 become unavailable, the data remains stored in read data FIFO 430 until the data lines 450, 455 become available (only data line 450 is used in 32 bit mode).
  • the first 32 bits in read data FIFO 430 are sent to multiplexers 410, 420.
  • multiplexer 420 is concerned with the lower 32 bits and multiplexer 410 is concerned with the higher 32 bits.
  • the first 32 bits are popped from read data FIFO 430 and placed at the input of multiplexer 420 via SW_RD[31:0] 450.
  • the second set of 32 bits will be popped from read data FIFO 430 and sent to multiplexer 410 via SW_RD[63:32] 455.
  • all 64 bits of data are placed at the inputs of multiplexers 410, 420, all 64 bits are selected from multiplexers 410, 420, and placed on MC_D_BUS 126(a) (or MC_I_BUS 126(c) as the case may be) and read into cache 110.
  • An alternative embodiment of the present invention can be configured with a separate set of multiplexers, one set for the I_Cache 113 and the a second set for D_Cache 115.
  • data line SW_RD[63:32] is optional for 32 bit (low cost) implementations.
  • FIG. 5(b) a timing diagram for reading data when a 64 bit external bus 140 is being utilized is shown. Initially, the data from external data bus 140 is stored in read data FIFO 430. Since a 64 bit external data bus 140 is being used, the data is stored in read data FIFO 430 is 64 bit long words. The data remains in read data FIFO 430 until data lines SW_RD 450, 455 are available.
  • SW_RD 450, 455 become available, all 64 bits are transferred to the inputs of the multiplexers via SW_RD 450, 455, as shown at reference number 550.
  • MC_D_B_VLD goes high, as shown at 555, the data will subsequently be placed on MC_D_BUS 126(a) (or MCJ BUS 126(c) as the case may be) during the next cycle of clock 505, as shown at reference number 560.
  • the data is transferred over MC_D_BUS 126(a) and forwarded to requesting cache 110.
  • Data is put into read data FIFO 430 when the switch read bus (SW_RD) is not available. Data is always put in write data FIFO 230 and read out according to memory timing requirements. If external bus 140 or SW_RD buses are currently being used by some other port, the oncoming write or read data is temporarily pushed into write data FIFO 230 or read data FIFO 430, respectively. When the requested bus becomes available (i.e., external bus 140 or SW_RD is released), data is popped from the particular FIFO and transferred to either memory 150, or requesting cache 110 or IOU 130. On the other hand, if the requested bus is available when the data arrives in the write data FIFO 230 or read data FIFO 430, then the data is immediately transferred through the respective FIFO onto the data lines.
  • SW_RD switch read bus
  • the memory system is designed to allow a 64 bit data path to operate in either 64 or 32 bit mode. Software can select which system configuration is used. The 32 bit mode control operation for both 64 and 32 bit chips is the same. Essentially, the control logic and the data path is similar to when the system is configured to connect to a 32 bit external bus and run in 32 bit mode. However, when a 32 bit external bus is used the upper bits of the switch 125 and the write data FIFO 230 or read data FIFO 430 are not used (i.e., SW_WD[31:16] and SW_RD[63:32] will be "don't care"). But as discussed above, the control logic remains the same.
  • the write data FIFO 230 and read data FIFO 430 must be able to store at least two sets of data at any given time, where a set of data is equal to the maximum block of data that is to be transferred. This ensures that when external bus 140 has been accessed on the first set of data is being placed onto external bus 140, a second set of data is immediately available to be put on external bus 140. Thus, there is no guaranteed lag time between the first set of data being placed on external bus 140 and the second set o data being place on external bus 140.
  • the present invention is directly scalable (e.g. 64/128 bits).
  • the present invention is applicable to dual width I/O buses.
  • the present invention is not limited to external buses, but can be applied to internal buses as well.

Abstract

The present invention provides a memory system interface design, which provides access to a dual width memory bus. Specifically, a subsystem and method provides for interfacing with a 32 bit or a 64 bit bus. The 32 bit bus would be used for low end products, and the 64 bit bus would be used for high end products. A memory control unit (MCU) supports both the 32 bit and 64 bit modes. Selecting a 32 bit or 64 bit memory subsystem gives a user more room to adjust system cost and performance.

Description

D E S C R I P T I O N
Title: SYSTEM AND METHOD FOR SUPPORTING A MULTIPLE WIDTH MEMORY SUBSYSTEM
CROSS-REFERENCETORELATEDAPPLICATION The present application is related to the following application, assigned to the
Assignee of the present application:
Microprocessor Architecture Capable of Supporting Multiple Heterogeneous Processors, invented by Derek J. Lentz et al., Attorney Ref: SP016, Application Serial No. 07/726,893, filed July 8, 1991, which is hereby incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to the field of microprocessor memory systems, and more particularly to a system that supports a dual width memory bus.
2. Discussion of Related Art
A typical computer-based processor system (or computer system) consists of three major subsystems: a main memory, one or more central processing units (CPU) and an input-output (I O) subsystem. In a computer system, the various subsystems must have interfaces to one another. For example, the memory and CPU need to communicate, as well as the CPU and I O devices.
This communication is typically done via a bus. The bus serves as a shared communication link between the subsystems. Two major advantages of having a bus are low cost and versatility. By defining a single interconnection scheme, new devices and subsystems can easily be added to the computer system. Moreover, peripherals may even be ported between separate computer systems that use a common bus.
One reason bus design is so difficult is that the maximum bus speed is largely limited by physical factors: the length of the bus and the number of devices (and, hence, bus loading). These physical limits prevent arbitrary bus speedup.
The objective of designing a memory subsystem is to attempt to match processor speed with the rate of information (or bandwidth) of memory at the lowest level and most reasonable cost. For main memory, we can use a wider bus called a "memory bus" to increase the memory bandwidth or to reduce the latency o memory. In the case of a memory subsystem, the memory bandwidth is the number of memory bytes that can be transferred (either fetched or stored) between the CPU and the memory per unit time. Hence, the maximum memory bus bandwidth B is equal to B=W Tm byte/s, where W is the width of word in bytes delivered per memor cycle Tm.
Oftentimes, a variety of different size memory buses are available to help increase performance. However, designing a system that allows access to multiple external buses having different widths presents a design problem. If, for example, a system that is currently configured to accept data firo the memory bus in 32 bit blocks, a 64 bit data transfer will create a predicament for the CPU and/or cache. Consequently, a system is needed that allows memory buses with different widths to be utilized without changing the overall configuration of the computer system. For a more in depth discussion of the above, see Hennessy et al., Computer Architecture a Quantitative Approach, Morgan Kaufi-nann Publishers (1990).
SUMMARY OF THE INVENTION
The present invention provides a memory system interface design for a processor and a method of operating such an interface which provides access to a dual width memory bus. Specifically, the present invention provides a mechanism that allows a computer-based system to access either a 32 bit memory bus or a 64 bit memory bus. The 32 bit memory bus would be used for low-end products, while the 64 bit memory bus would be used for high-end products. A memory control unit (MCU) of the present invention supports both modes: the 32 bit bus mode and the 64 bit bus mode. The present invention in one embodiment has been integrated onto a microprocessor chip.
Selecting a 32 bit or 64 bit memory subsystem provides a user with a flexible framework in which to design a system. The user can adjust system cost and performance by choosing to utilize a 32 bit or 64 bit external bus. The present invention provides a system and method which decreases the amount of wires necessary to transfer data. Moreover, a microprocessor chip incorporating the present invention allows switching between the 32 bit or 64 bit external memory bus without changing the control signals and/or syste configuration.
The present invention provides a computer-based system and method for efficiently transferring data over an external memory bus between a main memory and a bus requestor, comprising a dual width memory subsystem configured to provide access to a plurality of different external memory buses. The dual width memory subsyste comprises a plurality of multiplexers connected to receive data from the bus requestor and a storage device connected to receive and store data firom the plurality of multiplexers, the data is stored in blocks depending on the width of the external bus. Furthermore, the dual width memory subsystem comprises a storage device connected to receive and store data from the external memory bus, the data is stored in blocks depending on the width of the external bus. A plurality of multiplexers connected to receive data from the storage device, and connected to send said data to a bus requestor in blocks determined by the limitations of the system.
BRIEF DESCRIPTION OF THE DRAWINGS The above and further advantages of this invention may be better understood by referring to the following description taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a general block diagram of the system architecture 100 of the present invention; FIG. 2 is a circuit block diagram for a cache 110 write (store) to the main memory 150;
FIG. 3a and FIG. 3b are the cache write (store) data timing for a 32 bit and a 64 bit memory bus, respectively;
FIG. 4 is a circuit block diagram for a read (fetch) from main memory 150 to cache 110;
FIG. 5a and FIG. 5b show return read data timing for a 32 bit and a 64 bit memory bus, respectively;
FIG. 6 is a detailed circuit diagram of the Data Multiplexer Select 240 shown in FIG. 2; FIG. 7 is a general flowchart for writing a data stream to main memory 150; and
FIG. 8 is a general flowchart for reading a data stream from main memor
150.
DETAILED DESCRD7TION
I. Environment Background for the Present Invention
Referring to FIG. 1, there is provided in accordance with a preferre embodiment of the present invention a microprocessor architecture designate generally as 100. System architecture 100 includes a CPU 105, a cache controlle unit 110 which includes cache memories 113 and 115, an I/O subsystem 130, memory control and interface unit 120 (MCU), and interleaved memory banks 150a 150b, 150c configured for interleaved operations. The interleaved memory bank 150 are connected to MCU 120 via an external data bus 140. The present inventio allows MCU 120 to accept data from either a 32 bit or a 64 bit external bus 140. I is contemplated that the present invention will operate in a multiprocesso environment.
Cache memory 113, 115 serves as a buffer between CPU 105 and memor 150a, 150b, and 150c. Generally, a cache is a small, fast memory located close t CPU 105 that holds the most recently accessed code or data. Typically, CPU 150 i the fastest unit in the system, with a processor cycle typically of tens of nanoseconds, while memory 150 has a cycle time of hundreds of nanoseconds. The speed gap between CPU 105 and memory 150 can be closed by using fast cache memory 110 between the two. However, regardless of how fast CPU 105 and cache 110 are, performance will suffer if there are no means of retrieving the data in a fast, efficient manner. Consequently, connecting a wider external bus 140 to MCU 120 allows more data to be transferred. Thus, the present invention allows MCU 120 to be connected to memory buses with different data widths.
MCU 120 of a preferred embodiment of the present invention comprises a switch network 125 which includes a switch arbitration unit 132, a data cache interface circuit 117, an instruction cache interface circuit 112, an I/O interface circuit 135, and one or more memory port interface circuits 127 known as ports, each port interface circuit 127 includes a port arbitration unit 134. MCU 120 is a circuit whereby data and instructions are transferred (read or written) between cache controller unit 110 (CCU) (both D-cache 115 and I-cache 113 (read only)), IOU 130 and main memory 150.
Switch network 125 is a means of communicating between a master and slave device. To switch network 125 the possible master devices are a D_Cache 115, an I_Cache 113, or an I O Controller Unit (IOU) 130 and the possible slave devices are a memory port 127 or an IOU 130, for example.
The function of switch network 125 is to receive the various instruction and data requests from CCU 110 and IOU 130. These units may be referred to as bus requestors. After having received these requests, the switch arbitration unit 132 passes them to the appropriate memory port (depending on the instruction address). The port 127, or ports as the case may be, will then generate the necessary timing signals, and send or receive the necessary data to/from external memory bus 140. Memory interface port 127 manages the data by sending to and receiving from interleaved memory 150. D-cache 115 requires that any data transaction be carried out in 64 bit blocks, regardless of whether the system is currently coupled to a 32 or 64 bit external memory bus 140.
Switch network 125 is connected to CCU 110, IOU 130, and memory port 127 via a set of tri-state buffered signal buses. The tri-state buffered signal buses include a memory control data bus (MC_D_BUS) 126(a), a cache data bus (CC_D_BUS) 126(b), and a memory control instruction bus (MCJLBUS) 126(c). Furthermore, the present invention includes request buses CC_D_REQ 128(a) and CC_I_REQ 128(b) and control signals (not shown) MC_D_REQ_ACK, MC_D_DA_ACK, and
MC_D_B_VLD.
Generally, a bus transaction includes two parts: sending the address and receiving or sending the data. Bus transactions are usually defined by what they do to memory: a read transaction transfers data from memory (to either the CPU or an VO device, for example), and a write transaction writes data to the memory. In a read transaction, the address is first put on the memory address bus (not shown) to memory 150, together with the appropriate control signals indicating a read. The memory responds by returning the data on bus 140 with the corresponding control signals. A write transaction requires that the CPU or I/O device send both address and data, and requires no return of data.
The present invention contemplates being placed on a chip with either a 64 pin or a 32 pin external memory bus interface. As will be appreciated, the 64 pin interface can be used in either 32 or 64 bit mode (i.e., with either a 64 bit or 32 bit external bus 140). A chip with a 32 pin memory data bus interface can not operate in 64 bit mode.
At reset, a preferred embodiment will assume a 32 bit memory interface during power up (boot), read a word from a fixed location (on or off chip) and ascertain therefrom the configuration required for proper system operation. In particular, during power up CPU 105 reads and executes boot code. The boot code instructs CPU 105 to read a specific memory location in memory 150. That memory location would have encoded in it the information to determine what size data bus is coupled to system 100. An alternative embodiment, includes pre-programming the chip hardware with the size of external bus 140. Thus, the sub-systems described below for allowing access to a 32 bit or 64 bit external memory bus 140 are aware, immediately after the chip is powered up or after a hardware reset, which size external bus 140 is currently coupled to system 100. Of course, other means for determining the size of the external bus will be apparent to those skilled in the art and in no way is the present invention limited to the techniques described above.
Referring to FIG's. 1, 2, and 4, CC_D_BUS 126(b) and SW_WD 215, are used to send write data from the master device (e.g., D_cache 115) to a write data FIFO 230 (described below and shown in FIG. 2). MC_D_BUS 126(a), SWJiD 450 and 455 are used to send the return read data from the slave device (memory port 127 or IOU 130) back to the master device. Both SW_WD and SW_RD are both tri-state buses.
II. The Dual Width Memory Subsystem
The present invention allows the system architecture described above to be interfaced with either a 32 or 64 bit external memory bus 140. In order to facilitate dual width memory transfers, the present invention is designed to use a maximum of two clock cycles to send a word between cache 110 and memory interface port 127, or vice versa. For example, if cache 110 writes one long word (64 bits), and the system 100 is coupled to a 32 bit external bus 140, it will take two clock cycles to send the data to memory interface port 127. Referring to FIG. 2, there is a logic design 200 (hereinafter subsystem 200) for writing to main memory 150. Subsystem 200 represents the hardware necessary for either a 32 or 64 bit data transfer. Subsystem 200 transfers data in a "double pumped" fashion. For example, instead of transferring one word of data every clock cycle, the subsystem 200 can transfer one half word of data every half clock cycle. Since the buses are double-pumped, care is taken in the circuit design to ensure that there is no bus-conflict when the buses turn around and switch from one master to a new master. Double pumping reduces the number of required bit lines thereby minimizing expensive wire requirements with minimal performance degradation. Although the preferred embodiment implements a double pumping scheme, double pumping is not necessary for carrying out the present invention.
Subsystem 200 uses multiplexers 210, 220 to send data from data cache bus (CC_D_BUS) 126(b) to main memory 150. Multiplexers 210, 220 of a preferred embodiment of the present invention uses multiplexer latches. In other words the multiplexers can temporarily store data. 16 or 32 bits (depending on whether there is a 32 or 64 bit memory bus 140, respectively) of data will be transferred to memory interface port 127 every half clock cycle and stored in a write data FIFO 230, located between memory interface port 127 and CCU 110.
Subsystem 200 also contains a buffer 250 and pad 260. Buffer 250 is a tri- state output pad buffer to drive the external memory data bus and pad 260 is used to connect subsystem 200 to main memory 150.
FIG's. 3(a) and 3(b) show a pair of timing diagrams for writing data to memory 150 with either a 32 bit memory bus or a 64 bit memory bus, respectively. Data can be transferred in one cycle if a 64 bit memory bus is used, while it takes two cycles to transfer data using a 32 bit memory bus.
Specifically, FIG. 3(a) shows a timing diagram for writing data to memory 150 with a 32 bit memory bus. Initially, cache 110 sends a request via the cache request signal (CC_D_REQ) 128(a) for access to memory 150, shown at reference number 310. MCU 120 acknowledges that request when MC_D_REQ_ACK goes high (at the rising edge of clock 305), as shown at reference number 315. Next, if access to memory is granted, the data to be written to memory 150 appears on the CCJDJBUS 126(b), as shown at reference number 320. Once the data appears on the CC_D_BUS 126(b), the data is transferred to write data FIFO 230 during the next two clock 305 cycles. The MC_D_DA_ACK signal indicates, as shown at reference number 325, that the data is currently being received in write data FIFO
230. Every time 32 bits of data enters MCU 120 it is placed in write data FIFO 230.
The first 32 bits are sent during the first clock 305 cycle (15 bits per half clock cycle) and the second 32 bits are sent during the second clock 305 cycle. Referring to
FIG. 2, initially, all 64 bits of data act as inputs into multiplexers 210 and 220. During the first clock 305 cycle, the first 32 bits are selected from multiplexers 210, 220 and saved via SW_WD 215, 217 in write data FIFO 230. During the second clock cycle, the second 32 bits are selected from multiplexers 210, 220 and saved in write data FIFO 230 via SW_WD 215, 217. Those skilled in the relevant art will readily be capable of generating the necessary control signals/logic for multiplexers 210, 220 based on the disclosed timing signals described above and the information that the system 100 is currently coupled to a 32 bit external bus. Once the data is saved in write data FIFO 230, it can be written (stored) to memory 150 whenever external data bus 140 becomes available. Oftentimes, not all of the data needs to be written to memory 150 (e.g., during a read-modify- write). Consequently, a data multiplexer select 240 is provided. Referring to FIG. 6(a), data multiplexer select 240 provides a scheme for selecting a set of bytes from the data being transferred from write data FIFO 230 to main memory 150. During a read-modify-write, potentially only a portion of the data is modified. For example, as shown in FIG. 6(b) only the first 8 bits of W0 have been modified (as indicated by shading). Initially, all 32 bits are placed at inputs ORG0 660 through ORG3 666 of multiplexers A 610 through D 640. The data at these inputs is the data originally read from the read portion of the read-modify-write operation. This data is modified and placed at the other inputs NEW0 650 through NEW3 656 of multiplexers A 610 through D 640. As illustrated in this example, only the first 8 bits have been modified, the remaining 24 bits of data should not be stored into memory. Thus, data line NEW0 650 is selected in multiplexer A 640 and data lines ORG1 662, ORG2 6664, and ORG3 664 are selected in multiplexers B 630, C 640, and D 650. Which has the effect of storing the data as originally read from memory except for the modified portion of the data. The structure and operation of the control logic for selecting the outputs of multiplexers A 640 through B 660 will become apparent to those skilled in the art.
Note that the data multiplexer select 240 is not essential to practice the present invention. It is only an option that has been implemented in a preferred embodiment of the present invention.
The timing diagram shown in FIG. 3(a) shows an example of two 64 bit words being written to memory 150 via a 32 bit external bus 140. A sample write data FIFO 340 is shown with four 32 bit blocks. At this point, the data is ready to be sent on external data bus 140 in 32 bit blocks from write data FIFO 340. FIG. 3(b) shows the timing diagram for writing data to main memory 150 with a 64 bit external memory bus 140. Initially, cache 110 requests an acknowledgement from MCU 120 that it can write data to memory 150. Once again, this is accomplished by sending CC_D_REQ high at 350. MCU 120 acknowledges the request by sending MC_D_REQ_ACK high at 355, at which point the data is sent onto the CC_D_BUS 126(b). At the beginning of the next clock cycle (shown at reference number 365) 32 bits of data are transferred to write data FIFO 230 via lines SW_WD 215, 217. Once again, the MC_D_DA_ACK signal goes high to acknowledge that write data FIFO 230 is receiving the data from cache 110. In contrast to the 32 bit memory bus timing constraints, it only takes one clock cycle to transfer the 64 bits to write data FIFO 230. 32 bits are transferred every half cycle. A sample write data FIFO 375 is shown with two 64 bit blocks. At this point, the data is ready to be driven onto external data bus 140 in 64 bit blocks. Referring to FIG. 2, as stated above, all 64 bits of data act as inputs to multiplexers 210 and 220. When a 64 bit external data bus 140 is coupled to system 100, all 64 bits of data are selected from multiplexers 210, 220. Thus, 64 bit blocks are stored in write data FIFO 230. During the first half clock cycle the first 32 bits are placed on SW_WD 215 and during the second half clock cycle the second 32 bits are placed on SW_WD 217. Consequently, it only takes one clock cycle to transfer 64 bits from cache 110 to write data FIFO 240.
The above procedures discussed above for writing a 32 bit or 64 bit data stream to main memory 150 via external data bus 140 is generally outlined in FIG. 7. Note that the procedure is exactly the same for both 32 and 64 bit data transfers, except step 750. If system 100 is coupled to a 32 bit external bus then the data transfer takes two cycles and if system 100 is coupled to a 64 bit external bus then the data transfer takes only one cycle. Since the 32 bit only implementation is a subset of the 64 bit implementation, the same MCU 120 control logic can be used in both. MCU 120 control is designed to change the control signals (i.e., inputs to the multiplexers) according to the width of memory bus 140. Those skilled in the art will readily be capable of generating the necessary control logic to operate the present invention given the timing and hardware configuration described above.
FIG. 4 shows the memory system for reading data (i.e., an information fetch). In a similar fashion to a write operation, cache 110 requires that data be returned back to cache 110 in 64 bit blocks. If memory port 127 returns a two long-word read request to cache 110, it will take two clock 505 cycles to send 128 bits to cache 110. The SW_RD bus 450, 455 is used to send the return read data from the slave device (memory port 127 or IOU 130) back to the master device. This bus is not double- pumped because of the timing constraints of cache 110. Data is sent only when clock 505 is high. Cache 110 requires that the data be valid at the falling edge of clock 505. Since the data is received from the port 127 when clock 505 is high, if the SW_RD bus 450, 455 was double- pumped, the earliest that cache 110 would get the data would be at the positive edge of clock 505, not at the negative edge of clock 505. Since the SW_RD bus 450, 455 is not double-pumped, this bus is only active (not tri- stated) during clock 505 and there is no problem with bus buffer conflict where two bus drivers drive the same wires at the same time.
Subsystem 400 also contains a buffer 440 and pad 450. Buffers 440 and 450 are used to translate the external pad voltages to the internal logic voltages and pad 450 is used to connect subsystem 400 to main memory 150. FIG. 5(a) and FIG. 5(b) show the read data timing back to cache 110 for 32 bit and 64 bit bus modes, respectively. FIG. 5(a) shows the timing diagram for reading data from main memory 150 using a 32 bit external memory bus 140. Initially, 32 bits of data are transferred over external memory bus 140 and placed in read data FIFO 430 in 32 bit blocks. Next, the data is placed on data lines SW_RD as shown at reference numbers 510 and 512. When MC_D_B_VLD goes high, as shown at reference number 515, the MC_D_BUS is available. The data/instruction requested by cache 110 will subsequently appear on data bus (MC_D_BUS) 126(a)/instruction bus (MC_I_BUS) 126(c), respectively, subsequent to the beginning of the next clock 505 rising edge, as shown at reference number 520. At this point, the data is being transferred to cache 110.
Referring to FIG's. 1 and 4, when utilizing a 32 bit external bus 140 the data enters MCU 120 through port 127. The data is then stored in read data FIFO 430 in 32 bit blocks. Initially, read data FIFO 430 is empty and data lines 450, 455 are available. However, once data lines SW_RD[31:0] 450 and SW_RD[63:32] 455 become unavailable, the data remains stored in read data FIFO 430 until the data lines 450, 455 become available (only data line 450 is used in 32 bit mode).
As soon as data line 450 becomes available, the first 32 bits in read data FIFO 430 are sent to multiplexers 410, 420. Specifically, multiplexer 420 is concerned with the lower 32 bits and multiplexer 410 is concerned with the higher 32 bits. The first 32 bits are popped from read data FIFO 430 and placed at the input of multiplexer 420 via SW_RD[31:0] 450. Next, the second set of 32 bits will be popped from read data FIFO 430 and sent to multiplexer 410 via SW_RD[63:32] 455. Once all 64 bits of data are placed at the inputs of multiplexers 410, 420, all 64 bits are selected from multiplexers 410, 420, and placed on MC_D_BUS 126(a) (or MC_I_BUS 126(c) as the case may be) and read into cache 110.
An alternative embodiment of the present invention can be configured with a separate set of multiplexers, one set for the I_Cache 113 and the a second set for D_Cache 115. In addition, data line SW_RD[63:32] is optional for 32 bit (low cost) implementations. Referring to FIG. 5(b), a timing diagram for reading data when a 64 bit external bus 140 is being utilized is shown. Initially, the data from external data bus 140 is stored in read data FIFO 430. Since a 64 bit external data bus 140 is being used, the data is stored in read data FIFO 430 is 64 bit long words. The data remains in read data FIFO 430 until data lines SW_RD 450, 455 are available. Once SW_RD 450, 455 become available, all 64 bits are transferred to the inputs of the multiplexers via SW_RD 450, 455, as shown at reference number 550. When MC_D_B_VLD goes high, as shown at 555, the data will subsequently be placed on MC_D_BUS 126(a) (or MCJ BUS 126(c) as the case may be) during the next cycle of clock 505, as shown at reference number 560. The data is transferred over MC_D_BUS 126(a) and forwarded to requesting cache 110.
Referring again to FIG. 4, after the data enters read data FIFO 430 and data lines 450, 455 are available, all 64 bits of data are placed at the inputs of multiplexers 410, 420. The lower 32 bits are placed at the input to multiplexer 420 via data line SW_RD[31:0] 450. The upper 32 bits are placed at the input to multiplexer 410 via data line SW_RD[63:32] 455. After all 64 bits have been placed on data lines SW_RD[31:0] 450 and SW_RD[63:321 455, the data is selected from multiplexers 410, 420 and forwarded to cache 110.
The procedures discussed above for reading a 32 bit or 64 bit data stream from main memory 150 via external data bus 140 is generally outlined in FIG. 8. Once again note that the procedure is exactly the same for both 32 and 64 bit data transfers, except steps 850, 860, and 870. If system 100 is coupled to a 32 bit external bus then the data transfer takes two cycles per long word (64 bits) and if system 100 is coupled to a 64 bit external bus then the data transfer takes only one cycle per long word. Those skilled in the art will readily be in a position to generate the necessary control logic to operate the present invention given the timing and hardware configuration described above.
Data is put into read data FIFO 430 when the switch read bus (SW_RD) is not available. Data is always put in write data FIFO 230 and read out according to memory timing requirements. If external bus 140 or SW_RD buses are currently being used by some other port, the oncoming write or read data is temporarily pushed into write data FIFO 230 or read data FIFO 430, respectively. When the requested bus becomes available (i.e., external bus 140 or SW_RD is released), data is popped from the particular FIFO and transferred to either memory 150, or requesting cache 110 or IOU 130. On the other hand, if the requested bus is available when the data arrives in the write data FIFO 230 or read data FIFO 430, then the data is immediately transferred through the respective FIFO onto the data lines.
The memory system is designed to allow a 64 bit data path to operate in either 64 or 32 bit mode. Software can select which system configuration is used. The 32 bit mode control operation for both 64 and 32 bit chips is the same. Essentially, the control logic and the data path is similar to when the system is configured to connect to a 32 bit external bus and run in 32 bit mode. However, when a 32 bit external bus is used the upper bits of the switch 125 and the write data FIFO 230 or read data FIFO 430 are not used (i.e., SW_WD[31:16] and SW_RD[63:32] will be "don't care"). But as discussed above, the control logic remains the same.
To fully appreciate the design of the present invention the write data FIFO 230 and read data FIFO 430 must be able to store at least two sets of data at any given time, where a set of data is equal to the maximum block of data that is to be transferred. This ensures that when external bus 140 has been accessed on the first set of data is being placed onto external bus 140, a second set of data is immediately available to be put on external bus 140. Thus, there is no guaranteed lag time between the first set of data being placed on external bus 140 and the second set o data being place on external bus 140. In addition, the present invention is directly scalable (e.g. 64/128 bits).
Consequently, one skilled in the art can readily design a system that provides for dual width memory bandwidth with a variety of bit transfer combinations. In addition, the present invention contemplates a multiple width memory bus. Thus, it is contemplated that one skilled in the art could readily design a system, utilizing the teachings of the present invention described above, that is configured to handle, for example, a 32 bit, 64 bit, and/or 128 bit external data buses. Thus, there are infinite combinations of external data bus widths that could be implemented into one system with the teachings of the present invention.
It is contemplated that one skilled in the art can apply the teachings of the present invention to any type of bus in a computer-based system. For example, the present invention is applicable to dual width I/O buses. Furthermore, the present invention is not limited to external buses, but can be applied to internal buses as well.
While the invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims

C L A I M SWhat is claimed is ;
1. A dual width memory subsystem, configured to provide access to a plurality of different width buses in a computer-based system, for efficiently transferring data between a memory and one or more bus requestors over the bus currently coupled to the system, said dual width memory subsystem comprising:
(a) means for determining the width of the bus currently coupled to the computer-based system;
(b) selecting means, configured to receive data from said bus requestor, and for selecting blocks of said data to be outputted; and
(c) storage means for receiving and storing said outputted data in blocks corresponding to said determined width; wherein said selecting means permits said dual width memory subsystem to access buses having different widths.
2. The system of claim 1, further comprising:
(d) a selecting means for selecting individual bytes for transfer to the memory.
3. The system of claim 1, wherein the bus requestor is a data cache or an instruction cache or an VO device.
4. The system of claim 1, further comprising bus means, located between said selecting means and said storage means, for transferring said data, and said bus means is configured for passing data during each clock phase to said storage means.
5. The system of claim 1, wherein the subsystem operates in a multiprocessor environment.
6. The system of claim 1, wherein said selecting means comprises a plurality of multiplexers.
7. A dual width memory subsystem, configured to provide access to a plurality of different width buses in a computer-based system, for efficiently transferring data between a memory and one or more bus requestors over the bus currently coupled to the system, said dual width memory subsystem comprising:
(a) means for determining the width of the bus currently coupled to the computer-based system;
(b) temp means for receiving and storing data from the bus in blocks corresponding to said determined width; and
(c) selecting means, connected to receive data from said temp means, and connected to send said received data to said bus requestor in blocks corresponding to the limitations of the computer-based system.
8. The system of claim 7, wherein said bus requestor is a data cache or an instruction cache or an I/O device.
9. The system of claim 7, wherein said subsystem operates in a multiprocessor environment.
10. The system of claim 7, further comprising control means for allowing data to be transferred in at most two clock cycles.
11. A method for efficiently writing data to a memory from a bus requestor over a bus in a computer-based system, the computer-based system is configured to allow access to a plurality of different width buses, the method comprising the steps of:
(1) determining the width of a bus coupled to the computer- based system;
(2) requesting access to said memory;
(3) sending a data stream to the inputs of a plurality of multiplexers;
(4) selecting data from said data stream and storing said selected data into a temporary FIFO in blocks equal to the width of said bus; and (5) popping data from said FIFO and placing it on said bus once said bus is available.
12. The method of claim 11, wherein during said step (3) said data stream is transferred during each clock phase into said write data FIFO.
13. The method of claim 11, further comprising a step of selecting which of said popped data will be transferred to memory.
14. A method of efficiently reading data from a memory location requested by a bus requestor in a computer-based system, the computer- based system is configured to allow access to a plurality of different width buses, the method comprising the steps of:
(1) determining the width of a bus coupled to the computer- based system; (2) requesting access to said memory;
(3) placing a data stream on said bus once access is granted to said memory;
(4) placing said data stream from said bus into a FIFO in blocks equal to the width of said bus;
(5) popping data from said FIFO and placing it at the inputs to a plurality of multiplexers; and
(6) sending said data located in said plurality of multiplexers to the bus requestor.
15. A method for efficiently transferring data over a bus between a memory and one or more bus requestors in a dual width memory subsystem configured to provide access to a plurality of different width buses in a computer-based system, said method comprising the steps of: (1) determining the width of a bus coupled to the computer-based system;
(2) selecting blocks of data to be stored in a temporary FIFO corresponding to the width of said bus as determined by step (1) and receiving and storing said selected data in said temporary FIFO; and
(3) writing said selected data stored in said temporary FIFO to said memory.
PCT/JP1993/000317 1992-03-18 1993-03-17 System and method for supporting a multiple width memory subsystem WO1993019424A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5516422A JPH07504773A (en) 1992-03-18 1993-03-17 System and method for supporting multi-width memory subsystems

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US85360492A 1992-03-18 1992-03-18
US07/853,604 1992-03-18

Publications (1)

Publication Number Publication Date
WO1993019424A1 true WO1993019424A1 (en) 1993-09-30

Family

ID=25316483

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP1993/000317 WO1993019424A1 (en) 1992-03-18 1993-03-17 System and method for supporting a multiple width memory subsystem

Country Status (3)

Country Link
US (3) US5594877A (en)
JP (2) JPH07504773A (en)
WO (1) WO1993019424A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004051491A1 (en) * 2002-11-29 2004-06-17 Nokia Corporation A method and a system for detecting bus width, an electronic device, and a peripheral device

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5539911A (en) 1991-07-08 1996-07-23 Seiko Epson Corporation High-performance, superscalar-based computer system with out-of-order instruction execution
EP0547247B1 (en) * 1991-07-08 2001-04-04 Seiko Epson Corporation Extensible risc microprocessor architecture
US5493687A (en) 1991-07-08 1996-02-20 Seiko Epson Corporation RISC microprocessor architecture implementing multiple typed register sets
EP0636256B1 (en) 1992-03-31 1997-06-04 Seiko Epson Corporation Superscalar risc processor instruction scheduling
DE69308548T2 (en) 1992-05-01 1997-06-12 Seiko Epson Corp DEVICE AND METHOD FOR COMPLETING THE COMMAND IN A SUPER-SCALAR PROCESSOR.
DE69330889T2 (en) * 1992-12-31 2002-03-28 Seiko Epson Corp System and method for changing register names
US5628021A (en) 1992-12-31 1997-05-06 Seiko Epson Corporation System and method for assigning tags to control instruction processing in a superscalar processor
US5748917A (en) * 1994-03-18 1998-05-05 Apple Computer, Inc. Line data architecture and bus interface circuits and methods for dual-edge clocking of data to bus-linked limited capacity devices
US5764927A (en) * 1995-09-29 1998-06-09 Allen Bradley Company, Inc. Backplane data transfer technique for industrial automation controllers
US5867672A (en) * 1996-05-21 1999-02-02 Integrated Device Technology, Inc. Triple-bus FIFO buffers that can be chained together to increase buffer depth
JPH1078934A (en) * 1996-07-01 1998-03-24 Sun Microsyst Inc Multi-size bus connection system for packet switching computer system
US6014720A (en) * 1997-05-05 2000-01-11 Intel Corporation Dynamically sizing a bus transaction for dual bus size interoperability based on bus transaction signals
GB9802096D0 (en) 1998-01-30 1998-03-25 Sgs Thomson Microelectronics Shared memory access
US6301629B1 (en) * 1998-03-03 2001-10-09 Alliance Semiconductor Corporation High speed/low speed interface with prediction cache
JPH11259238A (en) * 1998-03-11 1999-09-24 Matsushita Electric Ind Co Ltd Signal processor
WO2001018639A1 (en) * 1999-09-08 2001-03-15 Matsushita Electric Industrial Co., Ltd. Signal processor
JP3803196B2 (en) * 1998-07-03 2006-08-02 株式会社ソニー・コンピュータエンタテインメント Information processing apparatus, information processing method, and recording medium
US6249845B1 (en) 1998-08-19 2001-06-19 International Business Machines Corporation Method for supporting cache control instructions within a coherency granule
JP2001202326A (en) * 2000-01-21 2001-07-27 Mitsubishi Electric Corp High speed block transfer circuit for dynamic bus sizing
US7895620B2 (en) * 2000-04-07 2011-02-22 Visible World, Inc. Systems and methods for managing and distributing media content
EP1179785A1 (en) * 2000-08-07 2002-02-13 STMicroelectronics S.r.l. Bus interconnect system
US7143185B1 (en) * 2000-08-29 2006-11-28 Advanced Micro Devices, Inc. Method and apparatus for accessing external memories
US20040213188A1 (en) * 2001-01-19 2004-10-28 Raze Technologies, Inc. Backplane architecture for use in wireless and wireline access systems
US6904499B2 (en) * 2001-03-30 2005-06-07 Intel Corporation Controlling cache memory in external chipset using processor
US6633965B2 (en) * 2001-04-07 2003-10-14 Eric M. Rentschler Memory controller with 1×/M× read capability
US6678811B2 (en) * 2001-04-07 2004-01-13 Hewlett-Packard Development Company, L.P. Memory controller with 1X/MX write capability
US7054978B1 (en) * 2001-08-16 2006-05-30 Unisys Corporation Logical PCI bus
US6941425B2 (en) * 2001-11-12 2005-09-06 Intel Corporation Method and apparatus for read launch optimizations in memory interconnect
US20030093632A1 (en) * 2001-11-12 2003-05-15 Intel Corporation Method and apparatus for sideband read return header in memory interconnect
KR100449721B1 (en) * 2002-05-20 2004-09-22 삼성전자주식회사 Interface for devices having different data bus width and data transfer method using the same
KR100450680B1 (en) * 2002-07-29 2004-10-01 삼성전자주식회사 Memory controller for increasing bus bandwidth, data transmitting method and computer system having the same
JP4031996B2 (en) * 2003-01-30 2008-01-09 富士フイルム株式会社 Digital still camera with memory device
US7099985B2 (en) * 2003-12-23 2006-08-29 Intel Corporation Using a processor to program a semiconductor memory
US7844767B2 (en) * 2004-05-21 2010-11-30 Intel Corporation Method for identifying bad lanes and exchanging width capabilities of two CSI agents connected across a link
TWI260024B (en) * 2005-01-18 2006-08-11 Vivotek Inc An architecture for reading and writing an external memory
US7788420B2 (en) * 2005-09-22 2010-08-31 Lsi Corporation Address buffer mode switching for varying request sizes
JP5369941B2 (en) * 2009-07-02 2013-12-18 コニカミノルタ株式会社 Data processing apparatus, data processing method, and data processing program
US20110296078A1 (en) * 2010-06-01 2011-12-01 Qualcomm Incorporated Memory pool interface methods and apparatuses
US9577854B1 (en) 2015-08-20 2017-02-21 Micron Technology, Inc. Apparatuses and methods for asymmetric bi-directional signaling incorporating multi-level encoding
US10164817B2 (en) * 2017-03-21 2018-12-25 Micron Technology, Inc. Methods and apparatuses for signal translation in a buffered memory

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62232061A (en) * 1986-04-02 1987-10-12 Casio Comput Co Ltd Data transmission processor
EP0255593A2 (en) * 1986-08-01 1988-02-10 International Business Machines Corporation Data package communication systems
EP0290172A2 (en) * 1987-04-30 1988-11-09 Advanced Micro Devices, Inc. Bidirectional fifo with variable byte boundary and data path width change
EP0468823A2 (en) * 1990-07-27 1992-01-29 Dell Usa L.P. Computer data routing system

Family Cites Families (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2337376A1 (en) * 1975-12-31 1977-07-29 Honeywell Bull Soc Ind DEVICE ALLOWING THE TRANSFER OF BLOCKS OF VARIABLE LENGTH BETWEEN TWO INTERFACES OF DIFFERENT WIDTH
US4514808A (en) * 1978-04-28 1985-04-30 Tokyo Shibaura Denki Kabushiki Kaisha Data transfer system for a data processing system provided with direct memory access units
US4453229A (en) * 1982-03-11 1984-06-05 Grumman Aerospace Corporation Bus interface unit
US4667305A (en) * 1982-06-30 1987-05-19 International Business Machines Corporation Circuits for accessing a variable width data bus with a variable width data field
US4663728A (en) * 1984-06-20 1987-05-05 Weatherford James R Read/modify/write circuit for computer memory operation
KR900007564B1 (en) * 1984-06-26 1990-10-15 모토로라 인코포레이티드 Data processor having dynamic bus sizing
US4716527A (en) * 1984-12-10 1987-12-29 Ing. C. Olivetti Bus converter
JPS61139866A (en) * 1984-12-11 1986-06-27 Toshiba Corp Microprocessor
JPS61175845A (en) * 1985-01-31 1986-08-07 Toshiba Corp Microprocessor system
US5265234A (en) * 1985-05-20 1993-11-23 Hitachi, Ltd. Integrated memory circuit and function unit with selective storage of logic functions
JPS6226561A (en) * 1985-07-26 1987-02-04 Toshiba Corp Personal computer
JPS62165610A (en) * 1986-01-17 1987-07-22 Toray Ind Inc Plastic optical cord having high heat resistance
JPS649561A (en) * 1987-07-02 1989-01-12 Seiko Epson Corp Computer
GB2211326B (en) * 1987-10-16 1991-12-11 Hitachi Ltd Address bus control apparatus
US4878166A (en) * 1987-12-15 1989-10-31 Advanced Micro Devices, Inc. Direct memory access apparatus and methods for transferring data between buses having different performance characteristics
US5045998A (en) * 1988-05-26 1991-09-03 International Business Machines Corporation Method and apparatus for selectively posting write cycles using the 82385 cache controller
US5125084A (en) * 1988-05-26 1992-06-23 Ibm Corporation Control of pipelined operation in a microcomputer system employing dynamic bus sizing with 80386 processor and 82385 cache controller
US5202969A (en) * 1988-11-01 1993-04-13 Hitachi, Ltd. Single-chip-cache-buffer for selectively writing write-back and exclusively writing data-block portions to main-memory based upon indication of bits and bit-strings respectively
US5235693A (en) * 1989-01-27 1993-08-10 Digital Equipment Corporation Method and apparatus for reducing buffer storage in a read-modify-write operation
US5255378A (en) * 1989-04-05 1993-10-19 Intel Corporation Method of transferring burst data in a microprocessor
JP2504206B2 (en) * 1989-07-27 1996-06-05 三菱電機株式会社 Bus controller
US5224213A (en) * 1989-09-05 1993-06-29 International Business Machines Corporation Ping-pong data buffer for transferring data from one data bus to another data bus
JPH0484253A (en) * 1990-07-26 1992-03-17 Mitsubishi Electric Corp Bus width control circuit
US5191653A (en) * 1990-12-28 1993-03-02 Apple Computer, Inc. Io adapter for system and io buses having different protocols and speeds
US5255374A (en) * 1992-01-02 1993-10-19 International Business Machines Corporation Bus interface logic for computer system having dual bus architecture
US5440752A (en) * 1991-07-08 1995-08-08 Seiko Epson Corporation Microprocessor architecture with a switch network for data transfer between cache, memory port, and IOU
US5617546A (en) * 1993-12-22 1997-04-01 Acer Incorporated Data bus architecture compatible with 32-bit and 64-bit processors
US5611071A (en) * 1995-04-19 1997-03-11 Cyrix Corporation Split replacement cycles for sectored cache lines in a 64-bit microprocessor interfaced to a 32-bit bus architecture

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62232061A (en) * 1986-04-02 1987-10-12 Casio Comput Co Ltd Data transmission processor
EP0255593A2 (en) * 1986-08-01 1988-02-10 International Business Machines Corporation Data package communication systems
EP0290172A2 (en) * 1987-04-30 1988-11-09 Advanced Micro Devices, Inc. Bidirectional fifo with variable byte boundary and data path width change
EP0468823A2 (en) * 1990-07-27 1992-01-29 Dell Usa L.P. Computer data routing system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ELECTRONIC DESIGN vol. 33, no. 12, May 1985, HASBROUCK HEIGHTS, NEW JERSEY US pages 219 - 225 ZOCH ET AL. '68020 DYNAMICALLY ADJUSTS ITS DATA TRANSFERS TO MATCH PERIPHERAL PORTS' *
PATENT ABSTRACTS OF JAPAN vol. 12, no. 100 (P-683)(2947) 2 April 1988 & JP,A,62 232 061 ( CASIO COMPUTER ) 12 October 1987 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004051491A1 (en) * 2002-11-29 2004-06-17 Nokia Corporation A method and a system for detecting bus width, an electronic device, and a peripheral device

Also Published As

Publication number Publication date
US6047348A (en) 2000-04-04
JPH07504773A (en) 1995-05-25
US5594877A (en) 1997-01-14
JP2004185639A (en) 2004-07-02
US5887148A (en) 1999-03-23

Similar Documents

Publication Publication Date Title
US5594877A (en) System for transferring data onto buses having different widths
US5003465A (en) Method and apparatus for increasing system throughput via an input/output bus and enhancing address capability of a computer system during DMA read/write operations between a common memory and an input/output device
JP3765586B2 (en) Multiprocessor computer system architecture.
US5978866A (en) Distributed pre-fetch buffer for multiple DMA channel device
US5649230A (en) System for transferring data using value in hardware FIFO'S unused data start pointer to update virtual FIFO'S start address pointer for fast context switching
US5509124A (en) Coupled synchronous-asychronous bus structure for transferring data between a plurality of peripheral input/output controllers and a main data store
US7526626B2 (en) Memory controller configurable to allow bandwidth/latency tradeoff
US6636927B1 (en) Bridge device for transferring data using master-specific prefetch sizes
US5919254A (en) Method and apparatus for switching between source-synchronous and common clock data transfer modes in a multiple processing system
US6330630B1 (en) Computer system having improved data transfer across a bus bridge
US7127573B1 (en) Memory controller providing multiple power modes for accessing memory devices by reordering memory transactions
US6754739B1 (en) Computer resource management and allocation system
WO1986003608A1 (en) Queue administration method and apparatus
US5199106A (en) Input output interface controller connecting a synchronous bus to an asynchronous bus and methods for performing operations on the bus
US20050253858A1 (en) Memory control system and method in which prefetch buffers are assigned uniquely to multiple burst streams
US6925532B2 (en) Broadcast system in disk array controller
US20040221075A1 (en) Method and interface for improved efficiency in performing bus-to-bus read data transfers
US7409486B2 (en) Storage system, and storage control method
US5923857A (en) Method and apparatus for ordering writeback data transfers on a bus
US6301627B1 (en) Method/system for identifying delayed predetermined information transfer request as bypassable by subsequently-generated information transfer request using bypass enable bit in bridge translation control entry
US6836823B2 (en) Bandwidth enhancement for uncached devices
EP1596280A1 (en) Pseudo register file write ports
US6907454B1 (en) Data processing system with master and slave processors
WO2003014948A1 (en) System architecture of a high bit rate switch module between functional units in a system on a chip
AU2002326916A1 (en) Bandwidth enhancement for uncached devices

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)