1
SYSTEM AND METHOD FOR SEARCHING PATTERNS IN REAL-TIME OVER A SHARED MEDIA
5
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to systems and methods for multiplexing data streams, and particularly, to a 10 system and method enabling simultaneous real-time pattern searching of multiple logical streams communicated over a shared media.
2. Discussion of the Prior Art
The MPEG-2 Generic Coding of Moving Pictures and :5 Associated Audio: Systems Recommendation H.222.0 ISO/ IEC 13818-1 defines the mechanisms for combining, or multiplexing, several types of multimedia information into one program stream. This standard uses a known method of multiplexing, called packet multiplexing, where data, video 20 and audio packets, etc. are interleaved one after other onto a single MPEG-2 transport stream (TS). The individual streams comprising the data, video and audio packets absent system or timing data are called elementary streams. The Packetized Elementary Streams (PES's) are elementary 25 streams comprising all the header and data required to enable decoding of the elementary stream associated with a programs and may be up to 64 kbytes in length.
Transport Streams (TS's) are defined for transmission networks and allow for the multiplexing of many Packetized 30 Elementary Streams which are packetized into smaller fixed length size, e.g., 188 bytes. These TS's additionally may suffer from occasional transmission errors. Each TS packet consists of a TS Header, followed optionally by ancillary data called Adaptation Field, followed typically by some or 35 all the data from one PES packet. The TS Header consists of a>Sync13 byte (0x47), flags, indicators, a Packet Identifier (PID), plus other information used for error detection, timing, etc. The semantics for the MPEG-2 TS header are as follows: 40
Sync byte: (8-bits) a fixed value 0x47.
Transport error indicator: (1-bit) Indicates an uncorrectable bit error exists in the current TS packet.
Payload unit13 start13 indicator: (1-bit) indicates the pres- 45 ence of a new PES packet or a new TS-PSI Section. PSI—Program Specific Information.
Transport priority: (1-bit) indicates a higher priority than
other packets.
PID: (13-bits) Values 0 and 1 are pre-assigned, while values 2 to 15 are reserved. Values 0x0010 to 0x1 FFE, may be assigned by the Program Specific Information (PSI). Value 0x1 FFF is used for Null packets. Transport13 scrambling control: (2-bits) indicates the 5J scrambling mode of the packet payload.
Adaptation field control: (2-bits) indicates the presence
of adaptation field or payload.
Continuity counter: (4-bits) One continuity counter per
PID. It increments with each non-repeated TS packet 60 having the correspondent PID. Each MPEG-2 program stream may be characterized as a data stream that is encapsulated using MPEG-2 TS packets, with each packet containing a header field with a packet identifier (PID). The PID field is particularly used by an 65 MPEG-2 transport demultiplexer ("Demux") to tune to a particular set of PID's that correspond to a given program
2
stream. Each program stream must have a set of distinct PID's (except for PID=0xl FFF for the Null packet). For example:
program stream 1: <video PID=0xl01, audio PID=0xl02, secondary audio PID=0xl07, 0x1 FFF> valid
program stream 2: <video PID=0xl01, audio PID=0x200, private data PID=0xl07, 0x1 FFF> valid
For a typical MPEG-2 transport stream multiplexer, several program streams, originating from different sources (e.g., network interface, or local storage sub-system), are routed over a shared data bus (i.e., a local bus) and stored in a local packet memory. Finally, packets are removed from the packet memory and transmitted over the output channel.
It would be highly desirable to provide for MPEG-2 applications, a mechanism for monitoring the data transfer over the local bus, by parsing the MPEG-2 Transport Layer, removing MPEG-2 TS header and adaptation fields, and searching the payload field, for a specific pattern (e.g., a start code "0x000001").
U.S. Pat. No. 5,734,429 describes an apparatus for detecting start code for bit stream of compressed image according to the MPEG-standard. The disclosed apparatus however, can only handle the case where only one bit stream is present. Similarly, U.S. Pat. No. 5,727,036 describes a method for searching start code for high bit rate but only covers the case where only one bit stream is present.
It would be desirable to provide a mechanism that searches for patterns in a multi-program multiplexed stream, in real-time, and moreover, that searches in parallel several bit streams that are multiplexed and transmitted over a shared media (bus, network, etc.).
It would be additionally desirable to provide a real-time search mechanism that searches for patterns in a multiprogram multiplexed stream, wherein individual bit streams are transferred as buffer (burst-basis) units, and that further provides tracking and transferring of buffer unit context when a search pattern crosses a buffer boundary, i.e., when the starting part of the pattern resides in one buffer and the remainder part resides in a following buffer.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a system and method for performing real-time search of a bit pattern over a shared media supporting multiple simultaneous data streams each associated with a different program source.
It is a further object of the present invention to provide in an MPEG-2 packet Time Division Multiplexed transport system, a real-time data monitoring mechanism for data transfers occurring over a local bus that parses the MPEG-2 Transport Layer, removes MPEG-2 Transport Stream header and adaptation fields, searches the payload field for a specific pattern, and, identifies and indexes points where the search patterns occur.
It is yet another object of the present invention to provide in an MPEG-2 packet transport system, a real-time data monitoring and pattern searching system that additionally tracks all the individual bit streams at a given time, and performs context switching in a transfer unit (burst) basis.
According to the principles of the invention, there is provided a system and method for providing real-time searching and indexing of patterns included in packets of a packet stream over a time-domain multiplexed shared media (e.g., local bus, local area network, etc) and, particularly providing searching of several data streams transmitted over this shared media using a real-time search engine. Given a set of data streams, where a data stream is characterized as
3
a set of linked buffers (initially stored in a Packet Memory (30), for example, using a linked list data structure, and associated with a program stream using a unique identifier called the Queue Identifier (QID)), with each buffer capable of being transferred in blocks of variable block sizes over a shared media using a time-domain multiplexing (TDM) technique (e.g., one block from data stream(i) is transmitted, then one block from data stream(j) is transmitted, and so forth), the real-time search engine is enabled to: capture each individual data transfer; parse the higher level protocol that encapsulates the payload; remove the transport layer header, and perform the pattern searching in the data stream payload. In the event of a pattern being found, the Real-Time Search Engine (100) generates a special status message containing the offset within the packet payload, the location within the associated buffer, and the contents of the packet following the location of the search pattern.
This search is performed in real-time (as the blocks traverse the media), and the search results are reported to a host processor using one or more pre-defined messages. Additionally, as a pattern can cross a buffer boundary, i.e., the starting part of the pattern may reside in one buffer and the remainder part may reside in the subsequent buffer, additional complexity is introduced. The Real-Time Search Engine addresses this complexity by performing a context switching operation which includes: capturing each individual data transfer; mapping a start data transfer address and a data stream identifier (QID) using a buffer number as the key; saving the current contents of a Context Register into an SRAM Context Storage (107); fetching the context corresponding to the new data stream (using the Buffer number as the key); and, copying its contents into the Context Register (105). Once the appropriate context is fetched, the Real-Time Search Engine continues parsing the data stream.
Advantageously, the real-time pattern searching method and system of the invention may be implemented in any transport packet scheme, e.g., IP, ATM, where one or more a data streams are transmitted over a shared media using a time domain multiplexing technique.
BRIEF DESCRIPTION OF THE DRAWINGS
Further features, aspects and advantages of the apparatus and methods of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
FIG. 1 is a diagram illustrating the connection of the Real-Time Search Engine (100) with the Host Sub-System;
FIG. 2 illustrates an example of several data stream (70) transfers over a shared bus and the segmentation of an MPEG-2 transport packet;
FIG. 3 illustrates the linked list data structure (81) used to store the data streams in the Packet Memory (30);
FIG. 4 is a high-level block diagram illustrating the Real-Time Search Engine (100) and its connection with the Host Sub-System Local Bus (60);
FIG. 5 is a detailed block diagram of the Real-Time Search Engine (100) according to the preferred embodiment of the invention;
FIG. 6 illustrates the syntax for commands used to control the Real-time Search Engine (100);
FIG, 7 illustrates the data format of the dynamic context switching registers (105); and
FIG. 8 illustrates the data format of the Result Register (104).
4
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 1 is a block diagram illustrating a preferred implementation of the Real-time Pattern Searching System (9) of the invention. As shown in FIG. 1, MPEG-2 program streams may be originated from a network connection (10) or from a local storage device (50). In any case, the streams contents are transmitted in blocks which are written into a Packet Memory (30) in a manner that facilitates re-assembly of the original MPEG-2 program stream contents. Particularly, as shown in FIG. 1, there is provided an input channel (11) from where an MPEG-2 Program Stream originates via a Network Interface Module (10) that processes and generates the MPEG-2 data streams for temporary storage in the Packet Memory storage device (30). Although not shown, there may be further implemented an MPEG-2 Transport stream demultiplexer (Demux) block which demultiplexes individual MPEG-2 program streams for temporary storage in a Packet Memory storage device as shown and described in commonly-owned, co-pending U.S. patent application Ser. No. 09/448,333, the contents and disclosure of which is incorporated by reference as if fully set forth herein. Alternatively, an MPEG-2 program stream may originate from a Local Storage device (50), where it is received by a Local Storage Interface (40), for processing and storage in the Packet Memory (30). In the preferred embodiment, the Packet Memory (30) is used to cache both incoming and outgoing program streams. Buffers from indi3Q vidual program streams are stored in the Packet Memory (30) as will be described in greater detail herein.
A Real-time Search and Indexing Engine (100) is provided that is MPEG-2 aware, and supports multiple MPEG-2 program streams that are transferred over a non35 multiplexed Local Bus (60). The Real-Time Search and Indexing Engine (100) particularly monitors the data transfers into the Packet Memory device (30), (i.e., the write operations), capturing them into a Bus Interface and Caching mechanism (101), shown and described herein with 40 respect to FIG. 2, and generates the proper indexing in the manner to be described in greater detail.
A Host Processor (20) is attached to the Local Bus (60), for controlling the operation of each of the modules connected to the Local Bus (60). That is, the Host Processor (20) 45 attached to the local bus (60), controls the various system data flows which include: 1) data flow from the From Network Interface Module (10) to the Packet Memory (30); 2) data flow from the Local Storage Interface (40) to the Packet Memory (30); and, 3) data flow including Host 50 Processor (20) commands to the Real-Time Search Engine (100).
FIG. 2 illustrates a logical diagram for the data flows from the individual program streams (70) that are multiplexed into a single shared transport media (71) using Time Divi
55 sion Multiplexing (TDM). As shown, the TDM slots (73) on the shared transport media may have variable sizes, depending on the burst size supported by the shared media (71). Typically, for a Local Bus scenario, the information residing in the header field (72) includes a "start of transfer address"
60 (hereinafter ADDR) that suffices to associate the data transfer block with its source (QID). According to the MPEG-2 standard, the TS packet (75) includes a header (75a) having a packet identifier (PID), flags, indicators, error detection and timing information, in addition to its associated MPEG
65 content payload (75fc). As depicted in FIG. 2, each data burst associated with a QID may comprise a whole or partial TS packet (75), in the case of segmentation.
5
As shown in FIG. 3, for purposes of this invention, a program stream is considered a sequence of buffers (80) identifiable by an buffer id. These buffers are stored in the Packet Memory (30) using a linked list data structure (81) such as illustrated in FIG. 3 and manageable by the Host 5 Processor (20). This buffer management scheme is advantageous as it improves memory utilization by eliminating fragmentation.
FIG. 4 is a high-level block diagram illustrating the Real-Time Search Engine (100) and its connection with the 10 Host Sub-System Local Bus (60). As shown in FIG. 4, the Real-Time Search Engine (100) includes a local Bus Interface and Caching,(101) mechanism that captures each block written into the Packet Memory (30) via the Local Bus (60); and, a field programmable gate array (110) including: an 15 Input Register (102) for receiving captured information routed over an INPUT bus (102') from the Bus Interface and Caching unit (101); a DEL Register (103) for receiving delimeters (DEL) used to signal start of transfer, data and control fields generated by the Bus Interface and Caching 20 unit (101) and routed over a DEL bus (103'); a Decoder and Finite State Machine block (120) for performing all decoding and context handling functions; a Result Register (104) for reporting data/address results to the Host Processor (20) using pre-defined messages via the local Bus Interface and 25 Caching (101) unit; and, a Context Register (105) for storing information relating to the context switching functions. Further included is the SRAM Context Switching Memory (107) for temporarily storing context information from the Context Register (105) including buffer address information 30 (107') and data information (107"). With further regard to FIG. 4, each data transfer block (write operation into the Packet Memory (30)) routed over the non-multiplexed Local Bus (60) for capture by the Real-Time Search Engine (100), includes data and address information that is input to the 35 Local Bus Interface and Caching block (101) via DATA bus (61) and address ADDR bus (62), respectively. The Local Bus Interface and Caching block (101) outputs include: information carried over an INPUT bus (102') for input to the Input Register (102); and, the Delimiter information 40 having <"address", "command", or "data"> values indicating the type of information that is carried on the INPUT (102') bus via the DEL (delimiter) bus (103').
The Host Processor (20) controls the operation of the Real-Time Search Engine (100) by sending commands over 45 the Local Bus (60) at a pre-defined address range. These
commands include "Clear Buffer", "Start Buffer" and
"Chain Buffer" which are decoded and processed by the
Decoder and Finite State Machines (120) block. Particularly, before starting the transfer of a new MPEG-2 program 50
stream, the Host Processor (20) issues a command ("Start
Buffer") to start the Real-Time Search Engine 100). This command assigns the buffer number for the first buffer, and the Packet Identifier (PID) for which the search is to be performed. The Real-Time Search Engine (100) initializes 55 the Context Register (105) corresponding to the new program stream, enabling the capture of data transfers associated with this program stream. For the subsequent buffers belonging to this program stream, the Host Processor (20)
must issue a "Chain Buffer" command linking the current 60
buffer with the following buffer before data transfers associated with the next buffer start. It is understood that the "Chain Buffer" command may be initiated any time following the "Start Buffer" command. When a data transfer
block first accesses a new buffer (one that has not been 65 previously accessed), as it switches from the current to the next buffer, it automatically performs a "chain buffer"
6
operation, by copying the context corresponding to the current buffer into an area in SRAM Context Storage (107) associated with the context of the next buffer. It also resets
an "enable context" flag that indicates a pending request
for context switching.
More specifically, the MPEG-2 aware Real-Time Search and Indexing Engine (100) has been architected with the following capabilities: 1) capability to search a given pattern inside the MPEG-2 payload field for any given MPEG-2 program stream. Once the pattern is found, the Real-time Search Engine (100) reports the pattern location (buffer number, buffer offset, and byte position inside the MPEG-2 Transport Stream packet); 2) the capability to handle multiple program streams, dynamically switching context (each program stream has its own context), on a block (smallest transfer unit in the local bus) basis. Context switching is performed whenever a new data block is transferred from a buffer that is not the current buffer. The search engine keeps the entire context stored locally (e.g., SRAM); and, 3) the capability to report the search results by writing one or more specialized data blocks into a pre-allocated shared memory region, and utilizing a direct mapping scheme to associate the report with the buffer where the pattern was found. Specifically, once the Real-Time Search Engine (100) finds the search pattern, it reports the byte immediately following the search pattern by writing a special data transfer block containing the contents of a "Result Register" into the Packet Memory (30) as described herein with respect to FIGS. 5 and 8. The maximum number of reports per buffer is pre-assigned, however, this number may be altered by increasing the shared memory region used for reporting the search results. Preferably, the search engine (100) is implemented using FPGA's to enable system modifications without extensive re-design.
FIG. 5 is a further detailed block diagram of the RealTime Search Engine (100) and particularly, the Decode and Finite State Machine Block (120). As shown in FIG. 5, the DEL (delimiter) bus (103') carries DEL information (103") for input to a DECODER block (114) which decodes the DEL parameter information to enable routing of the INPUT information (102") into the appropriate register: ADDR REG (111), CMD REG (112), or DATA REG (113) under actual control of the corresponding control lines: (HI'), (112'), and (113'). As referenced herein, these DEL values signal the following types of information: Address— indicating the starting address of the data transfer; Command—indicating the command issued by the Host Processor (20); and, Data—indicating the data transfer payload.
Particularly, the ADDR REG (111) is connected to a CONTEXT SWITCHING HANDLER (115), which is a Finite State Machine that checks the contents of the ADDR REG (111), and compares it with the current buffer number stored at the Context Register (105). If the contents of the
ADDR REG (111) match the current buffer number,
(indicating that the data transfer belongs to current buffer), no context switch is performed. If the CONTEXT SWITCHING HANDLER (115) determines that the current (input) address belongs to a different buffer, it saves the current context, i.e., contents of Context Register (105), in the SRAM Context Storage memory (107), and fetches the context corresponding to the new (input) buffer from the CONTEXT STORAGE (107), updating the contents of the Context Register (105). FIG. 7 illustrates the dynamic contents of the Context Register (105) as comprising a first context word (122) and a second context word (123). Particularly, as shown in FIG. 7, the first context word (122) format is as follows:
« PreviousContinue » |