CA2364316A1 - Improved efficiency masked matching - Google Patents

Improved efficiency masked matching Download PDF

Info

Publication number
CA2364316A1
CA2364316A1 CA002364316A CA2364316A CA2364316A1 CA 2364316 A1 CA2364316 A1 CA 2364316A1 CA 002364316 A CA002364316 A CA 002364316A CA 2364316 A CA2364316 A CA 2364316A CA 2364316 A1 CA2364316 A1 CA 2364316A1
Authority
CA
Canada
Prior art keywords
bit
candidate
patterns
group
pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002364316A
Other languages
French (fr)
Inventor
Heng Liao
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsemi Storage Solutions Ltd
Original Assignee
PMC Sierra Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PMC Sierra Ltd filed Critical PMC Sierra Ltd
Publication of CA2364316A1 publication Critical patent/CA2364316A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • H04L45/742Route cache; Operation thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching

Abstract

Methods and apparatus for reducing the search space processed by masked matching methods. The search space is reduced by grouping the candidate bit patterns into groups and subgroups that have internal bit agreement between the members. By only applying the mask matching methods to a select number of groups selected by their bit agreement with the target bit pattern, the computation time and memory requirement of the masked matching method is reduced.

Description

IMPROVED EFFICIENCY MASKED MATCHING
Field of the Invention The present invention relates to methods and apparatus for matching a target bit pattern with a multitude of candidate bit patterns and, more specifically, to methods which will enhance the efficiency of such methods.
Background of the Invention In data networks, routers classify data packets to determine the micro-flows that the packets 1o belong to and then apply the classification to the packets accordingly. Flow identification is the essential first step for providing any flow dependent service. A number of network services require packet classification including access-control, firewalls, i5 policy-based routing, provision of integrated/differentiated qualities of service, traffic billing, and secure tunnelling. In each application, the classifier determines which micro-flow an arriving packet belongs to so as to determine whether to forward 20 or filter, where to forward it to, what class of service it should receive, the scheduling tag/state/parameter that it is associated with, or how much should be charged for transporting it. The classifier maintains a set of rules about packet headers for flow 25 classification.
To clarify, a router is multi-port network device that can receive and transmit data packets from/to each port simultaneously. Data packets typically have a regular format with a uniform header structure. The header structure usually contains data fields such as address, or packet type. When a packet is received from a port, the router uses the header information to determine whether a packet is discarded, logged, or forwarded. If a packet is forwarded, then the router also calculates which output port the packet will be going to. The router also accounts for the number of each type of packet passing by. The forwarding decision (where to send the packet) is to typically made based on the destination address carried in the packet. In an Internet Protocol Router, forwarding involves a lookup process called the Longest Prefix Match (LPM) that is a special case of the general mask matching process.
The LPM uses a route table that maps a prefix rule (a mask-matching rule with all the wildcard bits located at the contiguous least significant bits) to an output port ID. An example of an LPM route table is given below:
# 32-bit Prefix Output PortID

1 xxxx xxxx xxxx xxxx xxxx xxxx xxxx port 0 xxxx
2 1111 0010 1100 xxxx xxxx xxxx xxxx port 1 xxxx
3 1101 0011 0001 xxxx xxxx xxxx xxxx port 3 xxxx
4 1111 0010 1100 1100 0011 xxxx xxxx port 2 xxxx
5 0010 0000 0001 1111 1111 0000 0000 port 4 where x is a wild card bit.

An input packet with destinat ion address =

"1111 0010 1100 1100 0011 1111 1111 1" should be 3o forwar ded to port 2 because it matches entry #2 and #4, but #4 has priority over #2 because the prefix length (numbe r of non-wildcard bits) of #4 longer than #2.
is The router or a firewall will also examine the input packets to determine if they should be discarded and logged. This is usually done with an Access Control List (ACL) Lookup. An ACL can be a user configurable mask-matching rule set (based on packet header fields) that categorizes certain types of traffic that may be hazardous to the network. Hence, when a packet that matches an ACL entry is received, the router/firewall should take action according to the ACL to discard and log the packet or alarm the network administrator.
Such devices as explained above use general 1o multi-layer classification methods in carrying out the device's function. General multi-layer classification requires the examination of arbitrary data/control fields in multiple protocol layers. The prior art grammatical/lexical parser provides flexible solutions to this problem, but the cost of supporting a large rule set is high.
A multiple field classifier is a simple form of classifier that relies on a number of fixed fields in a packet header. A classic example is the 7-dimensional 2o classification, which examines the SA/DA/TOS/Protocol in the IP header, and the SPORT/DPORT/PROTOCOL FLAG in the TCP/UDP header. Because a multi-field classifier deals with fixed fields, parsing is not required. Instead of dealing with variable length packets, the multi-field classifier does classification on fixed sized search keys. The search key is a data structure of the extracted packet data fields. The Multi-field classifier assumes the search keys are extracted from the packet before being presented to the classifier.
3o The problem of multiple field classification can be transformed into the problem of condition matching in multi-dimensional search key space, where each dimension represents one of the data fields the classifier needs to examine. A classification rule specifies conditions to be matched in all dimensions.
The classification rules specify value requirements on several fixed common data fields.
Previous study shows that a majority of existing applications require up to 8 fields to be specified:
source/destination Network-layer address (32-bit for Ipv4), source/destination Transport-layer port numbers (16-bit for TCP and UDP), Type-of-service (TOS) field (8-bits), Protocol field (8-bits), and Transport-Layer protocol flags (8-bits) with a total of 120 bits. The number of fields and total width of the fields may increase for future applications.
Rules can be represented in a number of ways including exact number match, prefix match, range match, and wildcard match. Wildcard match was chosen to be the only method of rule representation that did not sacrifice generality. Any other forms of matching are translated into one or multiple wildcard match rules. A
2o wild card match rule is defined as a ternary string, where each bit can take one of three possible values:
'1', '0', Or 'X'. A blt Of '1' Or '0' lri the rule requires the matching search key bit in the corresponding position to have exactly the same value, and a bit of 'x' bit in the rule can match either '0' or '1' in the search key.
An example of a rule specification on a 16-bit field is given below:
The classifier wants to match 1111 0000 xxlx Oxxl The mask is:

The target value is:

Prefix match rules can be represented in wildcard rules naturally by contiguous 'x' bits in the rules. However the don't-care bits in a general wildcard do not have to be contiguous. Ranges or multiple disjoint point values may be defined by using multiple masked matching rules. For example, an 8-bit range must be broken into two masked matching rules 'OOOlOxxx' and 'OO110xx'. Even with this limitation, the 1o masked matching form is still considered to be an efficient representation, because most of the ranges in use can be broken down into a small number of mask rules. A compiler can handle the task of breaking down user rule specification in a convenient syntax, therefore the complexity can be hidden from the user.
Each rule represents a region in the multi-dimensional space. Each search key (representing a packet to be classified) defines a point in this space.
Points that fall into one region are classified as a 2o member of the associated class. Ambiguity arises when multiple regions overlap each other. A single priority order is defined among the rules to resolve the ambiguity. The rules are numbered from 0 to N-1. The rule indices define the priority among the rules in ascending order. The region with higher priority will cover the region with lower priority. In other words, if a packet satisfies both rule[i] and rule[j], if i<j, it is classified into class[i], otherwise into class[j].
One advantage of mask matching is its so dimension independence. Multiple fields concatenated can be classified with the same method as if they were one wide field. This is accomplished by concatenating the masks of the target strings.

The prior solutions can be grouped into the following categories:
Sequential match For each arriving packet, this approach evaluates each rule sequentially until a rule is found that matches all the fields of the search key. While this approach is simple and efficient in use of memory (memory size grows linearly as the size of the rule set increases), this approach is unsuitable for high-speed 1o implementation. The time required to perform a lookup grows linearly with rule set size.
Grid of Tries The 'Grid of Tries' (or Tuple Space Search) uses an extension of tries data structure to support two fields per search key. This is a good solution for a two-dimensional rule set. But it is not easy to extend the concept to more fields.
The cross-producing scheme is an extension of the 'Grid of Tries' that requires a linear search of the 2o database to find the best matching filter. Hence the effectiveness of cross-producing is not clear. The grid of tries approach requires intensive precompute time.
The rule update speed is slow.
A scheme based on tries is presented by Douceur et al. in US Patent 5,995,971 and US Patent 5,956,721. This method utilizes a tri-indexed hierarchy forest ("Rhizome") that accommodates wildcards for retrieving, given a specific input key, a pattern stored in the forest that is identical to or subsumes the key.
3o This approach has the weakness of not supporting "conflict" between patterns (as stated in line 2126, column 22 of US patent 5,995,971). Patterns that partially overlap but do not subsume one another (E. g.
6 pattern "100x" and "1x00") are in "conflict" because they overlap each other partially, may not be stored in the rhizome defined by the patent, since no defined hierarchical relationship holds for these patterns. In networking applications, these conflicts widely exist in router access list and firewall policies. This weakness limits the use of this classification scheme.
Concurrent Cross Producing T.V. Lakshman in "High Speed Policy-Based 1o Packet Forwarding Using Efficient Multi-Dimensional range Matching", Proceedings of ACM SIGCOMM'98 Conference, September, 1998, presented a hardware mechanism for concurrent matching of multiple fields.
For each dimensional matching this scheme does a binary search on projections of regions on each dimension to find the best match region. A bit-level parallelism scheme is used to solve the crossproducing problem among dimensions. The memory size required by this scheme grows quadratically and memory bandwidth grows linearly 2o with the size of the rule set. Because of the computation complexity in the cross-producing operation, this scheme has a poor scaling property. This scheme also requires a time consuming data structure generation process, hence the rule update speed is slow.
Ternary CAM
Hardware Ternary CAMS (Content Addressed Memory) can be used for classification. Ternary CAMS
store three value digits . '0', '1' or 'X'(wildcard).
The CAMS have good look-up performance, and fast rule so update time. But the hardware cost (silicon area) and power consumption are high. More over, the CAMS require full-custom physical design that prevents easy migration
7 between different IC technologies. For these reasons, currently available CAMS are typically small.
Recursive Flow Classification The recursive flow classifier (RFC) as discussed in Pankaj Gupta and Nick Mckeown, "Packet Classification on Multiple Fields", Sigcomm, September 1999, Harvard University and Pankaj Gupta and Nick Mckeown, "Packet Classification using Hierarchal Intelligent Cuttings", Proc. Hot Interconnects VII, 1o August 99, Stanford, exploits the heuristics in typical router policy database structure(router microflow classifier, access list, fireware rules). RFC uses multiple reduction phases; each step consisting of a set of parallel memory lookups. Each lookup is a reduction in the sense that the value returned by the memory lookup is shorter (is expressed in fewer bits) than the index of the memory access. The algorithm can support very high lookup speed at relatively low memory bandwidth requirement. Since it relies on the policy 2o database structure, in the worst case, little reduction can be achieved at each step. Hence the performance becomes indeterministic. In a normal case, the lookup performance gain is achieved at the cost of high memory size and very long precomputation time. For a large ruleset (16K), the RFC precompute time exceeds the practical limit of a few seconds. In general, RFC is suitable for small classifiers with static rules.
All of the above methods involve, in one form or another, large computation or lookup times.
3o Whichever method is implemented, the cost in time and complexity eventually increases to unacceptable levels.
Furthermore, whichever method is used, the whole search
8 space of possible candidate bit patterns is searched for a match with the target bit patterns. What is required is a method which reduces the search space for whichever mask matching method is implemented.
Summary of the Invention The present invention meets the above need by providing methods and apparatus for reducing the search space processed by mask matching methods. The search space is reduced by grouping the candidate bit patterns 1o into groups and subgroups that have internal bit agreement between the members. By only applying the mask matching methods to a select number of groups selected by their bit agreement with the target bit pattern, the computation time and memory requirement of the mask matching method is reduced.
In a first aspect the present invention provides a method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit 2o patterns, the method comprising dividing said candidate bit patterns into specific groups such that for every group, members of that group have bit agreement with every other member in said group, and applying said process only to groups whose members have bit agreement with said target bit pattern.
In a second aspect the present invention provides a method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit 3o patterns, the method comprising dividing said candidate bit pattern into search space groups, each group having an aggregate number of members lesser than the total
9 aggregate number of candidate bit patterns and applying said process only to groups whose members are in bit agreement with the target bit pattern on at least a first bit position.
In a third aspect, the present invention provides a method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit patterns, the method comprising grouping said candidate 1o bit patterns based on a value of at least one specific bit position of said candidate bit patterns whose value of said at least one specific bit position does not match a value of said target bit pattern.
In a fourth aspect, the present invention provides a method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit patterns, the method comprising:
a) grouping said candidate bit patterns based on a value of a specific bit in a specific bit position in said candidate bit patterns;
b) discarding candidate bit patterns which have a specific bit in the specific bit position whose value does not match the value of the corresponding bit in the target bit pattern;
c) repeating steps a)- b) to the remaining candidate bit patterns with different bit positions until the number of remaining candidate bit patterns is at a minimums and so d) applying said process to the remaining candidate bit patterns.

Brief Introduction to the Drawings A better understanding of the invention will be obtained by a consideration of the detailed description below, in conjunction with the following drawings, in which:
Figure 1 illustrates a Patricia tree data structure according to the prior art;
Figure 2 is a block diagram illustrating a data structure for a leaf node in a modified Patricia 1o tree according to an embodiment of the invention;
Figure 3 is a block diagram illustrating data structure for a leaf node in a modified Patricia tree according to an embodiment of the invention;
Figure 4 illustrates a modified Patricia tree according to the invention and how it is constructed;
Figure 5 is a block diagram illustrating how minimum search spaces can be combined to create a physical set;
Figure 6 is a schematic diagram illustrating 2o an example of how a given target bit pattern can be used to traverse the Patricia tree of Figure 4 to arrive at a relevant physical set;
Figure 7 is a high level flow chart detailing the steps in the creation of a Patricia tree and physical subsets;
Figure 8 is a flow chart detailing the steps in the traversal of a modified Patricia tree as shown in the example of Figure 6;
Figure 9 is an illustration of the lookup 3o tables generated for the EI~IM process;

Figure 10 is a schematic illustrating an example of an implementation of the EMNIM process in conjunction with the modified Patricia tree traversal;
Figure 11 is a block diagram illustrating a possible format for a physical set entry necessary to implement priority indexing; and Figure 12 is a block diagram showing the different parts of a possible hardware embodiment of the invention.
to Detailed Description of the Preferred Embodiment To clarify the terminology used in this description the following definitions are provided:
Bit Position: a bit position is the placement of a specific bit in a bit pattern. Thus, if a bit pattern is lOXX1011, then the bit positions and their values are as follows (counting from the left):
Bit position 0 . 1 Bit position 1 . 0 Bit position 2 . X
2o Bit position 3 . X
Bit position 4 . 1 Bit position 5 . 0 Bit position 6 . 1 Bit position 7 . 1 From the above, it should be therefore be clear that bit position 2 is subsequent to or higher than bit position 1. Concurrently, bit position 4 is prior to or lower than bit positions 5, 6, and 7. Clearly, such relative positioning is dependent on a starting bit position from 3o which relative positioning is determined. The above assumes a starting bit position of 0. However, there is nothing preventing bit position 7 from being the starting bit position. In this case, bit position 5 will be prior to bit position 4 and bit position 1 will be subsequent to bit position 2.
Filter: A filter defines a mask-matching rule. The method below assumes that a filter contains W
bits and forms a concatenation of all the fields at which the classifier needs to look. A filter is further defined as a W-bit 3-value rule set. Each bit can take the value of either '1', '0', or 'x', which means that 1o the bit needs to be l, 0, or don't-care (x) to match this filter. Each rule can also be termed a filter bit.
Rule Set: A rule set is a collection of N
filter rules indexed from 1 to N. The rule set defines a z5 single order of priority. A rule with larger index value has higher priority than a rule with a smaller index.
The symbols F[0] .. F[N-1] are used to represent filters O..N. A rule set can be represented by the 3-value matrix F[N,W], where F[i,j] means the mask 2o matching value of rule i bit j. Members of a rule set are also termed candidate bit patterns.
Field Vector: The packet field data to be classified. It is W-bit 2-value vector (either '0' or '1') that is the concatenation of the key fields of a 25 data packet. The field vector is represented by the symbol K[O..W-1]. This is also termed as a target bit pattern.
Bit agreement: the concept of a bit agreement is analogous to the idea of matching. A bit value of 30 '1' agrees only with another bit value of '1'. A bit value of '0' does not agree with a bit value of '1' but does so with a bit value of '0'. A bit value of 'x' agrees with all bit values. Thus, a bit pattern of lOIXlOX is in bit agreement with both 1011101 and 1010101, but not with 101111X. Similarly, a bit pattern of lOXX is in bit agreement with XX11 and with 1001 but s not with 11XX. Bit agreement may be constrained to a specific bit position. Thus, bit pattern lOXX1 is in bit agreement with 10100 up to bit position 3 but not with respect to bit position 4. Similarly, bit pattern X1001 is in bit agreement with bit pattern 111XX only up 1o to bit position 1 since at bit position 2 the respective bit values are not equal.
Patricia tree: A data structure that provides relatively fast retrieval, insertion and removal operations for a search database. A Patricia tree is a 15 binary tree. Each internal node in this tree has exactly two branches and a bit index value. The tree is traversed starting at the roots when a given node is reached, movement can occur in either of two directions.
Each leaf in the tree contains a data value. The 2o numeric bit index of an internal node specifies the specific bit in an input key that governs movement from the node. If that bit has a one value, one branch is taken; else if it has a zero value, the other branch is taken, and so forth through successive nodes along the 2s path until a leaf node is reached. While the bit index for any child is greater than that of its parent, the bit indices encountered along any such path need not be consecutive, merely ascending. A look-up operation on the Patricia tree involves following the branches so progressively starting from the root until a leaf node is reached. A match is only considered to be found if the data value on the leaf node is equal to the search key. The manner in which nodes are interconnected from a Patricia tree and the values of their associated bit indices are strictly determined by the value that is to be stored in each leaf node and by, where disagreements occur starting at the root and continuing at each successively higher bit positions, among all such leaf values. Each internal node (and hence a bit index) specifies a bit position at which such a disagreement exists. The Patricia tree structure has been used to do 1o exact bit vector matches and longest prefix matches.
As an example of a Patricia tree, Figure 1 illustrates a Patricia tree with the following rule set:
Bit position 0 1 2 3 4 5 Rule 0 1 1 0 0 1 0 Rule 1 0 1 0 1 0 1 Rule 2 0 1 0 1 0 0 Rule 3 1 1 1 0 0 0 Rule 4 0 1 0 0 0 1 In Figure 1, the circles denote internal nodes 2o while the rounded boxes denote leaf nodes. In the figure, the index relates to the bit position being examined. Thus, if a match for the target bit pattern 010101 is being searched for, the search starts with bit position 0 (also known as the starting bit position).
At this bit position, if the target pattern has a value of 0 from internal node 10 the right hand branch is taken to internal node 30. It should be noted that internal nodes 20 and 30 only reference indices 2 and 3 since, for bit position l, ALL the candidate bit 3o patterns have the same value of '1'. Thus, bit position 1 is not a bit position through which the different patterns can be differentiated. This means that, for the candidate bit patterns that have a 0 in bit position 0, bit position 1 cannot be used to differentiate between them. Similarly, for the candidate bit patterns with a '1' in bit position 0, bit position 1 cannot be s used to differentiate between them.
Returning to the search, the target pattern has a value of 1 at bit position 3. This means that internal node 40 is the next destination. Bit position (index = 5) can be used to differentiate between the 1o two remaining candidate bit patterns. Since the target bit pattern has a value of '1' at bit position 5, then the right hand branch is taken, leading to leaf node 50 and a match.
While the above example illustrates the use of a Patricia tree for a most specific match or most specific pattern, this is its limitation - only very specific patterns can be found.
The present invention uses the concept of the Patricia tree to reduce the search space that needs to 2o be searched for a match. The original concept of the Patricia tree can only represent exact match rules efficiently. Matching a rule with a wild card (an 'x' in a bit position) requires that the rule be fully expanded into exact match rules that cover the wild card space. Thus, a bit pattern having a single wild card will require 2 leaf nodes while a bit pattern with n wild cards will need 2n leaf nodes to expand. The present invention extends the Patricia tree concept to "prune" or reduce the search space of wildcard rules.
3o This pruned or reduced search space can then be searched for the proper match. Without pruning, all the candidate bit patterns (or rules) in the classifier database will have to be examined for each target bit pattern. By traversing the wild card Patricia key using the target bit pattern, a majority of the candidate data bit patterns can be eliminated, thereby significantly reducing the number of candidate bit patterns that have to be examined.
The wildcard Patricia tree has two types of nodes: internal nodes and leaf nodes. Each internal node has two descendants, and the leaf node has no 1o descendants. Each internal node represents a branch-point to two different directions on which different pruning is done to the search space. The leaf node represents a Minimum Subset or MSS that cannot be pruned any further.
The data structures for the internal node is illustrated in Figure 2. A bit index field 60 specifies the bit position to which the internal node refers while a child0 field 70 points to the next node if the bit position referenced by the bit index field 60 has a 2o value of '0'. Similarly, a childl field 80 points to the next node if the bit position referenced by the bit index field 60 has a value of '1'. It should be noted that there is no childX field as the target bit pattern is the pattern being matched when traversing the 2s Patricia tree. Accordingly, there can be no X value as the target bit pattern only has definite '1' or '0' values in its bit positions.
Each leaf node points to a MSS of the rule database. The MSS's are stored separately in a data 3o structure described in the next section. The data structure of the leaf node only contains a subset index that is used to identify the associated subset in the data structure. The data structure is illustrated in Figure 3.
As can be seen, the leaf node 90 merely points to an index that references an associated minimum subset.
With a given wildcard rule set, the set can be turned into a Patricia tree and minimum subsets using the following method which is explained in C pseudo-code:
Node* Make tree (Set rset, int index) { Set group0, groupl, groupx~
Node* p:
P = new Node;
while (index <(KEY WIDTH-1)) {
group0 ~e; groupl ~o; groupx ~m;
for (j=0; j<sizeof(rset); j++) switch (rset.rule[j].bit[index]) {
case 0: group0~=group0 v {rset.rule[j]};
case 1: groupl~=groupl v {rset.rule[j]}~
case x: group0~=groupx v {rset.rule[j]};
if ((group0 == m) ~~ (groupl == m))//no disagreement found {
index = index +1~

else ptype = INTERNAL NODE;
p~bit index = index;
p~child0 = Make tree ( (group0~groupx), index+1);
p~childl = Make tree ( (groupl~groupx), index+1);
return p;
}
p~type = LEAF NODE;
p~subset index = Create new minimum-aubset(rset);
return p;
]
As can be seen, the Make tree function is recursively applied to generate a tree and subtrees within the tree. In the initial pass, the pseudo-code groups the candidate bit patterns into three groups, 2o group0 for those with 0 at the initial bit position, groupl for those with 1 at the initial bit position, and groupX for those with X at the initial bit position.
Then, each of these groups are in turn subdivided into smaller groups based on the bit value at the bit position referenced by the continuously increasing index. The end result is that the final leaf nodes are subsets that have a member's candidate bit patterns that have bit agreements with each other. It should be noted that groupX is added to groups group0 and groupl prior 3o to subdividing group0 and groupl. This is done to account for the bit agreement between an 'X' value and a '0' or a '1' bit value.

In other words, the code starts with the complete rule set with index = 0. At each step, the pseudocode finds the first bit position that is greater than or equal to the initial index value where a disagreement in bit value exist among the rules. Only the bit pair of '1' and '0' is considered to be a disagreement. A value of 'X' agrees with both '1' and '0'. The code finds bit disagreements by dividing the rule set (the reset argument of the Make tree function) 1o into three separate groups represented by group 0, group 1, and group X respectively for each index value incrementally. Grouping is done according to the bit value of each rule at the indexed bit position - if a rule has a 0 value at the indexed position, it is added to group 0 and so on. For a specific index value after the grouping is done, if both group 0 and group 1 are not empty, a bit disagreement exists for that index value. In this case an internal node is created. The internal node has two separate branches corresponding to 2o bit values of 0 and 1. The Make tree function is called to generate the subtrees for the 0-branch and the 1-branch. Note that the rule set for the 0-branch consists of both group 0 and group x, and the rule set for the 1-branch consists of group 1 and group x. In other words, groupx is duplicated for both branches since no pruning can be done for this group.
If the Make tree function cannot find any bit disagreement for a rule set (i.e. after the index has been incremented beyond the key width), no further 3o pruning on the set can be done. Therefore, a "leaf node" is created to represent the MSS.

An example of the results of an application of the pseudocode is illustrated in Figure 4. Given the rule set or candidate bit patterns in Table l, the tree of Figure 4 is produced.
Table 1 Rule set:

Bit position: 0 1 2 3 4 5 Rule 0 1 1 0 0 1 0 1o Rule 1 0 1 0 1 0 x Rule 2 0 1 0 x 0 1 Rule 3 1 1 x 0 0 x Rule 4 0 1 0 0 0 1 Rule 5 x x x x x x Rule 6 0 1 1 x x x At the initial bit posit ion (index = 0) the candidate bit patterns are divided into 3 groups, group0, groupl, and groupx, as sho wn in box 100. Each of group0 and groupl in box 100 is further subdivided into subgroups in box 110 and box 120 respectively. As can be seen, the initial group0 is subdivided using index = 3 since index = 1 for this group cannot be used to subdivide the group as there is bit agreement between all group members. Only when inde x = 3 is there bit disagreement. These subgroups are shown in box 130 (for bit value = 0 at bit position 3) a nd in box 140 (for bit value = 1 at bit position 3). The initial groupl is subdivided (see box 120) using ind ex = 4. This is 3o because, for this groupl, indices 1, 2 and 3 cannot be used to subdivide as all of the gr oup members have bit agreement until index = 4.

The other subgroups derived from the rule set are shown in boxes 150 and 160. These subgroups result from the initial group having a bit value of '1' in the initial bit position. As can be seen, each one of these subgroups is indexed or referenced by a subset_index value. The subset index value and the subset or subgroup members are as follows:
subset index = 0 [box 130]
{010001, 010X01, xxxxx}
subset index = 1 [box 140]
{01010X, 010X01, xxxxx}
subset index = 2 [box 150]
{11XOOX, xxxxx}
subset index = 3 [box 160]
{110010, xxxxx}
These subgroups can be reduced to a smaller number of what can be termed a physical set (PS) to which a mask matching method can be applied. While 2o there are 4 MSS sets in the example, in practical applications there can be large numbers of such sets, with the number of elements in each set being quite varied. To simplify storage and sorting through what could be large numbers of MSS sets, these sets can be combined into physical sets referenced by a PS index number. If this is done, a target bit pattern can then traverse the tree (effectively grouping the candidate bit patterns and "discarding" from consideration those bit patterns whose bit values at the relevant bit 3o positions do not agree with the target bit pattern) to arrive at a PS index number at the resulting leaf node.
The PS index number refers to a physical set which contains candidate bit patterns having a high probability of matching the target bit pattern. A mask matching process can then be applied to the physical set. Since the physical set is smaller in element numbers than the complete rule or candidate set, then the mask matching process becomes more efficient as it is only applied to a smaller set.
The above approach is even more practical if the mask matching process to be used or the hardware to 1o be used can only support a fixed set of fixed size groups. Thus, given a list of MSS's and a group size limit of LIMIT, ,the MSS's have to be packed into physical sets having a size of, at most, LIMIT.
If an MSS group is of size LIMIT, then it cannot be packed into a physical set. It can become a physical set on its own. However, other MSS's of fixed size that are not full can be packed into physical sets, saving on the memory that would otherwise have to be allocated to the empty spots in the non-full MSS's .
2o Clearly, each MSS can not be larger than the mask matching process LIMIT, otherwise the MSS can not be fitted into the fixed size subsets supported. In practice, LIMIT should be made as large as possible within the physical constraints to relax this constraint as fas as possible. The maximum size of the MSS
actually determines the number of rules that can subsume or overlap any one rule in the set. From our study of typical IP classification rule sets, we have found that the maximum MSS size is quite small (<256). Hence, the limitation does not seriously impair the applicability of the invention.

The following C pseudo-code function illustrates how the MSS groups can be packed into physical sets:
Pack MSS Packs a list of MSS's into a list of PS's.
Set Pack MSS (Set LIST MSS) f Set LIST PS;
Set T;
LIST PS-m;
While (LIST MSS not empty) {
T--0 ;
For (each set MSS in LIST MSS) f if (sizeof (TvMSS)<=LIMIT) T = TvMSS;
LIST MSS = LIST MSS - MSS;
}
LIST PS~LIST PS v {T};
return LIST PS;
Note that memory savings may be achieved by the PACK MSS function when multiple overlapping MSS's are merged into one PS - the overlapping rules appear only once in the merged PS set. A larger LIST MSS
allows more MSS's to be packed together, and hence the so chances of getting memory savings is larger. This is another reason for using the largest practical LIMIT
value. However, note that increasing LIMIT indefinitely is also not feasible, as it imposes a significant burden on the associative lookup process.
The PACK MSS function not only generates a list of PS sets, but also establishes a mapping from MSS
to PS. Each MSS is mapped into the PS that PACK MSS
packs it into. An index lookup table is generated to represent the mapping, so that one can find the associated PS from the MSS index. Alternatively, the Patricia tree leaf node can store a pointer to the 1o associated PS and save the indexed lookup processing step.
Figure 5 schematically illustrates what occurs when PACK MSS is implemented. The subgroups in boxes 130, 140, 150, 160 (from Figure 4) are packed into physical sets 170, 180. As can be seen, the subgroups in boxes 130, 140 are packed into physical set 170 (PSO) and elements or members common to the subgroups are only represented once in the physical set. The resulting physical set 170 from subgroups 130, 140 will thus have 2o the members:
PSm {010001, 01010x, 010x01, xxxxx}
Similarly, the subgroups in boxes 150, 160 are packed into physical set 180 (PSl) to result in:

{110010, 11x00x, xxxxx}
The example in Figure 5 assumes LIMIT = 4.
With the tree completed and the physical sets created, it can thus be determined if a given target bit 3o pattern matches a candidate bit pattern. As noted above, the process essentially comprises of traversing the tree to "discard" the groups of candidate bit patterns which do not have bit agreement with the target bit pattern at specific bit positions. The tree traversal can be accomplished using the method of the C
pseudo-code function below:
Tree lookup traveraea the Patricia tree to locate the corre:ponding PS.
Tree Lookup (key, Node* tree) {
if (tree--type = - INTERNAL NODE) switch (key[tree~bit index]) case O:PS INDEX = Tree lookup (key, tree~child0)~break;
case 1:PS INDEX = Tree lookup (key, tree~childl);break~ }
}
else // LEAF NODE
{
MSS INDEX = p~subset index;
PS INDEX = MSS2PS MAPPING[MSS INDEX]~
}
return PS INDEX
}
As an example of the application of the Tree-lookup function, Figure 6 schematically illustrates a traversal of the tree in Figure 4 using the physical sets generated in Figure 5. For this example, the 3o target bit pattern is 110001. As can be seen, the traversal begins with index = 0 meaning bit position =
0. Since the target bit pattern has a value of '1' at this bit position then the '1' branch in Figure 6 is taken. The next index value, and hence the next bit position is 4. Since the target bit pattern has a value of '0' at this bit position (as shown by the underlined value in the figure). This leads to the subset index =
2 or MSS INDEX = 2. This MSS INDEX (as can be seen in Fig. 6) leads through the MSS/PS mapping index to a PS_INDEX (or physical set index) denoting physical set 1 or PSl.
1o From here, the physical set (PS1) can be retrieved and the mask matching method can be applied to the physical set using the target bit pattern.
To summarize the steps involved in the above process, Fig. 7 illustrates a flowchart detailing these general steps. The initial phase starts with a rule set or a set of candidate bit patterns 190. A tree is then created from this rule set(step 200). The tree creation step essentially groups the candidate bit patterns into subsets based on their bit values at specific bit 2o positioning. From this, the minimum subsets or the subgroups in Fig. 4 can be obtained (step 210). The subgroups can then be packed into physical sets (step 220) .
With the physical sets created along with the tree, a given target bit pattern can thus be used to find a match. Fig. 8 illustrates the steps in this process.
As can be seen in Fig.8, the process begins with determining the bit position/index at an internal 3o node of the tree (step 230). Step 240 is that of determining if the bit value of the bit position referred to in the internal node has a value of 1. If so, then step 250 notes that the '1' branch has to be taken. If, on the other hand, the bit value equals '0' then the '0' branch is taken (step 260). Either way, 270 checks if the next node is a leaf node. If not, then the node must be an internal node and step 280 checks the index/bit position at that node. Then, connector A details jumping back to step 230. On the other hand, if the node found in step 270 is a leaf node, then the physical set pointed to by the leaf node 1o is retrieved (step 290). The chosen mask matching method is the applied to this physical set to determine if a match exists or not (step 300).
The above description invention can be customized to work with any mask matching method. As an 1s example, it can be customized with the applicant's other invention, the enhanced mask matching method (EMMM) as described in copending US application entitled MULTI-FIELD CLASSIFICATION USING ENHANCED MASKED MATCHING.
The EMMM breaks the target bit pattern and the 2o candidate bit patterns into sets of s-bit-wide chunks.
The number of chunks is W/s. For each chunk, a 3-dimensional partial match array for all the bit combinations in the chunk is precomputed. The partial match is represented by:
25 M[N] [W/s][2S]
The partial match array M [ k] [ i ] [ j ] represents the precomputed partial match result for all the bit combinations within the chunk, where i specifies the chunk index, j specifies an s-bit combination value, and 3o k indexes the N-bit partial filter matching result.
For each chunk, the field bits are used to index into the 3-D partial match array to fetch the N-bit partial match result vector. The fetched partial match vector is then ANDed together for each chunk to form the complete match vector.
It should be noted that each bit in the match vector references a specific candidate bit pattern in the relevant physical set. If there are multiple matches in a physical set, priority encoding is used to resolve the conflict. Thus, it is important for this specific embodiment that the candidate bit patterns in to the physical set be placed in their order of priority.
This requirement will be made clear later in this document.
To generate the partial match array for ENIMM, the following method detailed in C pseudo-code can be i5 used:
GENERATE Et~~i ARRAY generate E1~! partial match array from LIST PS.
Parameters:
NUM PS the number of PS's supported KEY WIDTH the width of the search key 20 CHUNK SIZE the EMMM chunk size (speedup factor) LIMIT the maximum number of rules in each PS
LIST PS [i][j] the j-th rule in the i-th PS
M the EMMM partial match array for LIST-PS
SUBROUTINE Match (x, y) // x is s-bit binary vector, y is s-bit 25 ternary vector f result = 1;
for (T=O;T<CHUNK SIZE ; t++) 30 switch (y[t]) case 'x' , p = 1; break;

case '1' . p = (x[t] ==1); break;
case '0' . p = (x[t] ==0); break;
result = result & p;
}
return result;

GENERATE EMM ARRAY (Set LIST PS) {
Boolean M{NUM PS] [KEY WIDTH/CHUNK SIZE] [2C~NK SIZE] [LIMIT] ;
Set LIST PS;
Int ps index, chunk index, chunk value, rule index;
for (ps index = 0; ps index<NUM PS; ps index++) for (chunk index=0; chunk index<KEY WIDTH/CHUNK SIZE;
chunk index ++)// the chunk for (chunk val=0; chunk val<2s ; chunk val++) //enumerate bit combinations in a chunk for(rule index=0; rule index<LIMIT;
rule index++) M[ps index][chunk index][chunk val]rule ind ex] = Match (chunk val, LIST PS[pa index][rule index][chunk i ndex' (CHUNK SIZE+1)a-l:chunkindexi'CHUNK SIZE]);
As an example of the application of the above method, Fig. 9 illustrates a partial match array for the physical sets generated in Fig. 5. As can be seen , the candidate bit patterns in physical sets PSO and PS1 are "chunked" into chunks that are 2 bits wide. Each chunk is indexed as 0, 1, or 2, depending on their placement in the bit pattern. Since the value of LIMIT is 4, to have a full physical set, the last rule/candidate bit pattern in the set is duplicated to fill the set. Since the rules/bit patterns are in their order of priority (i.e. the highest priority rule is at the top) then it is the lowest priority rule/bit pattern that is duplicated.
It should be noted that in the tables of Fig.
9, the rules in the physical sets are also indexed as 0, 1, 2, 3, along with the physical set and the chunk.
1o Thus, for PS INDEX = 0, chunk index = l, and rule index - 2, one needs to examine the physical set PSO, the 3ra rule in the physical set (bit pattern 010x01 since indexing begins at index = 0) and the second chunk of Ox.
z5 To generate the tables in Figure 9, all possible bit combinations for a given chunk size is generated and each bit combination is compared with the relevant chunk of the bit pattern. If there is a match between the bit combinations and the chunk, a '1' is 2o entered in the partial match vector for that chunk index. The partial match vector is the final column in the tables of Fig. 9. As an example, in Fig. 9 the chunk size equals 2 and this means 4 possible bit combinations (00, O1, 10, 11). For chunk index 0, 25 physical index 0, there are 16 entries - one for every correlation between the 4 bit combinations and the 4 rule chunks. Thus, for chunk value Ol (bit combination O1) in the chunk index 0 (first chunk) of physical set 0 PSO and rule index 2 (3rd rule in PSO the rule chunk has 3o a value of Ox, compared with the chunk value of O1, there is a match and, accordingly, a '1' is entered into the partial match vector.

A similar process is used when the table relating to the second physical set (PS index = 1) is created.
Once the tables are created and the tree is traversed, the target bit pattern can be matched with the physical set members using EMNIM. This can be done by applying the following C pseudo-code:
E2~~I LOOKUP ( PS INDEX, Key) {
Boolean M[NUM PS] [KEY WIDTH/CHUNK-SIZE] [2C~NK SIZE] [LIMIT] ; // the EMMM array Boolean result [LIMIT];
Int chunk index, chunk value, rule index;
int class index;
for (rule indes=0; rule index <LIMIT; rule_index++) // set result vector to 11...1 result {rule index] = 1;
// calculate the match vector for (chunk index=0; chunk index < KEY WIDTH/CHUNK-SIZE; chunk index ++) // the chunk for (chunk val=0; chunk Val<2~NK SIZE ~ chunk val++) //
enumerate bit combinations in a chunk for (rule index=0; rule index<LIMIT;
rule index++) result[rule_index]=result[rule-index] &
M[ps index][chunk_index][chunk val][rule in dex];
// do priority encoding on result [rule index];
for (rule index=0; rule index<LIMIT; rule+index++) if (result(rule index)) break;
class index = rule index;

return class index;
An example of this application is illustrated in Fig. 10. This example uses the same target bit pattern used in traversing the tree in Fig. 6.
As can be seen in Fig. 10, the target bit pattern is chunked into 3 chunks: 11, 00, and O1, each chunk being indexed independently. From the tree traversal example in Fig. 6, the target bit pattern 11 00 O1 is to be associated with physical set PS1 (PS-INDEX = 1) for a possible match. As shown in Fig.
10, each chunk of the target bit pattern is independently matched with the relevant potion of the physical set tables in Fig. 9 to determine the practical match vector.
To explain this process, the first chunk of the target bit pattern (CHUNK INDEX = 0) can be used as an example. Given that (from Fig. 10) CHUNK INDEX = 1 PS INDEX = 1 CHUNK VALUE = 1 then, from the PSl table of Fig. 9, the partial match vector for the chunk and its values has a value of (1 1 1 1). Similarly, the partial match vectors for the other chunks are found. To duplicate the data in Fig.
10, the values for the other chunks and then partial match vectors are:
CHUNK INDEX = 1 PS INDEX = 1 CHUNK VALUE = 00 Partial match vector = (1 1 1 1) CHUNK INDEX = 2 PS INDEX = 1 CHUNK VALUE = 01 Partial match vector = (0 1 1 1) These partial match vectors are then ANDed together in a bitwise manner to give the final match vector. To illustrate this:
Partial match vectors Final Match Vector l0 1 1 0 0 & &

The final match vector denotes that there is a match in rows 1, 2, and 3 of physical set PS1 (PS INDEX
- 1)for the target bit pattern 11 00 Ol. It also denotes that there is not a match between the target bit pattern and the candidate bit pattern in row 0 of the physical set PS1.
To determine which of these matching candidate bit pattern must take precedence, priority between these matching bit patterns must be determined. As noted in the beginning, the priority in the rule set is determined by the order of the bit patterns in the rule set. Thus candidate bit pattern 0 in the rule set has 3o the highest priority. In the physical sets the bit patterns are also in order of priority. Thus, the candidate bit pattern in row 1 of physical set PS1 has the highest priority. For this reason, the result index in Fig. 10 is given as 1. If the matching bit pattern with the highest priority were in position 2, then the result index would equal 2. With result index = 1, this corresponds to rule 3 in the rule set. Clearly, when the rules in the rule set are placed in the physical sets, the rules' ranking or placement in the original rule set is noted. The matching bit pattern is therefore determined to be rule 3 in the original rule set.
It should be noted that while the above uses a rule's positioning in the physical set and in the rule 1o set to determine priority, a priority indexing system may be used to eliminate the need for priority placement. A physical set entry may have the format given in Figure 11 to implement priority indexing. The system in Figure 11 assumes a 6 bit pattern and 2 bit chunks of the candidate bit pattern while position 330 would contain a priority index. If there are multiple matches in a physical set, the priority index would determine priority - a higher priority index denoting either higher or lower priority. Priority encoding 2o would therefore be simpler, albeit at the cost of a few bits per entry. The few bits per entry can be saved by simply ordering the entries in order of priority, albeit at the cost of more logic when placing the entries in the physical set.
With regard to implementing the method described above in hardware, Figure 12 illustrates a block diagram showing the different parts of a possible hardware embodiment. To explain Figure 12, a search key 230 is fed into a Patricia tree search engine 240 and a 3o multiplexer 250. Coupled to the Patricia tree search engine is a RAM bank 260 which contains the Patricia tree data structure. The multiplexer 250 has an output that is received by a merge bus 270. Also being input into the merge bus 270 is the output of the Patricia tree search engine 240. The Patricia tree search engine 240 outputs a PS index which indicates which physical index is being referenced along with the two bit output of the multiplexer 250. Also input into the merge bus 270 is the output of a sequencer 280. The output of the merge bus 270 is a 20 bit address that is fed into the EMMM array in memory or RAM 290. The output of this 1o array 290 is fed into D flip-flops 300 by way of AND
gate 310. The other input of the AND gate 310 is the output of the OR gate 320. OR gate 320 has two inputs, one from a priority encoder 330, and the other from the sequencer 280. The output of the priority encoder 330 i5 is fed into a lookup table 340 along with the PS index or physical set index from the Patricia tree search engine. The ultimate output of the lookup table 340 is the matching bit pattern.
The block diagram of Figure 12 has essentially 2o two major components. The first is the Patricia tree component comprising the Patricia tree search engine 240 and the memory 260 containing the Patricia tree data structure. The second component is composed of the hardware search engine 290 implementing the EMMM method 2s as outlined above and disclosed in the applicant's application entitled MULTI-FIELD CLASSIFICATION USING
ENHANCED MASKED MATCHING. The physical set index from the Patricia tree search engine is sent to the merge bus 270 and in conjunction with the two bit output of the 3o multiplexer 250 which determines the chunks to be examined. The merge bus takes the physical set index and the chunk to be examined along with the sequencer output which determines which chunk of the search key is to be looked up and combines all this into a 20 bit address that references the memory 290 containing the EMD~I array as described above. The output of this memory 290 is sent to the AND gate 310 for storage and D
flip-flops 300. Essentially, the AND gate 310 D-flip-flops 300 and the OR gate 320 perform a running AND
operation on the results of the lookup for the EMMNI
array. As can be seen in Figure 12 the output of the D
to flip-flops 300 is also fed into the priority encoder 330. Once all the results from the EMS array 290 have been processed by the loop defined by AND gate 310, D
flip-flops 300, and OR gate 320, the priority encoder determines from the match vector which of the matching i5 bit patterns has priority over the others. The result of the priority encoder, a result index, is then sent to the lookup table 340 along with the physical set index from the Patricia tree search engine and based on these two pieces of data, the final matching bit pattern is 2o found from the look up table.
In terms of the performance of the combined ENIMM method and the Patricia tree pruning as outlined above, the following is an analysis of their performance and cost in terms of hardware. In the wild-card 25 Patricia tree, because of the pruning process defined by the Make-tree algorithm, the two subsets representing the child nodes of any internal node must have at least one uncommon element. Hence any leaf node of a wildcard Patricia tree has at least one unique element that does 3o not belong to the remaining leaf nodes. Therefore:

Number of PS's s Number of Leaf Nodes (number of MSS's) s Number of rules in the full set. Also because the Patricia tree is a binary tree we know:
Number of Internal Nodes s (Number of Leaf nodes - 1);
Because the bit index of the descendant nodes is greater than the parent node and the bit index value is smaller than the width of the key, then Depth of Patricia tree <_KEY WIDTH+1.
1o Memory cost for Patricia tree:
Patricia tree size = Number of internal nodes + Number of Leaf Nodes <=2*Number of rules-1;
Worst case computation cost of Patricia Tree Lookup:
Tree Lookup-step = Depth of Patricia Tree -1SKEY WIDTH;
Each tree lookup step consists of one memory access to fetch the tree node and a logical comparison.
EML~I array memory cost = NUM PS x KEY WIDTH/CHUNK x 2CHUNK SIZE x LIMIT;
2o Memory bandwidth (bits per lookup) KEY WIDTH/CHUNK SIZE x LIMIT;
The parameter NUM-PS should be chosen according to the maximum rule set size and the LIMIT. Because of the overlapping of rules among the PS's, the following equation should be satisfied:
NUM-PS x LIMIT x (1-overlapping factor) >
NUM RULES
hence:
NUM-PS=NUM RULES/LIMIT/(1-overlapping factor);
3o The following table gives a set of assumptions on possible system parameters:

Name Value Comments Overlapping factor 20~
I

The Memory cost estimation Name Value Comments NUM PS 256k/128/(1-20$)=2560 Internal Nodes43x256K=11 Mb 7 bits for bit index +

18 bits per pointer x2 =

43-bit Leaf Nodes 12x256 K = 3.0 Mb 12-bits for PS index (addressing a space of 4096 PS's) EMMM ARRAY 2560x128/2x4x128=80Mb The Memory bandwidth cost (per lookup) Name Value Comments Number of internal128 x 43 = 5504 fetch internal node node access record each record is 43-bit wide Number of leaf 1 x 12 = 12 fetch leaf node node access record each record is 11-bit wide Number of EMMM KEY WIDTH/CHUNK SIZEEach time fetch = a accesses 128/2 = 64 x 128 128-bit wide partial =

8192 match vector total memory bw ~ 13708bits/lookup At OC 48 rates, the classifier needs to have a throughput of 6M lookups per second. This translates into 6Mx13708-82Gbps memory bandwidth. At a 200MHz clock frequency, the required raw memory bandwidth can be achieved using a 410-bit wide memory bus. The actual design can be optimized by using a mixture of embedded SRAM, embedded DRAM, and off-chip DRAM/SRAM.
A person understanding the above-described' to invention may now conceive of alternative designs, using the principles described herein. All such designs which fall within the scope of the claims appended hereto are considered to be part of the present invention.

Claims (12)

We Claim:
1. A method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit patterns, the method comprising:
a) dividing said candidate bit patterns into specific groups such that for every group, members of that group have bit agreement with every other member in said group; and b) applying said process only to groups whose members have bit agreement with said target bit pattern.
2. A method as in claim 1 wherein said dividing is accomplished by:
a1) defining a working bit position;
a2) grouping said candidate bit patterns based on bit values at said bit position such that for every group all members in that group have the same bit value at said bit position;
a3) defining a new working bit position;
a4) applying steps a1) - a3) to every group such that members of resulting subgroups have bit agreement at all bit positions up to the new working bit position; and a5) applying steps a2) - a4) to all resulting subgroups such that final resulting subgroups have bit agreement at all bit positions, wherein each new working bit position is subsequent to its preceding working bit position.
3. A method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit patterns, the method comprising:
a) dividing said candidate bit pattern into search space groups, each group having an aggregate number of members lesser than the total aggregate number of candidate bit patterns; and b) applying said process only to groups whose members are in bit agreement with the target bit pattern on at least a first bit position.
4. A method as in claim 3 wherein said dividing is accomplished by:
a1) defining a working bit position;
a2) grouping said candidate bit patterns based on bit values at said bit position such that for every group all members in that group have the same bit value at said bit position;
a3) defining a new working bit position;
a4) applying steps a1) - a3) to every group such that members of resulting subgroups have bit agreement at all bit positions up to the new working bit position; and a5) applying steps a2) - a4) to all resulting subgroups such that final resulting subgroups have bit agreement at all bit positions, wherein each new working bit position is subsequent to its preceding working bit position.
5. A method as in claim 3 further including combining at least two of said search space groups into a minimum search space group.
6. A method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit patterns, the method comprising grouping said candidate bit patterns based on a value of at least one specific bit position of said candidate bit patterns whose value of said at least one specific bit position does not match a value of said target bit pattern.
7. A method of increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of candidate bit patterns, the method comprising:
a) grouping said candidate bit patterns based on a value of a specific bit in a specific bit position in said candidate bit patterns;
b) discarding candidate bit patterns which have a specific bit in the specific bit position whose value does not match the value of the corresponding bit in the target bit pattern;
c) repeating steps a)- b) to the remaining candidate bit patterns with different bit positions until the number of remaining candidate bit patterns is at a minimum; and d) applying said process to the remaining candidate bit patterns.
8. A system for matching a target bit pattern with at least one of a plurality of candidate bit patterns, the system comprising:
- dividing means for dividing said candidate bit patterns into specific groups such that for every group, members of that group have bit agreement with every other member in said group; and - mask matching means for matching said target bit pattern with at least one of said candidate bit patterns, said mask matching means including being applied only to groups whose members are in bit agreement with the target bit pattern on at least a first bit position.
9. A system as in claim 8 wherein said dividing means executes the following method:
a1) defining a working bit position;
a2) grouping said candidate bit patterns based on bit values at said bit position such that for every group all members in that group have the same bit value at said bit position;
a3) defining a new working bit position;
a4) applying steps a1) - a3) to every group such that members of resulting subgroups have bit agreement at all bit positions up to the new working bit position; and a5) applying steps a2) - a4) to all resulting subgroups such that final resulting subgroups have bit agreement at all bit positions, wherein each new working bit position is subsequent to its preceding working bit position.
10. A system for increasing the efficiency of a mask matching system for matching a target bit pattern with at least one of a plurality of candidate bit patterns, the system comprising:
- grouping means for grouping said candidate bit patterns into specific groups, said grouping being based on a value of at least one specific bit position of said candidate bit patterns such that for each group, members of a group have bit agreement with other members of said group; and - classification means for classifying said specific groups into subgroups, at least one candidate subgroup being in bit agreement with said target bit pattern on at least one bit position, wherein said mask matching system is applied only to said at least one candidate subgroup.
11. Computer readable media having encoded thereon computer readable and computer executable code for executing a method for increasing the efficiency of a process for finding a match between a target bit pattern and at least one of a plurality of given candidate bit patterns, said method comprising:
a) grouping said candidate bit patterns based on a value of a specific bit in a specific bit position in said candidate bit patterns;
b) discarding candidate bit patterns which have a specific bit in the specific bit position whose value does not match the value of the corresponding bit in the target bit pattern;
c) repeating steps a)- b) to the remaining candidate bit patterns with different bit positions until the number of remaining candidate bit patterns is at a minimum; and d) applying said process to the remaining candidate bit patterns.
12. A method of increasing the efficiency of a mask matching process for finding a match between a target bit pattern and at least one of a plurality of given candidate bit patterns, said candidate bit patterns being divided into groups according to bit agreement between said candidate bit patterns, the method comprising:
- receiving said target bit pattern;
- determining which of said groups is in most bit agreement with said target bit pattern; and - applying said process to at least one group that is in most bit agreement with said target bit pattern.
CA002364316A 2001-09-17 2001-12-03 Improved efficiency masked matching Abandoned CA2364316A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/953,215 US7054315B2 (en) 2001-09-17 2001-09-17 Efficiency masked matching
US09/953,215 2001-09-17

Publications (1)

Publication Number Publication Date
CA2364316A1 true CA2364316A1 (en) 2003-03-17

Family

ID=25493716

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002364316A Abandoned CA2364316A1 (en) 2001-09-17 2001-12-03 Improved efficiency masked matching

Country Status (2)

Country Link
US (1) US7054315B2 (en)
CA (1) CA2364316A1 (en)

Families Citing this family (87)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7487200B1 (en) * 1999-09-23 2009-02-03 Netlogic Microsystems, Inc. Method and apparatus for performing priority encoding in a segmented classification system
US7382787B1 (en) 2001-07-30 2008-06-03 Cisco Technology, Inc. Packet routing and switching device
US7418536B2 (en) * 2001-07-30 2008-08-26 Cisco Technology, Inc. Processor having systolic array pipeline for processing data packets
US7058724B1 (en) * 2001-11-02 2006-06-06 Cisco Technology, Inc. Arrangement for routing a received signaling message based on a selected summary route in an SS7 network
KR100456671B1 (en) * 2001-11-24 2004-11-10 주식회사 케이티 Parallel lookup engine and method for fast packet forwarding in network router
US7231383B2 (en) * 2002-05-01 2007-06-12 Lsi Corporation Search engine for large-width data
US7899067B2 (en) * 2002-05-31 2011-03-01 Cisco Technology, Inc. Method and apparatus for generating and using enhanced tree bitmap data structures in determining a longest prefix match
US7525904B1 (en) 2002-06-20 2009-04-28 Cisco Technology, Inc. Redundant packet routing and switching device and method
US7450438B1 (en) 2002-06-20 2008-11-11 Cisco Technology, Inc. Crossbar apparatus for a forwarding table memory in a router
US7710991B1 (en) 2002-06-20 2010-05-04 Cisco Technology, Inc. Scalable packet routing and switching device and method
US6934252B2 (en) * 2002-09-16 2005-08-23 North Carolina State University Methods and systems for fast binary network address lookups using parent node information stored in routing table entries
US7274699B2 (en) * 2002-09-20 2007-09-25 Caterpillar Inc Method for setting masks for message filtering
US7117196B2 (en) * 2002-11-22 2006-10-03 International Business Machines Corporation Method and system for optimizing leaf comparisons from a tree search
US7536476B1 (en) * 2002-12-20 2009-05-19 Cisco Technology, Inc. Method for performing tree based ACL lookups
US7216321B2 (en) * 2003-03-10 2007-05-08 Atrenta, Inc. Pattern recognition in an integrated circuit design
US7200785B2 (en) * 2003-03-13 2007-04-03 Lsi Logic Corporation Sequential tester for longest prefix search engines
US7415463B2 (en) * 2003-05-13 2008-08-19 Cisco Technology, Inc. Programming tree data structures and handling collisions while performing lookup operations
US7382777B2 (en) * 2003-06-17 2008-06-03 International Business Machines Corporation Method for implementing actions based on packet classification and lookup results
US7516126B2 (en) * 2003-06-30 2009-04-07 Intel Corporation Method and apparatus to perform a multi-field matching search
US7840696B2 (en) * 2003-07-25 2010-11-23 Broadcom Corporation Apparatus and method for classifier identification
US7633886B2 (en) * 2003-12-31 2009-12-15 University Of Florida Research Foundation, Inc. System and methods for packet filtering
US7366728B2 (en) * 2004-04-27 2008-04-29 International Business Machines Corporation System for compressing a search tree structure used in rule classification
US7412431B2 (en) * 2004-04-27 2008-08-12 International Business Machines Corporation Method for managing multi-field classification rules relating to ingress
US7454396B2 (en) * 2004-04-27 2008-11-18 International Business Machines Corporation Method for compressing multi-field rule specifications
EP1762079A1 (en) 2004-06-23 2007-03-14 Qualcomm Incorporated Efficient classification of network packets
US7711893B1 (en) * 2004-07-22 2010-05-04 Netlogic Microsystems, Inc. Range code compression method and apparatus for ternary content addressable memory (CAM) devices
US7725450B1 (en) 2004-07-23 2010-05-25 Netlogic Microsystems, Inc. Integrated search engine devices having pipelined search and tree maintenance sub-engines therein that maintain search coherence during multi-cycle update operations
US7747599B1 (en) 2004-07-23 2010-06-29 Netlogic Microsystems, Inc. Integrated search engine devices that utilize hierarchical memories containing b-trees and span prefix masks to support longest prefix match search operations
US8886677B1 (en) * 2004-07-23 2014-11-11 Netlogic Microsystems, Inc. Integrated search engine devices that support LPM search operations using span prefix masks that encode key prefix length
US20060045088A1 (en) * 2004-08-25 2006-03-02 Nokia Inc. Method of using Patricia tree and longest prefix match for policy-based route look-up
US8005084B2 (en) * 2004-11-30 2011-08-23 Broadcom Corporation Mirroring in a network device
US7680107B2 (en) * 2004-11-30 2010-03-16 Broadcom Corporation High speed trunking in a network device
US7826481B2 (en) * 2004-11-30 2010-11-02 Broadcom Corporation Network for supporting advance features on legacy components
US7715384B2 (en) 2004-11-30 2010-05-11 Broadcom Corporation Unicast trunking in a network device
US7830892B2 (en) 2004-11-30 2010-11-09 Broadcom Corporation VLAN translation in a network device
US7554984B2 (en) * 2004-11-30 2009-06-30 Broadcom Corporation Fast filter processor metering and chaining
US8014390B2 (en) * 2004-11-30 2011-09-06 Broadcom Corporation Policy based routing using a fast filter processor
US7889712B2 (en) 2004-12-23 2011-02-15 Cisco Technology, Inc. Methods and apparatus for providing loop free routing tables
US20060221956A1 (en) * 2005-03-31 2006-10-05 Narayan Harsha L Methods for performing packet classification via prefix pair bit vectors
US20060221967A1 (en) * 2005-03-31 2006-10-05 Narayan Harsha L Methods for performing packet classification
US7668160B2 (en) * 2005-03-31 2010-02-23 Intel Corporation Methods for performing packet classification
US7551609B2 (en) * 2005-10-21 2009-06-23 Cisco Technology, Inc. Data structure for storing and accessing multiple independent sets of forwarding information
US7825777B1 (en) 2006-03-08 2010-11-02 Integrated Device Technology, Inc. Packet processors having comparators therein that determine non-strict inequalities between applied operands
US7298636B1 (en) 2006-03-08 2007-11-20 Integrated Device Technology, Inc. Packet processors having multi-functional range match cells therein
CN101554021A (en) * 2006-08-02 2009-10-07 佛罗里达大学研究基金会股份有限公司 Succinct representation of static packet classifiers
US7697518B1 (en) 2006-09-15 2010-04-13 Netlogic Microsystems, Inc. Integrated search engine devices and methods of updating same using node splitting and merging operations
US20080101222A1 (en) * 2006-10-30 2008-05-01 David Alan Christenson Lightweight, Time/Space Efficient Packet Filtering
US7987205B1 (en) 2006-11-27 2011-07-26 Netlogic Microsystems, Inc. Integrated search engine devices having pipelined node maintenance sub-engines therein that support database flush operations
US8086641B1 (en) 2006-11-27 2011-12-27 Netlogic Microsystems, Inc. Integrated search engine devices that utilize SPM-linked bit maps to reduce handle memory duplication and methods of operating same
US7953721B1 (en) 2006-11-27 2011-05-31 Netlogic Microsystems, Inc. Integrated search engine devices that support database key dumping and methods of operating same
US7831626B1 (en) 2006-11-27 2010-11-09 Netlogic Microsystems, Inc. Integrated search engine devices having a plurality of multi-way trees of search keys therein that share a common root node
JP4995125B2 (en) * 2008-03-12 2012-08-08 株式会社アイピーティ How to search fixed length data
US9489221B2 (en) * 2008-06-25 2016-11-08 Microsoft Technology Licensing, Llc Matching based pattern inference for SMT solvers
CN101478551B (en) * 2009-01-19 2011-12-28 清华大学 Multi-domain network packet classification method based on multi-core processor
CN101925109B (en) 2009-06-16 2012-12-26 华为技术有限公司 Method and device for controlling channel mapping
EP2445289B1 (en) * 2009-06-16 2015-09-23 Huawei Technologies Co., Ltd. Method for mapping control channel, method for detecting control channel and device thereof
US8085603B2 (en) * 2009-09-04 2011-12-27 Integrated Device Technology, Inc. Method and apparatus for compression of configuration bitstream of field programmable logic
EP2552059B1 (en) * 2010-03-24 2014-12-03 Nec Corporation Packet transfer system, control apparatus, transfer apparatus, method of creating processing rules, and program
US20120310941A1 (en) * 2011-06-02 2012-12-06 Kindsight, Inc. System and method for web-based content categorization
US10229139B2 (en) 2011-08-02 2019-03-12 Cavium, Llc Incremental update heuristics
CN103377261A (en) * 2012-04-28 2013-10-30 瑞昱半导体股份有限公司 Access control list management device, executive device and method
US9049200B2 (en) * 2012-07-27 2015-06-02 Cisco Technology, Inc. System and method for improving hardware utilization for a bidirectional access controls list in a low latency high-throughput network
US9031932B2 (en) * 2012-09-06 2015-05-12 Oracle International Corporation Automatic denormalization for analytic query processing in large-scale clusters
US10083200B2 (en) * 2013-03-14 2018-09-25 Cavium, Inc. Batch incremental update
US9595003B1 (en) 2013-03-15 2017-03-14 Cavium, Inc. Compiler with mask nodes
US9195939B1 (en) 2013-03-15 2015-11-24 Cavium, Inc. Scope in decision trees
US10229144B2 (en) 2013-03-15 2019-03-12 Cavium, Llc NSP manager
JP6221501B2 (en) * 2013-08-19 2017-11-01 富士通株式会社 NETWORK SYSTEM, ITS CONTROL METHOD, NETWORK CONTROL DEVICE, AND ITS CONTROL PROGRAM
US10503716B2 (en) * 2013-10-31 2019-12-10 Oracle International Corporation Systems and methods for generating bit matrices for hash functions using fast filtering
US9596215B1 (en) * 2015-04-27 2017-03-14 Juniper Networks, Inc. Partitioning a filter to facilitate filtration of packets
US20160335296A1 (en) * 2015-05-14 2016-11-17 Blue Sage Communications, Inc. Memory System for Optimized Search Access
US10496680B2 (en) 2015-08-17 2019-12-03 Mellanox Technologies Tlv Ltd. High-performance bloom filter array
US10049126B2 (en) 2015-09-06 2018-08-14 Mellanox Technologies Tlv Ltd. Cuckoo hashing with selectable hash
US10068034B2 (en) * 2016-09-07 2018-09-04 Mellanox Technologies Tlv Ltd. Efficient matching of TCAM rules using hash tables in RAM
US10491521B2 (en) 2017-03-26 2019-11-26 Mellanox Technologies Tlv Ltd. Field checking based caching of ACL lookups to ease ACL lookup search
US10476794B2 (en) 2017-07-30 2019-11-12 Mellanox Technologies Tlv Ltd. Efficient caching of TCAM rules in RAM
US10747783B2 (en) 2017-12-14 2020-08-18 Ebay Inc. Database access using a z-curve
US11327974B2 (en) 2018-08-02 2022-05-10 Mellanox Technologies, Ltd. Field variability based TCAM splitting
US11003715B2 (en) 2018-09-17 2021-05-11 Mellanox Technologies, Ltd. Equipment and method for hash table resizing
US10944675B1 (en) 2019-09-04 2021-03-09 Mellanox Technologies Tlv Ltd. TCAM with multi region lookups and a single logical lookup
US11184282B1 (en) * 2020-04-17 2021-11-23 Vmware, Inc. Packet forwarding in a network device
US11539622B2 (en) 2020-05-04 2022-12-27 Mellanox Technologies, Ltd. Dynamically-optimized hash-based packet classifier
US11782895B2 (en) 2020-09-07 2023-10-10 Mellanox Technologies, Ltd. Cuckoo hashing including accessing hash tables using affinity table
JP2024022699A (en) * 2020-11-06 2024-02-21 株式会社Preferred Networks Information processing device, information processing method, and computer program
US11917042B2 (en) 2021-08-15 2024-02-27 Mellanox Technologies, Ltd. Optimizing header-based action selection
US11929837B2 (en) 2022-02-23 2024-03-12 Mellanox Technologies, Ltd. Rule compilation schemes for fast packet classification
US11968285B2 (en) 2022-02-24 2024-04-23 Mellanox Technologies, Ltd. Efficient memory utilization for cartesian products of rules

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5995971A (en) 1997-09-18 1999-11-30 Micdrosoft Corporation Apparatus and accompanying methods, using a trie-indexed hierarchy forest, for storing wildcard-based patterns and, given an input key, retrieving, from the forest, a stored pattern that is identical to or more general than the key
US5956721A (en) 1997-09-19 1999-09-21 Microsoft Corporation Method and computer program product for classifying network communication packets processed in a network stack
US6341130B1 (en) * 1998-02-09 2002-01-22 Lucent Technologies, Inc. Packet classification method and apparatus employing two fields
US6289013B1 (en) * 1998-02-09 2001-09-11 Lucent Technologies, Inc. Packet filter method and apparatus employing reduced memory
US6560610B1 (en) * 1999-08-10 2003-05-06 Washington University Data structure using a tree bitmap and method for rapid classification of data in a database

Also Published As

Publication number Publication date
US7054315B2 (en) 2006-05-30
US20030123459A1 (en) 2003-07-03

Similar Documents

Publication Publication Date Title
US7054315B2 (en) Efficiency masked matching
Taylor Survey and taxonomy of packet classification techniques
US7116663B2 (en) Multi-field classification using enhanced masked matching
US7536476B1 (en) Method for performing tree based ACL lookups
US10476794B2 (en) Efficient caching of TCAM rules in RAM
Gupta et al. Algorithms for packet classification
Spitznagel et al. Packet classification using extended TCAMs
Lakshminarayanan et al. Algorithms for advanced packet classification with ternary CAMs
US7668160B2 (en) Methods for performing packet classification
Li et al. Tuple space assisted packet classification with high performance on both search and update
Iyer et al. ClassiPl: an architecture for fast and flexible packet classification
Meiners et al. Hardware based packet classification for high speed internet routers
US8375165B2 (en) Bit weaving technique for compressing packet classifiers
Nikitakis et al. A memory-efficient FPGA-based classification engine
Pao et al. A multi-pipeline architecture for high-speed packet classification
Pao et al. Efficient packet classification using TCAMs
Li et al. TabTree: A TSS-assisted bit-selecting tree scheme for packet classification with balanced rule mapping
Meiners et al. Topological transformation approaches to optimizing TCAM-based packet classification systems
Waldvogel Multi-dimensional prefix matching using line search
Tan et al. Mbittree: A fast and scalable packet classification for software switches
Macián et al. An evaluation of the key design criteria to achieve high update rates in packet classifiers
Lim et al. High-speed packet classification using binary search on length
Erdem et al. Clustered hierarchical search structure for large-scale packet classification on FPGA
Pao et al. Parallel tree search: An algorithmic approach for multi-field packet classification
Taylor et al. On using content addressable memory for packet classification

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued