WO2003030370A2

WO2003030370A2 - Method of decoding a turbo product code utilizing a scalable and hardware efficient forward error correction decoder

Info

Publication number: WO2003030370A2
Application number: PCT/US2002/031812
Authority: WO
Inventors: Anthony M. Jones; Paul Wasson; Peter Gentle; Chris Walker; David J. Casebolt; Edward R. Coulter; Alan R. Danielson; Jason A. Dearien; Jason G. Franklin; Nathan J. Hungerford
Original assignee: Comtech Aha Corporation
Priority date: 2001-10-04
Filing date: 2002-10-04
Publication date: 2003-04-10
Also published as: AU2002334863A1; WO2003030370A3

Abstract

A method and apparatus for decoding a codeword having a plurality of bits, wherein each bit includes a soft value. The apparatus includes a plurality of bit cells (10) arranged in a first array (100). Stored within each bit cell is the soft value for a corresponding bit of the codeword. The apparatus also includes a controller module (106) coupled to the first array. The controller module performs a component decode on each soft value in the plurality of bit cell, rotating the soft values between each of the bit cells along the first array by using a connection scheme (110).

Description

METHOD OF DECODING A TURBO PRODUCT CODE UTILIZING A

SCALABLE AND HARDWARE EFFICIENT FORWARD ERROR

CORRECTION DECODER

Related Application

This Patent Application claims priority under 35 U.S.C. 119 (e) of the co- pending U.S. Provisional Patent Application, Serial No. 60/327,646 filed October 4, 2001, and entitled "SCALABLE AND REGULAR HARDWARE-EFFICIENT FORWARD ERROR CORRECTION DECODER". The Provisional Patent Application, Serial No. 60/327,646 filed October 4, 2001, and entitled "SCALABLE AND REGULAR HARDWARE-EFFICIENT FORWARD ERROR CORRECTION DECODER" is also hereby incorporated by reference.

Field of the Invention

The present invention relates to an apparatus and method of decoding, in general, and in particular, an apparatus and method of decoding a turbo product code utilizing a scalable and hardware efficient decoder.

Background of the Invention

Existing decoding systems utilize several module assemblies which are routed to one another to decode encoded data which enters within the system. Such existing decoding systems may include several RAMs as well as several Soft In-Soft Out (SISO) decoders which pass the encoded data back and forth to iteratively decode the data, find the lowest confidence values and the nearest neighbors. Conventional techniques for decoding iterative codes generally have a long latency to allow for many iterations. The latency is generally many data block times. In most communication systems excessive latency cannot be tolerated, so iterations are pipelined and run in parallel to decrease the latency. This increases the amount of logic, especially for large block sizes that give the best coding performance.

Most importantly, the codes used in TPC blocks require that each bit can affect the new value of every other bit during decoding. This generates large amounts of global interconnect, exacerbated by increased parallelism. Global interconnect routing is difficult to manage and is not scalable or regular which generally creates timing problems in the back-end.

Existing decoders also have a number of disadvantages. The design of the existing decoding systems is not scalable, because the hardware performance of the decoders is dependent of the block size of the turbo product code as well as the number of soft value bits that are input. In addition, the design of existing decoding system uses longer interconnect routing which causes difficulty in the back-end assembly of the system and the use of large logic density within the system. Moreover, the long interconnect routing between decoding components in the existing decoding systems cause a large amount of power dissipation in the decoder. In effect, the amount of turnaround- time and the area of consumed silicon used in the decoder, particularly for the large block size applications, is increased. Further, existing decoding systems do not have the design capabilities in which the system has a fixed latency or uses time efficiently due to the difficulty in having many parallel operations performed at one time.

What is needed is a scalable decoding system having routing configurations which do not need to be changed for turbo block codes of different sizes. What is also needed is a decoding system in which all of the decoding components are localized and the routing between all the decoding components are known. In addition, what is needed is a decoding system in which decoding of large block sizes can be performed at high decoding rates without extensive routing between decoding components.

Summary of the Invention

In one aspect of the invention, a decoder for decoding an encoded codeword having a plurality of bits. Each of the plurality of bits includes a soft value. The decoder comprises a plurality of bit cells that are arranged in a first array. Each bit cell stores the soft value within for a corresponding bit. A controller module coupled to the first array of bit cells. The controller module performs a component decode on each soft value in the plurality of bit cells. The controller module rotates the soft values between each of the bit cells along the first array using a connection scheme.

In another aspect of the present invention, a decoder for decoding an encoded turbo product code block having (n,k) bits in a first array. The decoder comprises 'n' number of bit cells arranged in the first array. Each bit cell receives a soft value for a corresponding bit. A controller module is coupled to the first array of bit cells. The controller performs a component decode on each soft value by shifting the soft value along each bit module in the first array.

In yet another aspect of the present invention, a method of decoding an encoded codeword having a plurality of encoded bits. The method comprises the steps of receiving the encoded codeword, wherein the received each encoded bit is loaded into a corresponding bit cell in an array of bit cells. The method comprises the step of calculating a syndrome result for the encoded codeword, wherein the syndrome result is calculated by comparing a minimum input bit confidence between each bit cell in the array. The method comprises the step of determining a nearest neighbor code set for each bit from the syndrome result, wherein a nearest neighbor confidence value is stored in each bit cell in the array. The method comprises the step of calculating a pair of lowest difference metric values for the encoded codeword, wherein the lowest difference metric values are calculated by summing a minimum sum value in each bit cell. The method includes the step of generating an extrinsic LLR value for each bit in the encoded codeword, wherein the extrinsic LLR value is determined from a lowest difference metric value from a current bit pair confidence value.

Other features and advantages of the present invention will become apparent after reviewing the detailed description of the preferred embodiments set forth below.

Brief Description of the Drawings

Figure la illustrates a 4x4 bit cell array block diagram of the decoder in accordance with the present invention.

Figure lb illustrates placement of a plurality of local controllers within a 4x4 bit cell array in accordance with the present invention.

Figure 2a illustrates a general block diagram of the decoder in accordance with the present invention.

Figure 2b illustrates a rotation order of soft bits for a 4 bit cell array in accordance with the present invention.

Figure 3 illustrates a block diagram of a bit cell of the preferred embodiment in accordance with the present invention.

Figures 4a-4c illustrate a shift operation of a hyper dimension of a 4x4 bit array in accordance with the present invention.

Figures 5a-5b illustrate several "figure 8" configurations of the preferred embodiment in accordance with the present invention.

Figure 6 illustrates the division order of input bits into the decoder according to the preferred embodiment of the present invention.

Figure 7 illustrates a block diagram of the syndrome and parity calculation process according to the preferred embodiment of the present invention.

Figure 8 illustrates a detailed block diagram of the difference metric calculation process according to the preferred embodiment of the present invention.

Figure 9 illustrates a decoder block diagram of the extrinsic LLR calculation process according to the preferred embodiment of the present invention.

Figure 10 illustrates a hyper dimension rotation process of the bit cells according to the present invention.

Figures 1 la- 1 lb illustrate a three dimensional TPC block and a corresponding super cell in accordance with an alternative embodiment of the present invention.

Figure 12 illustrates several super cell connection schemes of the decoder in accordance with an alternative embodiment of the present invention.

Figures 13a- 13c illustrate a hyper dimension rotation process of a super cell array in accordance with an alternative embodiment of the present invention.

Figure 14 illustrates a block diagram of the 8X4 super cell array in accordance with an alternative embodiment of the present invention. Figure 15 illustrates a block diagram of a SISO tile of an alternative embodiment of the present invention.

Figure 16 illustrates a overall block diagram of the decoder in accordance with an alternative embodiment of the present invention.

Figure 17 illustrates a block diagram of the j bit cell and the k bit cell in accordance with an alternative embodiment of the present invention.

Figure 18 illustrates a shift operation between bit cells in accordance with an alternative embodiment of the present invention.

Figure 19 illustrates a block diagram of the syndrome and parity calculation process according to an alternative embodiment of the present invention.

Figure 20 illustrates a detailed block diagram of the difference metric calculation process according to an alternative embodiment of the present invention.

Figure 21 illustrates a block diagram of the extrinsic LLR calculation process according to an alternative embodiment of the present invention.

Figure 22 illustrates a block diagram of the shifting of bits between bit cells according to an alternative embodiment of the present invention.

Figure 23 illustrates an overall block diagram of the processing of bits through a SISO tile according to an alternative embodiment of the present invention.

Detailed Description of the Preferred Embodiment

Reference will now be made in detail to the preferred and alternative embodiments of the invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it should be noted that the present invention may be practiced without these specific details. In other instances, well known methods, procedures, components, and circuits have not been described in detail as not to unnecessarily obscure aspects of the present invention.

The present invention is directed to an apparatus for and method of generating a scalable and regular forward error correction decoder system. The decoder 1 of the present invention preferably utilizes turbo product codes (TPCs), extended hamming codes as well as parity codes, but are not limited to these codes. In addition, the decoder 1 of the present invention preferably decodes TPCs of two- or three- dimensions, whereby the TPC codes have a possible additional diagonal dimension, denoted as an enhanced or hyper dimension. Generally, the decoder 1 preferably uses soft input values and iterates the vales through all desired dimensions of the bit cell array or SISO tile 100 to generate values indicating how each soft value should be changed. The decoder 1 preferably uses an architecture based on a systolic approach, whereby the data is processed and decoded using near-neighbor connections. The use of nearest-neighbor connections in the present decoder 1 overcomes a deficiency in existing decoders in removing all of the global interconnect associated with the flow of data between decoding components. In addition, the decoder 1 of the present invention is also advantageous over prior art decoders, because all of the soft values are distributed in a plurality of bit cells. Each bit cell then performs the component decode on its respective soft value bit and provides a result to a series of controllers, whereby the controllers determine the nearest neighbors to decode the block.

Figure la illustrates a general block diagram of a two-dimensional bit cell array 100 having 4x4 bit cells 10 in accordance with the present invention. Generally, each box shown in Figure la represents a bit cell module or bit cell 10, whereby each bit cell 10 receives one or more soft values for a corresponding bit. For example, as shown in Figure 2a, for an encoded codeword which has a turbo product code block of 256x256 bits, the decoder will preferably have 256x256 bit cells 10, wherein each bit cell contains one soft value bit. In addition, as shown in Figure la, a set of bit cells 10 are grouped together to form a tile of soft-in-soft-out (SISOs) decoding modules or a SISO tile 100. For example, the decoder array 100 shown in Figure 2a includes 256x256 bit cells 10 grouped to form 8x8 SISO tiles 100, whereby each SISO tile 100 includes 8x8 bit cells 10. Thus, there are 8 rows and 8 columns of bit cells 10 in each SISO tile 100.

Preferably, as shown in Figure la, a local controller 104 is coupled to a respective row and column of bit cells 10 to control a local calculation loop. For example, as shown in Figure la, local controller 104a controls the row of bit cells including bit cells 10a, 10b, lOe and lOf. In addition, local controller 104a controls the column of bit cells including 10a, 10c, lOi, and 10k. Similarly, as shown in Figure la, local controller 104b preferably controls the row of bit cells including bit cells 10c, lOd, lOg and lOh. In addition, local controller 104b preferably controls the column of bit cells including 10b, lOd, lOj, and 10m. It should be noted that the local controller 104 can alternatively be configured to control any combination of rows and columns. For instance, local controller 104b can be configured to control the row of bit cells including bit cells 10c, lOd, lOg and lOh and column including lOe, lOg, lOn, and lOq.

Preferably, one local controller 104 controls both one x and y axes of the bit cell 10 array. Alternatively, one local controller 104 controls a row of bit cells 10 whereas another local controller 104 controls a column of bit cells 10. As shown in Figure lb, local controllers 104a-d are placed in a row configuration between columns of the bit cells which the controller respectively controls. Thus, as shown in Figure lb, local controller 104a is placed in between bit cells 10a, 10b, 10c and lOd. Similarly, local controller 10b is placed in between bit cells lOe, lOf, lOg and lOh and so on. Alternatively, the local controllers 104 are placed in a diagonal configuration along the bit cell 10 array in the SISO 100.

Thus, each row and column of bit cells 10 in a local calculation loop is coupled to a local controller module 104, whereby the local controller controls the rotations or shifts of the soft values in each of the bit cells 10 within the row and column. Each local controller 104 preferably controls a number of bit cells 10, whereby the bit cells 10 communicate with one another and the local controller 104. In addition, each local controller 104 communicates with the bit cells 10 in the local calculation loop as well as other local controllers in the SISO tile 100. Preferably, each of the bit cells 10 in a local loop rotate the soft values between themselves as well as to their respective local controller 104. In addition, as shown in Figure la, each local controller 104 is coupled to a global controller module 106. The global controller module 106 controls each of the local controller modules 104 to process all of the entire rows and columns of the TPC block.

Figure 3 illustrates a block diagram of the preferred embodiment of an individual bit cell 10 in accordance with the present invention. As shown in Figure 3, the bit cell 10 preferably includes a spin register 12, a difference metric register 14, a previous result register 16, an original data register 18 and a shadow register 20. It is preferred that each bit cell 10 in the present invention has the same register components. Alternately, the bit cell 10 include other or additional components, including but not limited to logic modules, latch registers and other RAMs (not shown). As discussed above in Figure la, each bit cell 10 is coupled to a local controller module 104, whereby the local controller 104 preferably controls the bit cells 10 externally. Alternately, each bit cell 10 includes the local controller module 104 within.

As discussed above, an array of bit cells 10 configured together forms a SISO tile 100. As shown in Figure 2a, each SISO tile 100 and each bit cell 10 within is preferably coupled to an input module 108 in which the input module 108 provides soft values to corresponding bit cells 10 in the SISO 100. Preferably, the soft values for a particular bit are input into the shadow register 20 of the particular bit cell 10. The shadow register 20 for each bit cell 10 preferably contains one soft value for the bit corresponding to that bit cell 10. Alternately, the shadow register 20 in a bit cell 10 contains more than one soft value for the corresponding bit. The original data register 18 in the bit cell 10 stores the original starting soft values of the bit associated with the bit cell 10 for a given block. The previous result register 16 in the bit cell 10 stores the decoded result from the previous block. The difference metric register 14 in the bit cell 10 in Figure 3 stores the difference metrics of the bit for each block. The spin register 12 in the bit cell 10 is used to generate the input for the next axis iteration as well as generate the output of the final result after the final block has been decoded. The details of how each of these components work together will be discussed in more detail below.

The general overview of the workings of the bit cell 10 array or SISO 100 will now be discussed. As shown in Figure la, SISO tile 102a includes bit cells 10a, 10b, 10c, and lOd whereas SISO tile 102b includes bit cells lOe, lOf, lOg, and lOh. In addition, SISO tile 102c includes bit cells lOi, lOj, 10k and 10m, whereas SISO tile lOd includes bit cells lOn, lOp, lOq and lOr. Also, as shown in Figure la, row 1 is designated as having bit cells 10a, 10b, lOe and 1 Of whereas row 2 in the bit array includes bit cells 10c, lOd, lOg and lOh. In addition, column 1 in the bit array, as shown in Figure la, includes bit cells 10a, 10c, lOi and 10k, whereas column 2 includes 10b, lOd, lOj and 10m.

As stated above, the decoder 100 of the present invention decodes the bit cells 10 in local calculation loops as well as global calculation loops, as indicated by the arrows shown in Figure la to find the nearest neighbor values. In addition, the array of bit cells 10 in the SISO 100 includes several multiplexers 110 which control processing of the local and global calculation loops. In the local operation or local calculation loop, the local controller 104 activates the multiplexer or gate 110 to be closed such that the soft values are rotated only between the bit cells in a row or column in the SISO tile 102. For instance, in the local calculation loop for row 1, the local controller 104a iteratively shifts or rotates the soft values between the bit cells 10a and 10b in SISO 102a when the multiplexers 110a and 110b are in a local mode. Similarly, for the local calculation loop in row 1, the local controller 104b iteratively shifts or rotates the soft values between the bit cells lOe and lOf in SISO tile 102b when the multiplexers 110a and 110b are in a local mode. The same process for the local calculation loops occurs for rows 2, 3 and 4.

For the local calculation loop in column 1, the local controller 104a iteratively shifts or rotates the soft values between bit cells 10a and 10c in SISO tile 102a when the multiplexers 110c and 1 lOd are in a local mode. In addition, for the local calculation loop in column 1, the local controller 104a iteratively shifts or rotates the soft values between the bit cells lOi and 10k in SISO tile lOOd when the multiplexers 110c and 1 lOd are in a local mode. The same process for the local calculation loops occurs for columns 2, 3 and 4. Once the local controllers 104 communicate with their respective bit cells 10, the local controllers 104 preferably communicate the soft values with each other and relay a result to the global controller 106.

Each local controller module 104 is preferably coupled with the global controller module 106, as shown in Figure la. Alternatively, each individual bit cell 10 in the array is coupled directly to the global controller module 106. After the local operation is complete, a global operation is performed of a length that is equal to the length of the row, whereby the soft value of each bit cell si compared with each bit cell in the global calculation loop. As stated above, the global calculation loop is performed to find the nearest neighbor data for the encoded vector. For the bit cell 10 array shown in Figure la the global shift is of length four, whereby the local controller 104a opens the multiplexer 110a and local controller 104b opens the multiplexer 110b. Thus, the soft value in bit cell 10a is globally rotated through each bit cell 10 in the row and back to bit cell 10a. Similarly, the soft value in bit cell 10b is rotated and compared with bit cells 10c, lOd and 10a and back to bit cell 10b. The global shift preferably occurs in each row in the bit array at the same time. Alternatively, the global shift occurs in each row in the bit array at different times. Generally, the global shift allows each bit cell 10 and corresponding local controller 104 to look at the soft values of every other bit cell 10 in the same row or column.

Similarly, for a global operation performed on a column of the bit array, the global controller 106 instructs local controllers 104 and 104c to open the multiplexers 110c and 1 lOd so that the soft values are rotated and compared between the bit cells 10a, 10c, lOi and 10k in the entire column and back to their original positions. In addition, the soft value originating in each bit cell 10b, lOd, lOj and 10m is rotated and compared globally along the entire row and back to its original starting position. The global operation for the y-axis preferably occurs in each column at the same time. Alternatively, the global operation for the y-axis in each column occur at different times. The details of the application of the global operation will be discussed in more detail below.

Figure 2b illustrates four bit cells each having one soft value within, denoted as S„ S₂, S₃, and S₄. The first cycle, or cycle 0, is shown in Table 1 as having S, XORed with itself; S₂ XORed with itself; S₃ XORed with itself and S₄ XORed with itself. As stated above, during the global operation, each soft value is preferably rotated by one bit cell in either direction. In cycle 1 , each soft value is rotated to the right by one bit cell, whereby S, is XORed with S₄; S₂ is XORed with S,; and so on. In cycle 2, each soft value is rotated to the right by an additional bit cell. Again, in cycle 3, each soft value is again rotated to the right by another bit cell. In cycle 4, the soft values S,, S₂, S₃, and S₄ are rotated to the right by one bit cell such that the soft values are placed back into their original bit cells as in cycle 0. More detail regarding the actual decoding process in relation to the global calculation loop is discussed below.

In addition to having extended hamming bits, turbo product codes, in both two and three dimensions, have a "block parity" mechanism in which redundant bits are added to the encoded block of data in the form of one or more parity rows and columns. To decode a turbo product code having a parity row and column, the decoder 1 must perform a component decode on a unique combination of soft values which are not located entirely in a row or column. This unique combination of soft values is called an enhanced dimension, such as a "hyper-diagonal".

Figures 4a-c, illustrate an example (3,4) x (3,4) turbo product code matrix having a parity row and column. Specifically, Figure 4a illustrates the block matrix 144 before rotation. Figure 4b illustrates the block matrix 144a after the rows are rotated to form the enhanced columns. Figure 4c illustrates the block matrix 114b after the rows are rotated back to their original positions. The block matrix 144 includes soft values located in the array as S,_j where i is the row and j is the column. It should be noted that the turbo product code matrix is not restricted to a length which is a power of two and, thus can be any length. The parity row in the block matrix 144 includes soft values in (ij) bits, namely bits 14, 24, 34 and 44. Similarly, the parity column in the block matrix includes bits 41,42,43 and 44. The decoder 1 of the present invention utilizes diagonal dimensions to treat each of the combinations of soft values from the block in Figure 4a as an individual code, whereby these combinations are treated as another dimension:

(Sιι_>S₂₂,S₃₃,S₄₄) (S₁₂,S₂₃,S₃₄,S₄₁,) (S₁₃,o₂₄,S₃₁,S₄₂) (S₁₄,S₂₁,o ₂,S₄₃)

In decoding the parity rows and columns, the decoder 1 reconfigures the block matrix 144 by rotating each column or row by a predetermined amount of bits, whereby the hyper-diagonal that is to be decoded is aligned as a single row or column. As shown in the example in Figure 4b, to decode a parity column, the enhanced dimension (S, ,,8₂₂,8₃₃,8₄₄) is formed by rotating each of the rows in the hyper diagonal by a predetermined amount of bits. Thus, as shown in Figure 4b, the first row is not rotated; the second row is rotated to the left by 3; the third row is rotated to the left by 2 and the last row is rotated to the left by 1. The enhanced dimension, once all of the diagonal bits are aligned for a single diagonal, is then simply a column decode, or an enhanced column. The remaining diagonals in the block matrix 144 are also preferably aligned in columns.

The decoder 1 of the present invention then performs the local and global calculation loops discussed above on the enhanced dimensions. Similarly, to decode a parity row, the hyper diagonal is decoded by rotating each of the columns a predetermined amount. The enhanced dimension, once all of the diagonal bits are aligned for a single diagonal, is then simply a row decode, or an enhanced row. The remaining diagonals in the block matrix are also preferably aligned in rows. The decoder 1 of the present invention then performs the local and global calculation loops discussed above on the enhanced rows. Any enhanced dimension can be supported since the rows (or columns) can be rotated by any amount and in any direction. It is preferred that the decoder 1 uses already determined logic results from the individual decoded soft values in the rows and/or columns to decode the hyper diagonals. Alternatively, the decoder 1 decodes the hyper diagonals using logic without using the decoded soft values from each row and/or column. More detail regarding enhanced dimensions and hyper diagonals of turbo product codes is disclosed in co-pending U.S. Application Serial Number 09/808,884 filed on March 14, 2001 entitled "ENHANCED TURBO PRODUCT CODES" which is hereby incorporated by reference.

When decoding an enhanced dimension in a TPC block of data, the soft value in the last bit cell that is shifted must loop back to its original bit cell position in the bit cell array. Due to the long length of large TPC block sizes, such connections may have nearest neighbor requirement deficiencies. For example, the soft value in the 256^th bit cell 10 in a 256 bit cell array will travel 256 bits along the bit cell 10 array to be rotated back to its original bit cell position. Therefore, the decoder 1 of the present invention preferably utilizes a figure-of-eight configuration to reduce this deficiency. Figures 5a and 5b illustrates a number of possible "figure 8" connections that can be used between the bit cells 10 for a particular row. It should be noted that although only eight bit cells 10 are shown in Figure 5a, any number of bit cells 10 may be used. For instance 16 bit cell positions are illustrated in Figure 5b. Further, it should also be noted that the configuration of "figure 8" connections are not limited to the examples shown and described in Figures 5a-5b and the present invention.

As shown in Figures 5a and 5b, the order of bits are placed in a predetermined order such that the amount of routing between each of the bit cells 10 is a minimum. The predetermined order also depends on the number of bit cells that are involved in the "figure 8" configuration. The "figure 8" connection of bit cells 10 shown in Figure 5a is broken down into a first half and a second half of bit cells 10, whereby the first half includes groups 1 and 2 (bits 0, 1, 7, 6) whereas the second half includes groups 3 and 4 (bits 2, 3, 5, 4). As shown in Figure 5a, the bit cells 10 are arranged such that the starting order in the first half of the groups is inter-digitate with the second half which is in reverse order. Following the example shown in Figure 5a, each of the groups contain 2-bit cells. Thus, the bit cells are placed in a configuration order (0, 1, 7, 6, 2, 3, 5, 4), such that the arrows show the order from bit cell 0 to bit cell 7. The "figure 8" order begins from bit cell 0 to bit cell 1 in group 1 and skips over the group 2 to bit cell 2 in group 3. Following, the "figure 8" order proceeds from bit cell 2 to bit cell 3 in group 3 and in a reverse bit order to bit cell 4 and bit cell 5 in group 4. Proceeding therefrom, the "figure 8" order travels from bit cell 5 and skips over bit cells 2 and 3 in group 3 and continues to bit cell 6 in group 2. Continuing, the order travels from bit cell 6 to bit cell 7 and proceeds in reverse order back to bit cell 0 in group 1.

The decoder 1 of the present invention preferably loads and unloads the rows and columns with bits in the pre-twisted "figure 8" order shown in Figures 5a and 5b. Preferably, the "figure 8" order is fixed in hardware whereby a buffer is utilized to simultaneously store a portion of the block as the bits are loaded and unloaded. Alternatively, the decoder 1 of the present invention, stores and utilizes an index register whereby the decoder 1 carries an index with each bit value as the bits are rotated through the "figure 8" connections during loading and unloading. This allows the decoder 1 to re-order the bits in any possible configuration, whereby any sequence of indices are supported.

In another alternative embodiment, an index shift register (not shown) is used to load and unload the bits in a pre-twisted order. In this alternative embodiment, some linear combination of the current index and the index of the value being passed is used. For example if the index is denoted i and the shifting index is denoted σ, the condition relating i and σ is given by ι=σ <8> α, where <8> is some linear operator. Since the operation <8> is linear, this can be reduced to i = σ Θ where Θ is the inverse operation of <8>. The linear combination σ O α a can be pre-computed using the index shift register (not shown), since i and σ are known. In other words, this linear combination keeps track of the variable i with respect to the variable α and vice versa.

The details of the decoding process to find the nearest neighbors for a given codeword will now be discussed. As stated above, the decoder 1 of the present invention performs a local component decode on the soft values in each bit cell located in a particular row or column separately. Preferably, the first iteration is performed on a column, whereby the next iteration on the row and so on. Alternately, the first iteration is performed on a row, whereby the next iteration is performed on the column and so on. Further, alternately, more than one column or row is iteratively decoded in parallel at a particular time.

In general, the decoder 1 of the present invention decodes data in six overall steps. First, the encoded data is input into the input module 108, whereby the input module 108 provides the encoded bits to the SISO tile 100 of the decoder 1. The decoder 1 then calculates the syndrome and parity values for each encoded bit and then find the nearest neighbor code set. Following the decoder 1 finds the minimum difference metric values and calculates the extrinsic Log-Likelihood Ratio (LLR) values, whereby the extrinsic values are output by the output module 112 (Figure 2a). Each of these steps will now be discussed in detail.

As described above, the decoder 1 of the present invention includes a tiled array of SISOs 100, wherein each SISO tile 100 includes a number of bit cells 10 arranged in rows and columns (Figure 2a). As stated above, the decoder 1 of the present invention performs a local calculation loop on each individual row and column of bit cells 10 in each SISO tile 100, whereby each local calculation loop is controlled by the local controller module 104 (Figure la). It is preferred that the local controller module 104 is a local state machine, however other registers and modules are alternatively used. A global control module 106 controls all of the local controllers 104 in the SISO array to process the row, column and hyper axis iterations on the encoded block. Each local controller module 104 preferably collects the local results from the group of bit cells in its respective local loop and communicates the local results to each other and the global controller 106. As will be discussed below, all of the local vector results from the group of local bit cells 10 are processed to find a common global result. In finding the common global result, each local group of bit cells 10 calculates a local syndrome and each local syndrome is XORed, under a global operation controlled by the global controller 106, to find the vector syndrome of the whole loop.

As stated above and shown in Figure la, the bit cells 10, local controller modules 104, and the global controller module 106 form the bit cell arrays in SISO tile 100. Preferably, a set number of bit cells 10 in a SISO tile 100 is coupled to a respective local controller 104. Thus, the local controllers 104 are arranged in rows and columns throughout the bit cell array 100, whereby each local controller 104 controls a set number of rows and columns of bit cells in a SISO tile 100. As stated above, the global controller 106 is coupled to each local controller 104, whereby the global controller 106 controls the global shifts of the soft values in the bit cells 10 in a row or column array. In addition, the global controller 106 controls the timing of the local controllers 104 in decoding the soft values in the bit cells 10. Alternatively, each bit cell 10 is coupled to a separate local controller 104 or includes a local controller 104 within.

In inputting the encoded data into the decoder 1 of the present invention, encoded data is loaded into the input module 108 shown in Figure 2a. The input module 108 receives the packets of encoded data and loads them into a TPC block order module (not shown) and the input module 108 outputs the data to the bit cell array 100. The encoded data received at the input module 108 may be over any communication medium, including, but not limited to, hardwire, optics, antenna and fiber optics.

The bits provided by the input module 108 to the bit cell array are preferably split into a predetermined number of columns respective to the number of groups present in the "figure 8" connection (Figures 5a and 5b). Alternatively, the bits provided by the input module 108 to the bit cell array is preferably split into a predetermined number of rows respective to the number of groups -present in the "figure 8" connection. As shown in Figure 6, the soft bits are divided into four columns by the input module 108, whereby the individual bits are placed into the bit cell array of each row in the shifted order. It should be noted that the present invention is not limited to the shifted order shown in Figure 6. It is preferred that each column is loaded and unloaded into the groups that are independent of one another.

The bits are passed from the input module 108 to the shadow register 20 of each bit cell 10 in the first row of the bit cell 10 array in the SISO 100. It is understood that the array of bits is alternatively split into any number of groups. As stated above in the enhanced dimension discussion, the input data is preferably loaded into the SISO 100 in a predetermined order respective on the number of groups present in the array, as also shown in Figure 6. Thus, the input data is input into the SISO 100 in a shifted order configuration instead of a natural input order, as shown in Figures 5a and 5b. As shown in Figure 6, the shift order configuration causes the four columns of data to be spread out through the SISO tile 100 instead of distinct, separate groups. However, it is understood that the "figure 8" pattern applies to any bit size vector with an appropriate number of bit cells 10 per SISO tile 100.

As the SISO tile 100 receives the soft value vectors from the input module 108, the soft values are placed within the shadow register 20 of each bit cell 10 until all the bit cells 10 in the SISO tile 100 have been loaded. As the bits are loaded, the shadow register 20 of a particular bit cell 10 preferably shifts the soft value within to the shadow register 20 in the adjacent bit cell 10. For example, as soft values are loaded row by row into the SISO tile 100 shown in Figure la, the soft value stored in bit cell 10a is shifted to bit cell 10b. In addition, the soft value stored in bit cell 10c is shifted to bit cell lOd. Preferably, the input module 108 simultaneously shifts the soft values to the bit cells 10 in the first row of the SISO tile 100. This process preferably continues until all of bit cells 10 in the SISO tile and all the SISO tiles, themselves, are filled with the soft values of the vector.

As the soft values are loaded into the input module 108 and passed onto the SISO tile 100, the global controller module 106 preferably counts each bit being passed until the entire block of data has been loaded into the SISO 100. The global controller module 106 then preferably communicates to each local controller module 104 that the entire block of encoded data has been loaded. The local controller modules 104 then preferably communicate a signal to their respective local loops of bit cells 10 in the respective SISOs 100 to initiate the decoding process on the loaded data. Once the bit cells 10 receive the signal from their respective local controllers 10, the contents in the SPIN register 12 are loaded into the Shadow register 20. In addition, the data previously in the Shadow register 20 are loaded into the original data registers 18. This is because the data in the SPIN register 12 are the final results from the previously decoded block.

Once all of the soft values are loaded into all of the SISO tiles in the decoder 1 , the decoder 1 begins calculating the syndromes and parity bits for the encoded codeword. The syndrome calculation determines if the current loaded codeword is a valid codeword. If the result of the syndrome calculation is '0', then the current loaded codeword if valid. However, if the result of the syndrome calculation is not '0', then the loaded codeword is not valid. In the case that the codeword is not valid, the syndrome result is determined to be the Galois address of the bit that is in error or has a ' l' value.

Figure 7 illustrates a block diagram of a plurality of bit cells 10t, each coupled to the local controller module 104. The syndrome calculation preferably starts when the global controller module 106 sends a signal command to each of the local controller modules 104, whereby each local controller module 104 sends the command to its respective bit cells in the respective loops. In the first axis iteration of the block, the contents in the original data register 18 are copied into the SPIN register 12 of each bit cell 10.

It is preferred that at the end of each axis iteration, the value in the SPIN register 12 is automatically loaded with the encoded bits that are to be decoded in the next axis iteration. The syndrome and parity values for each bit cell 10 are preferably calculated locally inside each local controller module 104, whereby the bit cells 10 in the loop provide the necessary value. Alternatively, the syndrome and parity values are calculated within the bit cell 40 itself or within another controller module.

Figure 7 illustrates that the local controller module 104 includes a parity module 114, a syndrome module 116, a mini module 118, a min2 module 120, a lowil module 122 and a lowi2 module 124. In addition, the parity module 114 is coupled to a parity multiplexer 126 and the syndrome module 116 is coupled to a- syndrome multiplexer 128. In addition, the SPIN register 12 in each bit cell 10 communicates with one another, as is discussed in more detail below.

Figure 7 also illustrates a Galois register 134/ which is preferably located within the same local controller module 104 performing the decoding process, although a Galois register may be located in a local controller module which only controls the rows of the bit cell arrays and a local controller module which only controls the columns of the bit cell arrays. However, the Galois register 134 is shown outside of the local controller module 104 for exemplary purposes. This is because the Galois address in one axis is used by the decoder 1 to decode the data in another axis. Thus, the decoder 1 decoding a row of bits in a bit cell array will use the Galois address of the column respective to that bit cell 10 that is currently being decoded. Similarly, the decoder 1 decoding a column of bits in a bit cell array will use the Galois address of the row respective to each corresponding bit cell 10 that is being decoded. It is preferred that each Galois register 134, per row or column, is provided with the Galois address at the same time.

The syndrome for the codeword is calculated by rotating the SPIN register 12 and the Galois Address in the Galois register 132 through the entire vector of the codeword. The local controller module 104 rotates the Galois Address in the Galois Register 132 with respect to the SPIN registers 12 for each of the bit cells 10. As shown in Figure 7, the decoder 1 sums the Galois address value in the Galois register 132 of each '1' value in the codeword vector, wherein each T value is stored in the syndrome module 116. The value in the SPIN register 12 for each bit cell 10 is first examined to check if the sign of the input soft value for that bit cell 10 is ' 1 ' as the Galois register 132 rotates the Galois address with the data being shifted. If the sign of the soft value in the SPIN register 12 for a particular bit cell 10 is ' 1 ', then the Galois address of that soft value in the bit cell 10 is XORed at the syndrome multiplexer 128 with the current value of the syndrome in the syndrome module 116. The results are preferably accumulated in the syndrome module 116.

After the entire group of bit cells 10 in a SISO tile 100 is finished processing, the local results inside each local controller module 104 are XORed with one another to get a final syndrome result. In the present example shown in Figure 2a, there are 8 SISO tiles 100 in the x-axis and 8 SISO tiles 100 in the y-axis, wherein each SISO tile contains 8 x 8 bit cells. Therefore, there are 8 local controller modules 104 in SISO tile 100 for the x and y axes that communicate with each other in the local calculation loops. The local parity value for the soft values in each bit cell 10 is preferably calculated in parallel with the calculation of the syndrome. As shown in the local controller 104 in Figure 7, the sign of each bit provided by each of the SPIN registers is XORed with current parity values in the parity module 114 and the result is preferably accumulated in the parity module 114. These results are also preferably XORed, in parallel, with the local syndrome results. The final parity result is then preferably stored in all local controller modules 104 in the decoder.

The decoder 1 also finds the two minimum input bit confidences and their respective Galois addresses by using the algorithm described above. As shown in Figure 7, the local controller module 104 determines the local minimum value in the lowil module 122 and the local second most minimum value in the lowi2 module 124 from the values in the SPIN register of each bit cell 10 and the Galois address from the Galois register 134. In addition, the local controller 104 stores the location of the lowil value, which is designated as mini, in the mini module 118. Similarly, the local controller 104 stores the location of the lowi2 value, which is designated as min2, in the min2 module 120. The two local minimum values are compared with the local minimum values in other bit cells 10 in the bit array 100 to generate two overall minimum lowil and lowi2 values, as well as their respective locations, minil and mini2. These overall minimum lowil and low2 values and their respective locations, minil and mini2 are stored in the respective modules in every local controller module 104in the decoder 1. In the case where several locations each have equal minimum values for a particular row or column, the local controller module 104 preferably stores the minimum values with their respective minimum Galois addresses in a RAM (not shown). Alternatively, the local controller module 104 stores the minimum values and their respective minimum Galois addresses within the any of the modules 104 in the bit cell arrays. These stored values can then be retrieved for later calculations.

Preferably, the results stored in the syndrome module 116, parity module 114, mini module 118 and min2 module 120 are held through the entire axis iteration. There are 4 possible outcomes from the syndrome/parity calculation: svndrome parity

A) 0 0

B) 0 1

C) 1 0

D) 1 1

The results of the syndrome and parity calculations are used to determine whether or not the vector is valid and can be used as a center codeword. If the vector is not valid, the decoder 1 uses the syndrome and parity results to correct the vector to a valid center codeword. A valid center codeword allows the decoder 1 to find the Galois addresses of the two common bits that can be used to find a set of nearest neighbor codewords.

In case A), the syndrome result is 0 and the parity result is 0. Thus, the codeword is valid and there are no errors. The two common bits are located in the mini and min2 locations. In case B), the syndrome result is 0 and the parity result is 1. Thus, the syndrome result is valid, although the parity result is invalid. Therefore, there is a parity error and the address of the parity bit, where the Galois address is 0, is placed in the mini module 118. In addition, the address previously in the mini module 118 is now placed in the min2 module 120. In case C), the syndrome result is 1 and the parity result is 0. Therefore, the syndrome result is invalid, although the parity bit is valid. In this case, there are two errors, so the Galois address stored in the syndrome module 116 is placed in the mini module 118. In addition, the parity bit, where the Galois Address is zero, is placed in the min2 module 120. In case D), the syndrome result is 1 and the parity result is 1. Therefore, both the syndrome and parity values are invalid. Since there is one error in the syndrome, the Galois address stored in the syndrome module 116 in placed in the mini module 118, whereas the prior location in the mini module 118 is placed in the min2 module 120.

Once the syndrome values have been calculated, the decoder 1 of the present invention finds the nearest neighbor code set for the vector. The locations stored in the mini module 118 and min2 module 120 are the locations of the common hamming code bits. Hamming code bits preferably have a weight of four, although any other weight is contemplated. In the preferred embodiment, the hamming code bits are such that the nearest neighbor codewords are set as mini, min2, and two floating other bits are asserted. The two floating bits are determined by placing a ' 1' value in the mini and min2 locations and marching a third "1" through every location in the codeword, whereby the syndrome value is used to locate the fourth bit that is a nearest neighbor. The algorithm used by the decoder 1 is as follows:

syndrome = mini XOR min2 XOR Galois Address As stated above, the syndrome is calculated by rotating the SPIN register 12 and the Galois Address through each bit cell 10 in the entire vector of the codeword. The local controller module 104 rotates the Galois Address in the Galois Registers 132 with the SPIN registers 12 from each of the bit cells 10. In addition, each local controller module 104 preferably includes a fixed or constant Galois Address module 134 which is used in each respective local bit cell. Thus, each bit cell 10 is coupled to and communicates with the Galois Address register 132 as well as the constant Galois Address module 134. As shown in Figure 7, the local controller module 104 XORs the mini value in the mini module 118 with the min2 value in the min2 module 120 at multiplexer 146 and drives the results across all of the local bit cells 10 in the SISO tile 100. Since each of the local controller modules 104 calculates the mini and min2 values the same way, each bit cell 10 in the vector preferably receives the same mini XOR min2 value result from their respective local controller module 104. Alternatively, each bit cell 10 receives a different mini XOR min2 value result for the same local controller modules 104.

In determining the nearest neighbor bit for the codeword, the decoder 1 XORs the rotating value in the Galois register 132 with the value in the fixed Galois Address module 134 at Galois multiplexer 136. The XORed Galois result is then sent to comparator 138 whereby the XORed Galois result is compared to the XORed min value received from the mini module 118 and the min2 module 120 of each bit cell 10 in the array by a global rotation If this XORed Galois result value at comparator 138 is equal to the XORed min value from multiplexer 146, then the nearest neighbor value is found. When each bit cell 10 detects this condition, a load dm signal is produced and the confidence value in the SPIN register 12 of the each bit cell 10 is stored in the difference metric register 14. Alternatively, the hard decision value as well as the confidence value in the SPIN register in each bit cell 10 is stored in difference metric register 14.

After the nearest neighbor value is determined, the decoder 1 of the present invention finds the minimum difference metric values of the codeword. The minimum difference metric values are determined by using the hard decision value of the nearest neighbor bit and the corresponding confidence value to find the sum of the minimum confidences that are stored in the SPIN register 12 and difference metric register 14 of each bit cell 10. The SPIN register 12 and difference metric register 14 are preferably shifted between bit cells 10 in a loop during local operations to find the sums that are utilized by the local controller module 104.

Each sum preferably determined from the values in the SPIN register 12 and the difference metric register 14 of each bit cell 10 is based on the results of the syndrome and parity calculations. In the cases B) and D) from above, there is one error based on the syndrome and/or parity bit, whereby the confidence value from the Galois address register 132 stored in the mini module 118 is negated before the sum is calculated. In case C), there are two errors such that the confidences from the Galois address register 132 stored in mini and min2 are both negated before the sum is calculated.

Figure 8 illustrates a more detailed block diagram of the difference metric calculation process in accordance with the present invention. After the syndrome and parity errors have been fixed, the local controller module 104 shown in Figure 8 compares the sum result from the SPIN register 12 and difference metric 14 of each bit cell 10 in the local loop with the values in the lowil and lowi2 modules 122, 124. In addition, the local controller module 104 stores the smallest difference metric value in the lowil register 122 and the locations of the values which generate the smallest difference metric value in a newminl register 140 and a newmin2 module 142. In addition, the second most minimum sum is stored in the lowi2 module 124. One of the locations stored in the newminl module 140 is the rotating Galois address received from the Galois address register 132, as shown in Figure 8. The location stored in the newmin2 module 142 is calculated by XORing the value in the mini module 118 with the value in the min2 module 120 and Galois address location from the Galois register 132, as shown in Figure 8. The contents previously stored in the lowil module 122 and lowi2 module 124 are the minimum confidence values that were used to find the minimum confidences in the syndrome step. Once the Galois address locations of the minimum confidence values are determined, the contents in the lowil module 122 and the lowi2 module 124 are not needed and can be overwritten to determine the minimum difference metrics.

Each difference metric value preferably has an equal difference metric value in its corresponding pair location. Thus, the Galois address associated with the smallest difference metric value is stored in the newminl module 140 and the corresponding pair's address is stored in the newmin2 module 142 after being XORed with the mini and min2 values. In searching for the minimum difference metric value and the second most minimum difference metric value, the local controller module 104 preferably disregards the pair's value, because the pair's value is equal to the stored minimum difference metric value. In the case where there are equal minimum difference metric values, the minimum difference metric is preferably replaced when a new difference metric value that is equal to or smaller than the difference metric value stored in the newminl register 140 or newmin2 register 142, as shown in Figure 8. Once the difference metrics are determined, the local controller module 104 sends the minimum difference metric value to the difference metric module 14 of each bit cell 10.

The minimum difference metric values in the bit cells 10 in each local loop are then compared with the minimum difference metric values in the other local loops of bit cells 10 in the SISO tile 100. After all the local loops are compared, the decoder 1 generates the two overall minimum difference metric values, which are placed in the lowil module 122 and lowi2 module 124 in each local controller module 104. In addition, the decoder 1 preferably stores the locations that generate the two overall minimum difference metric values in the newminl register 140 and newmin2 register 142 in each local controller module 104. In the case where there are several equal minimum difference metric values, the values having the minimum Galois address are stored in the newminl module 140 and newmin2 module 142. Since there are 8 SISO tiles in the present example bit array (Figure 2a), all of the local controller modules 104 will find the same result after 8 cycles.

After the minimum difference metric values are determined, the decoder 1 of the present invention calculates the extrinsic Log-Likelihood Ratio (LLR) value for each bit in the vector. The extrinsic LLR value for the input bit, but not including the common bits, is calculated by subtracting the difference metric of the center codeword from the confidence value of the current bit pair in the difference metric register 14. In addition, for all common bits, the extrinsic LLR value for the input bit is calculated by subtracting the difference metric value in the lowi2 module 124 from the confidence value of the current bit pair, wherein the difference metric of the bit is the sum of the confidence value for the current bit and the confidence value of the current bit pair. The LLR result is negated based on a sign provided by the difference metric register. In particular, the LLR value is negated if the current bit pair confidence does not have the corresponding Galois addresses at the minil nor the mini2 location. In addition, the current bit pair confidence value will be negated if the Galois address is the mini2 location but has a syndrome or parity error, as discussed above. Similarly, the current bit pair confidence value will be negated if the Galois address is the minil location and has a syndrome error but no parity error, as discussed above.

Figure 9 illustrates a block diagram of the LLR value calculation process in the local controller module in accordance with the present invention. Preferably, the extrinsic LLR value calculation process begins by the contents in the original data register 18 being rotated along the bit cell array such that the bit in the difference metric register 14 is not used. Alternatively, the extrinsic LLR value calculation process begins by the contents being copied from the original data register 18 into the SPIN register 12 of each bit cell 10. The data in the original data register 18 is used later in the process to generate the input for the next axis iteration, whereby the data in the original data register 18 is not copied into the SPIN register 12 in the last iteration. In the last iteration, the final result is calculated as the sum of the input data bit with the extrinsic LLR value from every axis. If the data from the original data register 18 is not copied to the SPIN register 12, the SPIN register 12 will contain the original input data summed with every extrinsic LLR value except the current axis result. Thus, the final result is calculated by summing the current axis result with the contents in the SPIN register 12.

The pair of hard decision and confidence values for each of the bits are then stored in the difference metric register 14 of each bit cell 10. The difference metric register 14 and the SPIN register 12 are shifted by the local controller module 104, so that the LLR extrinsic values can be calculated from the hard decision and soft confidence information in every bit cell 10 in the local loop. Preferably, the LLR extrinsic value result is stored in the SPIN register 12. The LLR extrinsic value result is preferably multiplied by a feedback constant (block 152) that is supplied by the global controller module 106 and implemented in a lookup table for a final extrinsic axis result which is stored in the mini module 118. Preferably, the data is shifted between bit cells 10 in the local loop for every other clock to allow time for the bit cells 10 to process the input for the next axis iteration.

For every iteration except the last iteration, the LLR extrinsic result is preferably summed with the contents in the SPIN register 12 and the previous result register 16 to generate the input for the next axis iteration. The result of this sum is preferably stored in the SPIN register 12 and the weighted LLR extrinsic result is written to the previous result register 16 in the same step. However, in the last iteration, the LLR extrinsic result is preferably summed with the contents stored in the SPIN register 12 and the result of this sum is stored back into the SPIN register 12 to be output.

Following the last iteration, the decoded data is output when the new input data block has been completely loaded into the decoder 1. The global controller 106 counts every bit being loaded into the decoder 1 and generates a signal when the entire TPC block has been loaded in the decoder 1. The signal is directly communicated to the bit cells 10. Alternatively, this signal is received in each-of the bit cells 10 from each respective local controller 104. Once the signal is received in the bit cell 10, the contents in the Shadow register 20 are loaded into the original data register 18. Also, the sign of the contents from the previous block's final results which are in the SPIN register 12 and original data register 18 are loaded into the Shadow register 20 in the bit cell. In other words, the value in the Shadow register 20 of the first bit cell 10 is sent to the Shadow register 20 of the next second bit cell 10. In addition, the value previously in the Shadow register 20 of the second bit cell 10 is sent to the Shadow register of the third bit cell and so on. The data from the Shadow registers 20 of each of the bit cells 10 are output one row or column at a time. Alternatively, each of the rows and columns are output at the same time.

The extrinsic LLR value, once determined, is weighted by multiplying the extrinsic LLR value by a feedback constant (block 152). However, if the extrinsic LLR value is out of range, the value is clipped (block 152). Further, the extrinsic LLR value is negated if the hard decision for the bit is '0' or if there is a parity error, which is determined by the decoder 1 looking in a lookup table. However, if the hard decision value is '0' and there is a parity error for the bit, the extrinsic LLR value is negated twice so the result is the original weighted, clipped extrinsic LLR value.

Each LLR result produced by the decoder 1 represents the amount by which the original decoding result must be changed. To prevent the decoder 1 from correlating the calculated LLR result with itself for an iteration, the decoder 1 utilizes the following equation in finding the extrinsic LLR result

outdata = in + decode(in )

where ζ is the last iteration index. Thus, the decoder 1 stores the last N-1 results from the iterations by preferably using a simple stack arrangement. It should be noted that the decoder 1 of the present invention may also utilize this method for the three dimensional TPC blocks.

Since the LLR extrinsic data is added to the original data in the decoding process, the decoder 1 of the present invention preferably clips and rounds the data to match the original data during the summation process. In applications where the quantization levels are few, the decoding process is very sensitive to the weighting, rounding and clipping. Thus, the number of bits to be clipped is preferably small and a look-up table can be used to generate the weighting, rounding and clipping attributes to the decoded code.

Following the last iteration, the decoded data is preferably output when the next input data block has been completely loaded into the decoder 1. The global controller 106 counts every bit being loaded into the decoder 1; thus the global controller 106 generates a signal that the entire TPC block has been loaded. Preferably, the signal is directly communicated to the bit cells 10. Alternatively, the signal is directly communicated to each of the respective local controller modules 104. In the preferred embodiment, once the signal is received in the bit cell 10, the contents in the Shadow register 12 are loaded into the bit cell's original data register 18. Also, the signs of each of the bits stored in the SPIN register 12 and original data register 18 from the previous block's last iteration are loaded into the Shadow register 20 of each bit cell 10. The data in the Shadow register 12 of each bit cell 10 is output preferably one row or column at a time. Alternatively, all of the rows and columns are output at the same time. The output from each Shadow register 12 contains the hard decision of the decoded TPC result and the corresponding confidence value of each respective bit from the input data. This hard decision result as well as the corresponding confidence value is output from the decoder 1 to an output module 112 (Figure 2a) which outputs the decoded data.

As stated above, the TPC block includes parity rows and, columns which are decoded by the present decoder 1. Thus, the decoder 1 utilizes hyper diagonals or enhanced dimensions to decode the parity bit row and column of the two dimensional encoded block. Preferably, the decoder 1 begins the hyper axis parity calculation by rotating each hyper diagonal of the TPC block into columns. Alternatively, the decoder 1 rotates each hyper axis into rows. As stated above, the soft values in the bit cells 10 are preferably rotated in either or both directions to reduce the number of overhead clocks necessary to set up the enhanced arrays and put the soft bits back into the original bit cell 10 order. Alternatively, the soft values are rotated along the bit cells 10 in only one direction.

As shown and discussed in Figures 5a and 5b, the bits are input into the bit cell array in a shifted order to accommodate the "figure 8" configuration. In order to process the hyper dimension using the "figure 8" configuration, the decoder shifts the values in the bit cells 10 in the shifted order shown in Figure 10 for each row or column.

Figure 10 illustrates the "figure 8" process of shifting the bits in a 4x4 array to decode the hyper dimension. The first row in Figure 10 is shown to have a soft value in each bit cell 0-15, in which bit cell 0 contains the soft value for the first bit in the encoded block. As shown in Figure 10, the row includes groups 1 through 4, whereby the group 1 includes bit positions 0-3; group 2 includes bit positions 12-15 in reverse order; group 3 includes bit positions 4-7 and group 4 includes bit positions 8-11 in reverse order. As shown in Figure 10, the bit positions in the first row are positioned as they loaded into the bit cell 10 array by the input module 108. The bit positions in the second row are shifted to the left by one bit cell, whereby the bit positions that are located at an end of a group is shifted along the lines of the arrows shown. For example, bit '0', previously in group 1, is shifted to the first position in group 2. The bit positions in the third row are shifted to the left by another bit cell (two bit cells in comparison to the first row). For example, bit ' 1', previously in group 1, is shifted to the first position in group 2. The bit positions in the fourth row are shifted to the left by another bit cell (three bit cells in comparison to the first row). For example, bit '2', previously in group 1, is shifted to the first position in group 2. It should be noted that although the bit positions are shifted to the left by one position, the bit positions may be shifted in any direction along any axis and by any predetermined number of positions.

As stated above, the decoder 1 shifts the bit positions in the rows or columns, to align the diagonal into an enhanced dimension. In other words, once the hyper axis is aligned in a column, the enhanced dimension of bit cells 10 is processed as a y axis iteration by each local controller module 104 using the decoding process discussed above. Alternatively, if the columns are shifted, instead of the rows, the local controller 104 processes the array as if it is a x axis iteration using the decoding process discussed above. The decoder 1 preferably calculates the parity of the axis using an XOR tree or logic. This process occurs for each hyper diagonal array in the TPC block. In another embodiment, the decoder 1 of the present invention also supports decoding three-dimensional TPC codes. The decoder 1 of the present invention decodes the three dimensional TPC block by handling or "packing" the three dimensional block into a two dimensional block, whereby the third dimension is folded into a single cell, denoted as a super-cell. The decoder 1 of the present invention thereby decodes the three dimensional block as a two dimensional block. It should be noted that the packing order and the details of decoding the extra dimension are not limited to the examples shown.

Figure 11a illustrates a 4x4x4 three dimensional TPC block 200. It should be noted that other sized three dimensional blocks are contemplated in the present invention. Figure 1 lb illustrates a corresponding two-dimensional super cell array 202 which comprises several super cells 203 within, whereby each super cell 203 comprises the bits along a predetermined vector in the three dimensional block 200. Specifically, each super cell 203 includes a single bit in the x plane, a single bit in the y plane and an entire vector in the z plane. Alternatively, the super cell 203 includes a single bit in two axes and an entire vector in another axis. In other words, the super cell 203 includes, but is not limited to, a bit in the y and z axes and an entire vector in the x axis and so on. The super cell 203 in the upper left hand corner of the super cell array 202 in Figure 1 lb contains bit 111, which represents the bit in the x, y and z axes (3-D block 200) as well as bits 112-114 which are the rest of the bits in the vector for the z plane.

As shown in Figure 11a, the number ijk in each cell in the block 200 indicates the position of the bit by x-row, y-column and z-plane, respectively. Thus, as shown in the block in Figure 11a, the array in the z-plane in the upper left corner of the three dimensional block 200 includes bits (S₁₁₁,S₁₁₂,S₁|₃,S₁ι₄). Regarding the same S_{11 14} bit array in the super cell array 202 (Figure 1 lb), the 111-114 bit array is packed in a clockwise configuration in the upper left hand corner. Similarly, the array in the lower right hand corner of the three dimensional block 200 includes bits (S₄₁₁,8₄₄₂,8₄₄₃,8₄₄₄), whereby the same S*,,.^ array is shown in the lower right hand corner of the super cell array 202. It should be noted that the super cell 203 is not limited to a clockwise configuration and may have any configuration.

In relation to the decoder 1 of the present invention, each bit shown in Figure 1 lb represents a bit cell 20, whereby the decoder 1 rotates the bit cells 20 in performing the decoding process on the three dimensional encoded block 200. As discussed above in relation to the two dimensional TPC code, the decoder 1 of the present invention rotates and shifts the soft values of the bits between bit cells 20 in order to find the nearest neighbor values of the code. Such shifts in relation to the three dimensional blocks of encoded data are designated as connection schemes. A few of the connection schemes that the decoder 1 may use within a super-cell array 202 are shown in Figure 12. Specifically, Figure 12 illustrates the Inter-Z Connection 204, the Calculate Connection 206, the Move Connection 208 and the Snake Connection 210 schemes.

Although these connection schemes are discussed in regard to the present invention, the decoder 1 of the present invention is not limited to the actual connections schemes shown. Thus, other connection schemes may be used by the decoder 1. Such other connection schemes, however, must be configured with the local controller module 104 and the global controller module 106 to ensure that the nearest neighbor values are accurately determined for an encoded codeword.

As shown in Figure 12, each box corresponds to a bit cell 20, whereby the arrows represent the order of the rotations between the bit cells 20. Alternatively, the bit cells 20 are shifted the direction opposite as shown in Figure 12. In addition, each row or column of bit cells 20 shown in Figure 12 is rotated in a different direction. Although each connection arrow is shown separately for clarity, the actual connection schemes are present within the super-cell itself. As shown in Figure 12, the Inter-Z Connection scheme 204 shifts between the bit cells 20a and 20c in the first column and the bit cells 20b and 20d in the next column. Similarly, the Inter-Z Connection scheme shifts between the bit cells 20a and 20b and the bit cells 20c and 20d in the next row. Alternatively, the Inter-Z Connection scheme 204 shifts between the bit cells 20a and 20c in a column and shifts between bit cells 20a and 20b in a row.

The Calculate Connection scheme 206 as shown in Figure 12 processes bit cells 20a and 20i in a column and skips over the bit cell 20c that adjoins the processed bit cells 20a, 20i in the same column. Similarly, the Calculated Connection scheme 206 processes bit cell 20a in a row and then skips over the bit cell 20b that adjoins the processed bit cell 20e in the same row. The Move Connection scheme 208 allows the decoder 1 to systematically move along the bit cells 20a and 20b in a row. Alternatively, the Move Connection scheme allows the decoder 1 to systematically move along the bit cells 20a and 20c in a column. The Snake Connection scheme 210 systematically process each of the bit cells in the super cells as shown in Figure 12. Specifically, the decoder 1 utilizes the Snake Connection scheme 210, as shown in Figure 12, to systematically process the bit cells along the z-plane shown in a "S" type configuration. It should be noted that each of the above connection schemes is not limited to only rows and columns and are also useable in the z-planes. Also, it should be noted that the above connection schemes are not limited to the direction of the arrows shown in the figures and discussed in the present specification.

The decoder 1 decodes bit cells having soft bits (S₁₁₁,S₁₂ι,S₁₃₁,S|₄|) in the three dimensional block 200 by utilizing the connection schemes in the super cell array 202 to shift and rotate between the bit cells 20. Particularly, the decoder 1 utilizes the Calculate Connection scheme 206 to move along a linear axis of bit cells (Sni,S_12I,S₁₃ Si₄,), because the bit cells 20 in the far left column of the three dimensional block 200 are not positioned in a linear row in the super cell array 202. As shown in Figure lib, the first column in the super cell array includes bit cells (S_{l l l},S₁₁₃,S₁₂₁,Sι₂₃,S₁₃₁,S₁₃₃,S₁₄₁,S₁₄₃). In order for the decoder 1 to decoder the row (Sιn,S₁₂„S_I31,S₁₄₁), the decoder 1 performs the Calculate Connection scheme 206 beginning with S_{U 1} and skips over S₁₁₃ in the super cell array 202 to decode bit cell S₁₂|. The decoder 1 continues to decode the first column of encoded bits by using the Calculate Connection scheme 206 to move from S₁₂₁ and skip over S₁₂₃ to decode bit cell S₁₃₁. The decoder 1 then moves from S_13l, skips over bit cell S₁₃₃ and decodes bit cell S₁₄₁ by utilizing the Calculate Connection scheme 206. At this point, the first column of the three dimensional block is decoded.

As the first column (S₁₁₁,S₁₂i,Si₃,,S₁₄₁) is processed, the decoder 1 simultaneously processes the other columns along the first z-plane in the block 300. Alternatively, as the first column is processed, the decoder processes any other axis in the block 200. In processing the second left column (S_2π,S₂₂|,S₂₃₁,S₂₄,) in the three dimensional block 200, the decoder 1 performs the Calculate Connection scheme 206 to process the bit cells (S₂₁₁,S₂₂₁,S₂₃,,S₂₄₁) in the super cell array 202 in the same manner as (Sι_π,S₁₂₁,S₁₃₁,S₁₄|). The same simultaneously occurs for columns (S₃π_>S₃₂₁,S₃₃|,S_34I) and (S₄,₁,S₄₂ι,S₄₃₁,S₄₄₁). Therefore, all of the bit cells 20 processed for the first z-plane are shown below:

z-plane 1: (S₁₁₁,S|₂₁,S₁₃₁,S₁₄₁)(S₂₁|,S₂₂ι,S ₃₁,S_24iχS₃₁₁,S₃₂₁,S₃₃|,S₃₄₁χS₄₁₁,S₄₂₁,S₄₃|,S₄₄₁)

Alternatively, the decoder 1 processes all of the columns in the first z-plane in a non- simultaneous manner. Although not discussed in detail, the decoder 1 also utilizes the Calculate Connection scheme 206 to decode all of the bit cells in rows in the first z- plane in the block.

Once the bit cells 20 are shifted in the first z-plane, the decoder 1 performs an Inter-Z Connection scheme 204 to shift from the bit cells 20 in the first z-plane to the bit cells 20 in the second z-plane. For the first column in the second z-plane, the decoder 1 performs the Inter-Z Connection scheme 204 to begin the decoding process at bit cell S_U2, which is shown in the second column in the super cell array 202. The decoder 1 then performs the Calculate Connection scheme 206 to skip over every other bit cell 20 in the super cell array 202, as discussed above. The process repeats for bit cells 20 in the third z-plane and the fourth z-plane of the three dimensional block 200. The order in which the bit cells 20 are processed in each z-plane is shown below:

z-plane 2: (Sι₁₂,S₁₂₂,S₁₃₂,S₁₄₂)(S₂₁₂,S₂₂₂,S₂₃₂,S₂₄₂)(S₃₁₂,S₃₂₂,S₃₃₂,S₃₄₂)(S₄₁₂, ₄₂₂,8₄₃₂,b₄₄₂) z-plane 3: (S₁₁₃,S₁₂₃,S₁₃₃,S₁₄₃)(S₂₁₃,S₂₂₃,S₂₃₃,S₂₄₃χS₃₁₃,S₃₂₃,S₃₃₃,S₃₄₃)(S₄ι₃,S₄₂₃,S₄₃₃,S₄₄₃) z-plane 4: (S₁₁₄,S₁₂₄,S_[34,S_{1 4})(S₂₁₄,S₂₂₄,S₂₃₄, ₂₄₄)(S₃₁₄,S₃₂₄,S₃₃₄,S₃₄₄)( ₄, ₄,8₄^,8₄^,5₄₄₄) It should be noted that it is not necessary that the decoder 1 process the columns in the particular order discussed above. For instance, the decoder 1 alternatively utilizes the Calculate Connection scheme to process the first z-plane and the last z-plane of the three dimensional block 200. The decoder 1 of the present invention simultaneously decodes these two planes the using the Calculate Connection scheme:

z-plane 1: (S_{| 11},S₁₂ι,S₁₃₁,S₁₄₁χS₂₁₁, ₂₂₁,S₂₃₁,8₂₄₁ (S₃₁₁,S₃₂₁, ₃₃₁,8_{341 41}|,S₄₂₁, ₄₃₁,8₄₄₁)

Z-plane 4: (S₁₁₄,S₁₂₄,S_{1 4},S₁₄₄)(S₂₁₄, ₂₂₄,S₂₃₄,8₂₄₄XS₃₁₄,S₃₂₄, ₃₃₄, _{344 414}, ₄₂₄, _{4 4},S₄₄₄)

Following, the decoder 1 of the present invention utilizes the Inter-Z Connection scheme to decode the remaining two z-planes by rotating the columns by one position. Specifically, the (S_{l u},S₁₂₁,S_|31,Si_4|) column is rotated to the right by one position, whereas the (S_U4,S₁₂₄,S₁₃₄,S₁₄₄) column is rotated to the left by one position. The remaining code combinations are then calculated using the Calculate Connection scheme as shown below:

z-plane 2: (S₁₁₂,8₁₂₂,8|₃₂,S|₄ χS₂₁₂,8₂₂₂,8₂₃₂,8_{2 2})(S₃₁₂,S_{3 2},S₃₃₂,S₃₄₂)(8₄|₂,S₄₂₂,S₄₃₂,S₄₄₂) z-plane 3: (8, ι₃,S₁₂₃,8_l33,S|₄₃)(S₂₁₃,S₂₂₃,S₂₃₃,8₂₄₃)(S₃₁₃,S₃₂₃,S₃₃₃,8₃₄₃)(8₄|₃,S₄₂₃, S ₃₃,S₄₄₃)

It should be noted that the decoder 1 of the present invention is not limited to calculating the columns in the order of z-planes shown above. After an iteration of these columns in the z-planes, the decoder 1 performs another rotate of the columns using the Inter-Z Connection scheme to loop back to the originally decoded columns (S_{n ι},S_12],S₁₃₁,S₁₄₁) and (S_n4,S₁₂₄,S₁₃₄,S₁₄₄). The decoder 1 then performs the same iteration technique on the rows and the planes of the three dimensional block. The decoder 1 of the present invention uses information that is supplied by each super cell 203 in decoding the three dimensional block in the z-axis. Thus, the super-cell array 202 is constructed to decode the z-dimension of the encoded code directly, whereby the local control module 204 is included within each super-cell 203 (not shown). Alternatively, the local control module 204 is externally configured to each super cell 203.

Alternatively, the decoder 1 decodes the three dimensional block 200 by first decoding a predetermined number of bit cells 20 in each super cell 203. After the decoder 1 decodes the super cell 203, the decoder 1 uses the Move Connection scheme 208 to rotate along the super-cell array 202 and decode another predetermined super cell 203 having the same number of bit cells 20 within. For example, in the super cell array 202 in Figure 1 lb, the decoder 1 decodes all the bit cells 20 along the z-plane in the fourth column (S_4U,S₄₂„S₄₃₁,S₄₄₁) by performing a Snake Connection scheme 210 in the corresponding group of bits in the super cell array 202. Specifically, the decoder 1 decodes the following super cells 203 using the Snake Connection scheme 210 to move within the super cell 203 and the Move Connection scheme 208 to move between the super cells 203:

(S₄₁₁,S_4|2,S₄₁₃,S₄ι₄X 8^,08₄^,8 ₂₅,8₄^ 8₄₃,, 8₄₃₂,S₄₃₃,S₄₃₄)(S₄₄₁,S₄₄₂,S ₄₃,8_{4 4})

To decode all of the bits along the z-plane in the third column (S₃₁,,S₃₂|,S_33l,S_34I) in the three dimensional block 200, the decoder 1 performs a Move Connection scheme to the left by two bit column positions, whereby the super cells 203 to be decoded are (S₃ιι,S₃i₂,S_3I3,S₃,₄). Thus, the following subsets in the super cell are decoded by the decoder 1 using the Snake Connection scheme 210 to move within the super cell 203 and the Move Connection scheme 208 to move between the super cells 203:

( _{3l l 5}S₃i₂,S₃,₃,8₃₁₄X8₃₂,₎S₃₂₂,8₃₂₃,8₃₂₄XS₃₃i,8₃₃₂,8₃₃₃,8₃₃₄X8₃₄₁,8₃₄₂,8₃₄₃,8₃₄₄

and so on. Once all of the z-axes have been decoded using the above process, a final Move Connection scheme 208 is performed and the decoder 1 rotates to the left once more and ends up back at its original position (S₄₁₁,S₄,₂,S₄₁₃,S₄₁₄). It should be noted that the Move Connection scheme 208 can be performed such that the decoder 1 moves along the bit cells 20 in the super cell array 202 in any direction by any number of rows or columns.

Although the super cell array 202 in the above example is a square block, the super cell array 202 alternatively has a non-square dimension. In the case of a non- square super cell array 202, the connection schemes are configured to skip over more than one bit cell in a row, column or plane to uniformly decode the row, column or plane. However, it should be noted that other connection scheme configurations, including partial rotates, left- and right-rotates and further parallelization in either direction are alternatively utilized. Further, the decoder 1 would properly be configured such that the nearest neighbor computation would not become deficient.

As stated above with respect to the two dimensional code, the three dimensional TPC code also includes forward error correction capabilities and a parity check plane. The decoder 1 of the present invention performs a hyper decode along an enhanced dimension in the three dimensional TPC block to properly decode the codeword. The enhanced dimension for a three-dimensional TPC code can be decoded due to the symmetry inherent in the diagonal shown in Figure 13a. For example, the 4x4x4 TPC block 200 in Figure 11a and corresponding super cell array 202 in Figure 13a has an enhanced super cell diagonal which includes super cells 209, 215, 216 and 218. Within each super cell 203 in the enhanced diagonal are the individual bit cells 20 which are designated as different shades. The shades of bit cells 20 represent the bits which are aligned along a diagonal axis in the three dimensional block 200 (Figure 11a). As stated above, the enhanced dimension is not limited to a diagonal variation.

Figure 13a illustrates the first four diagonal enhanced code groupings or enhanced bit cells for each bit in the z-plane for the first column of bits (S_{1 I I},S_{1 12},S_U3JS_{I I4}) in accordance with the present invention. The first group 209, represented in black in Figure 13a, includes bit cells (S,,,, 8₂₂₂,8₃₃₃,8₄₄₄); the second group 214, represented in gray, includes bit cells (S, ₁₂,₂₂₃^₃₃₄,8₄₄,); the third group 216, represented in white, includes bit cells (S,,₃,S₂₂₄,S₃₃,,S₄₄₂); and the fourth group 218, represented in gray stripes, includes bit cells (S,,₄,S₂₂₁, 8₃₃₃,8₄₄₃). In the corresponding three dimensional TPC block 200, the bit cells in the grouping above are positioned along the diagonal in the z-plane (not shown).

To decode the enhanced dimension of the three dimensional TPC block 200, the decoder 1 performs two steps. One step includes rotating the enhanced bit cells 20 within each super cell 203 such that the enhanced bit cells 20 are aligned in the same corresponding bit positions within each super cell 203. In addition, the decoder 1 also rotates the each of the entire super cells a certain number of positions in the super cell array 202 to align the super cells 203 along one axis. For example, the bit cells 20 in the second super cell 215 are rotated by one bit position using the Snake Connection scheme 210 such that the bit cells 20 in the second group 215 are aligned with the bit cells 20 in the first group 209, as shown in Figure 13b. Similarly, the bit cells 20 in the third group 216 are rotated by two bit positions using the Snake Connection scheme 210 such that the bit cells 20 in the third super 216 are aligned with the bit cells 20 in the first super cell 209 and second super cell 215. The same process is performed on the fourth super cell 218. Once all the bit cells 20 with each of the super cells 203 are rotated, the super cells will be aligned along the hyper diagonal, as shown in Figure 13b.

Once the individual bit cells 20 are lined up within each super cell 203, the decoder 1 performs the Move Connection scheme 208 to rotate each super cell 203 by an appropriate amount of positions to line up the super cells 203 in one column, as shown in Figure 13c. Alternatively, the super cells 203 are rotated an appropriate number of positions to line up the super cells 203 in rows. As stated above, each super cell 203 can be rotated in any direction by any number of bits, such that the super cells 203 are lined up in a column or row. It should be noted that the decoder 1 may alternatively align the super cells 203 into a column or row first and then perform the Snake Connection scheme to align the individual bit cells 20 with the super cells 203. Once the super cells 203 are aligned in a column or row, as shown in Figure 13c, a component decode is performed by the decoder 1 using the process described above. The decoding process for a three dimensional code performs a component decode on the soft value in each bit cell 20 located in a particular row or column in the super cell array 202. The first iteration is performed on a column, whereby the next iteration on the row and so on. Alternately, the first iteration is performed on a row, whereby the next iteration is performed on the column and so on. Alternatively, more than one column or row is iteratively decoded at a particular time or simultaneously. Figure 16 illustrates a block diagram of the decoder in accordance with the present invention.

In general, the decoder 2 decodes the encoded three dimensional TPC data in six overall steps. First, the encoded data is input into the input module 222 (Figure 16), whereby the input module 222 provides the encoded bits to the SISO array of the decoder 2 (Figure 16). The decoder 2 then calculates the syndrome and parity values for each encoded bit and then finds the nearest neighbor code set. Following the decoder 2 finds the minimum difference metric values and calculate the Log- Likelihood Ratio (LLR) extrinsic values. The LLR extrinsic values are then output from the output module 224 (Figure 16). Each of these steps will now be discussed in detail.

Figure 14 illustrates a block diagram of the super cell 203 in accordance with the present invention. For three dimensional TPC codes, the decoder 2 is configured to include an array of SISO tiles or super cell arrays 202, whereby each super cell 203 includes an array of bit cells 20 within. Each super cell 203 includes bit cells 20 which include two types of bit cells within, designated as k bit cells 22 and j bit cells 24, as shown in Figure 14. In addition, each super cell 203 includes two types of local controller modules that are coupled to the bit cells 20. As stated above, each encoded data vector is decoded in local calculation loops, whereby each loop is controlled by the respective local controller module. One local controller module, designated as the xy-controller 230, controls the x axis and z axis iterations in the TPC block. In addition, the other local controller module, designated as the yh- controller 232 controls the y axis and hyper axis iterations in the TPC block.

Each local controller module 230, 232 collects the local results from its respective group of local bit cells 22, 24. As will be discussed below, all of the local vector results from the group of local bit cells 22, 24 are processed to find a common global result.

As stated above, a super cell 203 is a single bit cell in the x plane, a single bit cell in the y plane, and an entire vector of bit cells in the z plane. For example, as shown in Figure 14, the super cell 203 shown includes 32 bit cells. 8 of the bit cells shown in Figure 14 are j bit cells 24, which are full bit cells, whereas 24 of the bit cells are k bit cells 22, which are shift only bit cells. It should be noted that although the super cell is shown to include 32 bit cells, any other number of bit cells can be included.

In the example shown in Figure 14, the 8 j bit cells 24 in the super cell 203 represent that the 8 encoded data vectors received within the super cell 203 are processed in parallel in the x, y, and hyper axes of the TPC block. As shown in Figure 14, there are 4 z-planes, each of which include 8 bit cells within, thereby there are 32 z-plane bit cells altogether in the super cell203. As shown in Figure 15, each SISO tile 202 includes 32 super cells 203, whereby there are 8 super cells 203 in one axis and 4 super cells in the other axis. As shown in Figure 16, there are 8x8 or 64 SISO tiles 202 in the decoder 2. Therefore, there are 32 super cells across the x axis for a given encoded codeword. In addition, since there are 64 SISO tiles in the decoder 2 shown in Figure 16, there are 16,000 total j bit cells 24, each of which process a vector in parallel across the decoder 2 for the x, y, and hyper dimensions.

The super cells 203, local controller modules 230, 232, and global controller module 220 together form a SISO tile 202. As shown in Figure 15, the SISO tile 202 includes an 8x4 array of super cells 203. However, the SISO tile 202 alternatively includes any desired sized super cell 203 array. As shown in Figure 15, the SISO tile 202 includes 64 yh-controllers 232 and 32 xz-controllers 230. The global controller module 220, shown in the upper left hand corner of the SISO tile 202 in Figure 15, is used to control the timing of all the local controller modules.

Figure 17 illustrates a block diagram of the j bit cell 24 and k bit cell 22 according to the decoder of the present invention. Each j bit cell 24 and k bit cell 22 includes a SPIN register 32, a previous result register 36, an EXT1 register 42, an original data register 38 and a Shadow register 40. As shown in Figure 17, the j bit cell 24 also includes a difference metric register 34. Alternatively, the j and k bit cells 24, 22 include other components, including, but not limited to, RAMs, latches and other registers.

In the present example, data is loaded into the input module 222 of the decoder 2 at 16 soft values per clock. Alternatively, data is loaded into the input module 222 at any other number of soft values per clock. The input module 222 shifts the 16 soft values across the top of each SISO tile in the decoder 2 until all 64 soft values are loaded into the super cells 203. The soft values are loaded in parallel into the Shadow registers 40 of each first bit cell. Each Shadow register 40 contains one soft value, although more than one soft value is alternatively included within each Shadow register 40.

As with two dimensional codes, the data is processed using the "figure 8" configuration during the decoding process. Thus, each soft value is loaded into each SISO tile 202 in a shifted order (Figure 5b). Alternatively, each soft value is loaded into each SISO tile 202 in a natural input order. Either method utilizes a Galois register 234, such that the Galois register 234 is loaded with the appropriate values for the corresponding bit positions. The Galois register 234 is preferably located within the same local controller module 230 performing the decoding process, although a Galois register 234 may be located in a local controller module which only controls the rows of the bit cell arrays and a local controller module which only controls the columns of the bit cell arrays. However, the Galois register 234 is shown outside of the local controller module 230 for exemplary purposes. As stated above, the shifted order (Figure 5b) ensures that the decoder 2 decodes the input bits in the proper order. In order to process the hyper dimension using the "figure 8" configuration, the decoder shifts the values in the bit cells 10 in the shifted order shown in Figure 10 for each row or column in the super cell 203.

As the input module 222 receives another 64 soft values and begins loading the soft values into each SISO tile 202, the Shadow register 40 in each of the first bit cells 20 in the first x row, designated in Figure 18 as '0', shift its contents to the Shadow registers in the first bit cell in the second x row. At the same time, the input module 222 shifts its contents into the Shadow register 40 in the first bit cell in the first x row. This process continues until all of the SISO tiles 202 have been filled.

As shown in Figure 18, the data is input from the input module 222 into the decoder 2 one row at a time until all of the j bit cells 24 receive the bits. A signal sent either from the global controller module or local controller module causes the super cells 203 to shift the data using the Snake Connection scheme 210 (Figure 12). Thus, the bits are shifted from the j bit cells 24 to the k bit cells 22 in a particular super cell 203. In addition, the bits in the last row of k bit cells 22 in a super cell 203 shifts the data into the j bit cells 24 of the next super cell 203 in the SISO tile 202. This parallel load fills the second z plane, whereby new data fills the first z plane, one x-row at a time. Alternatively, the first z plane is filled wherein all the x-rows are filled at one time. This process continues until all z planes are full and the complete data block is loaded.

As the soft values are loaded into the input module 222 and passed to the super cells 203, the global controller module 220 counts each bit until the entire block of data has been loaded. The global controller module 220 then communicates to the xz-controller 230 that the entire block of encoded data has been loaded. The xz- controller 230 then communicates a signal to the super cells 203 to initiate the decoding process on the loaded data. Once the super cells 203 receive the signal from their respective local controller, the contents in the shadow registers 40 are loaded into the original data registers 38 in each bit cell 20.

If the decoder 2 is not decoding the bits during the first iteration, the signal sent by the xz-controller 230 transfers the previous block result from the SPIN register 32 of a previous bit cell 20 into the Shadow register 40 of a following bit cell 20. Thus, the Shadow register 40 in the first bit cell "0" (Figure 18) of the first row receives data from the input module 222 if the signal is sent by the xy-controller 230. Otherwise, the first bit cell, designated as '0' in Figure 18, in the first row of the SISO tile 202 receives the data from the Shadow register 40 of the 32^nd bit cell in the SISO tile 202, which is shown in Figure 18 as bit cell "31". This is described further in detail below.

Following, the decoder 2 calculates the syndromes and parity values for bits in the encoded codeword. The syndrome calculation determines if the current codeword is a valid codeword. If the result of the syndrome calculation is '0', then the codeword if valid. However, if the result of the syndrome calculation is not '0', then the syndrome result is the Galois address of the error bit.

Figure 19 illustrates a block diagram of the decoder 2 of the present invention in decoding a three dimensional TPC block. As shown in Figure 19, the xz-controller 230 includes a syndrome module 236, a parity module 238, a mini module 240, a min2 module 242, a lowil module 244 and a lowi2 module 246. The syndrome cycle starts when the global controller module 220 (Figure 15) sends a start command to the local controller modules 230, 232, whereby each local controller module 230, 232 sends the start command to the super cells 203 in its assigned SISO tile 202. The syndrome and parity values are calculated locally inside each local controller module 230, 232. In the first axis iteration, the contents in the original data register 38 of each j bit cell 24 is copied into its respective SPIN register 32, as shown in Figure 19.

The syndrome is calculated by summing the Galois address in each Galois address register 234 of each ' 1 ' value in the data vector. The contents of the SPIN register 32 are examined to check if the sign of the input value is '1'. In addition, when processing the x axis, the yh-controller 232 rotates the value in the Galois address register 234, as shown in Figure 19. As shown in Figure 19, if the sign of the input value in the SPIN register 32 is ' 1 ', the Galois address of the current bit is XORed with the current value which is stored in the syndrome module 236. After all of the soft values in the local loop of j bit cells 24i are calculated, the global controller 220 XORs the local results in each xz-controller 230 and the local results in all of the other xz-controllers to get the final syndrome result. In the example decoder 2 shown in Figure 16, there are eight SISO tiles 202 in the x axis. Thus, the decoder 2 shown in Figure 16 has eight xz-controllers 230 which communicate with one another to find the final syndrome result.

The local parity value for each soft value is calculated in parallel with the syndrome calculation as shown in Figure 19. Alternatively, the local parity value for each soft value is calculated in a non-parallel fashion with the syndrome calculation. In calculating the parity of the soft values, the sign of each value in the SPIN register 32 of each j bit cell 24t is XORed with the values in already in the parity module 238 and the result is accumulated in the parity module 238. This process repeats for each iteration until a final parity result is determined. The final parity result, once determined, is then stored in all xz-controllers 230 in the decoder 2.

The decoder 2 also finds the two most minimum input bit confidences and their respective Galois addresses by using the algorithm described above. As shown in Figure 19, the xz-controller 230 holds the local minimum value in the lowil module 244 and the local second most minimum value in the lowi2 module 246. In addition, the xz-controller 230 stores the location of the lowil value, which is designated as mini, in the mini module 240. Similarly, the xz-controller 230 stores the location of the lowi2 value, which is designated as min2, in the min2 module 242. The two local minimum values are compared with other local minimum values in other j bit cells 24t to generate two overall minimum lowil and lowi2 values as well as their respective locations, minil and mini2. In the case where several locations each have equal minimum values for a particular row or column, the xz-controller 230 stores the minimum values with their respective minimum Galois addresses in a RAM (not shown). Alternatively, the xz-controller 230 stores the minimum values and their respective minimum Galois addresses within the any of the modules. These stored values can then be retrieved for later calculations.

The results stored in the syndrome module 236, parity module 238, mini module 240 and min2 module 242 are held through the entire axis iteration. There are 4 possible outcomes from this syndrome/parity calculation:

syndrome parity

A) 0 0

B) 0 1

C) 1 0

D) 1 1

The results of the syndrome and parity calculations are used to determine whether or not the vector is valid and can be used as a center codeword. If the vector is not valid, the decoder 2 uses the syndrome and parity results to correct the vector to a valid center codeword. A valid center codeword allows the decoder 2 to find the Galois addresses of the two common bits that can be used to find a set of nearest neighbor codewords.

In case A), the syndrome result is 0 and the parity result is 0. Thus, the codeword is valid and there are no errors. The two common bits are located in the mini and min2 locations. In case B), the syndrome result is 0 and the parity result is 1. Thus, the syndrome result is valid, although the parity result is invalid. Therefore, there is a parity error and the address of the parity bit, where the Galois address is 0, is placed in the mini module 240. In addition, the address previously in the mini module 240 is now placed in the min2 module 242. In case C), the syndrome result is 1 and the parity result is 0. Therefore, the syndrome result is invalid, although the parity bit is valid. In this case, there are two errors, so the Galois address stored in the syndrome module 236 is placed in the mini module 240. In addition, the parity bit, where the Galois Address is zero, is placed in the min2 module 242. In case D), the syndrome result is 1 and the parity result is 1. Therefore, both the syndrome and parity values are invalid. Since, there is one error in the syndrome, the Galois address stored in the syndrome module 236 in placed in the mini module 240, whereas the prior location in the mini module 240 is placed in the min2 module 242.

Once the syndrome values have been calculated, the decoder 2 of the present invention finds the nearest neighbor code set for the vector. The locations stored in the mini module 240 and min2 module 242 are the locations of the common hamming code bits. Hamming code bits have a weight of four, although any other weight is contemplated. In this embodiment, the hamming code bits are such that the nearest neighbor codewords are set as mini, min2, and two floating other bits that are asserted. The two floating bits are determined by placing a ' 1' value in the mini and min2 locations. In addition, a third "1" is marched through every location in the codeword, whereby the syndrome is used to locate the fourth bit that is a nearest neighbor. The algorithm used by the decoder 2 is as follows:

syndrome = mini XOR min2 XOR Galois Address

As stated above, the syndrome is calculated by rotating the SPIN register 32 and the Galois Address in the Galois Address Register 234through the entire vector of the codeword. The yh-controller 232 rotates the Galois Address in the Galois Registers 234 with the SPIN registers 32 from each of the bit cells. In addition, each yh-controller 104 includes a fixed or constant Galois Address module 235 which is used on each respective local j bit cell 24/. Each j bit cell 24/ shown in Figure 19 is coupled to and communicates with the Galois Address register 234 as well as the constant Galois Address module 235. As shown in Figure 19, the xz-controller 230 XORs the mini value in the mini module 240 with the min2 value in the min2 module 242 and drives the results across all of the j bit cells 24/ in the SISO tile 202. Since each of the xz-controllers 230 in a local loop calculates the mini and min2 values the same way, each j bit cell 24 in the vector receives the same mini XOR min2 value result. Alternatively, each bit cell 228 receives a different mini XOR min2 value result for different xz-controllers 230.

In determining the nearest neighbor bit for the codeword, the decoder 2 XORs the rotating value in the Galois register 234/ with the value in the fixed Galois Address module 235/ at Galois multiplexer 250/. The XORed Galois result is then sent to comparator 252/ of each j bit cell 24/ whereby the XORed Galois result is compared to the XORed min value. If this XORed Galois result value is equal to the XORed min value, then the nearest neighbor value is found. When each j bit cell 24/ detects this condition, a load_dm signal is produced. When the load sm signal is produced, , the value in the SPIN register 32 is stored in the difference metric register 34, whereby the difference metric register 34 stores the nearest neighbor hard decision bit as well as the corresponding confidence values of the bit.

After the nearest neighbor value is determined, the decoder 2 of the present invention finds the minimum difference metric values of the codeword. The minimum difference metric values are determined by using the hard decision value of the nearest neighbor bit and the corresponding confidence value to find the sum of the minimum confidences that are stored in the SPIN register 32 and difference metric register 34. The SPIN register 32 and difference metric register 34 are shifted between j bit cells 24/ in a loop during local operations to find the sums that are utilized by the xz-controller 230 as shown in Figure 20.

Each sum that is determined from the values in the SPIN register 32 and the difference metric register 34 is based on the results of the syndrome and parity calculations. In the cases B) and D) from above, there is one error based on the syndrome and/or parity bit, whereby the confidence value from the Galois address register 234 stored in the mini module 240 is negated before the sum is calculated. In case C), there are two errors such that the confidences from the Galois address register 234 stored in mini and min2 are both negated before the sum is calculated.

Figure 20 illustrates a more detailed block diagram of the xz-controller 230 in accordance with the present invention. The xz-controller 230 shown in Figure 21 compares each sum result and stores the two smallest difference metric values as well as the locations that created the two smallest difference metric values. The smallest difference metric value is stored in the lowil register 244 and the locations of the values which generate the smallest difference metric value are stored in a newminl register 254 and a newmin2 module 256. In addition, the second most minimum sum is stored in the lowi2 module 244. One of the locations stored in the newminl module 254 is the rotating Galois address received from the Galois address register 234. The location stored in the newmin2 module 256 is calculated by XORing the value in the mini module240 with the value in the min2 module 242 and Galois address location, as shown in Figure 21.

The sign of the value in the DM module 34 changes during this operation to indicate whether the computer LLR value (discussed below) is to be negated. Although the actual value of the difference metric does not change when the sign is changed, the sign of the difference metric is determined by depending on the cases described above in the step of determining the syndrome. In case A) described above, the value in the mini module 240 is XORed with the sign of the data in the SPIN register 32. However, for the other cases (B, C or D) in the syndrome step, the value in the min2 module 242 is XORed with the sign in the SPIN register 32.

As stated above, the contents previously stored in the lowil module 244 and lowi2 module 246 are the minimum confidence values that are used to find the minimum confidences in the syndrome step. Once the Galois address locations of the minimum confidence values are determined, the contents in the lowi 1 module 244 and the lowi2 module 246 are not needed and can be overwritten.

Each difference metric value has an equal difference metric value in its corresponding pair location. Thus, the Galois address associated with the smallest difference metric value is stored in the newminl module 254 and the corresponding pair's address is stored in the newmin2 module 256. In searching for the minimum difference metric value and the second most minimum difference metric value, the xz- controller 230 disregards the pair's value, because the pair's value is equal to the stored minimum difference metric value. In the case where there are equal minimum difference metric values, the minimum difference metric is replaced when a new difference metric value that is equal to or smaller than the difference metric value stored in the newminl register 254 or newmin2 register 256, as shown in Figure 20.

The minimum difference metric values in the j bit cells 24/ in each local loop are then compared with the minimum difference metric values in the other local loops of j bit cells 24/. After all the local loops are compared, the decoder 2 generates the two overall minimum difference metric values, which are placed in the lowil module 244 and lowi2 module 246 of each xz-controller 230. In addition, the decoder 2 stores the locations that generate the two overall minimum difference metric values in the newminl register 254 and newmin2 register 256 in each xz-controller 230. In the case where there are several equal minimum difference metric values, the values having the minimum Galois address are stored in the newminl module 254 and newmin2 module 256. Since there are 8 SISO tiles in the present example bit array (Figure 16), all of the xz-controllers 230 will find the same result after 8 cycles.

After the minimum difference metric values are determined, the decoder 2 of the present invention calculates the extrinsic Log-Likelihood Ratio (LLR) value for each bit in the vector. The extrinsic LLR value for the input bit, but not including the common bits, is calculated by subtracting the difference metric of the center codeword from the confidence value of the current bit pair in the difference metric register 34. In addition, for all common bits, the extrinsic LLR value for the input bit is calculated by subtracting the difference metric value in the lowi2 module 244 from the confidence value of the current bit pair, wherein the difference metric of the bit is the sum of the confidence value for the current bit and the confidence value of the current bit pair. The LLR result is negated based on a sign provided by the difference metric register 34. In particular, the LLR value is negated if the current bit pair confidence does not have the corresponding Galois addresses at the minil nor the mini2 location. In addition, the current bit pair confidence value will be negated if the Galois address is the mini2 location but has a syndrome or parity error, as discussed above. Similarly, the current bit pair confidence value will be negated if the Galois address is the minil location and has a syndrome error but no parity error, as discussed above.

Figure 20 illustrates a block diagram of the LLR value calculation process in the module in accordance with the present invention. The LLR calculation process begins by the decoder 2 copying the contents in the original data register 38 into the SPIN register 32. The data in the original data register 38 is used later in the process to generate the input for the next axis iteration. In the last iteration, the data in the original data register 38 is not copied to the SPIN register 32 on the last iteration. In the last iteration, the final result is calculated as the sum of the input data bit with the extrinsic LLR value from every axis. If the data from the original data register 38 is not copied to the SPIN register 32, the SPIN register 32 will contain the original input data summed with every extrinsic LLR value except the current axis result. Thus, the final result is calculated by summing the current axis result with the contents in the SPIN register 32.

The pair of hard decision and confidence values for each of the bits are then stored in the respective difference metric register 34/. The difference metric register 34 and the SPIN register 32 are shifted by the xz-controller 230, so that the LLR extrinsic values can be calculated from the hard decision and soft confidence information in every j bit cell 24/ in the local loop. The LLR extrinsic value result is then stored in the SPIN register 12. The LLR extrinsic value result is multiplied by a feedback constant that is supplied by the global controller module 220 and implemented in a lookup table for a final extrinsic axis result which is stored in the mini module 240. The data is shifted between j bit cells 24 in the local loop for every other clock to allow time for the bit cells to process the input for the next axis iteration.

For every iteration except the last iteration, the LLR extrinsic result is summed with the contents in the SPIN register 32 and the previous result register 36 to generate the input for the next axis iteration. The result of this sum is stored in the SPIN register 32 and the weighted LLR extrinsic result is written to the previous result register 36 in the same step. However, in the last iteration, the LLR extrinsic result is summed with the contents stored in the SPIN register 32 and the result of this sum is stored back into the SPIN register 32 to be output.

Following the last iteration, the decoded data is output when the new input data block has been completely loaded into the decoder 2. The global controller 220 counts every bit being loaded into the decoder 2 and generates a signal when the entire TPC block has been loaded in the decoder 2. The signal is directly communicated to the bit cells 20. Alternatively, this signal is received in each of the bit cells 20 from each respective xz-controller 230. Once the signal is received in the bit cell 20, the contents in the Shadow register 40 are loaded into the original data register 38, as shown in Figure 22. Also, the sign of the contents from the previous block's final results which are in the SPIN register 32 and original data register 38 are loaded into the Shadow register 40 in the bit cell of the next super cell, as shown by arrow 3 in Figure 22. In other words, as shown in Figure 22, the value in the Shadow register 40a of the first bit cell 20a is sent to the Shadow register 40b of the second bit cell 20b. In addition, the value previously in the Shadow register 40b of the second bit cell 20b is sent to the Shadow register (not shown) of the third bit cell (not shown) and so on. The data from the Shadow registers 40 of each of the bit cells 20 are output one row or column at a time. Alternatively, each of the rows and columns are output at the same time.

As shown in Figure 22, the Shadow register 40a in the first bit cell 20a receives data either from the input module 222 or from the last bit cell 20 in a previous super cell 203 depending on whether the data input is from the first row. When an unload signal is asserted by the global controller 220, data is unloaded as 64 bit pairs from the Shadow registers 40 of the last bit cell in every super cell 203 that is in the last x row of the SISO tile 202. The decoder 2 unloads 64 bit pairs, because the x dimension of the code is 64 bits wide (and because the input is loading 64 soft values).

The output of the Shadow register 40 contains the hard decision of the TPC result and the sign of the input data. The TPC result and the sign of the input data is output from the SISO tile 202 for the output module 224 to process. If the data pair is not equal, the output module 224 utilizes a counter which incrementally generates the corrections count. When the input module 222 loads another 64 soft values, the output module 224 shifts out 64 bit pairs from the Shadow register 40 of first bit cell 20 in the last x row of the SISO tile 202. The Shadow register 40 in the second to the last x row of each super cell 203 shifts its contents into the Shadow register 40 of the first bit cell '0' in the last row of the super cells 203, as shown in Figure 23. This process continues until the complete z-plane of bit cells 20 is unloaded. When the z- plane of bit cells is unloaded, the global controller 220 generates a signal that shifts the data in the Shadow register 40 of every first bit cell, designated as "0" in Figure 23, to the Shadow register 40 of the next bit cell, designated as "1" in Figure 23, within each super cell 203. In addition, bit cell 1 shifts its contents to bit cell 2 and so on.

The signal from the global controller from the global controller also shifts the data from the last bit cells 20, designated as '31' as in Figure 23, in the last row to the Shadow registers 40 of the first bit cell 0 in the first row of super cells 202 in the next SISO tile 202, designated as '0'. This shift opens the first row of bit cells 20 in the first row for the current input data and fills all the other rows of bit cells 20 with data to be output from bit cell 31 in the previous SISO tile row. In addition, the contents in the Shadow registers 40 in bit cell "0" and bit cell "31" are output to the output module 224. This continues until all planes are unloaded.

The z-axis parity calculation of the three dimensional TPC block 200 is unique because the entire z vector is available in a single super cell 203. The xz- controller receives 203 data from the SPIN register 32 from every j bit cell 24/ in the neighboring super cell 203. The following parity operation is repeated eight times to calculate the z axis parity for 8 super cells 203 in the x dimension of each SISO tile 202. It should be noted that the parity operation is repeated any number of times depending on the number of super cells 202 in the any dimension of each SISO tile.

The decoder 2 of the present invention calculates the parity of the z axis using an XOR tree. Alternatively, other methods to calculate the parity of z-axis are used. The SPIN register 32 in each super cell 203 is available to the parity calculation. If the decoder 2 of the present invention detects a parity error in the z-axis, a parity error flag is set.

The LLR extrinsic value, once calculated, is weighted by multiplying by the value with a z feedback constant. In addition, the LLR extrinsic value is clipped if the result is out of range. Further, the LLR extrinsic value is negated if the hard decision is 0 or if there is a parity error. Note that if the hard decision value of the bit is 0 and there is also a parity error, the extrinsic is negated twice so the result is the original weighted, clipped extrinsic LLR value. If the parity calculation is for the last axis iteration, the result is the sum of the weighted extrinsic LLR value and the value in the SPIN register for that bit cell. However, if the parity calculation is not for the last axis iteration, the result is the sum of the weighted extrinsic LLR value, the value in the previous result register 36, the value in the EXT1 register, which is the result from the axis previous to the axis processed to find the previous result, and the value in the original data register 38.

The result is stored in the SPIN register 32, the extrinsic LLR result is stored in the previous result register 36, and the contents of the previous result register 36 are shifted into EXT1. The super cells 202 are then shifted so the xz-controller 230 can operate on 8 new z axes. This is repeated 8 times, because there are 8 super cells 203 in the x dimension of the SISO tile 202, as shown in Figure 16.

The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of the principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be apparent to those skilled in the art that modification s may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention.

Claims

Claims What is claimed is:

1. A decoder for decoding an encoded codeword having a plurality of bits, wherein each of the plurality of bits includes a soft value, the decoder comprising: a. a plurality of bit cells arranged in a first array, wherein each bit cell stores the soft value within for a corresponding bit; and b. a controller module coupled to the first array of bit cells, the controller module for performing a component decode on each soft value in the plurality of bit cells, wherein the controller module rotates the soft values between each of the bit cells along the first array using a connection scheme.

2. A decoder for decoding an encoded turbo product code block having (n,k) bits in a first array, the decoder comprising: a. 'n' number of bit cells arranged in the first array, wherein each bit cell receives a soft value for a corresponding bit; and b. a controller module coupled to the first array of bit cells, wherein the controller performs a component decode on each soft value by shifting the soft value along each bit module in the first array.

3. A method of decoding an encoded codeword having a plurality of encoded bits, the method comprising the steps of: a. receiving the encoded codeword, wherein the received each encoded bit is loaded into a corresponding bit cell in an array of bit cells; b. calculating a syndrome result for the encoded codeword, wherein the syndrome result is calculated by comparing a minimum input bit confidence between each bit cell in the array; c. determining a nearest neighbor code set for each bit from the syndrome result, wherein a nearest neighbor confidence value is stored in each bit cell in the array; d. calculating a pair of lowest difference metric values for the encoded codeword, wherein the lowest difference metric values are calculated by summing a minimum sum value in each bit cell; and e. generating an extrinsic LLR value for each bit in the encoded codeword, wherein the extrinsic LLR value is determined from a lowest difference metric value from a current bit pair confidence value.