METHOD AND APPARATUS FOR SOLVING KEY EQUATION POLYNOMIALS IN DECODING ERROR CORRECTION CODES

Inventors: Chen-Yi Lee, Hsinchu (TW); Hsie-Chia Chang, Hsinchu (TW)

Correspondence Address:
PERKINS COIE LLP
P.O. BOX 2168
MENLO PARK, CA 94026 (US)

Appl. No.: 10/155,488
Filed: May 22, 2002

Foreign Application Priority Data
Nov. 28, 2001 (TW) 090129778

Publication Classification
Int. Cl. H03M 13/00

ABSTRACT

The presently invention discloses a method for computing error locator polynomial and error evaluator polynomial in the key equation solving step of the error correction code decoding process whereby the polynomials are generated through at most t intermediate iterations that can be implemented with minimal amount of hardware circuitry. However, depending on the selected (N,K) code, the number of cycles required for the calculation of the polynomials would be within the time required for the calculation of upstream data. Additionally, the present invention for computing the error locator polynomial and the error value polynomial employs an efficient scheduling of a small number of registers and finite-field multipliers (FFMs) without the need of finite-field inverters (FFIs) is illustrated. Using these new methods, a new area-efficient architecture that uses only 4t+2p+4 registers and three FFMs and no FFIs is presented to implement the inversionless Euclidean algorithm.
Fig. 2a

Fig. 2b
METHOD AND APPARATUS FOR SOLVING KEY EQUATION POLYNOMIALS IN DECODING ERROR CORRECTION CODES

BACKGROUND OF THE INVENTION

[0001] In the transmission of data from a source location to a destination location through a variety of media, noise caused by the transmission path and/or the media itself causes errors in the transmitted data. Thus, the data transmitted is not the same as the data received. In order to determine the errors in the received data, various methods and techniques have been developed to detect and correct the errors in the received data. One of the methods is to generate a codeword which includes a message part (data to be transmitted) and a parity part (information for performing error correction).

[0002] Among the most well-known error-correcting codes, the BCH (Bose-Chaudhuri-Hocquenghen) codes and the RS (Reed-Solomon) codes are the most widely used block codes in the communication field and storage systems. The mathematical basis of BCH and RS codes is explained by: E. R. Berlekamp, *Algebraic Coding Theory*, McGraw-Hill, New York, 1968; and Richard E. Blahut, *Theory and Practice of Error Control Codes*, Addison-Wesley, 1983.

[0003] An (N, K) BCH or RS code has K message symbols and N coded symbols, where each symbol belongs to GF(q) for a BCH code or GF(qm) for a RS code. A binary (N, K) BCH code can correct up to t errors with N=2m−1, N-K≥2t. An (N, K) RS code can correct up to t errors and ρ erasures with

\[ t = \left\lfloor \frac{N-K-\rho}{2} \right\rfloor . \]

[0004] For binary BCH codes, an error can be corrected simply by finding out the error location. For RS codes, an error can be corrected by finding out the error location and the error value. In RS codes, an erasure is defined to be an error with a known error location, and hence its correction reduces to finding the error value.

[0005] The method steps for common popular RS decoder architectures for the correction of errors can be summarized into four steps: (1) calculating the syndromes from the received codewords, (2) computing the error locator polynomial and the error evaluator polynomial, (3) finding the error locations, and (4) computing error values. If both errors and erasures are corrected, the four steps are modified to: (1) calculating the Forney syndromes from the received codewords and the erasure locations, (2) computing the errata locato polynomial and the errata evaluator polynomial, (3) finding the errata locations, and (4) computing the errata values.

[0006] Referring to FIG. 1, the general decoding steps are illustrated. Note that for simplification, the error-only RS decoder is introduced. The received data, R(x), is provided to a syndrome generator 10,20 to generate a syndrome polynomial, S(x), representing the error pattern of the codeword from which the errors can be corrected. The syndrome is then provided to a key equation solver 12,22 to generate an error locator polynomial, Ω(x), and an error evaluator polynomial, Ω(x). The error locator polynomial indicates the location(s) of the error and the error evaluator polynomial indicates the amount of the error. In the next step, the error locator polynomial is passed to a Chien search engine 14,24 to generate the root(s), \( β \), representing the location(s) of the errors. Then the error evaluator 16,26 receiving the root(s) and the error evaluator polynomial, \( Ω(x) \), calculates the error value(s) of the root(s).

[0007] The second step in the above-mentioned four-step procedure involves solving the key equation, which is

\[ S(x) = 0 \mod x^N \]

(1)

where \( S(x) \) is the syndrome polynomial, \( o(x) \) is the error locator polynomial and \( Ω(x) \) is the error evaluator polynomial. When both errors and erasures are corrected, \( o(x) \) and \( Ω(x) \) are the errata locato polynomial and the errata evaluator polynomial, respectively. In addition, the errata locato polynomial \( o(x) \) becomes the product of \( Ω(x) \) and \( λ(x) \) corresponding to the error locato polynomial and the erasure locato polynomial, respectively.

[0009] The techniques frequently used to solve the key equation (1) include the Berlekamp-Massey algorithm and the Euclidean algorithm. The extension of these algorithms to correct both errors and erasures can be found in the Blahut article cited above. Here a novel inversionless decomposed Euclidean architecture is invented to reduce the hardware complexity drastically while maintaining the overall decoding speed.

[0010] Prior art technologies applied the traditional Euclidean algorithm (or variation thereof) for the calculation of the error locator polynomial and the error evaluator polynomial, and designed circuits based upon these algorithms. However, each of these algorithms require a large number of registers, finite-field multipliers (FFM) and perhaps a finite-field inverters (FFI). Each of the FFM and FFI translates into a hardware circuitry and real estate on an integrated circuit chip. Therefore, the goal here is to derive a method for solving of the polynomials in an efficient manner and to minimize the amount of circuitry required in the implementation of the algorithm. The number of registers and FFM is typically a function of the variable t. Table 1 illustrates the authors of the architectures for correcting error-only codewords and the corresponding number of registers, FFM and FFI:

<table>
<thead>
<tr>
<th>Reference</th>
<th>Registers as a function of t</th>
<th>FFM as a function of t</th>
<th>FFI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Reed</td>
<td>( 8t + 2 )</td>
<td>( 8t )</td>
<td>0</td>
</tr>
<tr>
<td>Song</td>
<td>( 6t + 4 )</td>
<td>( 6t + 2 )</td>
<td>0</td>
</tr>
<tr>
<td>Wu</td>
<td>( 7t + 5 )</td>
<td>( t + 1 )</td>
<td>( 1 )</td>
</tr>
</tbody>
</table>

[0011] From Table 1, Reed proposed the implementation of the inversionless Euclidean algorithm requires \( 8t+2 \) registers, \( 8t \) FFM and \( 0 \) FFI in *VLSI Implementation of A Pipeline Reed-Solomon Decoder*; IEEE Transaction on Computers, vol. C-34, pp. 393-403, May 1985. In addition, the article *An Efficient Architecture for Implementing the Modified Euclidean Algorithm*, the 9th NASA Symposium...
on VLSI Design, November 2000, Song demonstrated an architecture requiring 64+4 registers and 64+2 FFMs and no FFI.

[0012] On the other hand, the article *An Area-efficient Versatile Reed-Solomon Decoder for ADSL*, IEEE International Symposium on Circuits and Systems, May 1999, Wu et al. presented the architecture reducing the number of FFMs but requiring the relatively complex FFI, which will limit speed and impose a significant hardware complexity.

[0013] Therefore, it would be desirable to have an inversionless method and apparatus that requires no FFIs and minimizes the number of registers and FFMs in the implementation thereof.

**SUMMARY OF THE INVENTION**

[0014] Accordingly, it is an object of the present invention to provide a method and apparatus for solving key equation polynomials in the decoding of codewords. Based upon the Euclidean algorithm, it can be implemented with minimal hardware circuitry.

[0015] It is another object of the present invention to provide a method and apparatus for solving key equation within a 1-step iterative decoding procedure while the prior art architectures require at most 2t iterations.

[0016] It is yet another object of the present invention to provide a method and apparatus for solving key equation polynomials without decreasing the overall decoding speed of the decoder.

[0017] Briefly, in a presently preferred embodiment, a method for computing error locator polynomial and error evaluator polynomial in the key equation solving step of the error correction code decoding process is presented whereby the polynomials are generated through at most t intermediate iterations that can be implemented with minimal amount of hardware circuitry. However, depending on the selected (N,K) code, the number of cycles required for the calculation of the polynomials would be within the time required for the calculation of upstream data.

[0018] Additionally, a presently preferred embodiment for computing the error locator polynomial and the error value polynomial employs an efficient scheduling of a small number of registers and finite-field multipliers (FFMs) without the need of finite-field inverters (FFIs) is illustrated. Using these new methods, a new area-efficient architecture that uses only 4t+2p+2 registers and three FFMs and no FFIs is presented to implement the inversionless Euclidean algorithm. This method and architecture can be applied to a wide variety of RS and BCH codes with suitable code sizes.

[0019] More specifically, the 3-FFM architecture of the presently preferred embodiment for solving key equation polynomials can also be utilized to calculate the Forney syndrome polynomial T(x) described above. This method and architecture can be applied to correct the error-only as well as the error-and-erasure codewords.

[0020] An advantage of the present invention is that it provides a method and apparatus for solving key equation polynomials in the decoding of codewords. Based upon the Euclidean algorithm, it can be implemented with minimal hardware circuitry.

[0021] Another advantage of the present invention is that it provides a method and apparatus for solving key equation polynomials within a t-iteration decoding procedure while other architectures require at most 2t iterations. It will maintain the overall decoding speed of the decoder.

[0022] Yet another advantage of the present invention is that it provides an identical method and apparatus for not only solving key equation polynomials but also calculating the Forney syndrome polynomial T(x). It can be applied to the correction of errors as well as erasures.

**BRIEF DESCRIPTION OF THE DRAWINGS**

[0023] FIGS. 1a and 1b illustrates the processing blocks in the decoding or codewords;

[0024] FIG. 2a–FIG. 2c shows a three-FFM architecture of the preferred embodiment for calculating the errata evaluator polynomial, \( \Omega(x) \), in the key equation solver.

[0025] FIG. 2d shows a three-FFM architecture of the preferred embodiment for calculating the errata location polynomial, \( \sigma(x) \), in the key equation solver.

[0026] FIG. 2e shows a three-FFM architecture of the preferred embodiment for calculating the Forney syndrome, \( T(x) \).

**DESCRIPTION OF THE PREFERRED EMBODIMENTS**

[0027] Firstly, we will show our modified decoding procedure requiring at most t iterations while the previous decoding procedure requires at most 2t iteration. Following the inversionless Euclidean algorithm is illustrated and the errata value(x) and errata location(x) produced by \( \Omega(x) \) and \( \sigma(x) \) in our inversionless decoding procedure are identical to the errata value(x) and errata location(x) produced by the original algorithm. Secondly, we decompose the inversionless Euclidean algorithm for reducing the number of registers to 4t+2p+2 and the number of FFMs to 3. Finally, we show the condition on N, K such that our architecture can be applied.

[0028] The Euclidean Decoding Procedure

[0029] For illustrating the Euclidean algorithm, we rewrite (1) as:

\[
\Omega(x) = x^{n-k}Q(x) + T(x)\nu(x) \tag{2}
\]

[0030] where \( Q(x) \) is the quotient polynomial of \( x^{n-k} \) and \( T(x)\nu(x) \), \( T(x) = S(x)\Lambda(x) \) is the Forney syndrome polynomial, and \( \nu(x) = \lambda(x)\Lambda(x) \) is the errata locator polynomial, which is the product of the error locator polynomial, \( \Lambda(x) \), and the erasure locator polynomial, \( \Lambda(x) \). Therefore, the errata evaluator polynomial, \( \Omega(x) \), can be calculated by the similar process of computing the GCD polynomial of \( x^{n-k} \) and \( T(x) \) through the Euclidean algorithm, whose decoding process can be shown as follows:

\[
\begin{align*}
R_1^{(t)}(x) &= x^{n-k} \\
R_2^{(t)}(x) &= T(x) \\
R_3^{(t)}(x) &= R_1^{(t-1)}(x) = R_2^{(t-1)}(x)Q(x) \\
R_4^{(t)}(x) &= R_3^{(t-1)}(x) = R_2^{(t-1)}(x)Q(x) \tag{3}
\end{align*}
\]

[0031] where \( Q(x) \) is the i-th quotient polynomial and \( R_3^{(i)}(x) \) is the i-th remainder polynomial. Each iterative step in (4) performs a polynomial division operation. Note that
the i-th dividend polynomial $R^{(i-2)}(x)$ and the i-th divisor polynomial $R^{(i-1)}(x)$ are equivalent to the $(i-2)$-th and the $(i-1)$-th remainder polynomials respectively. After n division operations, the n-th remainder polynomial, $R^{(0)}(x)$, is assumed to be the errata evaluator polynomial $\Omega(x)$. From the extended form of Euclidean algorithm introduced by Error-Control Coding for Data Networks, Kluwer Academic, 1999, the similar decoding process except the minor difference in the initial condition can be used to determine the errata locator polynomial $\sigma(x)$, which is also described as follows:

$$
\begin{align*}
\mu^{(1)}(x) &= 1 \\
\mu^{(0)}(x) &= \Lambda(x) \\
\mu^{(i)}(x) &= \mu^{(i-1)}(x) + \rho^{(i-1)}(x)Q^{(0)}(x) \\
\rho^{(i)}(x) &= \mu^{(i-1)}(x) + \rho^{(i-1)}(x)Q^{(0)}(x)
\end{align*}
$$

(4)

[0032] where

$$
\Lambda(x) = \prod_{a_i \in \Lambda} (1 + a_i x)
$$

[0033] represents the erasure locator polynomial and $\Lambda$ is the erasure set. Note that all $Q^{(0)}(x)$ here are equivalent to the i-th quotient polynomial $Q^{(0)}(x)$ in (3). Similarly, after n iterations, $\mu^{(n)}(x)$ is assumed to the errata locator polynomial $\sigma(x)$. From (3) and (4), it can be shown that the sum of $\deg(R^{(i-1)}(x))$ and $\deg(\mu^{(i)}(x))$ equals to a constant number, N-K+s, where s is the number of actual erasures and hence, equals the degree of $\Lambda(x)$.

[0034] Our Modified Decoding Procedure

[0035] The proposed modified decoding procedure calculating the quotient polynomial with degree one in advance is shown as follows:

[0036] Initial Condition

$$
A^{(0)}(0)=x^{n-K}, M^{(0)}(0)=\Omega^{(0)}(x)=T(x)
$$

[0037] For(i=0 to 1)

$$
\delta=\deg(A^{(i)}(x)), \Lambda=\deg(M^{(i)}(x))
$$

[0038] if$(\deg(\sigma^{(i)}(x)) \geq \deg(\Omega^{(i)}(x)))$

$$
\begin{align*}
q^{(i)}(x) &= A^{(i)}(x) / M^{(i)}(x) \\
q^{(i)}(0) &= 0 \\
\sigma^{(i)}(x) &= M^{(i)} \sigma^{(i-1)} + A^{(i)}(x) \rho^{(i)}(x)
\end{align*}
$$

(5)

[0039] else

$$
\begin{align*}
A^{(i+1)}(x) &= A^{(i)}(x) + \rho^{(i)}(x) \\
M^{(i+1)}(x) &= M^{(i)}(x) \\
q^{(i+1)}(x) &= q^{(i)}(x) + 1 \\
\sigma^{(i+1)}(x) &= \sigma^{(i)}(x) + \rho^{(i)}(x)
\end{align*}
$$

(6)

[0040]

[0041] else

$$
\begin{align*}
\Omega^{(i)}(x) &= q^{(i)}(x) + \rho^{(i)}(x) \\
\sigma^{(i)}(x) &= \sigma^{(i)}(x)
\end{align*}
$$

(7)

[0042] where $q^{(0)}(x)=q^{(0)}(x)+\rho^{(0)}(x)$ is the i-th dummy quotient polynomial, $\Omega^{(i)}(x)$ is the i-th dummy remainder polynomial, and $A^{(i)}(x)$ and $M^{(i)}(x)$ are the leading coefficients of the i-th dummy dividend polynomial $A^{(i)}(x)$ with degree of $\delta$ and the i-th dummy remainder polynomial with degree of $\Delta$, respectively. Note that if there are only errors, the erasure locator polynomial, $\Lambda(x)$ equals 1 and the Forney syndrome polynomial, $T(x)$ should be altered to the syndrome polynomial $S(x)$.

[0043] As compared with (3), if we assume the i-th dividend polynomial $R^{(i-2)}(x)$ to $A^{(0)}(x)$ as well as the i-th divisor polynomial $R^{(i-1)}(x)$ to $M^{(0)}(x)$, the difference in degree between $A^{(0)}(x)$ and $M^{(0)}(x)$ equating $\delta-\Delta$ implies the decoding procedure shown above will take at most

$$
\left\lfloor \frac{\delta - \Delta + 1}{2} \right\rfloor
$$

[0044] iterations to calculate the i-th remainder polynomial $R^{(0)}(x)$.

[0045] Note that our modified decoding procedure will stop at $\deg(\Omega^{(i)}(x))=\deg(\sigma^{(i)}(x))$ and in the meantime, $\sigma^{(i)}(x)$ is the errata locator polynomial $\sigma(x)$ with degree of s+v. That $s$ and $v$ represent the number of actual erasures and error(s). Recalling $\deg(\sigma^{(i)}(x))=\deg(\Lambda(x))=s$, the degree of $\sigma^{(i)}(x)$ will increase from s to s+v. In a specific case with degree of $Q^{(i)}(x)$ in (3) all equating one, v division operations are needed and in the decoding procedure shown above, the total number of iterations is v as a result that accomplishing each division operation takes 1 iteration with $\delta-\Delta=\deg(q^{(i)}(x))=1$. Owing to $s \geq 1$, the modified decoding procedure above requires at most t iterations for solving key equation polynomials.

[0046] The Inversionless Decoding Procedure

[0047] For eliminating the inverse operation within our modified decoding procedure, a novel inversionless decoding procedure is proposed and shown as follows:

[0048] Initial Condition:

$$
A^{(0)}(x)=x^{n-K}, M^{(0)}(x)=\Omega^{(0)}(x)=T(x)
$$

[0049] For(i=0 to 1)

$$
\delta=\deg(A^{(i)}(x)), \Lambda=\deg(M^{(i)}(x))
$$

[0050] if$(\deg(\sigma^{(i)}(x)) \geq \deg(\Omega^{(i)}(x)))$

$$
\begin{align*}
q^{(i)}(x) &= A^{(i)}(x) / M^{(i)}(x) \\
q^{(i)}(0) &= 0 \\
\sigma^{(i)}(x) &= M^{(i)} \sigma^{(i-1)} + A^{(i)}(x) \rho^{(i)}(x)
\end{align*}
$$

(7)

[0051] else

$$
\begin{align*}
A^{(i+1)}(x) &= A^{(i)}(x) + \rho^{(i)}(x) \\
M^{(i+1)}(x) &= M^{(i)}(x) \\
q^{(i+1)}(x) &= q^{(i)}(x) + 1 \\
\sigma^{(i+1)}(x) &= \sigma^{(i)}(x) + \rho^{(i)}(x)
\end{align*}
$$

(8)
else
\[ A^{(0)}(x) \in (x), M^{(0)}(x) = M^{(0)}(x) \]
\[ \eta^{(0)}(x) = \eta^{(0)}(x), m^{(1)}(x) = m^{(0)}(x) \]

[0053] else

\[ \Omega(x) = \Omega^{(0)}(x), \sigma(x) = \sigma^{(0)}(x) \] Finish

[0054] where \( \Omega(x) \) and \( \sigma(x) \) are the modified errata evaluator polynomial and errata locator polynomial, respectively. It can be shown that \( \sigma(x) \) and \( \Omega(x) \) can be used to find the same error location(s) and error value(s) as the original \( \sigma(x) \) and \( \Omega(x) \) do. While compared with other approaches, our proposed inversionless Euclidean algorithm not only eliminates the costly inversion operation but also introduces a t-iteration decoding procedure.

[0055] The Decomposed Architecture

[0056] Here we propose a decomposed architecture from the proposed inversionless Euclidean algorithm, which works with individual coefficients of the polynomial instead of the entire polynomial as a whole.

[0057] And (7)-(8) can be decomposed as the following two equations:

\[ \Omega^{(i+1)}(x) \in (x), M^{(0)}(x) \]
\[ \eta^{(0)}(x) = \eta^{(0)}(x), m^{(1)}(x) = m^{(0)}(x) \]
\[ \Omega^{(i+1)}(x) = \Omega^{(i)}(x) + \sum_{j=1}^{2} \lambda_{j} \Omega^{(i)}(x) \]
\[ \eta^{(0)}(x) = \eta^{(0)}(x) + \sum_{j=0}^{1} \eta^{(0)}(x) \]

[0058] where \( \delta \) and \( \Delta \) represent the degree of \( A^{(0)}(x) \) and \( M^{(0)}(x) \) respectively, and \( \sigma^{(i+1)}(x) \) and \( \sigma^{(i+1)}(x) \) corresponds to the j-th and k-th coefficient of \( \Omega^{(i+1)}(x) \) with degree of \( \delta = 2 \) and \( \delta + 1 \) with degree of \( \delta \) at the i-th iteration. From (9)-(10), if \( M^{(0)}(x), \Omega^{(0)}(x), \eta^{(0)}(x) \) and \( \eta^{(0)}(x) \) can be calculated in advance, there are only three finite-field multiplications needed to compute \( \sigma^{(i+1)}(x) \) and \( \sigma^{(i+1)}(x) \). The detailed cycle operation of our inversionless decomposed architecture can be seen in Table 2. For simplifying notations, we let \( \delta = 1 \) without loss of generality.

<table>
<thead>
<tr>
<th>Cycle</th>
<th>( \Omega^{(i+1)}(x) ) and ( \sigma^{(i+1)}(x) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>Initial</td>
<td>( w \cdot M^{(0)}M^{(0)} )</td>
</tr>
<tr>
<td>j = 0</td>
<td>( \Omega^{(0)}(x) = w \cdot \Omega^{(0)}(x) )</td>
</tr>
<tr>
<td>j = 1</td>
<td>( \Omega^{(1)}(x) = w \cdot \Omega^{(0)}(x) )</td>
</tr>
<tr>
<td>j = 2</td>
<td>( \Omega^{(2)}(x) = w \cdot \Omega^{(1)}(x) )</td>
</tr>
<tr>
<td>j = 3</td>
<td>( \Omega^{(3)}(x) = w \cdot \Omega^{(2)}(x) )</td>
</tr>
<tr>
<td>( k = 0 )</td>
<td>( \Omega^{(0)}(x) = w \cdot \Omega^{(0)}(x) )</td>
</tr>
<tr>
<td>( k = 1 )</td>
<td>( \Omega^{(1)}(x) = w \cdot \Omega^{(1)}(x) )</td>
</tr>
<tr>
<td>( k = 2 )</td>
<td>( \Omega^{(2)}(x) = w \cdot \Omega^{(2)}(x) )</td>
</tr>
</tbody>
</table>

[0059] It is evident from Table 1 that, at cycle \( j = 0 \), the computation of \( \Omega^{(0+1)}(x) \) requires \( w \cdot M^{(0)}(x) \) and \( \Omega^{(0)}(x) \), which have been calculated at the initialization cycle. Similarly, at cycle \( j = 1 \), the computation of \( \Omega^{(1+1)}(x) \) also requires \( \Omega^{(0)}(x) \), which has been calculated at cycle \( j = 0 \). Note that each cycle needs three finite-field multiplications and the calculations process of \( \sigma^{(i+1)}(x) \) is similar to that of \( \Omega^{(i+1)}(x) \).

[0060] The inversionless decomposed Euclidean algorithm shown above suggests a 2-GTM implementation of the key equation solver, which is illustrated in FIG. 2. The branch labeling in FIG. 2 corresponds to a particular time instance. As compared with Table 1, FIG. 2(a) shows the initialization cycle, when \( j = 0 \), refer to Table 1, is \( w \cdot M^{(0)}(x) \) and \( \Omega^{(0)}(x) \) is computed by a finite field multiplier 34, equation \( \Omega^{(0)}(x) = \Omega^{(0)}(x) + \Omega^{(0)}(x) \), where \( M^{(0)}(x) \) is computed by a finite field multiplier 30, \( M^{(0)}(x) \) is computed by a finite field multiplier 32, these two terms are added by a finite field adder 36 to have \( \Omega^{(0)}(x) \). As shown in FIG. 2(b), FIG. 2(b) indicates the calculation cycle for \( q^{(0)}(x) \) and \( \Omega^{(0)}(x) \). Since \( q^{(0)}(x) = \Omega^{(0)}(x) \), the finite field multiplier 40 is used to compute \( q^{(0)}(x) \), since \( \Omega^{(0)}(x) = w \cdot \Omega^{(0)}(x) + M^{(0)}(x) \), the finite field multiplier 44 realize \( w \cdot \Omega^{(0)}(x) \) and the finite field multiplier 42 realize \( M^{(0)}(x) \), these two terms are added by a finite field adder 36 to have \( \Omega^{(0)}(x) \).

[0061] The process for computing other coefficients of \( \Omega^{(i+1)}(x) \) is expressed in FIG. 2(c), when \( j = 1 \), refer to Table 1, \( \Omega^{(i+1)}(x) \) can be obtained by finite field multiplier 50, 52, 54 and a finite field adder 56. Because the computation process of \( \sigma^{(i+1)}(x) \) is similar to that of \( \Omega^{(i+1)}(x) \), the hardware used to compute \( \sigma^{(i+1)}(x) \) can be reconfigured to calculated \( \sigma^{(i+1)}(x) \), which is presented in FIG. 2(d), the hardware is similar to FIG. 2(c), with multipliers 60, 62, 64 and an adder 66 to compute \( \sigma^{(i+1)}(x) \) for \( k = 1 \) to \( \psi \), as illustrate in Table 1.

[0062] This architecture can be used for error-only correction as well as error-and-eraser correction. Compared to existing proposals requiring 6t to 8t FFMs, the preferred embodiment of the present invention significantly reduces hardware complexity down to 3 FFMs. However, in order to finish the i-th iteration, the architecture of the preferred embodiment requires \( 8t + t \) cycles whereas prior art architectures requires only two to three cycles. The additional time required for generating the data under the architecture of the present invention does not slow down the overall system processing speed. Due to the overall system processing speed dominated by the syndrome calculator and Chien Search, each taking N cycles to finish, our architecture slowing down the Euclidean algorithm (taking N cycles) will not impact the decoding speed.

[0063] Additionally, the method and apparatus of the present invention also minimize the amount of required registers. Recalling (9) and (10), \( \eta \) representing the degree of \( \sigma^{(i+1)}(x) \) is equivalent to the degree of \( \mu^{(i)}(x) \) in (4) and similarly, \( \delta \) representing the degree of \( \Omega^{(i)}(x) \) is equivalent to that of \( R^{(i-1)}(x) \) in (3). As shown earlier, \( \deg(\mu^{(i)}(x)+\deg(\Omega^{(i-1)}(x)))=N-K+S \leq 2t+\psi \), where \( t \) and \( \psi \) represent the number of errors and erasures in the decoding of codewords and consequently, in the preferred embodiment of the present invention, \( 2t+\psi+2 \) registers are used to store the coefficients of \( \Omega^{(i)}(x) \) and \( \sigma^{(i+1)}(x) \), and another \( 2t+\psi \) registers can be used for storing the coefficients of \( \mu^{(i)}(x) \) and \( \Omega^{(i-1)}(x) \). Hence, calculating \( \Omega^{(i)}(x) \) and \( \sigma^{(i+1)}(x) \) iteratively always totally requires \( 4t+2\psi+2 \) registers and if there are only errors corrected, the amount of required registers is \( 4t+2 \) and the previously proposed architectures requiring \( 6t \) to \( 8t \) registers.
Furthermore, the preferred embodiment of the present invention can also be used to calculate the Forney syndrome polynomial, \( T(x) \), which is defined as:

\[
T(x) = S(x)A(x) \mod x^{N-K}
\]

(11)

where

\[
A(x) = \prod_{i=1}^{N} (1 + \lambda_i x^i)
\]

(12)

is the erasure locator polynomial and \( \lambda_i \) is the \( i \)-th erasure magnitude. \( T(x) \) can be obtained by following procedures:

[0067] **Initial Condition**

\[ T^{(0)}(x) = S(x) \]

(13)

[0068] For \( t = 0 \) to \( t \)

\[
T^{(t)}(x) = T^{(t-1)}(x)(1 + \lambda_t x^t) \mod x^{N-K}
\]

(14)

[0069] if \( t = \psi \)

\[ T^{(\psi+1)}(x) = T^{(t)}(x) \]

(15)

[0070] else

\[ T^{(0)}(x) \]

(16)

[0071] where \( \lambda_t(x) \) is the \( i \)-th auxiliary polynomial for computing the \( i \)-th iteration Forney syndrome polynomial, \( T^{(i+1)}(x) \). Note that \( \lambda_t(x) \) can be expressed as \( 1 + \lambda_t^{(0)} x^t + \lambda_t^{(1)} x^{2t} \), and \( T^{(i+1)}(x) \) can be decomposed as the following results:

\[
T^{(i+1)}(x) = T^{(i)}(x) A^{(i)}(x) \mod x^{N-K}
\]

(17)

It is evident that the process calculating the \( t \)-th coefficient, \( T^{(t)}(x) \), is very similar to (9) or (10), and therefore, the 3-FFM architecture can be used to obtain the Forney syndrome polynomial \( T(x) \), which is illustrated in FIG. 2(a), as refer to equation (14), the second term is computed by a finite field multiplier 72, the third term is computed by a finite field multiplier 70 and these two terms and the first term are added by a finite field adder 74 to have \( T^{(i+1)}(x) \).

[0072] Application Conditions

[0073] The total number of cycles required to compute \( \delta(x) \) and \( \Omega(x) \) using the 3-FFM architecture of the preferred embodiment is of interest in considering the potential impact on the overall system performance. From the proposed iterative decoding process, \( 0 \leq t < 2 \) in (9) and \( 0 \leq t < 2 \) in (10) implying the number of cycles required to compute \( \Omega^{(i+1)}(x) \) is \( \delta + 1 \) and calculating \( \Omega^{(i+1)}(x) \) needs \( \psi + 1 \) cycles in the \( i \)-th iteration. However, one more cycle is needed to get \( q_1^{(0)} \) and \( q_2^{(0)} \), and the proposed decoding procedure requires \( \delta + \psi + 1 \) cycles in one iteration totally. From \( \psi + \Delta = N-K+s \) and \( \Delta = 1 \), it is clear that \( \delta + \psi + 1 = N-K+s+2 \leq 2t+1 \) for RS (N, K) code. Table 3 shows the maximum number of cycles for different RS (N, K) codes in Table 3 ranging from 4 to 16. If \( N \) is larger than the number of cycles required, then our 3-FFMs architecture can be applied to reduce the hardware complexity while maintaining the overall decoding speed.

[0075] There are many applications of BCH and RS codes in communications and storage systems that benefit from methods and apparatus of the present invention. For example, Digital Versatile Disks (DVDs) use a RS product code which is 182,172 in the row direction and 208,192 in the column direction. Digital TV broadcasting uses a 204,188 RS code. CD-ROM uses a number of smaller (32,28) and (28,24) RS codes. In the optical fiber submarine cable systems, RS (255,239) code is used and standardized to provide burst error correcting capability. In wireless telecommunications, the AMPS cellular phone systems uses (40,28) and (48,36) binary BCH codes, which are shortened codes of the (63,51) code. The (63,51) code, which can correct up to 2 errors (N-K=12,m=6), requires fewer than 12 cycles (t=2, row 1 of Table 3). All of these applications, as well as many others, can benefit from the method and apparatus of the present invention.

[0076] Although the present invention has been described in terms of specific embodiments, it is anticipated that alterations and modifications thereof will no doubt become apparent to those skilled in the art. It is therefore intended that the following claims be interpreted as covering all such alterations and modifications as fall within the true spirit and scope of the invention.

What is claimed is:

1. An apparatus for solving key equation polynomials in decoding error correction codes, a novel inversionless decomposed architecture which is frequently used in BCH and Reed-Solomon decoders comprising:

   a. syndrome calculator that received codewords and output a syndrome polynomial to a key equation solver;

   b. a key equation solver that calculated error locators polynomial and error evaluator polynomial and output error location;

   c. a Chein Search that received said error location polynomial and input a result to an error value calculator and output said error location;

   d. an error value calculator that received signal from said key equation solver and Chein Search, output an error value.

2. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus is used for BCH and Reed-Solomon (RS) decoders.

3. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus is applied to BCH and Reed-Solomon (RS) decoders which is a kind of inversionless decomposed architecture.

**TABLE 3**

<table>
<thead>
<tr>
<th>N - K</th>
<th>t</th>
<th>p</th>
<th>cycles</th>
<th>t</th>
<th>p</th>
<th>Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td>4</td>
<td>2</td>
<td>—</td>
<td>12</td>
<td>1</td>
<td>2</td>
<td>6</td>
</tr>
<tr>
<td>6</td>
<td>3</td>
<td>—</td>
<td>24</td>
<td>2</td>
<td>2</td>
<td>16</td>
</tr>
<tr>
<td>8</td>
<td>4</td>
<td>—</td>
<td>40</td>
<td>3</td>
<td>2</td>
<td>30</td>
</tr>
<tr>
<td>10</td>
<td>5</td>
<td>—</td>
<td>60</td>
<td>4</td>
<td>2</td>
<td>48</td>
</tr>
<tr>
<td>12</td>
<td>6</td>
<td>—</td>
<td>84</td>
<td>5</td>
<td>2</td>
<td>70</td>
</tr>
<tr>
<td>14</td>
<td>7</td>
<td>—</td>
<td>112</td>
<td>6</td>
<td>2</td>
<td>96</td>
</tr>
<tr>
<td>16</td>
<td>8</td>
<td>—</td>
<td>144</td>
<td>7</td>
<td>2</td>
<td>126</td>
</tr>
</tbody>
</table>
4. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus can be applied to the correction of errors as well as erasures.

5. An apparatus for solving key equation polynomials in decoding error correction codes in claim 1, wherein said method and apparatus is applied in inversionless Euclidean.

6. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus can eliminate the finite-field inverter (FFI) to finish.

7. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus is only needed at iteration decoding procedure.

8. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus including said decomposed technique which can also drastically reduce the required number of finite-field multipliers (FFMs) from 4t-6t to 3.

9. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus including said decomposed technique that uses only 4t+2p+4 registers.

10. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus including said decomposed technique that no FFIs is presented to implement the inversionless Euclidean algorithm.

11. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus can use to calculate the Forney syndrome polynomial.

12. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said apparatus is further operable in communication.

13. A method for solving key equation polynomials in decoding error correction codes. In particular, a novel method for inversionless decomposed architecture which is frequently used in BCH and Reed-Solomon decoders executable instructions for:

   (a) received said codewords and calculate said syndrome;

   (b) produced said errata locator polynomial and errata evaluator polynomial;

   (c) search said error location;

   (d) calculated said error value.

14. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method is used for BCH and Reed-Solomon (RS) decoders.

15. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method is applied to BCH and Reed-Solomon (RS) decoders which is a kind of inversionless decomposed architecture.

16. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method can be applied to the correction of errors as well as erasures.

17. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method is applied in inversionless Euclidean.

18. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method can eliminate the finite-field inverter (FFI) to finish.

19. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method is only needed at iteration decoding procedure.

20. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method including said decomposed technique which can also drastically reduce the required number of finite-field multipliers (FFMs) from 4t-6t to 3.

21. An apparatus for solving key equation polynomials in decoding error correction codes according to claim 1, wherein said method including said decomposed technique that uses only 4t+2p+4 registers.

22. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method including said decomposed technique which no FFIs is presented to implement the inversionless Euclidean algorithm.

23. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method including said decomposed technique which can also use to calculate the Forney syndrome polynomial.

24. A method for solving key equation polynomials in decoding error correction codes according to claim 13, wherein said method and apparatus is further operable in communication.

25. A method for solving key equation polynomials in decoding error correction codes. In particular, a novel method for inversionless decomposed architecture which is frequently used in BCH and Reed-Solomon decoders, wherein improving process including:

   (a) improved the speed of said Euclidean algorithm;

   (b) embellished said decoded procedure to reduce half decoded result;

   (c) combined the calculate which said errata locator polynomial and errata evaluator polynomial.

26. A method as recited in claim 25, wherein said Euclidean algorithm and time is shared said finite-field multipliers (FFMs).

27. A method as recited in claim 25, wherein said method can reduce said hardware area.

28. A method as recited in claim 25, wherein said modified Euclidean algorithm is a decomposed architecture, eliminated the limit of finite-field inversionless.

29. A method as recited in claim 25, wherein said inversionless Euclidean algorithm including total iteration number of degree is less than t but also other architectures requires at most 2t iterations.

30. A method as recited in claim 28, wherein said inversionless Euclidean algorithm use the degree of said error locator polynomial increase from p+1 to p+4.

31. A method as recited in claim 25, wherein said inversionless Euclidean algorithm, the number of total iterations in our modified procedure is less than t.
32. A method for solving key equation polynomials in decoding error correction codes. In particular, a novel method for inversionless decomposed architecture which is frequency used in BCH and Reed-Solomon decoders including:

(a) each iteration could eliminate at least one degree;
(b) combined the hardware of said errata locator polynomial and errata evaluator polynomial;
(c) a number of FFMs is reduced to 3.

33. A method as recited in claim 32, wherein said speed of inversionless Euclidean algorithm slowing down, but it will not impact the decoding speed.

34. A method as recited in claim 32, wherein said BCH and Reed-Solomon (RS) decoder, Digital Versatile Disks (DVDs) use a RS product code which is (182,172) in the row direction and (208,192) in the column direction.

35. A method as recited in claim 32, wherein said BCH and Reed-Solomon (RS) decoder, digital TV broadcasting uses a (204,188) RS code.

36. A method as recited in claim 32, wherein said BCH and Reed-Solomon (RS) decoder, CD-ROM uses a number of smaller RS codes, including (32,28),(28,24).

37. A method as recited in claim 32, wherein said BCH and Reed-Solomon (RS) decoder, in wireless communications, the AMPS cellular phone system uses (40,28) and (48,36) binary BCH codes, which are shortened codes of the (63,51) code.

* * * * *