Construction of Reed Solomon - detection and correction

Question

How can i construct RS code which can detect and correct errors.

E.g, I would like to construct RS(76,64,8), where

8: GF(2⁸) field, symbol size
64: information symbols
76: information(64) + check symbols(12)

I could easily construct 6 symbol correcting code, using pyfinite python library (https://pypi.org/project/pyfinite/).

I am also interested in constructing - a different variant - which can provide 8 symbol detection OR 4 symbol correction using the 12 check symbols.

Can you clarify your question? Please see [ask], [help/on-topic]. — AMC, Jul 10 '20 at 21:49
You haven't asked if you would want me to modify my C code interactive RS demo program so it's 4 byte correction and 8 byte detection. (It could fail if there are 9 or more bytes in error). — rcgldr, Jul 21 '20 at 20:08
Thanks @rcgldr, if possible, please provide this feature of 4 byte correction OR 8 byte detection in your code. — Raviy, Jul 26 '20 at 20:01
@Raviy - technically it's 4 byte error correction AND 8 byte error detection, as opposed to 4 byte error correction only OR 8 byte error detection only. I assume that erasures (known bad locations) aren't being considered. — rcgldr, Jul 27 '20 at 15:57
@Raviy - I updated my answer to include the simplest fix to limit the maximum number of errors corrected to <= 4. — rcgldr, Jul 27 '20 at 16:38

rcgldr · Accepted Answer · 2020-07-27T16:35:03.037

Is there a reason you can't use the python pyfinite code to handle the cases you want?

If interested, I have an old interactive RS ecc demo written in C. The user selects 1 of 30 possible GF(2^8) fields, some parameters about the RS generator polynomial, number of parity bytes (a define limits this to 20, but it can be changed), number of data bytes. Then the use can enter data, encode data, change data, change data and mark as erasure, fix data, ... . The code includes the 3 common decoders, PGZ matrix inversion, Berleykamp Massey discrepancy, Sugiyama's extended Euclid algorithm. The Euclid decoder is similar to a hardware implementation (emulating a pair of shift registers), since it was used to assist a hardware team implementing RS code back in the 1980's. I used Visual Studio to compile it, but there shouldn't be much issue with other compilers. It's too large to post directly in this answer, so here is a link to a zip file, that includes the source code and a readme.txt file:

http://rcgldr.net/misc/eccdemo8.zip

With 12 parities, RS code can correct up to 4 errors while still detecting 8 errors. Assume the worst case non-failing scenario of 8 errors. The code calculates the wrong 4 locations for errors, resulting in a total of 12 errors, the 4 mis-corrected errors plus the existing 8 errors. This can never fail because the Hamming distance between any two valid codewords is 13 bytes. For an example failure case, there could be 9 errors, the code calculates the wrong 4 errors, resulting in a total of 13 error for a possible mis-correction (which will create a valid codeword, but the wrong codeword).

To check for up to 8 errors after calculating up to 4 errors locations, the generated error locator polynomial should be verified using all 12 syndromes. This is done at line 871 in GenPErrorsE(), the Sugiyama extended Euclid decoder. That check could also be included in the other 2 decoders, but since the demo program calls all 3 decoders, it's not needed. Note that if the decoders calculate 6 errors, then it will always produce a valid codeword, but possibly the wrong codeword if there are actually 7 or more errors. The simplest fix to handle this in eccdemo8.c is to limit the number of errors to <= 4, which only requires inserting 4 lines of code at line 204:

    GenForneyErr();             /* generate forney err values */

    /* insert this code to limit to 4 errors */        
    if(vOffsets.size > 4){      /* limit to 4 errors */
        printf("uncorrectable, > 4 errors\n");
        return;
    }

    printf("vLocators:      ");

There's also another type of Reed Solomon code referred to as "original view" (versus the more common "BCH" view). For a RS(n,k) code, decoding operates on n symbols, while "BCH view" decoders operates on n-k symbols (syndromes), making "BCH view" significantly faster than "original view". Some erasure only codes, are based on "original view", but the most common case is Raid 6, which generates "BCH view" syndromes (yet another approach, the syndromes are not part of a "codeword"). The Wiki article explains this:

https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

Thank you very much for providing your code, suggestions. I am a novice in RS codes. As far as i understand, i can use your code or pyfinite without modifying the encoding/decoding routines for using RS code only for correcting errors. But, to provide some amount of detection and correction capability (different variant), i guess i need to come up with a different parity check matrix, decoding mechanism. If you have any pointers/references on how to construct the parity check matrix and decoding according to my needs, kindly let me know. — Raviy, Jul 12 '20 at 20:08
Thanks @rcgldr for your inputs. As far as i understand, your solution will provide 4 symbol correction for data (64)+8 check symbols. After this is done, the remaining 4 check symbols can be used to detect upto 4 symbol errors in the corrected(72) + 4 symbols. — Raviy, Jul 16 '20 at 18:10
Thanks @rcgldr, if possible, can you please elaborate on steps/algorithm involved in encoding and decoding for the approach you propose. — Raviy, Jul 22 '20 at 22:59
Thanks @rcgldr for explaining. I assume encoding is similar, i.e one needs to generate 12 ECC symbols for 64 data symbols as i do in the standard case (6 symbol correcting code). — Raviy, Jul 23 '20 at 18:19
During decoding - is it possible to 1st detect if there are <= 8 errors, then employ correction if needed and then check for miscorrections? Thanks a lot for your suggestions — Raviy, Jul 23 '20 at 18:26
I believe, this way the overhead will be less in the common case when there is no error. Because, typically detection is less expensive ( in terms of computation) when compared to correction. — Raviy, Jul 23 '20 at 18:29
@Raviy - I deleted prior my prior comments as they are not needed anymore (you might want to do the same). Unless there are zero errors (all 12 syndromes == 0), to determine the number errors, an error locator polynomial has to be generated, using Berlekamp - Massey method or Sugiyama's extended Euclid method. It would be simplest to use all 12 syndromes to find up to 6 errors, but then treat 5 and 6 error cases as non-correctable, before determining error values and doing the actual correction. — rcgldr, Jul 23 '20 at 20:47

Construction of Reed Solomon - detection and correction

1 Answers1