Reed Solomon Decoding - Error Correction - Syndromes Calculation

Question

I am implementing Reed Solomon Decoding for QR Codes Decoding using C++. I have implemented the main part of Decoding and error detection so far. I have followed ISO/IEC 18004:2006 Manual. As I have seen in Annex B : Error Correction decoding steps, Syndromes S(i) are calculated as S(i) = R(a^i). Let's assume we have High Error Correction Level, so we have 9 Data Codewords and 17 Error Correction Codewords, which give us a total of 26 codewords when we are in QR Codes Version 1. So, I assume that the polynomial R(x) shown in Pg.76 of ISO/IEC 18004:2006 Manual will be a sequence of Data Codewords and Error Correction Codewords with correct power of x respectively. So, S(i) = R(a^j) , where i=0...15 and j=0...25 for High Error Correction Level. But, when I run my code and as I have a whole QR Code Matrix with no errors, I expect all syndromes to be equal to zero, I take as a result non-zero Syndromes. Have I understood something wrong about Syndromes calculation under Galois Field Arithmetic through Reed Solomon Decoding ?

This is more of a math question about Reed-Solomon codes than a C++ question and your question doesn't involve C++ details. Consider removing the C++ tag. You might look at the math section of stackexchange. — doug, Jan 11 '17 at 17:04

rcgldr · Accepted Answer · 2017-07-02T14:47:12.307

1

After looking at QR Code references, for version 1, level H, with 9 data bytes and 17 error correction bytes, using generator polynomial g(x) = (x-1)(x-a)(x-a^2)...(x-a^(16)) you should be using syndromes S(i) = R(a^i) for i = 0 to 16. In a no error case, all 17 syndromes should be zero.

There's a decent wiki article for Reed Solomon error correction:

http://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

The wiki article contains a link to a Nasa tech brief RSECC tutorial:

http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19900019023.pdf

Link to C source code for a console program that demonstrates RSECC methods for 8 bit field (user chooses from 29 possible fields). I use Microsoft compilers or Visual Studio to compile it and Windows to run it, but it should work on most systems.

Note - I updated the ecc demo program to handle erasures in addition to errors, just in case it could be useful. Also added code to calculate error value polynomial Omega in case Euclid method is not used. The link is the same as before:

http://rcgldr.net/misc/eccdemo8.zip

Update based on the questions in comments:

My question about which GF(2^8):

GF(2^8) is based on 9 bit polynomial
        x^8 + x^4 + x^3 + x^2 + 1 = hex 11d
        primitive is x + 0 (hex 2)

Looking up QR code references, different generator polynomials are used depending on the correction level: L (low), M (medium), Q (quality), H (high).

Question about decoding using matrices. Sklar paper shows decoding using linear equations and matrix inversion. This procedure has to assume a maximum error case t, which will be floor(e / 2) where e is the number of error correction bytes (also called parity bytes or redundant bytes). If the determinant is zero, then try t-1 errors, if that's zero, try t-2 errors and so on, until determinant is non-zero or t is reduced to zero.

The Euclid or Berlekamp Massey decoding methods will automatically determine the number of errors.

In all cases, if there are more than t errors, there's some chance that a mis-correction will occur, depending on the odds of producing t locations where none of them are out of range. If any of the t locations found from error correction are out of range, then an uncorrectable error has been detected.

Update #2

I did a quick overview of the ISO document.

The generator polynomial is (x - 1) (x - 2) (x - 2^2) ..., so the syndromes to check are S(0) to S(n-1) as you mentioned before, and in the case of zero errors, then all syndromes S(0) to S(n-1) should be zero.

The ISO document uses the term codewords to refer to bytes (or symbols), but in most ecc articles, the term codeword refers to an array of bytes including data and error correction bytes, and the error correction bytes are often called parity bytes, redundant bytes or remainder bytes. So keep this in mind if reading other ecc articles.

Page 37 of the ISO document mentions "erasures" and "errors", which is RSECC terminology. "Erasures" refer to bad (or potentially bad) data bytes at known locations, detected outside of RSECC. "Errors" refer to bad bytes not detected outside of RSECC, and only determined during RSECC decoding. The document then notes that there are no invalid data bit patterns, which would imply that there is no "erasure" detection. It then adds to the confusion by showing an equation that includes erasure and error counts.

If you're curious, the Nasa pdf file on RSECC explains erasure handling starting at page 86, but I don't think this applies to QR codes.

http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19900019023.pdf

Getting back to the ISO document, it uses p to note the number or error correction bytes used for misdecode protection, as opposed to being used for correction. This is shown in table 9 on page 38. For version 1, which seems to be what you're using, reinterpreting:

error correction level
|    number of data bytes
|    |    number of ecc bytes used for correction
|    |    |    number of ecc bytes used for misdecode protection (p)
|    |    |    |    correction capability
L   19    4    3    2/26 ~ 07.69%
M   16    8    2    4/26 ~ 15.38%
Q   13   12    1    6/26 ~ 23.08%
H    9   16    1    8/26 ~ 30.77%

Given that this table shows that the expected correction capability is met without the usage of erasures, then even if erasures could be detected, they are not needed.

With GF(2^8), there are 255 (not 256) possible error locations that can be generated by RSECC decoding, but in version 1, there are only 26 valid locations. Any generated location outside of the 26 valid locations would be a detection of an uncorrectable error. So for L level, the 3 p bytes translates into the odds of miscorrection by 1/(2^24), and location range muliplies this by (26/255)^2 for ~6.20E-10 probablity. For H level, the 1 p bytes translates into the odds of miscorrection by (1/2^8) and location range by (26/255)^8 for ~4.56E-11 probability.

Note that for version 2, p = 0 for levels M, Q, H, relying on the location range (44/255)^(8 or 11 or 14) for miscorrection probability of 7.87E-7, 4.04E-9, 2.07E-11.

edited Jul 02 '17 at 14:47

answered Jan 11 '17 at 17:04

rcgldr

27,407
3
36
61

Assuming that we have High Error Correction Level, we have 9 Data Codewords & 17 Error Correction Codewords = 26 Codewords in total for Version 1 QR Codes. Also, for High ECL , syndromes needed for calculation are 16, S0...S15, following ISO/IEC for QR Codes, cause generator polynomials starting from a^0, i.e. g(x) = (x-a^0)*(x-a^1)*... So, R(x) = DC's + ECC's . My question is if the data and error codewords from r(x) poly are been set like : first the least significant and so on ... – dimkatsi91 Jan 16 '17 at 13:07
Thanks. Also, for everyone that comes up to that thread, I have to say that I found the next notes set : http://mathcs.holycross.edu/~little/MSRI-UP2009/LecturesWeek3Handouts.pdf good enough for understanding Extended Euclidean Algorithm for Reed Solomon Codes in Galois Fields. – dimkatsi91 Feb 04 '17 at 14:27
@dimkatsi - Were you able to get your error correction code working? – rcgldr Feb 20 '17 at 00:32
@dimkatsi - I assume you were able to get the syndromes == zero when there are no errors? You might want to try my latest demo code (download it again). If you run the demo, enter these values to the first 5 prompts: 1, N, Y, 17, 9 . The data starts off as all zeroes (no error). Enter 'C' to change data: enter an offset (0->25) and value (0x01 -> 0xFF), hit return a second time to get back to main "menu", then enter 'F' to fix (correct) the data. Or, post a link to your code and I'll check it out. – rcgldr Feb 20 '17 at 12:12
1

@dimkatsi - There is one special ending case for Euclid, sometime Omega has a leading zero (I don't know if there can be more than one leading zero). In my two shift register implementation, I shift out the leading dividend zeroes in my main Euclid loop, but after the loop I have a check for the number of terms that should be in Omega and shift right to bring the leading zero(s) back if needed. – rcgldr Feb 20 '17 at 12:13
Hello, a question about Forney's method for calculating the error values. As mentioned in >> https://downloads.bbc.co.uk/rd/pubs/whp/whp-pdf-files/WHP031.pdf , when we want to find the error value Yj = Xj * (Omega(Xj^-1) / Lambda'(Xj-1) ) , but when it is applied in the example in pp.38, it is not the correct value for Λ'(a^-9) = 6 and not 5 as it is mentioned. Am I missing something ? – dimkatsi91 Mar 01 '17 at 20:10
1

@dimkatsi - There's an issue explaining the derivative of a finite field polynomial on page 20. It's explained in [wiki formal derivative](http://en.wikipedia.org/wiki/Forney_algorithm#Formal_derivative), specifically, multiplication by (exponent-1) in the derivative is repeated addition. Since addition in GF(256) or GF(16) is xor, then for f(x) = 1 + f1 x + f2 x^2 + f3 x^3 ... f'(x) = f1 + (f2 x + f2 x) + (f3 x^2 + f3 x^2 + f3 x^2) + ... = f1 + f3 x^2 + ... . The even terms = 0 and odd term coefficients = fn. Given Λ(x) = 5 x^2 + 5x + 15, then Λ'(x) = 5 regardless of x. – rcgldr Mar 01 '17 at 21:58
Hello, just a question about error correction; If I have erasures, should I follow different correction approach from errors situation ? – dimkatsi91 Mar 30 '17 at 09:57
@dimkatsi - the erasures are used to modify the syndromes, and the modified syndromes used to generate error locations as usual. Then erasures and error locations are merged. Omega from Euclid can't be used because it's based on error locations only, so it has to be recalculated. The example code and the NASA article I linked to above handle this. I'm not sure how you get erasures from QR Code, since it seems there are no invalid data patterns. – rcgldr Mar 30 '17 at 15:09
With invalid data patterns may you mean wrong format string bit sequence ( 15 bits sequence) ? ( Table C1/ ISO/IEC 18004 ) – dimkatsi91 Mar 30 '17 at 16:50
@dimkatsi - I meant invalid data patterns, which are 8 bit data patterns, none of them invalid from what I understand. – rcgldr Mar 30 '17 at 19:48
1

About erasures, as mentioned in Wiki they are errors with the exception that we know the location they have occurred. So, we have no need to calc the error locator polynomial but only error magnitude polynomial and correct them just like they were errors. But the question remains, how does my code know which is an error and which is an erasure prior ? – dimkatsi91 Apr 03 '17 at 08:33
@dimkatsi - the ISO .pdf document mentions erasures, but the error correction procedure in Annex B doesn't deal with erasures, so there's an apparent conflict in the document. Erasure detection would be done before and outside of RS (Reed Solomon) correction and used as an input to RS correction. – rcgldr Apr 03 '17 at 11:09
Thanks for the information.I am just searching for a condition in order to know that an error is error or erasure, but from what i understand there is no condition to points this assumption as erasures are determined before Reed Solomon Error Correction Algorithm. – dimkatsi91 Apr 03 '17 at 11:25
1

@dimkatsi - error and erasure is described in [wikiversity RS for coders](https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders#Error_and_erasure_correction). It's also handled in my demo code [eccdemo8.zip](http://rcgldr.net/misc/eccdemo8.zip) . I don't know how your specific QR code reader works and how erasures would be detected. – rcgldr Apr 03 '17 at 11:37
Hello, which is the best Opencv QR Detect Algorithm ? I have found next resource : http://dsynflo.blogspot.gr/2014/10/opencv-qr-code-detection-and-extraction.html . What is your opinion sir ? – dimkatsi91 May 28 '17 at 18:04
@dimkatsi - I'm not familiar with QR visual detection, which converts the scanned image of an angled code into a non-angled 2d image of the code, perhaps as a matrix of 0 and 1 bits. – rcgldr May 28 '17 at 21:27
Just for everyone's information about DSynFlo QR Detection Algorithm. It detects a QR symbol with high accuracy. It also is implemented using OpenCV and it stores all needed images, both captured image, and the detected QR code symbol image. So, it is perfect for QR code symbol detection, since it uses affine transformation image processing techniques. So, concluding, it may is the best code out there for QR code symbol detection. – dimkatsi91 Jun 16 '17 at 23:36
Also, sir I would like to know your name if it is possible. I would like to mention your help in my master thesis document if it is ok with you. Thanks again for any help you provided me. – dimkatsi91 Jun 16 '17 at 23:43
@dimkatsi - email me at rcgldr@cox.net . – rcgldr Jun 17 '17 at 01:54

Reed Solomon Decoding - Error Correction - Syndromes Calculation

1 Answers1