I am attempting to clean up a flat file in a C-ISAM database that has corrupt data. The flat file is fixed position delimited, and the field I'm trying to clean up is a date field. The field uses an eight digit format (YYYYMMDD). What I am trying to do is use a text editor with regex capabilities to find any row where this column does not contain eight digits and replace it with a null value.
Because the true data that would have been there is irrecoverable and because the conversion scripts throw away invalid dates, the date does not need to be valid, only the format. In other words, 20160512 and 99999999 would be valid values, but all blanks is not, nor would " 123 ".
I don't have a great grasp of RegEx because I end up working with it so infrequently, but I'm most of the way there:
(?<=^.{1018})[^\d]{8}
This skips the first 1018 columns in a row and selects the next 8 characters if there are no digits within them. This catches the first 'not valid' example I gave but not the second. I need one that does both.