I am working on a C# program to determine the line length for each row in multiple large text files with 100,000+ rows before importing using an SSIS package. I will also be checking other values on each line to verify they are correct befor importing them into my database using SSIS.
For example, I am expecting a line length of 3000 characters and then a CR at 3001 and LF at 3002, so overall a total of 3002 characters.
When using ReadLine() it reads a CR or LF as and end of line so that I can't check the CR or LF characters. I had been just checking the length of the line at 3000 to determine if the length was correct. I have just encountered an issue where the file has a LF at position 3001 but was missing the CR. So ReadLine() says it is 3000 char witch is correct but it will fail in my SSIS package because it is missing a CR.
I have verified that Read() will reach each char 1 at a time and I can determine if each line has a CR and LF but this seems rather unproductive, and when some files I will encounter with have upwards of 5,000,000+ rows this seems very inefficient. I will also need to then add each char into a string or use ReadBlock() and convert a char array into a string so that I can check other values in the line.
Does anyone have any ideas on an efficient way to check the line for CR and LF and other values on a given line without wasting unnecessary resources and to finish in a relatively timely manner.