I have a requirement to count the number of records in a text file which is 600MB in size and some sample data is below. The data in this flatfile is delimited. The column delimiter used is pipe. And the data is qualified with a special character (in this case ±). Some of the values have a new-line character, because of which I'm getting wrong counts. In the below example, when I'm reading one line at a time, I'm getting 9 records but ideally it should be 7. The data is better represented in the image: enter image description here
±0000958779±|±KR±|±FEOUL±|±2F, 759, YEOKFAM-DONF, FANFNAM-FU±|±±
±0000958774±|±KR±|±BUFAN±|±208-7, CHOEUM-DONF, BUFANJIN-FU±|±±
±0000518874±|±RU±|±M.O, F. Odincovo±|±ZAO " Mremium Otel Menedjment"±|±±
±0000518971±|±RU±|±Famara±|±ul.Molevaya,80,
FamarFkaya ForodFka±|±±
±0000519050±|±RU±|±MoF VniiFFok±|±VlaFenko Ol'Fa VaFil'evna±|±±
±0000519027±|±RU±|±Ft-MeterFburF±|±DorozhinFkaya LariFa Anatol
evna±|±±
±0000958779±|±KR±|±FEOUL±|±MART AV CLINIC(CLOFED)±|±±