0

My app writes the contents of text files to a db memo field. Most files display properly in the db form, but some display in a run-on fashion (all contents on one line). If I open these files in Wordpad they display correctly, so I save them as plain text to make them display properly in the db form.

I'm tired of doing this :-) and would like to perform this "conversion" programmatically. Using the "analyze the bom" method I found online, it says both types of files are ASCII. Trying to load a problem file into a RichTextBox as rich text returns "input file not in correct format". Using the immediate window in debug, I've found the files that display properly use \r\n for newline, and the ones displaying improperly use only \n. Using Regex.Replace \n, \r\n just causes the \r\n to appear visibly as text (rather than encoding characters). The question "Text shown in single line in notepad" seems to be the same type of issue, except the file is being streamed and massaged line-by-line, which I'm thinking shouldn't be necessary.

So... all I need to do programmatically is to somehow mimic the behavior of opening the file in Wordpad and overwriting it as plain text. I've experimented extensively (unsuccessfully) with Encoding.Convert and the thing that really baffles me, is that the problem files appear to be ASCII encoded just like the "good" files, only difference is that the good files use \r\n where the problem files use just \n. Any help is appreciated...

For what it's worth, this is the statement that's loading the text file contents into the temporary string array (whose contents are later loaded to db). I realize I might need to ReadAllText into a "work" string and massage it before loading it to the array... but the only massaging it needs is to be opened in Wordpad and then overwritten as plain text. If only I could figure out how to do that!

txtfileary[aryctr] = File.ReadAllText(textfile, Encoding.ASCII);

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
BrianJames
  • 11
  • 1
  • Opening it in Wordpad and saving it merely replaces the `\n`, or *nix style line endings, with a carriage return/line feed pair (AKA CRLF or `\r\n`) Windows-style line endings. Encoding isn't involved here at all. Can't explain why your Regex.Replace didn't work, because you didn't include the code. – Ken White Mar 01 '18 at 18:38
  • Have you tried something like `txtFileAry[aryCtr] = Regex.Replace(File.ReadAllText(textFile, Encoding.ASCII), "(?<!\r)\n", "\r\n");`? It would be helpful to see the code and a small sample data file. – Rufus L Mar 01 '18 at 18:51
  • I've also found that if, after populating the database, I change the definition of the memo field from plain text to rich text, everything looks good. And if I change it back to plain text, the changes persist i.e. it still looks good. So it looks like if, after adding records to the database, I go through that little dance (change the memo field definition and then change it back again) I can accomplish what I want. But of course, I'd rather not have to perform that little routine each time I update the database. – BrianJames Mar 01 '18 at 19:01
  • txtfileary[aryctr] = pattern.Replace(txtfileary[aryctr], @"\r\n") – BrianJames Mar 01 '18 at 19:03
  • Regex pattern = new Regex(@"\n") – BrianJames Mar 01 '18 at 19:03
  • DOH! I don't need the @ escape character in the Replace statement. It was replacing \n with \\r\\n. Took it out and all looks good. Sorry for taking up part of your day... but Ken you did help by making me focus on the Regex.Replace strategy. – BrianJames Mar 01 '18 at 19:24

0 Answers0