3

So I have an assigment for c# in which I need to work with text files, separate words at commas and other punctuation marks. I choose to do it like this:

string Book1 = "@\\..\\Knyga1.txt";
string punctuation = " ,.?!;:\"";
string Read1 = File.ReadAllText(Book1);
string[] FirstFileWords = Read1.Split(punctuation.ToCharArray());

But I've run into a problem... My text files are supposed to be like books, so obviously there's going to be multiple lines... is there any way to add "the enter key" or whatever we shall call the thing to make a new line (sorry for my bad english) one of the punctuation marks? Because when working with individual words later, for example printing out the longest words, words which are at the start of line 2 3 and so on take up two lines in the console.

BligenN
  • 111
  • 10

4 Answers4

5

Just add \r\n to the list. That's the "enter key" -- i.e. "new line" -- in Windows OS', and it's what is returned by Environment.NewLine.

string punctuation = " ,.?!;:\"\r\n";

\r stands for "carriage return" and \n stands for "line feed" which, when used together, are called a "new line" (as explained on the above MSDN page and other places like this SO answer).

Additionally, there are other not-so-common "vertical whitespace" characters (see my question here for reference). So, to be complete, I would do this to include "vertical tab", "form feed", "next line", "line separator", and "paragraph separator":

string punctuation = " ,.?!;:\"\r\n\v\f\u0085\u2028\u2029";

Here's a Wikipedia article that describes all these and other whitespace chars.

Community
  • 1
  • 1
rory.ap
  • 34,009
  • 10
  • 83
  • 174
  • whats the difference between \n and \r exactly? And what is carriage return? – BligenN Dec 02 '16 at 15:37
  • 2
    @BligenN \n or newline original meant just go down one line and \r or carriage return meant go to the beginning of the line. So you needed both to go to the start of the next line. It's a typewriter thing. – juharr Dec 02 '16 at 15:39
  • Thanks for the update! Real useful information even though I won't be needing it this time as I'm mostly making my own text files and the goal is to make us learn to work with text and not to get too advanced :) But a huge thanks anyway man, appreciate it – BligenN Dec 02 '16 at 15:46
4

To add new lines to your group you need to use the new line and carriage return characters:

" ,.?!;:\"\r\n";
TheLethalCoder
  • 6,668
  • 6
  • 34
  • 69
1

If you want to put end-of-line you need \n

rbucinell
  • 73
  • 1
  • 2
  • 10
1

You can try char.IsPunctuation to find out all punctuation characters

// scan all the characters an filter out punctuation ones (585):
string punctuation = string.Concat(Enumerable.Range(0, char.MaxValue)
  .Select(c => (char)c)
  .Where(c => char.IsPunctuation(c)));

you may want to add up some other characters which are, technically, are not punctuation ones: space, line break, carriage return:

string punctuation = " \r\n" + 
  string.Concat(Enumerable.Range(0, char.MaxValue)
    .Select(c => (char)c)
    .Where(c => char.IsPunctuation(c)));
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215