-2

This questioned has been asked before in regard to other languages but I could't find anything on using regex or any other algorithm to solve this in C#.

For example:

Photosynthesis maintains atmospheric oxygen levels and supplies all of the organic compounds and most of the energy necessary for life on Earth. Most cases, oxygen is also released as a waste product. (((((THIS SERIES OF SPACES HERE THAT SUGGEST THE END OF A PARAGRAPH))))
Although photosynthesis is performed differently by different species, the process always begins when energy from light is absorbed by proteins called reaction centers that contain green chlorophyll pigments.

should be formatted as:

Photosynthesis maintains atmospheric oxygen levels and supplies all of the organic compounds and most of the energy necessary for life on Earth.

Although photosynthesis is performed differently by different species, the process always begins when energy from light is absorbed by proteins called reaction centers that contain green chlorophyll pigments.

How do I get this done?

Kurubaran
  • 8,696
  • 5
  • 43
  • 65
WyldeBoyy
  • 35
  • 8

2 Answers2

2
var SpacedText = "Some sample text.           This should be a new paragraph."

var NewlineText = Regex.Replace(SpacedText , @"\s{2,}", Environment.NewLine);

Change the 2 in the regex for however many spaces you want it to break on.

Environment.NewLine can be replaced with whatever newline delimiter you need (<br /> for html, or any listed here).

  • This sort of does the trick :D but how do I change the regex code so that it breaks after detecting 4+ amounts of spaces instead of detecting 'exactly' 4 spaces? – WyldeBoyy Aug 17 '14 at 09:43
  • If you played around with the regex, you'll find the number specifies only the minimum number of spaces to break on. Change the 2 to a 4 and you're away. –  Aug 17 '14 at 09:44
0

The best guess that I can think of is to match the end of sentence . and possible trailing whitespace, before also end of line, and replace it with . and carriage return/linefeed.

In this case the regex would be

 \.\s*[\r\n]+

http://regex101.com/r/cU2tF9/1

Pieter21
  • 1,765
  • 1
  • 10
  • 22