3

Input is:

<p>1:4 And David said unto him, How went the matter? I pray thee, tell me.</p>

<p>And he answered, That the people are fled from the battle, and many of the people also are fallen and dead; and Saul and Jonathan his son are dead also.</p>

In this first line contains numbers (1:4) and in second line only strings.

I want to find only strings in <p> tag and merge that content to previous <p> tag in html file.

Means:

1:4 And David said unto him, How went the matter? I pray thee, tell me. And he answered, That the people are fled from the battle, and many of the people also are fallen and dead; and Saul and Jonathan his son are dead also.

Can I do like this:

Regex.IsMatch(html, @"^[a-zA-Z]+$");

How can I do that?

Max
  • 12,622
  • 16
  • 73
  • 101
TinKerBell
  • 2,111
  • 2
  • 14
  • 12
  • Are you saying that you want to merge human paragraphs within a verse so that each HTML `paragraph` contains the whole verse, starting with the Biblical reference? – azhrei Mar 18 '13 at 06:06

1 Answers1

0

Looks like I got what you're trying to achieve:

StringBuilder sb = new StringBuilder();
foreach (string line in input.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries))
{
    sb.Append(line.Trim());

    // notice different regex, i.e.:
    // new paragraph stars with `<p>x:y` and ends with `</p>`

    if (!Regex.IsMatch(line, @"^\<p\>[0-9]\:[0-9].+\</p\>$"))
    {
         sb.AppendLine(); // insert line break
    }
}
string result = sb.ToString();

Works form me, see sandbox: one, two.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
  • Its not working for me.. plz help me to get only text

    tag not number .. Means only

    And he answered, That the people are fled from the battle, and many of the people also are fallen and dead; and Saul and Jonathan his son are dead also.

    – TinKerBell Mar 18 '13 at 09:09
  • @TinKerBell: `line = line.Remove("

    ").Remove("

    ")` will remove both tags but *from everywhere in the text*
    – abatishchev Mar 18 '13 at 18:39