0

This would be the sample text:

<option value="USD">American Samoa, United States Dollar (USD)</option>
<option value="EUR">Andorra, Euro (EUR)</option>
<option value="AOA">Angola, Kwanza (AOA)</option>
<option value="XCD">Anguilla, East Caribbean Dollar (XCD)</option>
<option value="XCD">Antigua and Barbuda, East Caribbean Dollar (XCD)</option>
<option value="ARS">Argentina, Peso (ARS)</option>

This is my try:

<option selected="selected" value="[A-Z]{3}">(?<Test>).+</option>.

The problem is, it only matches the first occurrence it finds. While I want it to get them all. What am I missing in my try?

Quoter
  • 4,236
  • 13
  • 47
  • 69
  • I don't see any c# code in the question. Did you select the wrong tag? Or is there something you haven't shared? – psubsee2003 May 24 '14 at 23:03
  • First of all: Regex - is not good for parsing HTML. Second - your regex is not matching any element in text above. Could you show your C# code? – Oleksii Aza May 24 '14 at 23:03
  • Hi guys, sorry I'm using Expresso (much faster than F5/shift F5) at the moment to test it. After that, I'll do it in C#. But why are you saying it isn't matching anything? I have to say that I copied just a few lines. I'm running this against all currencies. – Quoter May 24 '14 at 23:10
  • It isn´t matching anything because you´re trying to match select="selected" and none of them have this property – briba May 24 '14 at 23:19
  • Isn't it because you've got only 1 option that is selected in your text? You are matching the by having selected="selected" in your regex. – Bartosz Wójtowicz May 24 '14 at 23:29
  • Ah I see, I should be escaping that. But what I need is everything between the option tags. This html is just a rip from a site. I just need those currencies in my application in a resource file. – Quoter May 25 '14 at 08:14

2 Answers2

2

Regex is not recommended for HTML parsing.

Why don´t you use HTML Agility Pack?

http://htmlagilitypack.codeplex.com/

Here is an example:

 HtmlDocument doc = new HtmlDocument();
 doc.LoadHtml("YOUR HTML STRING");
 foreach(HtmlNode node in doc.DocumentElement.SelectNodes("//select/option[@selected='selected']")
 {
    string text = node.InnerHtml;                  // "American Samoa, United States Dollar (USD)"
    string value = node.Attributes["value"].Value; // "USD"
 }

You can also download via NuGet =)

If you like this solution, you can read a little bit more about XPath:

http://www.w3schools.com/XPath/xpath_syntax.asp

If you still want to use Regex, you can check this site:

http://www.jslab.dk/tools.regex.php

briba
  • 2,857
  • 2
  • 31
  • 59
  • Hi, sounds very interesting. Wasn't aware of this pack. That's exactly what I need, just the text between the option tags. Thanks! – Quoter May 25 '14 at 08:16
  • You´re welcome @Quoter.. please let me know if you need some help =) – briba May 25 '14 at 14:47
1

Are you talking about something like this regex:

<option value=""[A-Z]{3}""[^<]*</option>

Here is a full C# program, see the output at the bottom of the live C# demo.

using System;
using System.Text.RegularExpressions;
using System.Collections.Specialized;
class Program {
static void Main()    {
string s1 = @"<option value=""USD"">American Samoa, United States Dollar (USD)</option>
<option value=""EUR"">Andorra, Euro (EUR)</option>
<option value=""AOA"">Angola, Kwanza (AOA)</option>
<option value=""XCD"">Anguilla, East Caribbean Dollar (XCD)</option>
<option value=""XCD"">Antigua and Barbuda, East Caribbean Dollar (XCD)</option>
<option value=""ARS"">Argentina, Peso (ARS)</option>";
var myRegex = new Regex(@"<option value=""[A-Z]{3}""[^<]*</option>");
MatchCollection AllMatches = myRegex.Matches(s1);

Console.WriteLine("\n" + "*** Matches ***");
if (AllMatches.Count > 0)    {
    foreach (Match SomeMatch in AllMatches)    {
        Console.WriteLine("Overall Match: " + SomeMatch.Value);
            }
}

Console.WriteLine("\nPress Any Key to Exit.");
Console.ReadKey();
} // END Main
} // END Program
zx81
  • 41,100
  • 9
  • 89
  • 105
  • No I need everything in the option tags. Crasher came up with a nice solution that should work. Thanks for your help. – Quoter May 25 '14 at 08:17