0

Hi need to extract ONE letter from a string.

The string i have is a big block of html, but the part where i need to search in is this text:

Vahvistustunnus M :

And I need to get the M inside the nbsp's

So, who is the quickest regex-guru out there? :)

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
Jonas Cannehag
  • 391
  • 2
  • 11
  • Some questions... Is it always an `M` or can it be any character? What are you wanting to do with it? Find it? Replace it? Validate that it exists? What flavor of regex (PCRE, POSIX, etc...)? And in what context do you plan on using the regex (C#, Java, Vim, Notepad++, etc...)? – Robbie Apr 16 '12 at 14:39
  • It can be any character, i want to extract and use that character later on. The flavor of regex is unknown since it should be used in ui-tests using molybdenum and i'm not sure how they are doing the match (https://www.molyb.org/confluence/display/molyb/Home) – Jonas Cannehag Apr 16 '12 at 14:52
  • Ok, thanks.. one last question... is the match you are looking for always preceded by the literal `Vahvistustunnus`? – Robbie Apr 16 '12 at 14:59

2 Answers2

1

Ok, according to this page in the molybdenum api docs, the results will be all of the groups concatenated together. Given that you just want the char between the two  's then it's not good enough to match the whole thing and then pull out the group. Instead you'll need to do something like this:

(?<=Vahvistustunnus&nbsp;)[a-zA-Z](?=&nbsp;)

Warning This might not work for you because lookbehinds (?<=pattern) are not available in all regex flavors. Specifically, i think that because molybdenum is a firefox extension, then it's likely using ECMA (javascript) regex flavor. And ECMA doesn't support lookbehinds.

If that's the case, then i'm gonna have to ask someone else to answer your question as my regex ninja (amateur) skills don't go much further than that. If you were using the regex in javascript code, then there are ways around this limitation, but based on your description, it sounds like you have to solve this problem with nothing but a raw regex?

Robbie
  • 18,750
  • 4
  • 41
  • 45
  • I'm getting a "Unexpected Exception: invalid quantifier ?<=Vahvistustunnus )[a-zA-Z](?= )" on that one. Works perfect using C# though – Jonas Cannehag Apr 16 '12 at 15:47
  • chrome://molybdenum/content/js/extensions.js, lineNumber -> 579, stack -> RegExp("(?<=Vahvistustunnus )[a-zA-Z](?= )")@:0 ("(?<=Vahvistustunnus )[a-zA-Z](?= )","checkval")@chrome://molybdenum/content/js/extensions.js:579 ("(?<=Vahvistustunnus )[a-zA-Z](?= )","checkval")@chrome://molybdenum/content/selenium/htmlutils.js:60 – Jonas Cannehag Apr 16 '12 at 15:49
  • yeah, it must be using ECMA then i guess. I'll have a think about it, but for the time being i'm stuck. – Robbie Apr 16 '12 at 15:49
1

Looks like it uses JavaScript and if so

var str = "Vahvistustunnus&nbsp;M&nbsp;:";
var patt = "Vahvistustunnus&nbsp;([A-Z])&nbsp;:";
var result = str.match(patt)[1];

should work.

Jonas Elfström
  • 30,834
  • 6
  • 70
  • 106
  • The regex is working fine in javascript. But molybdenum refuses to extract the value into a variable i'm afraid. Probably should look at another testing framework :) Thanks anyway mate! – Jonas Cannehag Apr 17 '12 at 05:55