5

I want to write a url extracting function in objective C. The input text can be anything and may or may not contain html anchor tags.

Consider this:

NSString* input1 = @"This is cool site <a   href="https://abc.com/coolstuff"> Have fun exploring </a>";
NSString* input2 = @"This is cool site <a target="_blank" href="https://abc.com/coolstuff"> Must visit </a>";
NSString* input3 = @"This is cool site <a href="https://abc.com/coolstuff" target="_blank" > Try now </a>";

I want modified string as "This is cool site https://abc.com/coolstuff

Ignoring all text between anchor tag. And need to consider other attributes like _target in anchor tag

I can do something like

static NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"<a\shref=\"(.*?)\">.*?</a>" options:NSRegularExpressionCaseInsensitive error:nil];;
NSString* modifiedString = [regex stringByReplacingMatchesInString:inputString options:0 range:NSMakeRange(0, [inputString length]) withTemplate:@"$1"];

Works fine with input1 but fails in other cases.

Thanks

  • possible duplicate of [Objective-C - Finding a URL within a string](http://stackoverflow.com/questions/5998969/objective-c-finding-a-url-within-a-string) – vokilam Feb 13 '14 at 19:51

3 Answers3

10

Try this one:

<a[^>]+href=\"(.*?)\"[^>]*>.*?</a>
Sabuj Hassan
  • 38,281
  • 14
  • 75
  • 85
5

Or try this one:

<a.+?href="([^"]+)

EXPLAINED

<a - match opening tag

.+? - match anything lazily

href=" - match href attribute

([^"]+) - capture href value

OUTPUT

https://abc.com/coolstuff
https://abc.com/coolstuff
https://abc.com/coolstuff
gwillie
  • 1,893
  • 1
  • 12
  • 14
0
<[aA].+href[ ]*=[ ]*[\\]?"(.*)[\\]".*>(.+)<\/[aA]>

Here, the first group ($1) captures the url. $2 captures the link text.

Shyam Bhat
  • 1,600
  • 13
  • 22