2

I'm using this great plugin: https://github.com/maranomynet/linkify/blob/master/1.0/jquery.linkify-1.0.js

to link text manipulating the dom. The problem is with links like this: http://en.wikipedia.org/wiki/The_Godfather_(novel)

The link will be "http://en.wikipedia.org/wiki/The_Godfather_(novel"

What could I change in the linkify code to handle parenthesis, etc?

Thanks!

PS: Hey, it seems that Stackoverflow could use this too! haha ;)

EDIT:

I just saw the post on DaringFireball, it's working great... The problem is with simple URLs like www.google.com (i think it has to do with the first regex for "noProtocolUrl". This is what I've got right now:

var noProtocolUrl = /(^|["'(\s]|<)(www\..+?\..+?)((?:[:?]|\.+)?(?:\s|$)|>|[)"',])/g,
    httpOrMailtoUrl = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:(?:[^\s()<>.]+[.]?)+|\((?:[^\s()<>]+|(?:\([^\s()<>]+\)))*\))+(?:\((?:[^\s()<>]+|(?:\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi,
        linkifier = function (html) {
            return FormatLink(html
                        .replace(noProtocolUrl, '$1<a href="<``>://$2" rel="nofollow external" class="external_link">$2</a>$3')  // NOTE: we escape `"http` as `"<``>` to make sure `httpOrMailtoUrl` below doesn't find it as a false-positive
                        .replace(httpOrMailtoUrl, '<a href="$1" rel="nofollow external" class="external_link">$1</a>')
                        .replace(/"<``>/g, '"http'));  // reinsert `"http`
        },

With "www.facebook.com" I get this (with the rel and class attributes just as text next to the link:

www.facebook.com" rel="nofollow external" class="external_link">www.facebook.com
Santiago
  • 2,405
  • 6
  • 31
  • 43
  • What about links **in** parentheses? (http://example.com) – Kobi May 02 '11 at 17:01
  • Kobi, I really didn't think of that... I think it couldn't be done then... Hmm... – Santiago May 02 '11 at 17:13
  • Kobi's comment is presumably the reason it works the way it does. However, it seems to me that it should be possible to implement logic like: If the URL has a `(` in it, and doesn't have a `)` after that, then assume the `)` at the end is part of the URL. Otherwise, assume that it's not. Or if you want to get really fancy, you could make an AJAX `HEAD` call to the URL without the paren. If you get a 404, linkify it with the paren. There's probably some good reasons not to do that, but it might be a fun experiment. – Tyler May 06 '11 at 05:22

1 Answers1

1

From what I've found, the regex expression found here (Originally created by John Gruber of Daring Fireball, modified by naren1012) seems to do the trick.

To implement, replace this code:

 var noProtocolUrl = /(^|["'(\s]|&lt;)(www\..+?\..+?)((?:[:?]|\.+)?(?:\s|$)|&gt;|[)"',])/g,
      httpOrMailtoUrl = /(^|["'(\s]|&lt;)((?:(?:https?|ftp):\/\/|mailto:).+?)((?:[:?]|\.+)?(?:\s|$)|&gt;|[)"',])/g,
      linkifier = function ( html ) {
          return html
                      .replace( noProtocolUrl, '$1<a href="<``>://$2">$2</a>$3' )  // NOTE: we escape `"http` as `"<``>` to make sure `httpOrMailtoUrl` below doesn't find it as a false-positive
                      .replace( httpOrMailtoUrl, '$1<a href="$2">$2</a>$3' )
                      .replace( /"<``>/g, '"http' );  // reinsert `"http`
        },

With this code:

 var noProtocolUrl = /(^|["'(\s]|&lt;)(www\..+?\..+?)((?:[:?]|\.+)?(?:\s|$)|&gt;|[)"',])/g,
  httpOrMailtoUrl = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:(?:[^\s()<>.]+[.]?)+|\((?:[^\s()<>]+|(?:\([^\s()<>]+\)))*\))+(?:\((?:[^\s()<>]+|(?:\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»����]))/gi,
      linkifier = function ( html ) {
          return html
                      .replace( noProtocolUrl, '$1<a href="<``>://$2">$2</a>$3' )  // NOTE: we escape `"http` as `"<``>` to make sure `httpOrMailtoUrl` below doesn't find it as a false-positive
                      .replace(httpOrMailtoUrl, '<a href="$1">$1</a>')
                      .replace( /"<``>/g, '"http' );  // reinsert `"http`
        },
Brad Butner
  • 310
  • 1
  • 11
  • 1
    Brad, this is not working but I think it could be because the last characters on the regex are displaying wrong: «»â��â��â��â�� – Santiago May 10 '11 at 04:10
  • Thanks Brad, I just saw the original post in DaringFireball and it's working! I just have a little problem with "www.simple.com" links now, I edited the question... – Santiago May 10 '11 at 16:19