0

I am creating a minimal version of a text-to-markup process very similar to Textile. I successfully convert bold, italic, strikethroughs (with *, _ and - respectively), but I am also using the following expression to automatically convert HTTP strings to links:

/([^\(])(https?:\/\/([-\w\.]+)+(:\d+)?(\/([\w\/_\.\-]*(\?\S+)?)?)?)/

The problem is, if a HTTP string includes, for example, a dash, the expression for strikethroughs (/\-([^\*]+?)\-/) is also processed, resulting in a URL link that would change:

site.com/path-with-dashes to site.com/path<del>with</del>dashes

What is the best solution to achieve both processes together? I would assume that changing the strikethrough expression to require the character before the dash to be a space character, or the start of a line, would work, but I can't manage to achieve this in one expression.

Rhys
  • 1,581
  • 2
  • 14
  • 22
  • Regexes do not really look suited for that job. You have the advantage that any URI, and that includes URL, does not contain space: take advantage of this to separate your text "word by word", where a word is anything that does not contain spaces, and if the word successfully parses to an absolute URI, then make a link out of it. This way the rest of your process does not change. – fge Dec 20 '12 at 22:32
  • Great, that is a good point. Thanks. – Rhys Dec 20 '12 at 22:35

0 Answers0