0

I'm trying to linkify hashtags using regex, most of the cases work except when there is a word with dot at the end hot., this should only linkify #hot, but at the same time #hot.hot is valid

Here is my regex code:

var text = "#hot#hot hot #hot #hot.hot #hót #hot_hot #hot, (#hot) #hot. hot";
text.replace(#([^\b#,() ]*)/g, '<a href="/$1">#$1</a>');

output:

<a href="/hot">#hot</a><a href="/hot">#hot</a> hot <a href="/hot">#hot</a> <a href="/hot.hot">#hot.hot</a> <a href="/hót">#hót</a> <a href="/hot_hot">#hot_hot</a> <a href="/hot">#hot</a>, (<a href="/hot">#hot</a>) <a href="/hot.">#hot.</a> hot

the only issue is #hot. should linkify only #hot at the same time #hot.hot is valid

krisrak
  • 12,882
  • 3
  • 32
  • 46
  • I am unsure if the "\b" inside the character class actually does anything. There is no character that matches "\b", so all characters would be included in "[^\b]". At least it does not handle things like exclamation marks, if that was the intention. – Jens Sep 03 '14 at 15:01

2 Answers2

3

Your regex is fine, but you have to add a word boundary at the end:

#([^\b#,() ]*)\b
              ^-------- Here

Working demo

enter image description here

Federico Piazza
  • 30,085
  • 15
  • 87
  • 123
  • Thanks, `\b` did the trick, if I use this I also dont need `,()`, I can just use `/#([^\b# ]*)\b/g` – krisrak Sep 03 '14 at 15:37
0

Give this regex a try instead:

/#([^\W]+)/g

\w matches only letters, numbers, and underscores. So its opposite, \W, matches everything that's not a letter, number, or underscore. Put that \W in a negated character class ([^\W]) and you get the desired result which can still match the accented characters.

theftprevention
  • 5,083
  • 3
  • 18
  • 31