-1

I have a problem to extract text in the html tag using regex.

I want to extract the text from the following html code.

<a href="http://google.com/" target="_self" title="TEXTDATA" class="encyclopedia">Google</a>

The result:

TEXTDATA

I want to extract only the text TEXTDATA

I have tried but I have not succeeded.

Emma
  • 27,428
  • 11
  • 44
  • 69
elevaku
  • 7
  • 2

3 Answers3

1

Here we want to swipe the string up to a left boundary, then collect our desired data, then continue swiping to the end of string, if we like:

<.+title="(.+?)"(.*)

enter image description here

const regex = /<.+title="(.+?)"(.*)/gm;
const str = `<a href="http://google.com/" target="_self" title="TEXTDATA" class="encyclopedia">Google</a>`;
const subst = `$1`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

RegEx

If this expression wasn't desired, it can be modified or changed in regex101.com.

RegEx Circuit

jex.im also helps to visualize the expressions.

enter image description here

PHP

$re = '/<.+title="(.+?)"(.*)/m';
$str = '<a href="http://google.com/" target="_self" title="TEXTDATA" class="encyclopedia">Google</a>';
$subst = '$1';

$result = preg_replace($re, $subst, $str);

echo $result;
Community
  • 1
  • 1
Emma
  • 27,428
  • 11
  • 44
  • 69
0

Use this regex:

title=\"([^\"]*)\"

See: Regex

Hamed Ghasempour
  • 435
  • 3
  • 12
-1

<a href="http://google.com/" target="_self" class="encyclopedia">Google</a>

Remvoe Title and try

hio
  • 915
  • 8
  • 26