-3

I want to convert my html strings which contain <p style=“text-align:center; others-style:value;”>Content</p> to <center>Content</center>, and <p style=“text-align:right; others-style:value;”>Content</p> to <right>Content</right> etc.

My previous question has an answer which can achieve this perfectly. Which is regex is:

$RegEx = '/<(.*)(text-align:)(.*)(center|left|right|justify|inherit|none)(.*)(\"|\”|\'|\’)>(.*)(<\/.*)/s';
$string = preg_replace($RegEx, '<$4>$7</$4>', $string);

However, my strings may contain more than one occurrences of the text-align. For example, I might have <div style=‘text-align:left; others-style:value;’ class=‘any class’>Any Content That You Wish</div><p style=“text-align:center; others-style:value;”>Content</p> I want it to become <left>Any Content That You Wish</left><center>Content</center>, but it would just output <center>Content</center>.

How can I get what I want in PHP? Many thanks.

user2335065
  • 2,337
  • 3
  • 31
  • 54
  • 1
    Don't parse html using regex. https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Pushpesh Kumar Rajwanshi Apr 18 '19 at 07:23
  • @PushpeshKumarRajwanshi, I am converting html to non-html. What are the better ways? – user2335065 Apr 18 '19 at 07:29
  • It is recommended to use HTML parser for manipulating html text and transform to your desired structure. Have a look at this answer https://stackoverflow.com/a/18349154/2102956. – Pushpesh Kumar Rajwanshi Apr 18 '19 at 07:33
  • @PushpeshKumarRajwanshi How can I get text-align values inside the style tag in html parser? – user2335065 Apr 18 '19 at 07:36
  • If there are no nested tags in your case, you can use this regex `<(?:(p|div)).*?\bstyle\s*=\s*[“"'‘]?text-align:([a-zA-Z]+).*?>(.*?)<\/\1>` and replace with `<$2>$3$2>` [Demo](https://regex101.com/r/PewaCi/1) – Pushpesh Kumar Rajwanshi Apr 18 '19 at 07:39
  • @PushpeshKumarRajwanshi thank you, but how can I use it on PHP script? Seems when I try it, it says "Parse error: syntax error, unexpected '‘' (T_STRING) in ...". Probably it mistakenly think that some of the tags are closing tags. – user2335065 Apr 18 '19 at 07:49
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/192026/discussion-between-pushpesh-kumar-rajwanshi-and-user2335065). – Pushpesh Kumar Rajwanshi Apr 18 '19 at 07:57

1 Answers1

1

As I wrote in the comments, you should not use regex for manipulating html content. But only if you have non-nested tags in your html, you can go for it.

For this particular case, you can use this regex,

<(p|div).*?\bstyle\s*=\s*.*?text-align:([a-zA-Z]+).*?>(.*?)</\1>

And replace it with <\2>\3</\2>

Regex Demo

PHP code demo

$html = '<p style=“text-align:center; others-style:value;”>Content</p>
<div style=‘text-align:left; others-style:value;’ class=‘any class’>Any Content That You Wish</div><p style=“text-align:center; others-style:value;”>Content</p>';

$newhtml = preg_replace("~<(p|div).*?\bstyle\s*=\s*.*?text-align:([a-zA-Z]+).*?>(.*?)</\\1>~", '<\2>\3</\2>', $html);
echo $newhtml;

Prints,

<center>Content</center>
<left>Any Content That You Wish</left><center>Content</center>
Pushpesh Kumar Rajwanshi
  • 18,127
  • 2
  • 19
  • 36
  • Thank you! But now I encounter an issue: if my html content contains some paragraphs with no text-align at the very beginning, it will go missing after running the code. Example: `$html = '

    First Paragraph

    Content

    Any Content That You Wish

    Content

    ';` The `

    First Paragraph

    ` would disappear while I still want it to be retained. Thank you!
    – user2335065 Apr 28 '19 at 07:54