0

all this functions like htmlspecialchars(), htmlentities(), html_entities_decode(), urldecode() e.t.c it's like a mess. So, i decided to build my own function that will do exactly what i need it to do, by combining functionalities of each. The problem is the lines that i have comment out in the array. I want to encode the chars "& # ;" but if i do so, my function will also encode the encoded chars. Is there a way to encode this chars if there are not been a part of encoded elements in the table ? (the table could grow more in the future ...) This is my function:

function my_htmlspecialchars($string, $type='encode')
{

    $all = array(
        '΄'     => '´',
        '\\'    => '\',
        '\''    => ''',
        '"'     => '"',
        '|'     => '|', 
        '~'     => '~', 
        '{'     => '{', 
        '}'     => '}',         
        '€'     => '€',
//      '&'     => '&',     
        ':'     => ':', 
        '!'     => '!', 
        '@'     => '@', 
        '+'     => '+', 
        '='     => '=', 
        '^'     => '^',
        '$'     => '$', 
        '*'     => '*',
        '%'     => '%',
        '?'     => '?',
//      '#'     => '#',
//      ';'     => '&#59;',
        '`'     => '`', 
        ','     => ',', 
        '.'     => '.',
        '('     => '(', 
        ')'     => ')', 
        '['     => '[', 
        ']'     => ']',
        '<'     => '&lt;',
        '>'     => '&gt;'
    );

    $count = 0;
    $output = $string;

        // do the work
        switch($type)
        {
            case "encode":                  
                $output = str_replace(array_keys($all), $all, $string, $count);
                break;

            case "decode":
                $output = str_replace($all, array_keys($all), $string, $count);
                break;
        }


    return $output;
}

Usage example:

    $orgText = "\"' $ # @ % & &lt;p&gt;test&lt;/p&gt; <span>test2</span>";
    $orgText .= ": ; ? ! @ + = & ` ΄ ' \" < > ( ) { } \\ | [ ] ~ ^ * # , . % € ";
    echo 'org: '.$orgText."<br>";
##  $myText = htmlspecialchars($myText, ENT_QUOTES);
#   echo 'sch: '.$myText."<br>";    
    $encode = htmlentities($orgText, ENT_QUOTES);
    echo 'entities encode: '.$encode."<br>";
    $decode = html_entity_decode($encode, ENT_QUOTES, 'UTF-8')."<br>";
    echo 'entities decode1: '.$decode;
    echo 'entities decode2: '.html_entity_decode($decode, ENT_QUOTES, 'UTF-8')."<br>";  

    $myEncode = my_htmlspecialchars($orgText, 'encode');
    echo 'my encode: '.$myEncode."<br>";
    $myDecode = my_htmlspecialchars($myEncode, 'decode');
    echo 'my decode: '.$myDecode."<br>";
    /* The output in a browser should be:  
    "' $ # @ % &

    test
    test2: ; ? ! @ + = & ` ΄ ' " < > ( ) { } \ | [ ] ~ ^ * # , . % € 
    */

The output of the above example is (in a browser):

 org: "' $ # @ % & <p>test</p> test2: ; ? ! @ + = & ` ΄ ' " < > ( ) { } \ | [ ] ~ ^ * # , . % €
entities encode: "' $ # @ % & &lt;p&gt;test&lt;/p&gt; <span>test2</span>: ; ? ! @ + = & ` � ' " < > ( ) { } \ | [ ] ~ ^ * # , . % �
entities decode1: "' $ # @ % & <p>test</p> test2: ; ? ! @ + = & ` � ' " < > ( ) { } \ | [ ] ~ ^ * # , . % �
entities decode2: "' $ # @ % &

test
test2: ; ? ! @ + = & ` � ' " < > ( ) { } \ | [ ] ~ ^ * # , . % �

my encode: "' $ # @ % & <p>test</p> <span>test2</span>: ; ? ! @ + = & ` ´ ' " < > ( ) { } \ | [ ] ~ ^ * # , . % €
my decode: "' $ # @ % &

test
test2: ; ? ! @ + = & ` ΄ ' " < > ( ) { } \ | [ ] ~ ^ * # , . % € 

I think that the solution has to do something by using preg_replace() but i can't figure out how.

Cœur
  • 37,241
  • 25
  • 195
  • 267
ioaniatr
  • 277
  • 4
  • 15
  • 1
    Why do you need a "custom" function? – Oliver Charlesworth Nov 09 '14 at 21:48
  • I have tried all the mentioned functions but all have disadvantages in the way that i want to use them. For example `htmlentities($string)` in many cases need's `html_entities_decode(html_entities_decode($string))` so the browser could display the text as it should. I want to use specific functionalities of each by defining different input variables in one function. Well, the above example is not the complete function, but only the problem. – ioaniatr Nov 09 '14 at 22:00
  • I'd be interested in seeing a test-case that demonstrates a need to call the method twice! – Oliver Charlesworth Nov 09 '14 at 22:17
  • You can try the example code that i have added above ... – ioaniatr Nov 09 '14 at 22:36
  • The code behaves as expected; a single call is sufficient to decode: http://ideone.com/a6djDZ. – Oliver Charlesworth Nov 09 '14 at 22:45
  • The code looks fine (in text), but the browser does not recognize's the "

    " as an element the first time but as a text. Here is your live example: http://ioaniatr.gr/test.php
    – ioaniatr Nov 09 '14 at 23:12
  • As you can see, my function works fine if not encode `& # ;` chars . So, my problem is how to encode also this chars but only if there are not a part of an encoded element. For example: `&` if i encode this chars it would be `&amp;`. So, how to tell that chars `& ;` are part of an encoded element and not re-encode? – ioaniatr Nov 09 '14 at 23:38

0 Answers0