0

I've got a section of code on a b2evo PHP site that does the following:

$content = preg_replace_callback(
    '/[\x80-\xff]/',
    create_function( '$j', 'return "&#".ord($j[0]).";";' ),
    $content);

What does this section of code do? My guess is that it strips out ascii characters between 128 and 256, but I can't be sure.

Also, as it stands, every time this bit of code is called from within a page, PHP allocates and then does not free upto 2K of memory. If the function is called 1000+ times on a page (this can happen), then the page uses an extra 2MB of memory.

This is causing problems with my web application. Why am I losing memory, and how do I rewrite this so I don't get a memory leak?

seanyboy
  • 5,623
  • 7
  • 43
  • 56
  • As I point out in my update, replace this RE function with htmlentities and it should be fine... – PhiLho Nov 17 '08 at 15:06

3 Answers3

4

It's create_function that's leaking your memory - just use a normal function instead and you'll be fine.

The function itself is replacing the characters with numeric HTML entities (&#xxx;)

Greg
  • 316,276
  • 54
  • 369
  • 333
3

Not really stripping, it replaces high-Ascii characters by their entities.

See preg_replace_callback.
create_function is used to make an anonymous function, but you can use a plain function instead:

$content = 'Çà ! Nœm dé fîçhïèr tôrdù, @ pöür têstër... ? ~ Œ[€]';
$content = preg_replace_callback('/[\x80-\xff]/', 'CB_CharToEntity', $content);
echo $econtent . '<br>';
echo htmlspecialchars($content) . '<br>';
echo htmlentities($content) . '<br>';
echo htmlentities($content, ENT_NOQUOTES, 'cp1252') . '<br>';

function CB_CharToEntity($matches)
{
    return '&#' . ord($matches[0]) . ';';
}

[EDIT] Found a cleaner, probably faster way to do the job! ^_^ Just use htmlentities with options fitting your needs.

PhiLho
  • 40,535
  • 6
  • 96
  • 134
0

It's a lot simpler to use preg_replace with the /e flag in your case:

$content = preg_replace(
    '/[\x80-\xff]/e',
    '"&#".ord($0).";"',
    $content);
newacct
  • 119,665
  • 29
  • 163
  • 224