1

The issue: Wordpress blog's error log is flooded by "charset not supported, assuming utf-8" messages; grows 0 bytes to 450 Mb in 24 hrs (~28k page views, if stats are correct).

Details: I have a Wordppress-powered blog hosted on shared hosting account. It's been running for years, and this was never an issue until not too long ago, but I can't pinpoint the exact time frame when this started to happen. A few months ago I started to exceed my allowed resources (memory mostly), so they moved me to a different server, and I had to upgrade the account for higher allowed recourse usage. Old server was running php5, this one - php7. Latest WP + around 15 popular plugins, all al respective latest versions. The theme is ancient, it's been there from the beginning.

Yesterday I deleted the error log of 9 GB(!) in the site's root, today, 24 hrs later its 500 MB. All lines are similar:

[datetime] PHP Warning:  html_entity_decode(): charset `keep-ali0' not supported, assuming utf-8 in /home/accountname/public_html/wp-includes/formatting.php on line 5124
[datetime] PHP Warning:  htmlentities(): charset `/[^0-9\.]/' not supported, assuming utf-8 in /home/accountname/public_html/wp-content/plugins/wp-super-cache/wp-cache-base.php on line 5
... etc.

I parsed the older 2 GB log:

  • they came from 13 files: 3 core WP files, others from 6 different plugins
  • only from these functions: htmlentities(), htmlspecialchars(), html_entity_decode()
  • over 1000 unique "charsets": all are garbage, most include non-printable chars, others just weird stuff: paths (not mine!), regexes, integers, hex values...: #^[a-z]:[/\\]#i, meta_value, 0x7fe858ae2920, /home/someone-elses-account-name/public_html/includes/functions.php, ...

Where do these values come from?

Where do I even start troubleshooting this?

miken32
  • 42,008
  • 16
  • 111
  • 154
flamey
  • 2,311
  • 4
  • 33
  • 40
  • 1
    Not a solution but as a temporary hack change the value of the option error_reporting in php.ini to not include the constant E_WARNING. This may be a bit tricky as the value is a bit mask and E_WARNING may be specified by using the complement of another option.This will suppress the warning messages from being written to your log file while you solve the real problem. –  Dec 06 '18 at 22:28
  • Alternatively, go to /wp-includes/formatting.php on line 5124 and change html_entity_decode to @html_entity_decode - again not a solution but just suppression of the error messages. –  Dec 06 '18 at 22:38
  • Do you have edit access to the source code? If you can edit the source code I will tell you how to find the source of the prroblem. –  Dec 07 '18 at 00:11
  • 2
    https://bugs.php.net/bug.php?id=71876 – miken32 Dec 07 '18 at 03:13
  • @miken32 - Wow, I never suspected a bug in PHP, you are truly a guru! Thanks for the lesson next time something really strange occurs I will also consider PHP bugs. A very valuable lesson indeed, thankyou again. –  Dec 07 '18 at 03:57
  • LOL I wouldn't make a habit of it @magenta, 99.5% of the time someone posts a question insisting they've found a bug in a programming language, they have not. – miken32 Dec 07 '18 at 04:33
  • @magenta the problem with editing WP or plugins code, is that changes will be overwritten with next update. being shared hosting I don't have access to php.ini, and hoster probably wouldn't change it for me a it will affect all accounts on that server. I'll try `ini_set('internal_encoding', 'utf-8')` tonight, but even if this helps it will be overwritten with WP update, I guess. – flamey Dec 07 '18 at 07:44

1 Answers1

1

It appears this is a known bug in PHP, which is difficult to reproduce so it's stuck around a while.

https://bugs.php.net/bug.php?id=71876

Various workarounds have been suggested, including:

  • Setting internal_encoding=utf-8 in php.ini or using ini_set('internal_encoding', 'utf-8');
  • Ensuring that default_charset is not set in php.ini
  • Adding the character set to the function call, e.g. html_entity_decode($x, null, 'utf-8');

These workarounds appear to have mixed results.

miken32
  • 42,008
  • 16
  • 111
  • 154
  • thank you! this appears to be it. I'll give theses a try next few days. though, I don't have access to php.ini , but I'll talk to support again. – flamey Dec 07 '18 at 05:59
  • 1
    so, I couldn't use any of the workarounds (one php.ini for many accounts on shared hosting; code changes would be owerwritten by software updates) . but hosting support added `internal_encoding utf-8` to Apache web server config via include config (? something like that), and it worked. thank you for pointing me to the right direction! – flamey Dec 07 '18 at 11:57