8

i have a problem i never encountered before, and i think it has something to do with the apache configuration, which i'm not very well versed in.

first, there is a php script with a search form. the form is transmitted via POST.

then there's the result list of search hits. here the original search query is passed as part of the url, e.g.: search.php?id=1234&query=foo. this also works - as long as there are no umlauts (äöüÄÖÜß...) chars transmitted.

as soon as i include umlauts in the search query, the first part that transmits the query string as POST works, but passing it (urlencoded) in the URL leads to a 403.

so:

  • search.php?id=1234&query=bar works
  • search.php?id=1234&query=b%E4r leads to 403 (%E4 = "ä" utf-8 urlencoded)
  • search.php?id=1234&query=b%C3%A4r leads to 403 (%C3%A4 = "ä" utf-8 urlencoded)
  • submitting umlauts via POST works

i converted the app from iso-8859-1 to utf-8, but that made no difference.

i also tested it on my local machine, here it works flawlessly - as expected.

remote server setup (where it doesn't work):

Apache/2.2.12 (Ubuntu),
PHP Version 5.2.10-2ubuntu6.7, Suhosin Patch 0.9.7, via CGI/FastCGI

local setup (here the same works):

Apache/2.2.8 (Win32) PHP/5.3.5
PHP Version 5.3.5 via mod_php

does anybody have an idea why the remote apache/php-cgi doesn't accept properly urlencoded umlauts in the url?

additional info: i also tried to create a static file with an umlaut in it's name, and both /t%C3%A4st.php and /täst.php get served without problem. täst.php?foo=täst fails.

note: ?foo=%28, where %28 is "(", works also.

stefs
  • 18,341
  • 6
  • 40
  • 47
  • pedantic, I know, but "ß" has no umlauts... – Stephen Feb 01 '11 at 12:53
  • sorry :) how is this superset of kind-of special chars called? – stefs Feb 01 '11 at 13:02
  • 2
    do you have any mod_security-like module with some broken rules ? does this also happens with any non-ascii character like àéù ? – Arnaud Le Blanc Feb 01 '11 at 13:04
  • àáéèùú do not work either. ascii characters work, non-ascii don't. – stefs Feb 01 '11 at 13:09
  • @user576875: sorry, i can't answer the question regarding the mod_security. phpinfo says there's suoshin and the ioncube loader installed. i don't have access to any configuration directives. – stefs Feb 01 '11 at 13:13
  • Without exact logging from the http server, it's tough to say. If you can, and have access to the http binary though shell, you can run httpd -l to pull the current modules and check for mod_security. If the server-owner put suhosin on, it would'nt be a stretch for them to put on mod_sec as well. If you do find that module in apache, the server owner should be able to pull the server-wide error log and confirm/deny if it's mod_sec. An easy way to check if it's php, again, if you have shell, is to just run php and if it's a PHP issue, it'll kick out the error to you. – Alexander Blair Feb 01 '11 at 13:19
  • I doesn't sounds like a HTTPd/security problem because you put the special character(s) as part of the query. The URL spec don't define an encoding for it - thats up for the application. Can you add a simple "die('hello');" as the first line of your search.php script? – mabe.berlin Apr 04 '12 at 17:04
  • Are you using some web application framework? Some might filter certain URL characters for security reasons. – Andrei Bârsan Apr 08 '12 at 20:58
  • i don't work on that project anymore, and so this question isn't relevant to me personally anymore. other than that: no, i didn't use a framework. i still think the culprit was some kind of mis-configured security mod for apache. – stefs Apr 11 '12 at 11:41

1 Answers1

1

Apache doesn't escapes that, the browser does.

You need to use urlencode and urldecode to avoid issues with that kind of characters.

Some browsers, like old Netscape, just sends the url as written, with 8-bit characters in it. Others, notably MSIE, encodes the url as UTF-8 before sending it to the web-server, so a 8-bit character arrives as two characters, of which the first has the 8th bit set. There is not indication whatsoever, in request headers or elsewhere, that the url is encoded in UTF-8.

Sein Oxygen
  • 316
  • 3
  • 15
  • i'm aware of that. but i suspect some apache configuration/mis-configured security mod to block requests if there are certain characters in the url. – stefs Apr 11 '12 at 11:42
  • The other problem is urldecode is supposed to be done automaticly in php so this should not cause a problem, how ever if your being servered with a 403 has to be apache, if it got into php and errored the error would be 500, 403 say's that apache is failing to load the file as with get strings this has to be a rule that is setup on the apache server – Barkermn01 Apr 12 '12 at 14:14