4

A PHP file is receiving a URL encoded string via GET. However, some scripts may send strings encoded with the urlencode() function, while other scripts may send strings encoded with the rawurlencode() function.

What would be the best way to check which function was used to encode the string, so the appropriate decoding function (urldecode() or rawurldecode()) can be called?

So far, my only idea is code like this:

if (stristr($string, "%20"))...
miken32
  • 42,008
  • 16
  • 111
  • 154
Mindaugas Li
  • 1,071
  • 5
  • 15
  • 37
  • 1
    THis is something you should state in the documentation of your api, what remote urlencoding the other end should use – Ferrybig Apr 04 '16 at 09:17
  • Alright, so how does it answer my question? – Mindaugas Li Apr 04 '16 at 09:19
  • `echo this_sign_is_on_url('+') === true ? 'urldecode' : 'rawurldecode';` Don't know.... – Chay22 Apr 04 '16 at 09:22
  • It answers your question by telling you to implement a protocol and not accept every garbage someone sends you. Imagine if TCP/IP protocol tried to be smart and guess what the other side meant, instead of implementing a protocol to work by. If you receive malformed request or something you can't work with, have your script return an error message and hang up. – Mjh Apr 04 '16 at 09:57
  • I know what should or should not be done, but the fact is that the script wasn't developed by me, I "overtook" it and now want to change from urlencode to rawurlencode while maintaining compatibility with old installations. So I need an answer THIS QUESTION ONLY. – Mindaugas Li Apr 04 '16 at 10:00

1 Answers1

4

The two functions take any character defined by the regular expression [^0-9A-Za-z_~-] and convert it to a percent sign followed by its hexadecimal codepoint. The only difference between the two encoding methods is rawurlencode() uses a %20 for a space, instead of the + used by urlencode().

For decoding, this means that any sequence that matches the regular expression %[0-9A-F]{2} will be properly decoded by either function. That only leaves a + to worry about, which will not get decoded properly by rawurldecode(). So, you can use urldecode() on the server side and not worry about any testing.

<?php
$str = "foo bar baz";
$raw = rawurlencode($str);
$enc = urlencode($str);

echo rawurldecode($raw);
echo rawurldecode($enc);
echo urldecode($raw);
echo urldecode($enc);
?>

Output:

foo bar baz
foo+bar+baz
foo bar baz
foo bar baz
miken32
  • 42,008
  • 16
  • 111
  • 154