0

I wanted to do some checking if a string (json), that is been post via a curl call, starts with a specific sub-string. When I kept getting return FALSE instead of TRUE I tried to debug it. I first adjusted my code so it will accept $_GET instead of $_POST so I can pass along a query string in the url. I always sanitize the data passed in but it seems like my problem has to do with that.

I did an echo var_dump(); on the data and saw that the string count isn't equal for with and without sanitizing.

This is the json: [{"userProfileUrl":"/users/jDoe","avatar":"/images/1.jpg","name":"John Doe","id":1}]

This is how the url for debugging purpose looks like:

example.com?file=[{"userProfileUrl":"/users/jDoe","avatar":"/images/1.jpg","name":"John Doe","id":1}]

This is the code to do the checking:

function verifyFileStart($string, $startString)
{
    $len = strlen($startString);
    return (substr($string, 0, $len) === $startString);
}

This is with sanitizing:

<?php
define("STARTSWITH", '[{"userProfileUrl":"');

$file = $_GET['file'];

// With Sanitize
$file = filter_var($file, FILTER_SANITIZE_STRING);

$part = STARTSWITH;

echo var_dump(verifyFileStart($file, $part));
echo nl2br("\n\n");
echo var_dump($file);
echo nl2br("\n\n");
echo var_dump($part);

// Here is the function (verifyFileStart) that does the checking
?>

var_dump() Output:

bool(false)

string(128) "[{"userProfileUrl":"/users/jDoe","avatar":"1","name":"John Doe","id":1}]"

string(20) "[{"userProfileUrl":"" 

This is without sanitizing:

<?php
define("STARTSWITH", '[{"userProfileUrl":"');

$file = $_GET['file'];

/* Without Sanitize
$file = filter_var($file, FILTER_SANITIZE_STRING);*/

$part = STARTSWITH;

echo var_dump(verifyFileStart($file, $part));
echo nl2br("\n\n");
echo var_dump($file);
echo nl2br("\n\n");
echo var_dump($part);

// Here is the function (verifyFileStart) that does the checking
?>

var_dump() Output:

bool(true)

string(72) "[{"userProfileUrl":"/users/jDoe","avatar":"1","name":"John Doe","id":1}]"

string(20) "[{"userProfileUrl":"" 

Why is the first output with sanitize 128 and the output without sanitize 72?

My function verifyFileStart($string, $startString) only seems to say the sub-string is found in the string if not sanitized. How should I deal with this and make it work with sanitized data?

Thanks in advance!

Enes Palit
  • 41
  • 6

1 Answers1

0

filter_var adds characters to "sanitize" parts of the strings. You probably don't see them because you are looking at the string in the browser, and it de-sanitizes them away.

Look at it with "view source", or put a breakpoint and examine the string in the debugger. Or examine the request itself in the Dev Tools Network. Or use Fiddler :)

Your string actually becomes this: [{&#34;userProfileUrl&#34;:&#34;/users/jDoe&#34;,&#34;avatar&#34;:&#34;/images/1.jpg&#34;,&#34;name&#34;:&#34;John Doe&#34;,&#34;id&#34;:1}]

obe
  • 7,378
  • 5
  • 31
  • 40
  • Thanks for the clarification, but how should I do the checking without disabling sanitizing and check against define("STARTSWITH", '[{"userProfileUrl":"'); – Enes Palit Feb 05 '20 at 23:38
  • Well, you could take the beginning of the sanitized version and use it instead of your current version. Or you could try to run `filter_var()` on your current version before doing the comparison. But, to be honest, this approach of comparing the beginning of a JSON string seems a little strange. Maybe you can share more details about the ultimate goal... – obe Feb 05 '20 at 23:44
  • I need to compare the beginning because there is more than one JSON send to the server. I want to know if it is a valid JSON and per JSON I need to check the beginning because the source where I pull my JSON files from are not structured. Some examples (user, machine, building) start with: `[{"userProfileUrl":"` ; `[{"MachineName":"` ; `[{"buildingHight":"`. So in my index.php I have an if-statement to check wether the JSON contains a user, machine or building – Enes Palit Feb 06 '20 at 00:12
  • so why not just decode the JSON and then check its content in a more "structured" way? – obe Feb 06 '20 at 01:22
  • Yeah you're right, I'll go with that approach. Thanks again – Enes Palit Feb 06 '20 at 03:17