1

I am building a script thats goal is to check up to 100 URLS for validity (No 404).

The only variable in the URL is the page number, like so:

http://example.com/category/id/products/page/1
http://example.com/category/id/products/page/2

and so on up to 100,

as soon as my code reaches an invalid URL, I want it to stop and echo the number it has reached, this is the code I am trying to no avail:

$url ="http://example.com/category/id/products/page/1";

if (false !== strpos($url, $id)) {

    $pageNumber = 2;
    $check = true;

do{

    $urlIterate = "http://example.com/category/id/products/page/".$pageNumber;

    if(false !== strpos($urlIterate, $id)){

        $pageNumber++;

    }

    else{

        $check = false;

    }

}

while($pageNumber <= 99);

}

else{

    $check = false;
    echo 'No pages were found at all';

}

echo "There were ". $pageNumber." pages.;

?>
sepp2k
  • 363,768
  • 54
  • 674
  • 675
zak
  • 151
  • 3
  • 14
  • what im not understanding is the $id variable. How is it initiated? Also, once youre in the do..while loop, is the $id variable changed? – CodeGodie Oct 23 '14 at 12:47
  • 1
    Also, if you want validity to check for 404, why are you not using the **[get_headers()](http://php.net/manual/en/function.get-headers.php)** PHP function? – CodeGodie Oct 23 '14 at 12:53
  • $id variable is static and defined by myself depending on the ID of the user I am reviewing. – zak Oct 23 '14 at 12:56

2 Answers2

1

Im not sure if this is what youre looking for, but try this:

<?php

    $id_to_search = "90";

    for ($i = 1; $i <= 100; $i++) {
        $url = "http://example.com/category/id/products/page/" . $i;
        $values = parse_url($url);
        $paths = explode('/', $values['path']);
        $id_from_url = $paths[5];
        if ($id_to_search === $id_from_url) {
            $headers = get_headers($url);
            if ($headers[0] == 'HTTP/1.0 404 Not Found') {
                echo "URL Found! URL is invalid(404). URLs searched = " . $i . "<br>";
            } else {
                echo "URL is valid<br>";
            }
        } else {
            echo "URL was searched but it does not match the ID we are looking for<br>";
        }
    }
CodeGodie
  • 12,116
  • 6
  • 37
  • 66
  • This is essentially what I'm looking for however I've tried this method but it's problem is it will only search the URLs you input, not all the ones up to 100 My goal is to check a URL, if it's good, check the next url, and so on until 100 have been checked, if one is *not* good, echo the number it got to – zak Oct 23 '14 at 13:37
  • 1
    Did not fix my issue holistically but with some alterations I made it work, thanks for your help – zak Oct 24 '14 at 12:27
0

Why are you not using the for loop? It will be better while we know how much iterations will we need.

for($i = 1; $1<=100; $i++){
    $urlIterate = "http://example.com/category/id/products/page/".$i; //generate url
    $headers = get_headers($urlIterate, 1); //get headers
    if($headers[0] != 'HTTP/1.1 200 OK'){ //if we have an error
        if($i > 1) //if there was at least one found
            echo 'Last found number is ' . ($i-1);
        else
            echo 'No pages were found at all';
        break; //stops the 'for' loop
    }
}

Your code is looking for $id in urls - what's the point?

Rafał Swacha
  • 482
  • 2
  • 6
  • 20