0

I have this GenerateSitemap.php file where I can configure the crawler, but I don't understand how I should make the crawler remove some specific URLs for example (https://example.com/?page=1) (https://example.com/?page=10) (https://example.com/?page=125). I use spatie in laravel for this solution and try below solution but it didn't work

public function sitemap()
    {   
        SitemapGenerator::create('https://example.com')
   ->shouldCrawl(function (UriInterface $url) {
       return strpos($url->getPath(), '?page') === false;
   })
   ->writeToFile(public_path('sitemap.xml'));
}
apokryfos
  • 38,771
  • 9
  • 70
  • 114

1 Answers1

0

The problem is that you are using the getPath() method of the UriInterface, this will work only if your url has in the path the "?page" you are passing in the strpos, but, what you want to find in the url is the query so you should use getQuery() instead of getPath() and the needle of strpos should be like "page=".

public function sitemap(){   
    SitemapGenerator::create('https://example.com')
         ->shouldCrawl(function (UriInterface $url) {
                           return strpos($url->getQuery(), 'page=1') === false && 
                                  strpos($url->getQuery(), 'page=10') === false && 
                                  strpos($url->getQuery(), 'page=125') === false ;
                       })->writeToFile(public_path('sitemap.xml'));
}

of course, if you have more pages, you can just put the numbers you want to exclude in an array and iterate over its elements.

Mike
  • 387
  • 1
  • 4
  • 10