1

The Google Search Console is reporting broken image link errors for some of the pages on my legacy website. Many of these pages seem to be not indexed by Google, and I suspect the broken links may be the cause.

Here's the evidence: In the Console, I select one of the pages in which Googlebot has found an error. Then I click "Fetch as Google", and the following error is displayed: "Googlebot couldn't get all resources for this page". It lists one or more external image links from the page that are "not found". Actually, it's true that some of the external image paths are broken.

If I click "View as Search Result" for each defective page, the Console typically displays a blank search results page. I assume that means these pages have not been indexed by Google.

Here's the problem: Correcting a broken image path might seem easy, but in this case, it's not. My website has over 70,000 pages, with data pulled from a MySQL database containing hundreds of thousands of items. Each web page has multiple images linked from a product supplier's website. Most of the images are stored in the default image folder on supplier's website, but some of the images are stored in various other locations. Their locations are not predictable, and that is causing the problem.

This problem was anticipated from the start. Assuming that a percentage of the external image paths would inevitably be broken, each path is already coded with the following Javascript, to hide any ugly error messages:

<img src="http://www.product-supplier.com/default-image-folder/12345678.gif" alt="Image not available." onerror="javascript:this.style.display='none';" width="150">

This javascript allows all the product images to be displayed correctly on the webpage if their path is correct. But if the image path is faulty, then only a white space is displayed. Visually, this is acceptable for humans, but Googlebot doesn't understand the javascript, so it thinks the broken link is a required resource.

Here are my questions: Is there any way to prevent Googlebot from attempting to verify all the external image links? Can I indicate to Googlebot that the external image links don't matter? Is there any way to hide the image links from Googlebot?

If it's true that Google tends to not index any page with a broken link to an external image, then will it also not index a page with a broken link to an external web site? If so, that would create a powerful incentive not to link to external web pages, since we have no control over external web pages, and they are occasionally deleted.

Constraints:

  • The supplier does not explain their criteria for storing some of their product images in non-standard locations on their website.
  • The supplier does not provide a link for each image.
  • Given the vast amount of product data, it's not feasible to comb through it to find each individual broken link.
  • It would not be feasible to host all the images relevant to the supplier's constantly changing product catalog, as that would require too much ongoing maintenance.
  • Therefore, a percentage of the image links will always be broken.
  • My web pages are generated programmatically from my MySQL database, which is updated regularly with new data from the supplier.
  • My programming knowledge is limited to some php and very little javascript. So please answer in simple terms. Thanks.
Photon
  • 43
  • 1
  • 8
  • The presence of broken image urls should not be enough to stop google from indexing the site. https://support.google.com/webmasters/answer/7474347?hl=en&ref_topic=9002753 – Håken Lid Aug 11 '18 at 22:46
  • The Google Search Console seems to indicate that most of the problematic pages are not being indexed. In that console, Google lists the problematic web pages. If I click on each of the URLs in turn, and then click on View as Search Result, that typically leads to a blank page. I assume that means there is no Google search result for that those problematic web pages. For reference, here is an introduction to the Google Search Console: http://ryanmackellar.com/blog/new-search-console-index-coverage/ – Photon Aug 12 '18 at 01:56
  • Granted, that does not prove that the broken image links are the reason why Google has not indexed those pages. But no other errors are reported for those pages. So if I could get Google to ignore the broken image links, then I think the odds of being indexed would be higher. – Photon Aug 12 '18 at 02:11
  • Håken Lid, you have raised an important issue, and so I have revised my question accordingly. Thanks. – Photon Aug 12 '18 at 03:39

0 Answers0