I am trying to fetch the most relevant image from a url. I want to fetch the image that is closest to the title 'text' of the page. Or put it in a different way. I want to give scores to images based on their distance from the the title 'text'. And then fetch the image with the highest score.
The title 'text' could be in a heading element
<h1>title text</h1>,<h2>title text<h2>,etc
Or It may match up with the alt attribute of the
<img alt='title text'> tags.
Or It may also be in any other element like
<p> , <span> , <div> etc
for eg:
Lets say the title of the page is as follows:
<title>White Gold Round Diamond Wedding Band: Jewelry: Amazon.com</title>
And in the body of the page we have something like:
<h1>White Gold Round Diamond Wedding Band</h1>
The element closest to the above tag lets say is inside a div as follows:
<div class='abc'>
<img src='efg' />
</div>
Then the above image should get the highest score.
Instead , if the img's alt attribute matches the title , then that image should get the highest score.
Thanks in advance.